TWI684956B

TWI684956B - Object recognition and tracking system and method thereof

Info

Publication number: TWI684956B
Application number: TW107143429A
Authority: TW
Inventors: 黃聖筑; 林奕成; 黃偉倫; 盧奕丞; 劉郁昌; 劉旭航; 林家煌
Original assignee: 中華電信股份有限公司
Priority date: 2018-12-04
Filing date: 2018-12-04
Publication date: 2020-02-11
Also published as: CN111275734A; TW202022803A; CN111275734B

Abstract

The invention discloses object recognition and tracking system and method thereof, wherein the system includes a server and a mobile device. A template construction module of the server constructs multiple templates with different viewing angles by projecting three-dimensional model of the object. A feature capture module of the server captures template features of the multiple templates with the different viewing angles. Object recognition and tracking module of the mobile device compares data of multiple template features to identify the object and its viewing angles, and tracks the viewing angles of the object by an iterative nearest point algorithm, a hidden face removal method and a two-way correspondence check method. When performing the iterative closest point algorithm, the hidden face removal method removes or ignores template features that cannot be observed from the viewing angles of the object. When searching for the closest data for the template feature, the two-way correspondence check method two-way checks or searches whether two data of the template feature are the closest data to each other.

Description

Object identification and tracking system and method

本發明係關於一種物體辨識與追蹤技術，特別是指一種物體辨識與追蹤系統及其方法。 The invention relates to an object identification and tracking technology, in particular to an object identification and tracking system and method.

在一現有技術中，提出一種移動物體追蹤方法及電子裝置，其利用多個攝影機接收多個視訊資料，透過比對多個不同的幀(frame)來得知物體的位置與移動路徑，但此現有技術僅能追蹤畫面中物體的平移位置，而無法辨識與追蹤物體或得知物體之視角。 In a prior art, a moving object tracking method and an electronic device are proposed, which uses multiple cameras to receive multiple video data, and obtains the position and moving path of the object by comparing multiple different frames, but this existing The technology can only track the translational position of the object in the picture, but cannot identify and track the object or know the angle of view of the object.

在另一現有技術中，提出一種多追蹤器物體追蹤(Multi-tracker object tracking)系統，其可以整合多種追蹤器(如輪廓追蹤器、光學追蹤器)一同運作，以獲得穩定的物體追蹤效果，但此現有技術難以減少對物體追蹤所需之運算量。 In another prior art, a multi-tracker object tracking system is proposed, which can integrate multiple trackers (such as contour tracker and optical tracker) to work together to obtain a stable object tracking effect. However, this prior art is difficult to reduce the amount of calculation required for object tracking.

因此，如何解決上述現有技術之缺點，以辨識與追蹤物體或得知物體之視角，或者減少對物體追蹤所需之運算量，實已成為本領域技術人員之一大課題。 Therefore, how to solve the above-mentioned shortcomings of the prior art in order to identify and track an object or learn the angle of view of an object, or to reduce the amount of calculation required to track an object, has become a major issue for those skilled in the art.

本發明提供一種物體辨識與追蹤系統及其方法，係可辨識與追蹤物體或得知物體之視角，或者減少對物體追蹤所需之運算量。 The invention provides an object identification and tracking system and method thereof, which can be Identify and track objects or learn the angle of view of an object, or reduce the amount of computation required to track an object.

本發明之物體辨識與追蹤系統包括：一伺服器，係具有一樣板建構模組與一特徵擷取模組，樣板建構模組對物體之三維模型以投影之方式建構多個不同視角之樣板，且特徵擷取模組擷取、分析或精簡多個不同視角之樣板的樣板特徵的資料；以及一行動裝置，係自伺服器中取得或下載多個樣板特徵的資料，該行動裝置具有一物體辨識與追蹤模組以比對多個樣板特徵的資料來辨識物體及其視角，且物體辨識與追蹤模組利用疊代最近點演算法(Iterative Closest Point algorithm,ICP)、隱藏面移除法與雙向對應檢查法三者進行物體之視角追蹤，其中，在執行疊代最近點演算法時，物體辨識與追蹤模組利用隱藏面移除法移除或忽略物體之視角所無法觀察到的樣板特徵，而在疊代最近點演算法搜尋樣板特徵的最接近資料時，物體辨識與追蹤模組利用雙向對應檢查法雙向檢查或搜尋樣板特徵的兩個資料是否為彼此的最接近資料。 The object recognition and tracking system of the present invention includes: a server with a same plate construction module and a feature extraction module, the three-dimensional model of the object is constructed by the template construction module to construct a plurality of templates with different perspectives, And the feature extraction module captures, analyzes, or simplifies the data of template features of multiple templates with different perspectives; and a mobile device, which obtains or downloads data of multiple template features from the server, the mobile device has an object The identification and tracking module compares the data of multiple template features to identify the object and its perspective, and the object identification and tracking module uses the Iterative Closest Point algorithm (ICP), hidden surface removal method and The two-way correspondence inspection method is used to track the angle of view of the object. During the iterative closest point algorithm, the object recognition and tracking module uses the hidden surface removal method to remove or ignore the template features that cannot be observed by the angle of view of the object. However, when the iterative closest point algorithm searches for the closest data of the model features, the object recognition and tracking module uses a two-way correspondence check method to bidirectionally check or search whether the two data of the model features are the closest data to each other.

本發明之物體辨識與追蹤方法包括：由一伺服器之樣板建構模組對物體之三維模型以投影之方式建構多個不同視角之樣板，並由伺服器之特徵擷取模組擷取、分析或精簡多個不同視角之樣板的樣板特徵的資料；以及由一行動裝置自伺服器中取得或下載多個樣板特徵的資料，並由行動裝置之一物體辨識與追蹤模組比對多個樣板特徵的資料來辨識物體及其視角，且物體辨識與追蹤模組利用疊代最近點演算法、隱藏面移除法與雙向對應檢查法三者進行物體之視角追蹤，其中，在執行疊代最近點演算法時，物體辨識與追蹤模組利用隱藏面移除法移除或忽略物體之視角所無法觀察到的樣板特徵，而在疊代最近點演算法搜尋樣板特徵的最接近資料時，物體辨識與追蹤模組利用雙向對應檢查法雙向檢查或搜尋樣板特徵的兩個資料是否為彼此的最接近資料。 The object recognition and tracking method of the present invention includes: constructing a plurality of templates with different perspectives by projecting a three-dimensional model of an object from a model building module of a server, and acquiring and analyzing by the feature extraction module of the server Or streamline the data of template features of multiple templates with different perspectives; and obtain or download data of multiple template features from a server by a mobile device, and compare multiple templates by an object recognition and tracking module of the mobile device Feature data to identify objects and their perspectives, and the object recognition and tracking module Near point algorithm, hidden surface removal method and two-way correspondence inspection method are used to track the angle of view of the object. Among them, when iterating the closest point algorithm, the object recognition and tracking module uses the hidden surface removal method to remove or Ignore the template features that cannot be observed from the perspective of the object, and when the iterative closest point algorithm searches for the closest data of the template feature, the object recognition and tracking module uses the two-way correspondence check method to check or search for the two data of the model feature Whether it is the closest information to each other.

為讓本發明上述特徵與優點能更明顯易懂，下文特舉實施例，並配合所附圖式作詳細說明。在以下描述內容中將部分闡述本發明之額外特徵及優點，且此等特徵及優點將部分自所述描述內容顯而易見，或可藉由對本發明之實踐習得。本發明之特徵及優點借助於在申請專利範圍中特別指出的元件及組合來認識到並達到。應理解，前文一般描述與以下詳細描述兩者均僅為例示性及解釋性的，且不欲約束本發明所主張之範圍。 In order to make the above-mentioned features and advantages of the present invention more obvious and understandable, the embodiments are specifically described below in conjunction with the accompanying drawings for detailed description. Additional features and advantages of the present invention will be partially explained in the following description, and these features and advantages will be partially apparent from the description, or may be learned by practicing the present invention. The features and advantages of the present invention are recognized and achieved by means of the elements and combinations particularly pointed out in the scope of the patent application. It should be understood that both the foregoing general description and the following detailed description are merely exemplary and explanatory, and are not intended to limit the claimed scope of the invention.

1‧‧‧物體辨識與追蹤系統 1‧‧‧Object recognition and tracking system

10‧‧‧行動裝置 10‧‧‧Mobile device

11‧‧‧彩色攝影機 11‧‧‧Color camera

12‧‧‧深度感測器 12‧‧‧Depth sensor

13‧‧‧前景切割模組 13‧‧‧Foreground cutting module

14‧‧‧物體辨識與追蹤模組 14‧‧‧Object recognition and tracking module

141‧‧‧疊代最近點演算法 141‧‧‧ Iterative nearest point algorithm

142‧‧‧隱藏面移除法 142‧‧‧ hidden face removal method

143‧‧‧雙向對應檢查法 143‧‧‧Two-way correspondence inspection method

144‧‧‧裝置運動追蹤法 144‧‧‧ device motion tracking method

145‧‧‧姿勢測量法 145‧‧‧ Posture measurement method

15‧‧‧顯示模組 15‧‧‧Display module

20‧‧‧伺服器 20‧‧‧Server

21‧‧‧三維模型重建模組 21‧‧‧Three-dimensional model reconstruction module

22‧‧‧樣板建構模組 22‧‧‧Model construction module

23‧‧‧特徵擷取模組 23‧‧‧Feature extraction module

A‧‧‧物體 A‧‧‧Object

B‧‧‧三維模型 B‧‧‧3D model

C‧‧‧樣板 C‧‧‧Model

D‧‧‧樣板特徵 D‧‧‧Model features

F1‧‧‧辨識階段 F1‧‧‧Identification stage

F2‧‧‧追蹤階段 F2‧‧‧ tracking stage

T'‧‧‧樣板矩陣 T'‧‧‧ Model matrix

S11至S14、S21至S25‧‧‧步驟 S11 to S14, S21 to S25 ‧‧‧ steps

S31至S33、S41至S45‧‧‧步驟 S31 to S33, S41 to S45 ‧‧‧ steps

第1圖為本發明之物體辨識與追蹤系統的示意架構圖；第2圖為本發明之物體辨識與追蹤系統及其方法的使用流程的簡化示意圖；第3A圖與第3B圖為本發明以圖學投影之方式建構多視角之樣板的示意圖；第4圖為本發明沿光學軸旋轉之多個樣板的示意圖；第5圖為本發明將所有樣板向量組成一樣板矩陣的示意圖；第6圖為本發明之行動裝置在互動操作上的流程示意圖；以及第7圖為本發明之行動裝置在追蹤階段上的動態切換流程示意圖。 Figure 1 is a schematic structural diagram of the object recognition and tracking system of the present invention; Figure 2 is a simplified schematic diagram of the use process of the object recognition and tracking system and method of the present invention; Figures 3A and 3B are the present invention A schematic diagram of a multi-view template constructed by graphic projection; FIG. 4 is a schematic diagram of a plurality of templates rotating along an optical axis of the present invention; FIG. 5 is a diagram of the present invention that all template vectors are combined into a template matrix Intention; FIG. 6 is a schematic flowchart of the interactive operation of the mobile device of the present invention; and FIG. 7 is a schematic diagram of the dynamic switching process of the mobile device of the present invention in the tracking phase.

以下藉由特定的具體實施形態說明本發明之實施方式，熟悉此技術之人士可由本說明書所揭示之內容輕易地了解本發明之其他優點與功效，亦可藉由其他不同的具體實施形態加以施行或應用。 The following describes the embodiments of the present invention by specific specific embodiments. Those familiar with this technology can easily understand other advantages and effects of the present invention from the contents disclosed in this specification, and can also be implemented by other different specific embodiments. Or application.

無標記物(Markerless)或有標記物之物體辨識與追蹤技術是拓展擴增實境(Augmented Reality,AR)應用的關鍵技術，本發明提出一種物體辨識與追蹤系統及其方法，例如無標記物之物體辨識與追蹤系統及其方法，可透過行動裝置之彩色攝影機與深度感測器拍攝或掃描物體(目標物體)，進而辨識與追蹤物體(目標物體)，以利後續的擴增實境(AR)應用。 Markerless or marker-based object recognition and tracking technology is a key technology for expanding the application of Augmented Reality (AR). The present invention provides an object recognition and tracking system and method, such as marker-free The object recognition and tracking system and its method can shoot or scan objects (target objects) through the color camera and depth sensor of the mobile device, and then recognize and track the objects (target objects) to facilitate subsequent augmented reality ( AR) Application.

本發明以電腦視覺技術為基礎發展出一種物體辨識與追蹤系統及其方法，透過行動裝置之一彩色攝影機與一深度感測器拍攝或掃描物體(目標物體)，並由物體辨識與追蹤模組分析物體之色彩特徵與深度資訊，以辨識物體(目標物體)之狀態及視角。而且，本發明配合行動裝置內附的動態感測資訊，在行動裝置於短時距內小幅度運動下，使行動裝置自動改以感測資訊推估運動，達到以較低運算量來追蹤物體(目標物體)之三維(3D)動態的功能。同時，本發明可透過伺服器預先精簡要辨識之樣板的資料，以減少即時辨識樣板所需之運算量與資料量。 The present invention develops an object recognition and tracking system and method based on computer vision technology. The object (target object) is photographed or scanned through a color camera and a depth sensor of the mobile device, and the object recognition and tracking module Analyze the color characteristics and depth information of objects to identify the state and perspective of objects (target objects). Moreover, the present invention cooperates with the dynamic sensing information included in the mobile device, so that the mobile device automatically changes the mobile device to use the sensing information to estimate the motion when the mobile device moves in a small range within a short time interval, so as to achieve a lower amount of calculation. The function of tracking the three-dimensional (3D) dynamics of objects (target objects). At the same time, the present invention can refine the data of the template identified in advance through the server to reduce the amount of calculation and data required for real-time identification of the template.

第1圖為本發明之物體辨識與追蹤系統1，其包括一行動裝置10與一伺服器20。行動裝置10可例如為智慧手機或平板電腦等，伺服器20可例如為遠端伺服器、雲端伺服器、網路伺服器或後台伺服器等，但不以此為限。 FIG. 1 is an object recognition and tracking system 1 of the present invention, which includes a mobile device 10 and a server 20. The mobile device 10 may be, for example, a smart phone or a tablet computer, and the server 20 may be, for example, a remote server, a cloud server, a web server, or a background server, but not limited thereto.

伺服器20可具有一樣板建構模組22與一特徵擷取模組23，樣板建構模組22對物體A之三維模型B以投影之方式建構多個不同視角之樣板C，且特徵擷取模組23擷取、分析或精簡多個不同視角之樣板C的樣板特徵D的資料。同時，行動裝置10可自伺服器20中取得或下載多個樣板特徵D的資料，該行動裝置10具有一物體辨識與追蹤模組14比對多個樣板特徵D的資料來辨識物體A及其視角，且物體辨識與追蹤模組14利用疊代最近點演算法(ICP)141、隱藏面移除法142與雙向對應檢查法143三者進行物體A之視角追蹤。而且，在執行疊代最近點演算法141時，物體辨識與追蹤模組14利用隱藏面移除法142移除或忽略物體A之視角所無法觀察到的樣板特徵D，而在疊代最近點演算法141搜尋樣板特徵D的最接近資料時，物體辨識與追蹤模組14利用雙向對應檢查法143雙向檢查或搜尋樣板特徵D的兩個資料是否為彼此的最接近資料。 The server 20 may have a template construction module 22 and a feature extraction module 23. The template construction module 22 constructs a plurality of templates C of different perspectives by projecting the three-dimensional model B of the object A, and the feature extraction module Group 23 captures, analyzes or simplifies the data of model features D of a plurality of templates C with different perspectives. At the same time, the mobile device 10 can obtain or download data of a plurality of model features D from the server 20. The mobile device 10 has an object recognition and tracking module 14 that compares the data of the plurality of model features D to identify the object A and its The angle of view, and the object recognition and tracking module 14 uses the iterative closest point algorithm (ICP) 141, the hidden surface removal method 142 and the bidirectional correspondence inspection method 143 to track the angle of view of the object A. Moreover, when the iterative closest point algorithm 141 is executed, the object recognition and tracking module 14 uses the hidden surface removal method 142 to remove or ignore the template feature D that cannot be observed from the perspective of the object A, while at the closest point of the iteration When the algorithm 141 searches for the closest data of the model feature D, the object recognition and tracking module 14 uses a bidirectional correspondence check method 143 to bidirectionally check or search whether the two data of the model feature D are the closest data to each other.

物體辨識與追蹤系統1之運作方式可分為前置處理階段與互動操作階段兩個部分。第一部分之前置處理階段主要包括：由伺服器20之樣板建構模組22辨識物體A之三維模型B，以依據三維模型B建構多個不同視角之樣板C，並由伺服器20之特徵擷取模組23擷取多個不同視角之樣板C以產生相應之樣板特徵D。第二部分之互動操作階段主要包括：由行動裝置10之物體辨識與追蹤模組14進行物體A之辨識與追蹤定向。 The operation mode of the object recognition and tracking system 1 can be divided into two parts: pre-processing stage and interactive operation stage. The first part of the pre-processing stage To include: the template construction module 22 of the server 20 recognizes the three-dimensional model B of the object A, to construct a plurality of templates C of different perspectives according to the three-dimensional model B, and the feature extraction module 23 of the server 20 captures many A template C with different perspectives to generate a corresponding template feature D. The interactive operation phase of the second part mainly includes: the object recognition and tracking module 14 of the mobile device 10 performs the recognition and tracking orientation of the object A.

在物體辨識與追蹤系統1之前置處理階段，使用者可透過行動裝置10拍攝或掃描實際之物體A(目標物體)、或輸入物體A之三維模型B(亦可作為目標物體)的方式，以供伺服器20依據物體A之三維模型B建立多個不同視角之樣板C及樣板特徵D。例如，使用者可透過行動裝置10環繞拍攝或掃描物體A，以上傳物體A之色彩影像與三維(3D)點雲至伺服器20，再由伺服器20之三維模型重建模組21建立物體A之三維模型B，或者使用者可透過行動裝置10或其他任何之電子裝置直接輸入或上傳物體A之三維模型B至伺服器20。然後，由伺服器20之樣板建構模組22對物體A之三維模型B以投影之方式建構多個不同視角之樣板C，再由伺服器20之特徵擷取模組23擷取、分析或精簡多個不同視角之樣板C的樣板特徵D的資料，以供後續比對。 In the pre-processing stage of the object recognition and tracking system 1, the user can shoot or scan the actual object A (target object) through the mobile device 10, or input the three-dimensional model B of the object A (which can also be used as the target object). In order for the server 20 to create a plurality of template C and template features D of different perspectives according to the three-dimensional model B of the object A. For example, the user can shoot or scan the object A through the mobile device 10 to upload the color image and three-dimensional (3D) point cloud of the object A to the server 20, and then the object A is created by the three-dimensional model reconstruction module 21 of the server 20 The three-dimensional model B, or the user can directly input or upload the three-dimensional model B of the object A to the server 20 through the mobile device 10 or any other electronic device. Then, the template construction module 22 of the server 20 constructs a plurality of templates C of different perspectives on the three-dimensional model B of the object A, and then the feature extraction module 23 of the server 20 captures, analyzes, or simplifies The data of the model feature D of a plurality of model C with different viewing angles for subsequent comparison.

在物體辨識與追蹤系統1之互動操作階段，使用者可透過行動裝置10之物體辨識與追蹤模組14，以下列程序P11至程序P14對物體A進行辨識與追蹤。 In the interactive operation stage of the object recognition and tracking system 1, the user can recognize and track the object A through the object recognition and tracking module 14 of the mobile device 10 in the following procedures P11 to P14.

程序P11：由行動裝置10之物體辨識與追蹤模組14 比對多個不同視角之樣板C的樣板特徵D以進行物體A及其視角之辨識。例如，當行動裝置10自伺服器20中取得或下載多個樣板特徵D的資料後，行動裝置10之物體辨識與追蹤模組14可比對多個樣板特徵D之色彩影像與深度資訊，以辨識物體A及其視角(如粗略視角)。 Procedure P11: Object recognition and tracking module 14 of the mobile device 10 Compare the feature D of a plurality of templates C with different perspectives to identify the object A and its perspective. For example, after the mobile device 10 obtains or downloads data of a plurality of model features D from the server 20, the object recognition and tracking module 14 of the mobile device 10 can compare the color images and depth information of the plurality of model features D to identify Object A and its perspective (such as a rough perspective).

程序P12：由行動裝置10之物體辨識與追蹤模組14利用疊代最近點演算法(ICP)進行物體A之視角追蹤。例如，物體辨識與追蹤模組14可基於辨識後得到的物體A之粗略視角，結合本發明所提出之隱藏面移除法142與雙向對應檢查法143，以加強傳統之疊代最近點演算法(疊代逼近法)對物體A的角度追蹤效果。 Process P12: The object recognition and tracking module 14 of the mobile device 10 uses the iterative closest point algorithm (ICP) to track the angle of view of the object A. For example, the object recognition and tracking module 14 can be based on the rough perspective of the object A obtained after recognition, combined with the hidden surface removal method 142 and the bidirectional correspondence inspection method 143 proposed by the present invention, to strengthen the traditional iterative nearest point algorithm (Iterative approximation method) Angle tracking effect of object A.

程序P13：當行動裝置10在短時距內僅有小幅度運動時，物體辨識與追蹤模組14可自動切換改以裝置運動追蹤法144進行物體A之視角追蹤。例如，當物體辨識與追蹤模組14分析短時距內，行動裝置10僅有小幅度運動時，物體辨識與追蹤模組14可自動切換改以行動裝置10之慣性測量單元(Inertial Measurement Unit,IMU)取得的動態感測資訊推估出物體A之相對視角運動。據此，本發明可減少對物體A之相對視角運動較複雜的比對運算量、提高系統反應率或減少計算能耗。 Procedure P13: When the mobile device 10 has only a small movement within a short time interval, the object recognition and tracking module 14 can automatically switch to use the device motion tracking method 144 to track the angle of view of the object A. For example, when the object recognition and tracking module 14 analyzes a short time interval and the mobile device 10 has only a small amplitude movement, the object recognition and tracking module 14 can automatically switch to the inertial measurement unit (Inertial Measurement Unit) of the mobile device 10 IMU) The obtained dynamic sensing information estimates the relative perspective motion of object A. According to this, the present invention can reduce the amount of comparison calculation that is more complicated for the relative viewing angle movement of the object A, increase the system response rate or reduce the calculation energy consumption.

程序P14：由行動裝置10之物體辨識與追蹤模組14自動判斷是否需切換回完整的視角追蹤或物體辨識。例如，物體辨識與追蹤模組14可以比對關於物體A之裝置動態追蹤之效果與拍攝物體A之場景兩者的差異，以於兩者的差異超過門檻值時，由物體辨識與追蹤模組14切換回完整的視角追蹤計算、或需重新進行物體視角辨識。 Process P14: The object recognition and tracking module 14 of the mobile device 10 automatically determines whether it is necessary to switch back to complete viewing angle tracking or object recognition. For example, the object recognition and tracking module 14 can compare the difference between the dynamic tracking effect of the device about the object A and the scene where the object A is shot, When the difference exceeds the threshold, the object recognition and tracking module 14 switches back to the complete view tracking calculation, or the object view recognition needs to be performed again.

上述前景切割模組13、物體辨識與追蹤模組14、三維模型重建模組21、樣板建構模組22與特徵擷取模組23等五個模組，可採用硬體、韌體或軟體之形式予以建構、組成或實現。例如，此五個模組採用硬體之單一晶片或多個晶片予以建構。或者，前景切割模組13可為前景切割軟體或程式，物體辨識與追蹤模組14可為物體辨識與追蹤軟體或程式，三維模型重建模組21可為三維模型重建軟體或程式，樣板建構模組22可為樣板建構軟體或程式，特徵擷取模組23可為特徵擷取軟體或程式。但是，本發明並不以此為限。 The above-mentioned foreground cutting module 13, object recognition and tracking module 14, three-dimensional model reconstruction module 21, template construction module 22 and feature extraction module 23, etc., can be hardware, firmware or software. The form is constructed, composed or realized. For example, the five modules are constructed using a single chip or multiple chips of hardware. Alternatively, the foreground cutting module 13 may be a foreground cutting software or program, the object recognition and tracking module 14 may be an object recognition and tracking software or program, and the three-dimensional model reconstruction module 21 may be a three-dimensional model reconstruction software or program. The group 22 may be a template building software or program, and the feature extraction module 23 may be a feature extraction software or program. However, the invention is not limited to this.

第2圖為本發明之物體辨識與追蹤系統1及其方法的使用流程的簡化示意圖，請一併參閱第1圖。在整個觸發程序之前，使用者可以透過行動裝置10(見第1圖)之物體選擇介面F(見第2圖)選擇想要辨識與追蹤的物體A(見第2圖之步驟S11)，例如玩具車、玩具飛機等物體。若物體A之資料不存在行動裝置10中，則行動裝置10會從伺服器20中取得或下載物體A之資料包裹(見第2圖之步驟S12)，物體A之資料包裹的內容包括多視角樣板姿勢資訊、色彩樣板資料、深度樣板資料與權重值，並儲存在使用者之行動裝置10的記憶體(如硬碟或記憶卡)中。 FIG. 2 is a simplified schematic diagram of the use flow of the object recognition and tracking system 1 and method of the present invention, please refer to FIG. 1 together. Before the entire triggering procedure, the user can select the object A (see step S11 in FIG. 2) to be identified and tracked through the object selection interface F (see FIG. 2) of the mobile device 10 (see FIG. 1), for example Toy cars, toy airplanes and other objects. If the data of the object A does not exist in the mobile device 10, the mobile device 10 will obtain or download the data package of the object A from the server 20 (see step S12 in FIG. 2). The content of the data package of the object A includes multiple perspectives The model posture information, color model data, depth model data and weight values are stored in the memory (such as hard disk or memory card) of the user's mobile device 10.

觸發程序可由選定物體A與檢查物體A之資料存在後開始，先將物體A放置於行動裝置10之畫面中央附近，以供行動裝置10拍攝物體A(見第2圖之步驟S13)，行動裝置10之前景切割模組13(見第1圖)會自動於背景進行有關物體A之前景切割、視角辨識及追蹤，並將得到之物體A之姿勢結果以三維(3D)點雲的方式繪製在行動裝置10之畫面物體的相應位置上，以透過顯示模組15顯示三維(3D)點雲的結果於行動裝置10之螢幕上(見第2圖之步驟S14)，或以其他擴增實境(AR)輔助資訊呈現於行動裝置10之螢幕上。 The triggering process can start after the data of the selected object A and the inspection object A exist, and first place the object A near the center of the screen of the mobile device 10, For the mobile device 10 to photograph the object A (see step S13 of FIG. 2), the mobile device 10 foreground cutting module 13 (see FIG. 1) will automatically perform foreground cutting, angle recognition and tracking of the object A in the background, And draw the obtained posture result of the object A on the corresponding position of the screen object of the mobile device 10 in the form of a three-dimensional (3D) point cloud to display the result of the three-dimensional (3D) point cloud on the mobile device 10 through the display module 15 On the screen (see step S14 in FIG. 2), or other augmented reality (AR) auxiliary information is presented on the screen of the mobile device 10.

第3A圖與第3B圖為本發明以圖學投影之方式對物體A建構出多視角之樣板C的示意圖，請一併參閱第1圖。第3A圖為關於一般型態的物體A，對物體A做半球體或更細角度之投影。第3B圖為關於對稱型態的物體A，因繞著物體A之對稱軸可具有相似之投影影像，僅需針對其中一橫切面進行半圓形的視角投影。 FIG. 3A and FIG. 3B are schematic diagrams of a template C in which multi-view angles are constructed on the object A by graphical projection in the present invention. Please refer to FIG. 1 together. Fig. 3A is about the general type of object A, and the object A is projected with a hemisphere or a finer angle. FIG. 3B is about the symmetrical shape of the object A. Since a similar projection image can be formed around the symmetry axis of the object A, only one of the cross-sectional planes needs to be projected in a semicircular perspective.

如第3A圖、第3B圖與第1圖所示，在前置處理階段中，於行動裝置10拍攝完物體A(目標物體)後，行動裝置10可將物體A之色彩影像與深度資訊傳送至伺服器20，以供伺服器20之三維模型重建模組21對物體A進行建模而產生三維模型B，亦可透過行動裝置10或其他任何之電子裝置直接輸入物體A(目標物體)之三維模型B至伺服器20。然後，伺服器20可對物體A之三維模型B以圖學投影之方式建構多視角之樣板C，以供伺服器20之特徵擷取模組23分析多視角之樣板C而取得樣板特徵D之資訊。 As shown in FIG. 3A, FIG. 3B and FIG. 1, in the pre-processing stage, after the mobile device 10 shoots the object A (target object), the mobile device 10 can transmit the color image and depth information of the object A To the server 20 for the three-dimensional model reconstruction module 21 of the server 20 to model the object A to generate the three-dimensional model B, or directly input the object A (target object) through the mobile device 10 or any other electronic device Three-dimensional model B to the server 20. Then, the server 20 can construct a multi-view template C of the three-dimensional model B of the object A by graphical projection, so that the feature extraction module 23 of the server 20 analyzes the multi-view template C to obtain the template feature D News.

第4圖為本發明沿光學軸(Optical Axis)旋轉之多個樣板C的示意圖。為了快速處理在某視點物體沿光學軸旋轉的情況，本發明也會預先計算沿光學軸旋轉的多個樣板C，此類旋轉稱為平面內旋轉(in-plane rotation)。 FIG. 4 is a plurality of examples of the invention rotating along the optical axis (Optical Axis) Schematic diagram of panel C. In order to quickly handle the situation where an object rotates along an optical axis at a certain viewpoint, the present invention also precalculates a plurality of templates C that rotate along the optical axis. Such rotation is called in-plane rotation.

第5圖為本發明將所有樣板向量組成一樣板矩陣T'的示意圖，右側T₁、T₂至T_n表示多個原始樣板影像，中間t₁'、t₂'至t_n'表示多個經過LoG的結果影像，其中LoG表示高斯拉普拉斯算子(Laplacian of Gaussian)。T'為樣板矩陣，由向量化的樣板資料組合而成。 FIG. 5 is a schematic diagram of the present invention that all template vectors are combined into a template matrix T′. The right side T ₁ , T ₂ to T _n represent multiple original template images, and the middle t ₁ ′, t ₂ ′ to t _n ′ represent multiple After the result image of LoG, where LoG represents the Laplacian of Gaussian. T'is a template matrix, which is composed of vectorized template data.

因樣板C之比對容易受到光線變化、陰影、雜訊等干擾或影響，且樣板C之全圖比對所需之運算量十分龐大，為了增加對樣板C之辨識的準確性與對干擾的抵抗能力，本發明之行動裝置10將經過LoG(高斯拉普拉斯算子)與正規化(normalized)的每個樣板C之資訊重組成單一向量，並將所有樣板C之向量組成一樣板矩陣T'，且以互相關(cross-correlation)等方式作為特徵向量的比較方式。 Because the comparison of template C is susceptible to interference or influence of light changes, shadows, noise, etc., and the calculation of the full image comparison of template C is very large, in order to increase the accuracy of identification of template C and the interference Resistance ability, the mobile device 10 of the present invention reorganizes the information of each template C through the LoG (Gaussian Laplacian) and normalization (normalized) into a single vector, and the vectors of all template C form a template matrix T', and cross-correlation is used as a comparison method of feature vectors.

另外，本發明之行動裝置10可透過奇異值分解(Singular Value Decomposition,SVD)的方式，以減少在行動裝置10上所需要的資料量或減少樣板矩陣T'之維度。同時，本發明在不過度降低比對準確度與提升效率的基礎下，保留足以代表原始資料的維度來減少使用的資料量。這些在伺服器20產生樣板特徵D之資料，則再被包裹為資料集，以供行動裝置10下載及進行比對。 In addition, the mobile device 10 of the present invention can reduce the amount of data required on the mobile device 10 or reduce the dimension of the template matrix T'through singular value decomposition (SVD). At the same time, the present invention retains dimensions sufficient to represent the original data to reduce the amount of data used without excessively reducing comparison accuracy and improving efficiency. The data of the template feature D generated by the server 20 is then wrapped as a data set for the mobile device 10 to download and compare.

第6圖為本發明之行動裝置10在互動操作上的流程示意圖，請一併參閱第1圖。本發明可透過第1圖之行動裝置10之彩色攝影機11與深度感測器12拍攝或掃描有關物體A(目標物體)之場景，並由前景切割模組13利用平面切割等技術進行前景切割以取得物體A(目標物體)之輪廓區域。 FIG. 6 is a schematic flowchart of the interactive operation of the mobile device 10 of the present invention. Please refer to FIG. 1 as well. The present invention can The color camera 11 and the depth sensor 12 at 10 capture or scan the scene of the object A (target object), and the foreground cutting module 13 performs foreground cutting using planar cutting and other techniques to obtain the outline of the object A (target object) area.

同時，本發明之物體辨識與追蹤方法可包括第6圖之第一階段(辨識階段F1)與第二階段(追蹤階段F2)。 Meanwhile, the object recognition and tracking method of the present invention may include the first stage (recognition stage F1) and the second stage (tracking stage F2) of FIG. 6.

在第6圖之第一階段(辨識階段F1)中，先由第1圖之物體辨識與追蹤模組14分析有關物體A之前景區域特徵，並將物體A之前景區域特徵與預先產生之樣板特徵D的資料進行特徵比對，以辨識物體A(目標物體)之狀態及視角。物體辨識與追蹤模組14在取得前景區域之物體後，會將前景區域正規化及縮放至指定大小，並以建立樣板C時的分析方式對前景色彩與深度影像進行LoG與正規化以及向量化資訊，再與預先產生的樣板矩陣T'進行互相關運算以計算樣板C之相似度，其中，經互相關運算得到之分數最高者即為相似度最高的樣板C，且以樣板C之姿勢當作物體A的初始估計姿勢。然後，以四元數計算當前結果與前一幀是否有過大的旋轉角度差，以避免正反形狀過於相似造成錯誤的結果。為確保比對姿勢的可信度，樣板C之相似度超過一定門檻值的才會採納且第一個設定為初始的比對姿勢。 In the first stage (recognition stage F1) of Figure 6, the object recognition and tracking module 14 of Figure 1 analyzes the foreground area features of the relevant object A, and compares the foreground area features of the object A with the pre-generated template The data of feature D is compared to identify the state and perspective of object A (target object). After obtaining the objects in the foreground area, the object recognition and tracking module 14 will normalize and scale the foreground area to the specified size, and perform LoG and normalization and vectorization of the foreground color and depth images by the analysis method when creating the template C Information, and then perform a cross-correlation operation with the pre-generated template matrix T'to calculate the similarity of the template C, where the highest score obtained through the cross-correlation operation is the template C with the highest similarity, and the template C poses Make the initial estimated pose of object A. Then, use the quaternion to calculate whether there is an excessive rotation angle difference between the current result and the previous frame to avoid erroneous results caused by too similar front and back shapes. In order to ensure the credibility of the comparison posture, the similarity of the template C will only be adopted if it exceeds a certain threshold, and the first set is the initial comparison posture.

舉例而言，在第6圖之第一階段(辨識階段F1)中，物體辨識與追蹤模組14可於步驟S21中進行多個樣板C之比對，並於步驟S22中進行多個樣板C之翻轉檢查。若多個樣板C中無角度小於門檻值者，則進行步驟S23將例如C_miss(偵測失敗)加1，且若例如C_miss(偵測失敗)大於5，即再進行步驟S24以重設初始之比對姿勢。反之，若多個樣板C中有角度小於門檻值者，則進行步驟S25以設定比對姿勢。 For example, in the first stage of FIG. 6 (recognition stage F1), the object recognition and tracking module 14 may perform comparison of multiple templates C in step S21 and multiple templates C in step S22 The flip check. If none of the multiple templates C has an angle smaller than the threshold, step S23 is performed to add 1 to C _miss (detection failure), and if, for example, C _miss (detection failure) is greater than 5, then step S24 is performed to reset The initial comparison posture. Conversely, if there is a plurality of templates C whose angle is smaller than the threshold, step S25 is performed to set the comparison posture.

在第6圖之第二階段(追蹤階段F2)中，物體辨識與追蹤模組14可依據步驟S25所設定之比對姿勢進行步驟S31之ICP(疊代最近點演算法)追蹤或裝置運動追蹤。若追蹤失敗，則返回辨識階段F1(步驟S21之樣板比對)。反之，若追蹤成功，則物體辨識與追蹤模組14依序進行步驟S32之姿勢平滑化與步驟S33之更新姿勢比對，再返回步驟S31之ICP(疊代最近點演算法)追蹤或裝置運動追蹤。 In the second stage of FIG. 6 (tracking stage F2), the object recognition and tracking module 14 can perform the ICP (Iterative Nearest Point Algorithm) tracking or device motion tracking of step S31 according to the posture set by step S25 . If the tracking fails, it returns to the identification stage F1 (step S21 template comparison). On the contrary, if the tracking is successful, the object recognition and tracking module 14 sequentially performs the posture smoothing of step S32 and the updated posture of step S33, and then returns to the ICP (Iterative Nearest Point Algorithm) tracking or device movement of step S31 track.

前述步驟S32之姿勢平滑化，係因疊代最近點演算法(ICP)141向下採樣與使用者手持行動裝置10移動容易抖動等因素，可能造成追蹤到的姿勢過於跳動以至於畫面不流暢。若追蹤成功時，物體辨識與追蹤模組14會將姿勢記錄下來，並將當前的姿勢與前兩幀的姿勢以高斯濾波器(Gaussian filter)進行平滑化，使得過程畫面更加流暢。 The posture smoothing in the aforementioned step S32 is due to factors such as the iterative closest point algorithm (ICP) 141 downsampling and the user's hand-held mobile device 10 is easy to shake and other factors, which may cause the tracked posture to jump too much and the picture is not smooth. If the tracking is successful, the object recognition and tracking module 14 will record the posture and smooth the current posture and the posture of the first two frames with a Gaussian filter to make the process picture smoother.

上述第一階段(辨識階段F1)可以估計物體A之粗略視角方向，而第二階段(追蹤階段F2)則需要求取更準確的追蹤視角。傳統上，視角的追蹤求取僅透過疊代最近點演算法(ICP)來獲得，疊代最近點演算法(ICP)的目標是找兩個點集合對齊最佳的旋轉矩陣R與平移矩陣t。假設空間中有一輸入之點集合P(如P={p_i}，i=1,...N_P)、及另一目標之點集合Q(如Q={q_i}，i=1,...,N_Q)，其中p_i,q_i

，傳統之疊代最近點演算法(ICP)會以最接近點作為對應，對應之點集合為

，例如下列公式(1)所示，其中，P、Q、

為點集合，p_i、q_i為點，i、j、N_P、N_Q為正整數，x、y、z分別為x軸、y軸、z軸之數值。 The first stage (recognition stage F1) can estimate the direction of the rough viewing angle of the object A, while the second stage (tracking stage F2) requires a more accurate tracking perspective. Traditionally, the tracking of the angle of view is obtained only by the iterative closest point algorithm (ICP). The goal of the iterative closest point algorithm (ICP) is to find the rotation matrix R and the translation matrix t with the best alignment of the two point sets. . Suppose there is an input point set P (such as P={p _i }, i=1,...N _P ), and another target point set Q (such as Q={q _i }, i=1, ...,N _Q ), where p _i , q _i

, The traditional iterative closest point algorithm (ICP) will use the closest point as the correspondence, and the corresponding point set is

, As shown in the following formula (1), where P, Q,

Set point, p _{_i,} q _i is a _{point, i, j, N P,} N Q is a positive integer, x, y, z are x-axis, y-axis, the value of the z-axis.

可將上述求取最佳的旋轉矩陣R與平移矩陣t的關係寫為目標函數，以轉換為搜尋如下列公式(2)中最小的E(R,t)，即找到一組的旋轉矩陣R與平移矩陣t，使得兩者最為接近，其中，E(R,t)為依據旋轉矩陣R與平移矩陣t所計算之點集合與實際之點集合的總誤差值。 The relationship between the above-mentioned optimal rotation matrix R and translation matrix t can be written as an objective function, which can be converted to search for the smallest E(R, t) in the following formula (2), that is, to find a group of rotation matrix R It is the closest to the translation matrix t, where E(R, t) is the total error value between the set of points calculated according to the rotation matrix R and the translation matrix t and the actual set of points.

由上述方法可以推估實際拍攝之物體A的視角與粗略視角間的旋轉矩陣R與平移矩陣t，進而得知物體A之相對運動。然而，傳統之疊代最近點演算法(ICP)有容易陷入局部最小值的缺點，因此本發明在傳統之疊代最近點演算法(ICP)中加入(1)隱藏面移除法142與(2)雙向對應檢查法143，以求取更準確之物體A的追蹤視角。 The above method can be used to estimate the rotation matrix R and translation matrix t between the actual angle of view of the object A and the rough angle of view, and then know the relative motion of the object A. However, the traditional iterative closest point algorithm (ICP) has the disadvantage of easily falling into the local minimum, so the present invention adds (1) hidden surface removal method 142 and () to the traditional iterative closest point algorithm (ICP) 2) Two-way correspondence inspection method 143 to obtain a more accurate tracking angle of the object A.

(1)隱藏面移除法142：傳統之疊代最近點演算法(ICP)會對整個點集合進行比對，不但耗時且容易發生不穩的情況。由於本發明可以取得物體A之粗略視角，因此本發明之隱藏面移除法142可以移除物體A之視角看不到的點，並僅利用物體A之視角的可視點(剩餘的點)進行比對，以減少比對過程中的模糊地帶及連續畫面間之追蹤軌跡的顫抖情形。 (1) Hidden surface removal method 142: The traditional iterative closest point algorithm (ICP) compares the entire point set, which is not only time-consuming but also prone to instability. Since the present invention can obtain the rough angle of view of the object A, the hidden surface removal method 142 of the present invention can remove the points that cannot be seen from the angle of view of the object A. Only the visible points (the remaining points) of the viewing angle of the object A are used for comparison, so as to reduce the blurring in the comparison process and the jitter of the tracking trajectory between consecutive frames.

(2)雙向對應檢查法143：傳統之疊代最近點演算法(ICP)對每個輸入的點p _i

P只會單方向的搜尋相對應的點

，但本發明之雙向對應檢查法143可考慮不只搜尋對點p_i最接近的點

，也同樣搜尋對點q_j最接近的點

P，當點p_i與點q_j互為最接近的點時，點p_i與點q_j被稱為雙向對應，且具雙向對應的點應更具有代表性。 (2) Two-way correspondence check method 143: the traditional iterative closest point algorithm (ICP) for each input point p _i

P will only search the corresponding point in one direction

, But the bidirectional correspondence check method 143 of the present invention may consider not only searching for the closest point to the point p _i

, Also search for the closest point to point q _j

P , when point p _i and point q _j are the closest points to each other, point p _i and point q _j are called two-way correspondence, and points with two-way correspondence should be more representative.

再者，考慮到行動裝置10之運算能力較弱於伺服器20，若行動裝置10之應用程式進行過多的資料運算，則會影響行動裝置10之速率及快速消耗行動裝置10之電池剩餘使用時間。在許多行動應用(如擴增實境應用)中，在短時距內，物體A(目標物體)與行動裝置10的相對視角不會有太大的變化，且主要來自於行動裝置10的移動。因此，本發明在物體A之狀態與角度辨識完成後的短時距內，提出裝置運動追蹤法144，且裝置運動追蹤法144可視情況以行動裝置10之慣性測量單元(IMU)取得的動態感測資訊作為運動轉換參考，以達成於行動裝置10上辨識與追蹤物體A(目標物體)之高反應率及低運算量。 Furthermore, considering that the computing power of the mobile device 10 is weaker than that of the server 20, if the application of the mobile device 10 performs too much data calculation, it will affect the rate of the mobile device 10 and quickly consume the remaining battery time of the mobile device 10 . In many mobile applications (such as augmented reality applications), the relative viewing angle of the object A (target object) and the mobile device 10 will not change much in a short time interval, and it mainly comes from the movement of the mobile device 10 . Therefore, the present invention proposes a device motion tracking method 144 within a short time interval after the state and angle recognition of the object A is completed, and the device motion tracking method 144 may be based on the dynamic sense obtained by the inertial measurement unit (IMU) of the mobile device 10 as appropriate The measured information is used as a reference for motion conversion to achieve a high response rate and a low amount of calculation for identifying and tracking the object A (target object) on the mobile device 10.

第7圖為本發明之行動裝置10在追蹤階段上的動態切換流程示意圖，請一併參閱第1圖，且第7圖主要以疊代最近點演算法(ICP)141、裝置運動追蹤法144與姿勢測量法145協力完成。 FIG. 7 is a schematic diagram of the dynamic switching process of the mobile device 10 in the tracking stage of the present invention. Please also refer to FIG. 1, and FIG. 7 mainly uses the iterative closest point algorithm (ICP) 141 and the device motion tracking method 144 Completed in collaboration with posture measurement method 145.

在第7圖之步驟S41中，於行動裝置10辨識完物體A之粗略姿勢後，物體辨識與追蹤模組14利用疊代最近點演算法(ICP)141先行微調修正物體A之視角。同時，在第7圖之步驟S42中，物體辨識與追蹤模組14利用姿勢測量法145來比較物體A之輪廓及深度影像之差值以計算物體A之視角的誤差。 In step S41 of FIG. 7, after the mobile device 10 recognizes the rough posture of the object A, the object recognition and tracking module 14 uses the iterative closest point algorithm (ICP) 141 to fine-tune the angle of view of the object A in advance. Meanwhile, in step S42 of FIG. 7, the object recognition and tracking module 14 uses the posture measurement method 145 to compare the difference between the contour of the object A and the depth image to calculate the error of the angle of view of the object A.

在第7圖之步驟S43中，若物體A之視角的誤差大於預定之門檻值，則表示估計的方向是錯誤的(即追蹤失敗)，則返回辨識階段(物體狀態辨識的步驟)。反之，在第7圖之步驟S44中，若視角之誤差未大於預定之門檻值，則表示這個結果是可接受的(即追蹤成功)，物體辨識與追蹤模組14即會切換成以裝置運動追蹤法144之裝置運動資訊來反推物體A當前的視角。 In step S43 of FIG. 7, if the error of the viewing angle of the object A is greater than the predetermined threshold value, it indicates that the estimated direction is wrong (ie, the tracking fails), and then it returns to the recognition stage (the step of object state recognition). On the contrary, in step S44 of FIG. 7, if the angle of view error is not greater than the predetermined threshold value, it means that the result is acceptable (that is, the tracking is successful), and the object recognition and tracking module 14 will switch to device motion The motion information of the tracking method 144 is used to reverse the current perspective of the object A.

在第7圖之步驟S45中，每隔一段時間(如每隔100幀)後，物體辨識與追蹤模組14以姿勢測量法145對當前的前景物體及推算出的物體視角進行姿勢測量而得到姿勢測量值。若姿勢測量值小於預定之門檻值(即追蹤成功)，則物體辨識與追蹤模組14以步驟S44之裝置運動追蹤法144維持裝置運動追蹤。反之，若姿勢測量值未小於預定之門檻值(即追蹤失敗)，則重新以步驟S41之疊代最近點演算法(ICP)141進行物體視角之調整，並再次以步驟S42之姿勢測量法145進行姿勢測量，若姿勢測量值仍大於門檻值(即追蹤失敗)，則返回步驟S43之辨識階段(物體狀態辨識的步驟)，以重新估算物體A之視角。 In step S45 of FIG. 7, after a certain period of time (such as every 100 frames), the object recognition and tracking module 14 uses the pose measurement method 145 to perform pose measurement on the current foreground object and the calculated object angle of view to obtain Posture measurements. If the posture measurement value is less than the predetermined threshold (that is, the tracking is successful), the object recognition and tracking module 14 maintains the device motion tracking using the device motion tracking method 144 of step S44. Conversely, if the posture measurement value is not less than the predetermined threshold (ie, tracking failure), then the iterative closest point algorithm (ICP) 141 of step S41 is used to adjust the angle of view of the object, and the posture measurement method of step S42 is used again 145 Perform posture measurement. If the posture measurement value is still greater than the threshold (ie, the tracking fails), then return to the recognition phase of step S43 (step of object state recognition) to re-estimate the angle of view of object A.

如上述第1圖至第7圖所載，本發明之物體辨識與追蹤方法主要包括：由一伺服器20之樣板建構模組22對物體A之三維模型B以投影之方式建構多個不同視角之樣板C，並由伺服器20之特徵擷取模組23擷取、分析或精簡多個不同視角之樣板C的樣板特徵D的資料。同時，由一行動裝置10自伺服器20中取得或下載多個樣板特徵D的資料，並由行動裝置10之一物體辨識與追蹤模組14比對多個樣板特徵D的資料來辨識物體A及其視角，且物體辨識與追蹤模組14利用疊代最近點演算法141、隱藏面移除法142與雙向對應檢查法143三者進行物體A之視角追蹤。在執行疊代最近點演算法141時，物體辨識與追蹤模組14利用隱藏面移除法142移除或忽略物體A之視角所無法觀察到的樣板特徵D，而在疊代最近點演算法141搜尋樣板特徵D的最接近資料時，物體辨識與追蹤模組14利用雙向對應檢查法143雙向檢查或搜尋樣板特徵D的兩個資料是否為彼此的最接近資料。 As shown in FIGS. 1 to 7 above, the object recognition and tracking method of the present invention mainly includes: a template construction module 22 of a server 20 is used to project a plurality of different perspectives on the three-dimensional model B of the object A Template C, and the feature extraction module 23 of the server 20 captures, analyzes, or simplifies the data of the template features D of the template C of multiple perspectives. At the same time, a mobile device 10 obtains or downloads data of a plurality of model features D from the server 20, and an object recognition and tracking module 14 of the mobile device 10 compares the data of the plurality of model features D to identify the object A And its perspective, and the object recognition and tracking module 14 uses the iterative closest point algorithm 141, hidden surface removal method 142, and bidirectional correspondence inspection method 143 to track the perspective of the object A. When the iterative closest point algorithm 141 is executed, the object recognition and tracking module 14 uses the hidden surface removal method 142 to remove or ignore the template feature D that cannot be observed from the perspective of the object A, while the iterative closest point algorithm 141 When searching for the closest data of the model feature D, the object recognition and tracking module 14 uses a two-way correspondence check method 143 to bidirectionally check or search whether the two data of the model feature D are the closest data to each other.

具體而言，本發明之物體辨識與追蹤方法可例如為下列程序P21至P26所述，其餘技術內容如同上述第1圖至第7圖之詳細說明，於此不再覆敘述。 Specifically, the object recognition and tracking method of the present invention can be described in the following procedures P21 to P26, for example, and the remaining technical contents are as detailed in the above FIGS. 1 to 7 and will not be repeated here.

程序P21：由行動裝置10以拍攝或掃描實際之物體A、或輸入物體A之三維模型B的方式，提供伺服器20建立或取得三維模型B。 Program P21: The mobile device 10 provides the server 20 to create or obtain the three-dimensional model B by photographing or scanning the actual object A or inputting the three-dimensional model B of the object A.

程序P22：由伺服器20之樣板建構模組22對三維模型B以投影之方式建構多個不同視角之樣板C，並由伺服器20之特徵擷取模組23擷取多個不同視角之樣板C以產生相應之樣板特徵D。 Process P22: the template construction module 22 of the server 20 constructs a plurality of templates C of different perspectives on the three-dimensional model B by projection, and the servo The feature extraction module 23 of the device 20 captures a plurality of templates C with different viewing angles to generate corresponding template features D.

程序P23：由行動裝置100之物體辨識與追蹤模組14比對物體A與多個不同視角之樣板C的樣板特徵D來辨識出物體A及其粗略視角。 Process P23: The object recognition and tracking module 14 of the mobile device 100 compares the object A with the template features D of a plurality of templates C with different perspectives to recognize the object A and its rough perspective.

程序P24：由行動裝置100之物體辨識與追蹤模組14依據物體A之粗略視角，利用一疊代最近點演算法141(疊代逼近法)進行物體A之視角追蹤以求取較準確之視角。 Procedure P24: The object recognition and tracking module 14 of the mobile device 100 uses the iteration closest point algorithm 141 (iterative approximation method) to track the angle of the object A according to the rough angle of view of the object A to obtain a more accurate angle of view .

程序P25：當一段時間內，行動裝置10僅有小幅度運動時，由行動裝置10之物體辨識與追蹤模組14自動改以裝置運動追蹤法144進行物體A之視角追蹤。 Process P25: When the mobile device 10 has only a small movement for a period of time, the object recognition and tracking module 14 of the mobile device 10 automatically changes the device A tracking method 144 to track the angle of view of the object A.

程序P26：行動裝置10之物體辨識與追蹤模組14透過裝置運動追蹤法144比對物體A之視角追蹤之效果與物體A之拍攝場景兩者的差異，當兩者的差異超過門檻值時，行動裝置100之物體辨識與追蹤模組14自動改以疊代最近點演算法141(疊代逼近法)進行物體A之視角追蹤，或重新進行物體A及其視角之辨識。 Program P26: The object recognition and tracking module 14 of the mobile device 10 compares the difference between the effect of the angle of view tracking of the object A and the shooting scene of the object A through the device motion tracking method 144. When the difference between the two exceeds the threshold, The object recognition and tracking module 14 of the mobile device 100 automatically changes the iterative closest point algorithm 141 (iterative approximation method) to track the angle of view of the object A, or re-identify the object A and its angle of view.

上述物體辨識與追蹤模組14可包括一隱藏面移除法142，在執行疊代最近點演算法141時，物體辨識與追蹤模組14利用隱藏面移除法142移除或忽略以物體A之粗略視角所無法觀察到的樣板特徵D。 The object recognition and tracking module 14 may include a hidden surface removal method 142. When the iterative closest point algorithm 141 is executed, the object recognition and tracking module 14 uses the hidden surface removal method 142 to remove or ignore the object A Model feature D that cannot be observed from a rough perspective.

上述物體辨識與追蹤模組14可包括一雙向對應檢查法143，在疊代最近點演算法141搜尋樣板特徵D的最接近資料時，物體辨識與追蹤模組14利用雙向對應檢查法 143雙向檢查或搜尋樣板特徵D的兩個資料是否為彼此的最接近資料。例如，雙向對應檢查法143可以搜尋資料A的最接近資料B，亦能檢查資料B的最接近資料是否為資料A，藉此提升資料A與資料B之對應關係的可信度與準確度。 The object recognition and tracking module 14 may include a two-way correspondence check method 143. When the iterative closest point algorithm 141 searches for the closest data of the template feature D, the object recognition and tracking module 14 uses the two-way correspondence check method 143 Two-way check or search whether the two data of model feature D are the closest data to each other. For example, the two-way correspondence check method 143 can search the closest data B of the data A, and can also check whether the closest data of the data B is the data A, thereby improving the credibility and accuracy of the correspondence between the data A and the data B.

綜上，本發明之物體辨識與追蹤系統及其方法可具有下列特色、優點或技術功效： In summary, the object recognition and tracking system and method of the present invention can have the following features, advantages or technical effects:

一、本發明之行動裝置可對物體(目標物體)進行位置與視角追蹤，以拓展擴增實境之應用範疇。 1. The mobile device of the present invention can track the position and perspective of an object (target object) to expand the application scope of augmented reality.

二、本發明將較耗時之樣板的建構與樣板特徵的分析移至伺服器中進行運算，以減少即時辨識所需之運算量與資料量。 2. The present invention moves the construction of the more time-consuming template and the analysis of the characteristics of the template to the server for calculation, so as to reduce the amount of calculation and data required for real-time identification.

三、本發明之物體辨識與追蹤模組可將疊代最近點演算法(ICP)結合隱藏面移除法與雙向對應檢查法，以求取物體之更準確的追蹤視角。 3. The object identification and tracking module of the present invention can combine the iterative closest point algorithm (ICP) with the hidden surface removal method and the two-way corresponding inspection method to obtain a more accurate tracking angle of the object.

四、本發明之隱藏面移除法可移除視角看不到的點，並僅利用視角的可視點(剩餘的點)進行比對，以減少比對過程中的模糊地帶及連續畫面間之追蹤軌跡的顫抖情形。 4. The hidden surface removal method of the present invention can remove the points that are not visible in the viewing angle, and only use the visible points of the viewing angle (the remaining points) for comparison, so as to reduce the blur zone and the continuous picture between the comparison process Trembling to track the trajectory.

五、本發明之雙向對應檢查法可雙向檢查或搜尋樣板特徵的兩個資料是否為彼此的最接近資料，藉此提升兩個資料之對應關係的可信度與準確度。 5. The bidirectional correspondence checking method of the present invention can bidirectionally check or search whether the two data of the model features are the closest data to each other, thereby improving the credibility and accuracy of the correspondence between the two data.

六、本發明之物體辨識與追蹤模組可在行動裝置僅有小幅度運動下，自動改以動態感測資訊推估物體(目標物體)之三維相對運動，以大幅減少對物體之相對視角運動較複雜的比對運算量、提高系統反應率或減少計算能耗。 6. The object recognition and tracking module of the present invention can automatically change the dynamic sensing information to estimate the three-dimensional relative motion of the object (target object) when the mobile device has only a small amplitude movement, so as to greatly reduce the relative perspective motion of the object More complex Miscellaneous comparison calculations, improve system response rate or reduce computing energy consumption.

七、本發明可視行動裝置之狀況動態調整物體之視角運算方式，在追蹤物體時能保有低角度誤差，減低運算能耗，並維持即時互動性。 7. According to the situation of the mobile device of the present invention, the view angle calculation method of the object is dynamically adjusted, which can keep low angle error when tracking the object, reduce calculation energy consumption, and maintain real-time interactivity.

八、本發明可應用於例如下列產業。(1)製造業：產品之組裝提示、新一代工業4.0中智慧製造維修之應用。(2)教育業：器官構造之解剖教學。(3)食品業：營養成分與食用方式之說明及建議。(4)廣告商務：商品廣告內容之展示與互動。(5)服務業：遠端視訊協助客戶完成故障排除或裝修工作。(6)遊戲產業：公仔玩偶互動遊戲。另外，本發明亦可應用在例如智慧型眼鏡之類的產品上。 8. The present invention can be applied to the following industries, for example. (1) Manufacturing: assembly instructions for products, the application of smart manufacturing and maintenance in the new generation of Industry 4.0. (2) Education: Teaching of anatomy of organ structure. (3) Food industry: explanations and suggestions on nutrients and consumption methods. (4) Advertising commerce: display and interaction of product advertising content. (5) Service industry: Remote videoconferencing assists customers to complete troubleshooting or decoration work. (6) Game industry: Doll interactive games. In addition, the present invention can also be applied to products such as smart glasses.

上述實施形態僅例示性說明本發明之原理、特點及其功效，並非用以限制本發明之可實施範疇，任何熟習此項技藝之人士均可在不違背本發明之精神及範疇下，對上述實施形態進行修飾與改變。任何運用本發明所揭示內容而完成之等效改變及修飾，均仍應為申請專利範圍所涵蓋。因此，本發明之權利保護範圍，應如申請專利範圍所列。 The above-mentioned embodiments only exemplarily illustrate the principles, characteristics and effects of the present invention, and are not intended to limit the scope of the invention. Anyone who is familiar with this skill can do the above without departing from the spirit and scope of the present invention. The embodiment is modified and changed. Any equivalent changes and modifications made using the disclosure of the present invention should still be covered by the scope of the patent application. Therefore, the scope of protection of the rights of the present invention should be as listed in the scope of patent application.

10‧‧‧行動裝置 10‧‧‧Mobile device

11‧‧‧彩色攝影機 11‧‧‧Color camera

12‧‧‧深度感測器 12‧‧‧Depth sensor

13‧‧‧前景切割模組 13‧‧‧Foreground cutting module

142‧‧‧隱藏面移除法 142‧‧‧ hidden face removal method

144‧‧‧裝置運動追蹤法 144‧‧‧ device motion tracking method

145‧‧‧姿勢測量法 145‧‧‧ Posture measurement method

15‧‧‧顯示模組 15‧‧‧Display module

20‧‧‧伺服器 20‧‧‧Server

22‧‧‧樣板建構模組 22‧‧‧Model construction module

23‧‧‧特徵擷取模組 23‧‧‧Feature extraction module

A‧‧‧物體 A‧‧‧Object

B‧‧‧三維模型 B‧‧‧3D model

C‧‧‧樣板 C‧‧‧Model

D‧‧‧樣板特徵 D‧‧‧Model features

Claims

An object recognition and tracking system includes: a server with a plate construction module and a feature extraction module, the model construction module constructs a plurality of templates with different perspectives by projecting a three-dimensional model of an object, And the feature extraction module captures, analyzes, or condenses the data of the model features of the templates of different perspectives; and a mobile device that obtains or downloads the data of the model features from the server, the action The device has an object recognition and tracking module to identify the object and its viewing angle by comparing the data of the plurality of model features, and the object recognition and tracking module uses iterative closest point algorithm, hidden surface removal method and bidirectional Corresponding to the three inspection methods, the object's perspective is tracked. During the execution of the iterative closest point algorithm, the object recognition and tracking module uses the hidden surface removal method to remove or ignore the object's perspective that cannot be observed The template feature, and when the iterative closest point algorithm searches for the closest data of the template feature, the object recognition and tracking module uses the bidirectional correspondence check method to bidirectionally check or search whether the two data of the template feature are The closest information to each other.

An object recognition and tracking system as described in item 1 of the patent application scope, wherein the server further has a three-dimensional model reconstruction module for creating a three-dimensional model of the object for the model construction module to perform three-dimensional The model constructs the templates of the multiple angles of view in the manner of the projection.

The object identification and tracking system as described in item 1 of the patent application scope, which In this, the mobile device further has a color camera and a depth sensor to photograph or scan the object, and the object recognition and tracking module analyzes the color characteristics and depth information of the object to recognize the state and perspective of the object.

An object recognition and tracking system as described in item 1 of the patent application scope, wherein the mobile device further has a foreground cutting module for cutting the foreground of the object, identifying and tracking the angle of view.

The object recognition and tracking system as described in item 1 of the patent application scope, wherein when the mobile device has only a small amplitude movement within a short time interval, the object recognition and tracking module automatically switches to the device motion tracking method The object's perspective tracking.

The object recognition and tracking system as described in item 1 of the patent scope, wherein when the mobile device has only a small amplitude movement within a short time interval, the object recognition and tracking module automatically switches to the inertia of the mobile device The dynamic sensing information obtained by the measurement unit (IMU) estimates the relative viewing angle movement of the object.

An object recognition and tracking system as described in item 1 of the patent application scope, wherein the object recognition and tracking module compares the difference between the effect of device dynamic tracking on the object and the scene in which the object is photographed in order to When the difference between the two exceeds the threshold, the object recognition and tracking module switches back to the complete view tracking calculation, or the object view recognition needs to be performed again.

The object recognition and tracking system as described in item 1 of the patent application scope, wherein the object recognition and tracking module uses posture measurement method for comparison The difference between the contour of the object and the depth image is used to calculate the error of the angle of view of the object.

The object recognition and tracking system as described in item 1 of the patent application scope, wherein the mobile device further reconstructs the Gaussian Laplace operator (LoG) and the normalized information of each template into a single vector, and The vectors of all the templates form a template matrix.

The object recognition and tracking system as described in item 9 of the patent application scope, wherein the mobile device further uses singular value decomposition (SVD) to reduce the amount of data required on the mobile device or the dimension of the template matrix .

An object identification and tracking method includes: constructing a plurality of templates with different perspectives by projecting a three-dimensional model of an object from a model building module of a server, and acquiring and analyzing the feature extraction module of the server Or streamline the data of the template features of the templates of different perspectives; and obtain or download the data of the template features from the server by a mobile device, and compare it with an object recognition and tracking module of the mobile device Recognize the object and its perspective on the data of the multiple template features, and the object recognition and tracking module uses the iterative closest point algorithm, hidden surface removal method and two-way correspondence inspection method to track the object's perspective , Where the object recognition and tracking module uses the hidden surface removal method to remove or ignore model features that cannot be observed from the perspective of the object when performing the iteration closest point algorithm point When the algorithm searches for the closest data of the template feature, the object recognition and tracking module uses the two-way correspondence check method to bidirectionally check or search whether the two data of the template feature are the closest data to each other.

The object identification and tracking method described in item 11 of the patent application scope further includes the creation of a three-dimensional model of the object by the three-dimensional model reconstruction module of the server for the model construction module to use the three-dimensional model of the object The model of the multiple different perspectives is constructed by means of projection.

The object recognition and tracking method as described in item 11 of the patent application scope further includes photographing or scanning the object by a color camera and a depth sensor of the mobile device to analyze the object by the object recognition and tracking module Color features and depth information to identify the state and perspective of the object.

The object recognition and tracking method as described in item 11 of the patent application scope further includes foreground cutting, perspective recognition and tracking of the object by a foreground cutting module of the mobile device.

The object recognition and tracking method as described in item 11 of the patent application scope further includes when the mobile device has only a small amplitude movement within a short time interval, the object recognition and tracking module automatically switches to the device motion tracking method Track the object's perspective.

The object recognition and tracking method as described in item 11 of the patent application scope further includes when the mobile device has only a small amplitude movement within a short time interval, the object recognition and tracking module automatically switches to the mobile device The dynamic sensing information obtained by the inertial measurement unit (IMU) estimates the relative viewing angle movement of the object.

The object recognition and tracking method as described in item 11 of the patent application scope further includes the object recognition and tracking module comparing the difference between the effect of device dynamic tracking on the object and the scene in which the object is photographed, so that When the difference between the two exceeds the threshold, the object recognition and tracking module switches back to the complete view tracking calculation, or the object view recognition needs to be performed again.

The object recognition and tracking method as described in item 11 of the patent application scope further includes the object recognition and tracking module using posture measurement to compare the difference between the contour and depth image of the object to calculate the angle of view of the object .

The object recognition and tracking method as described in item 11 of the patent application scope further includes the mobile device reconstructing the Gaussian Laplace operator (LoG) and the normalized information of each template into a single vector, and The vectors of all the templates form a template matrix.

The object recognition and tracking method described in item 19 of the patent application scope also includes the singular value decomposition (SVD) method by the mobile device to reduce the amount of data required on the mobile device or the dimension of the template matrix .