TWI793579B - Method and system for simultaneously tracking 6 dof poses of movable object and movable camera - Google Patents
Method and system for simultaneously tracking 6 dof poses of movable object and movable camera Download PDFInfo
- Publication number
- TWI793579B TWI793579B TW110114401A TW110114401A TWI793579B TW I793579 B TWI793579 B TW I793579B TW 110114401 A TW110114401 A TW 110114401A TW 110114401 A TW110114401 A TW 110114401A TW I793579 B TWI793579 B TW I793579B
- Authority
- TW
- Taiwan
- Prior art keywords
- movable
- camera
- freedom
- orientations
- movable object
- Prior art date
Links
Images
Landscapes
- Image Analysis (AREA)
Abstract
Description
本揭露是有關於一種同時追蹤可移動物體與可移動相機的六自由度方位之方法與系統。The present disclosure relates to a method and system for simultaneously tracking a six-degree-of-freedom orientation of a movable object and a movable camera.
在現有的追蹤技術中,例如同時定位與地圖構建技術(Simultaneous Localization And Mapping, SLAM)可以追蹤可移動相機的六自由度方位,但卻無法同時追蹤可移動物體。原因是可移動相機需要用穩定的環境特徵點才能進行定位,而可移動物體的特徵點不穩定,通常會被丟棄,無法用於追蹤。Among existing tracking technologies, such as Simultaneous Localization And Mapping (SLAM), which can track the six-degree-of-freedom orientation of a movable camera, it cannot simultaneously track movable objects. The reason is that the movable camera needs to use stable environmental feature points for localization, while the feature points of the movable object are unstable and are usually discarded and cannot be used for tracking.
另一方面,用於追蹤可移動物體的技術都會忽略環境特徵點以避免干擾,因此這些技術都無法追蹤可移動相機。On the other hand, techniques for tracking movable objects ignore environmental landmarks to avoid interference, so none of these techniques can track movable cameras.
大多數神經網路所學習的特徵都是用來區分物體的類型,而不是計算物體的六自由度方位。某些用於辨識姿態或手勢的神經網路只能夠輸出骨骼關節在影像平面的2D坐標(x, y),即使靠深度感測技術估算關節與相機之間的距離,也不是空間中真正的3D座標,更無法計算空間中的六自由度方位。The features learned by most neural networks are used to distinguish the type of object, rather than to calculate the six degrees of freedom orientation of the object. Some neural networks used to recognize poses or gestures can only output the 2D coordinates (x, y) of bone joints on the image plane. Even if the distance between the joints and the camera is estimated by depth sensing technology, it is not the real distance in space. 3D coordinates, and it is impossible to calculate the six-degree-of-freedom orientation in space.
在運動捕捉系統中,則是使用多個固定相機追蹤關節位置,一般會在關節上貼標記以減少誤差,沒有追蹤可移動相機的六自由度方位。In the motion capture system, multiple fixed cameras are used to track the joint position, and the joints are generally marked to reduce errors, and the six-degree-of-freedom orientation of the movable camera is not tracked.
因此,就目前已知的技術而言,尚未有任何技術能夠做到同時追蹤可移動物體與可移動相機。Therefore, as far as currently known technologies are concerned, there is no technology capable of simultaneously tracking a movable object and a movable camera.
隨著混合實境(mixed reality, MR)的快速發展,促使研究人員開發能夠同時追蹤可移動相機和可移動物體之六個自由度方位的技術。在混合實境的應用中,由於安裝在MR眼鏡上的相機會隨頭部移動,因此需要知道相機的六自由度方位才能知道使用者的位置和方向。與使用者互動的物體也會移動,因此還需要知道該物體的六自由度方位才能在適當的位置和方向顯示虛擬內容。戴著MR眼鏡的使用者可能在室內或室外自由走動,很難在環境中放置標記。而且為了有較好的使用體驗,除了物體本身的特徵外,也不會在物體上貼特殊的標記。The rapid development of mixed reality (MR) has prompted researchers to develop technologies that can simultaneously track the six-degree-of-freedom orientation of a movable camera and a movable object. In mixed reality applications, since the camera mounted on the MR glasses moves with the head, it is necessary to know the six-degree-of-freedom orientation of the camera to know the user's position and direction. Objects that the user interacts with also move, so the 6DOF orientation of the object also needs to be known in order to display virtual content in the proper position and orientation. Users wearing MR glasses may move freely indoors or outdoors, making it difficult to place markers in the environment. And in order to have a better user experience, in addition to the characteristics of the object itself, no special marks will be attached to the object.
雖然這些情況提高追蹤六自由度方位的難度,我們仍開發出能夠同時追蹤可移動物體與可移動相機的技術,以解決上述這些問題並滿足更多的應用。Although these situations increase the difficulty of tracking the 6DOF orientation, we have developed a technology that can simultaneously track a movable object and a movable camera to solve the above-mentioned problems and meet more applications.
本揭露所提出之技術例如可以應用於:當使用者戴著MR眼鏡時,可以在手持裝置,例如:手機的真實螢幕旁顯示一個或多個虛擬螢幕,根據手機和MR眼鏡上的相機的六自由度方位設定虛擬螢幕的預設位置、方向和大小。並且,透過六自由度方位的追蹤,可以自動控制虛擬螢幕旋轉和移動,使其與觀看方向一致。本揭露技術可以為使用者提供以下好處:(1)將小的實體螢幕擴展到大的虛擬螢幕;(2)將單個實體螢幕增加到多個虛擬螢幕,以同時查看更多應用程式;(3)虛擬螢幕的內容不會被他人窺探。The technology proposed in this disclosure can be applied, for example: when a user wears MR glasses, one or more virtual screens can be displayed next to the real screen of a handheld device, such as a mobile phone, according to the six-dimensional information of the camera on the mobile phone and the MR glasses DOF Orientation sets the default position, orientation and size of the virtual screen. Moreover, through six-degree-of-freedom orientation tracking, the rotation and movement of the virtual screen can be automatically controlled to make it consistent with the viewing direction. The disclosed technology can provide users with the following benefits: (1) expand a small physical screen to a large virtual screen; (2) increase a single physical screen to multiple virtual screens to view more applications at the same time; (3) ) The contents of the virtual screen will not be snooped by others.
根據本揭露之一實施例,提出一種同時追蹤可移動物體與可移動相機的六自由度方位(6 DoF poses)之方法,包括以下步驟:以可移動相機擷取一連串的影像,從這些影像中提取數個環境特徵點,匹配這些環境特徵點計算可移動相機之數個相機矩陣,再由這些相機矩陣計算可移動相機的六自由度方位;並同時從可移動相機擷取的這些影像中推算可移動物體的數個特徵點,使用這些影像各自對應的相機矩陣,以及預先定義的幾何限制和時間限制,修正可移動物體的這些特徵點的座標,再以這些修正後的特徵點座標及其對應的相機矩陣,計算可移動物體的六自由度方位。According to an embodiment of the present disclosure, a method for simultaneously tracking a movable object and six degrees of freedom (6 DoF poses) of a movable camera is proposed, comprising the following steps: capturing a series of images with the movable camera, from the images Extract several environmental feature points, match these environmental feature points to calculate several camera matrices of the movable camera, and then calculate the six-degree-of-freedom orientation of the movable camera from these camera matrices; and calculate from these images captured by the movable camera at the same time Several feature points of the movable object, using the corresponding camera matrices of these images, and the predefined geometric constraints and time constraints, correct the coordinates of these feature points of the movable object, and then use these corrected feature point coordinates and their The corresponding camera matrix calculates the six-degree-of-freedom orientation of the movable object.
根據本揭露之另一實施例,提出一種同時追蹤可移動物體與可移動相機的六自由度方位之系統,包括可移動相機、可移動相機六自由度方位計算單元及可移動物體六自由度方位計算單元。可移動相機用以擷取一連串的影像。可移動相機六自由度方位計算單元用以從這些影像中提取數個環境特徵點,匹配這些環境特徵點計算可移動相機之數個相機矩陣,再由這些相機矩陣計算可移動相機的六自由度方位。可移動物體六自由度方位計算單元,用以從可移動相機擷取的這些影像中推算可移動物體的數個特徵點,透過這些影像各自對應的相機矩陣,以及預先定義的幾何限制、和時間限制,修正可移動物體的這些特徵點的座標,再以這些修正後的特徵點座標及其對應的這些相機矩陣,計算可移動物體的六自由度方位。According to another embodiment of the present disclosure, a system for simultaneously tracking the six-degree-of-freedom orientation of a movable object and a movable camera is proposed, including a movable camera, a six-degree-of-freedom orientation calculation unit for the movable camera, and a six-degree-of-freedom orientation of the movable object computing unit. The movable camera is used to capture a series of images. The six-degree-of-freedom orientation calculation unit of the movable camera is used to extract several environmental feature points from these images, match these environmental feature points to calculate several camera matrices of the movable camera, and then calculate the six-degree-of-freedom of the movable camera from these camera matrices position. The six-degree-of-freedom orientation calculation unit of the movable object is used to calculate several feature points of the movable object from the images captured by the movable camera, through the respective camera matrices corresponding to these images, as well as the predefined geometric constraints and time Limiting and correcting the coordinates of these feature points of the movable object, and then calculating the six-degree-of-freedom orientation of the movable object with the corrected coordinates of the feature points and the corresponding camera matrices.
為了對本揭露之上述及其他方面有更佳的瞭解,下文特舉實施例,並配合所附圖式詳細說明如下:In order to have a better understanding of the above and other aspects of the present disclosure, the following specific embodiments are described in detail in conjunction with the attached drawings as follows:
請參照第1A、1B圖,其繪示本揭露同時追蹤可移動物體與可移動相機之技術與習知技術相比在應用上的說明。本揭露所提出之技術例如可以應用於:如第1A圖所示,當使用者戴著MR眼鏡G1(MR眼鏡G1上配置可移動相機110)時,可以在手持裝置,例如:手機P1(即可移動物體900)的真實螢幕旁顯示一個或多個虛擬螢幕,根據手機P1和MR眼鏡G1上的可移動相機110的六自由度方位設定虛擬螢幕D2、D3的預設位置、方向和大小。可移動相機110的「可移動」係指相對於三維空間之一靜止物而言。並且,透過六自由度方位的追蹤,可以自動控制虛擬螢幕D2、D3旋轉和移動,使其與觀看方向一致(如第1B圖所示),使用者也可以根據自己的喜好調整這些虛擬螢幕D2、D3的位置和角度。習知技術所顯示的虛擬螢幕會跟著MR眼鏡G1移動,不會跟著物體的六自由度方位移動。本揭露技術可以為使用者提供以下好處:(1)將小的實體螢幕D1擴展到大的虛擬螢幕D2;(2)將單個實體螢幕D1增加到多個虛擬螢幕D2、D3,以同時查看更多應用程式;(3)虛擬螢幕D2、D3的內容不會被他人窺探。上述技術也可以應用於平板電腦或筆記型電腦,在其實體螢幕旁設置虛擬螢幕。可移動物體900除了實體螢幕以外,還可以是其他能定義特徵的物體,例如:汽車、自行車、行人等。可移動相機110不侷限是MR眼鏡G1上的相機,也可以是自主移動機器人和車輛上的相機。Please refer to Figures 1A and 1B, which illustrate the application of the technology of simultaneously tracking a movable object and a movable camera of the present disclosure compared with the conventional technology. The technology proposed in this disclosure can be applied, for example: as shown in Figure 1A, when the user wears MR glasses G1 (the
請參照第2A圖,其繪示根據一實施例之同時追蹤可移動物體900(標示於第1A圖)與可移動相機110的六自由度方位之系統100與方法。可移動物體900例如是第1A圖之手機P1;可移動相機110例如是第1A圖之MR眼鏡G1上的相機。同時追蹤可移動物體900與可移動相機110的六自由度方位之系統100包括可移動相機110、可移動相機六自由度方位計算單元120及可移動物體六自由度方位計算單元130。可移動相機110用以擷取一連串影像IM。可移動相機110可以設置於頭戴式立體顯示器、行動裝置、電腦或機器人上。可移動相機六自由度方位計算單元120及/或可移動物體六自由度方位計算單元130例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。Please refer to FIG. 2A , which illustrates a
可移動相機六自由度方位計算單元120包括環境特徵擷取單元121、相機矩陣計算單元122及相機方位計算單元123,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。環境特徵擷取單元121用以從這些影像IM中提取數個環境特徵點EF。相機矩陣計算單元122係匹配這些環境特徵點EF計算可移動相機110之數個相機矩陣CM。相機方位計算單元123再由相機矩陣CM計算可移動相機110的六自由度方位CD。The movable camera six-degree-of-freedom
可移動物體六自由度方位計算單元130包括物體特徵座標推算單元131、物體特徵座標修正單元132及物體方位計算單元133,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。物體特徵座標推算單元131用以從可移動相機110擷取的這些影像IM中推算可移動物體900的數個特徵點OF,這些特徵點OF為預先定義,與可移動相機110擷取的這些影像IM做比對,以推算這些特徵點OF的座標。其中,可移動物體900為剛性物體。The six-degree-of-freedom
請參照第2B圖所繪示之另一實施例,同時追蹤可移動物體900與可移動相機110的六自由度方位之方法包含訓練階段(training stage)ST1和追蹤階段(tracking stage)ST2。其中,物體特徵座標推算單元131使用神經網路推論模型MD,從可移動相機110擷取的這些影像IM中推算可移動物體900特徵點OF的座標,神經網路推論模型MD為預先訓練,訓練資料由手動或自動標記獲得,在訓練過程中加入幾何限制GC和時間限制TC。Please refer to another embodiment shown in FIG. 2B , the method for simultaneously tracking the six degrees of freedom of the
物體特徵座標修正單元132使用這些影像IM各自對應的相機矩陣CM,以及預先定義的幾何限制GC和時間限制TC,修正可移動物體900的這些特徵點OF的座標。其中,物體特徵座標修正單元132使用這些相機矩陣CM,將這些特徵點OF的二維座標投影至對應的三維座標,依據幾何限制GC,刪除三維座標偏差大於預定值的特徵點OF,或以相鄰特徵點OF的座標依據幾何限制GC補充未被偵測到的特徵點OF的座標。並且,物體特徵座標修正單元132更依據時間限制TC,比對這些特徵點OF於多張連續影像IM中的座標變化,再以這些連續影像IM中對應的這些特徵點OF的座標修正座標變化大於預定值的特徵點OF的座標,得到修正後之這些特徵點OF’的座標。The object feature coordinate correcting unit 132 corrects the coordinates of these feature points OF of the
請參照第3A圖,其示例說明可移動相機擷取的一連串影像中,環境特徵點、可移動物體特徵點各自的對應關係。對於非平面物體來說,則可以透過幾個選定的特徵點OF的質心來定義方向和位置。請參照第3B圖,其示例說明物體在空間的位置與方向。特徵點OF擬合出最佳平面PL,最佳平面PL之中心點C可以代表物體在三維空間中的位置(x, y, x),並且用最佳平面PL之法向量N可以表示物體的方向。Please refer to FIG. 3A , which illustrates the corresponding relationship between environmental feature points and movable object feature points in a series of images captured by the movable camera. For non-planar objects, the orientation and position can be defined through the centroids of several selected feature points OF. Please refer to FIG. 3B, which illustrates the position and orientation of objects in space. The feature point OF fits the best plane PL, the center point C of the best plane PL can represent the position (x, y, x) of the object in three-dimensional space, and the normal vector N of the best plane PL can represent the object’s position direction.
幾何限制GC定義於三維空間中,對於剛性物體,特徵點OF之間的距離應該是固定的。經過相機矩陣投影至二維影像平面後,所有特徵點OF的位置須限制在合理的範圍內。Geometric constraints GC are defined in three-dimensional space, for rigid objects, the distance between feature points OF should be fixed. After the camera matrix is projected onto the two-dimensional image plane, the positions of all feature points OF must be limited within a reasonable range.
請參照第4A~4B圖,其示例說明修正特徵點OF的座標。相機矩陣CM不僅可用於計算可移動相機110和可移動物體900的六自由度方位,還可套用三維的幾何限制GC,修正特徵點OF*投影到二維影像平面的座標(如第4A圖所示)或添加缺少的特徵點OF**座標(如第4B圖所示)。Please refer to FIGS. 4A-4B , which illustrate the coordinates of the corrected feature points OF. The camera matrix CM can not only be used to calculate the six-degree-of-freedom orientation of the
物體方位計算單元133再以修正後之這些特徵點OF’的座標及其對應的這些相機矩陣CM,計算可移動物體900的六自由度方位OD。對於平面之可移動物體,使用這些特徵點OF計算最佳擬合平面。可移動物體900的六自由度方位OD由平面之中心點及法向量定義。對於非平面之可移動物體,可移動物體900的六自由度方位OD由這些特徵點OF'之三維座標的質心定義。The object orientation calculation unit 133 calculates the six-degree-of-freedom orientation OD of the
如第2B圖所示,同時追蹤可移動物體900與可移動相機110的六自由度方位之系統100的訓練階段(training stage)ST1包括訓練資料生成單元140及神經網路訓練單元150,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。As shown in FIG. 2B, the training stage (training stage) ST1 of the
神經網路訓練單元150用以訓練神經網路推論模型MD。神經網路推論模型MD用於推算可移動物體900之特徵點OF的位置和序列。在訓練資料生成單元140中,訓練資料可以是手動標記特徵點之位置和序列的影像、或者是自動擴充已標記的影像。請參照第5A~5D圖,其繪示以手機為例之各種訓練資料。在這些圖式中,特徵點OF由實體螢幕D4的四個內角定義。實體螢幕D4擺放成縱向方向時,順時針方向從左上角到左下角依序指定為四個特徵點OF的順序。如第5A圖所示,四個特徵點OF依序具有座標、座標、座標、座標。即使將實體螢幕D4屏幕旋轉到橫向,特徵點OF的順序也保持不變(如第5B圖所示)。在某些情況中,並不是所有的特徵點OF都能被拍到。因此,訓練資料需要包含一些類似第5C圖或第5D圖這種缺漏一些特徵點OF的影像。如第5A圖與第5D圖所示,特徵點標記的動作可以分辨出手機的正面(即螢幕)與背面,而僅在正面進行標記。為了獲得較高的精準度,在標記特徵點OF時放大每張影像,直到清楚地看到每個像素。由於手動標記的動作非常耗時,因此需要自動擴充才能將訓練資料擴展到百萬張數等級。對手動標記的影像進行自動擴充的方法包含:按比例縮放與旋轉、以透視投影法進行映射、轉換到不同的顏色、調整其亮度和對比度、添加移動模糊和雜訊、加上其他物體遮蓋某些特徵點(如第5C圖與第5D圖所示)、變更螢幕顯示的內容、或者替換背景等等。再將這些手動標記之特徵點OF的位置按照轉換關係重新計算在自動擴充的影像中的位置。The neural network training unit 150 is used for training the neural network inference model MD. The neural network inference model MD is used to infer the position and sequence of the feature points OF of the
請參照第6圖,其示例說明神經網路在訓練階段的主要結構包含特徵抽取和特徵點座標預測。其中特徵抽取器ET可以使用如ResNet這種深度殘差網路或其他有類似功能的網路。所抽取的特徵向量FV傳送至特徵點座標預測層FL中,推算特徵點OF的座標(例如目前影像之特徵點OF的座標以表示、前一張影像之特徵點OF的座標以表示)。除了特徵點預測層之外,本實施例還加上幾何限制層GCL和時間限制層TCL以減少錯誤的預測。在訓練階段,每一層會根據損失函數計算出預測值與真值的損失值LV,然後將這些損失值及其各自的權重進行累加以獲得總損失值OLV。Please refer to Figure 6, which illustrates that the main structure of the neural network in the training phase includes feature extraction and feature point coordinate prediction. Among them, the feature extractor ET can use a deep residual network such as ResNet or other networks with similar functions. The extracted feature vector FV is sent to the feature point coordinate prediction layer FL, and the coordinates of the feature point OF are calculated (for example, the coordinates of the feature point OF in the current image are given by Indicates that the coordinates of the feature point OF in the previous image are given by express). In addition to the feature point prediction layer, this embodiment also adds a geometric constrained layer GCL and a temporal constrained layer TCL to reduce erroneous predictions. In the training phase, each layer will calculate the loss value LV of the predicted value and the true value according to the loss function, and then accumulate these loss values and their respective weights to obtain the total loss value OLV.
請參照第7圖,其示例說明在相鄰的兩張影像之間,特徵點位移的計算方式。在目前影像中特徵點OF的座標為,同一特徵點OF在前一張影像中的座標為,其間的位移定為。Please refer to Figure 7, which illustrates how to calculate the displacement of feature points between two adjacent images. The coordinates of the feature point OF in the current image are , the coordinates of the same feature point OF in the previous image are , the displacement between them is defined as .
不合理的位移以懲罰值進行限制。懲罰值例如是按照下式(1)進行計算。
其中m為所有訓練資料針對每個特徵點OF所計算出的位移平均值,s是位移標準差,d是同一特徵點OF在前一影像與目前影像之間的位移。當d≤m時,位移屬於可接受範圍內,沒有懲罰值(即)。請參照第8圖,其示例說明時間限制TC、懲罰值的計算及判定方法。圓的中心代表在前一張影像中特徵點OF的座標,圓的面積代表在目前影像中特徵點OF可接受的位移。如果在目前影像中,特徵點OF的預測座標在圓內(位移d'≤m),則懲罰值為零。如果在目前影像中特徵點OF的預測座標在圓外(位移d">m),則懲罰值為。位移超出圓的半徑(即m)越多,在訓練過程中將會得到較大的懲罰值和較大的損失值,以此限制特徵點OF的座標在合理範圍內。Among them, m is the average displacement calculated for each feature point OF in all training data, s is the displacement standard deviation, and d is the displacement of the same feature point OF between the previous image and the current image. When d≤m, the displacement is within the acceptable range and there is no penalty value (ie ). Please refer to Figure 8 for an example of time limit TC, penalty value calculation and judgment methods. The center of the circle represents the coordinates of the feature point OF in the previous image , the area of the circle represents the acceptable displacement of the feature point OF in the current image. If in the current image, the predicted coordinates of the feature point OF In the circle (displacement d'≤m), then the penalty value to zero. If the predicted coordinates of the feature point OF in the current image Outside the circle (displacement d">m), the penalty value for . The more the displacement exceeds the radius (ie m) of the circle, the larger the penalty value will be obtained during the training process And a larger loss value, so as to limit the coordinates of the feature point OF within a reasonable range.
請參照第9圖,其示例說明缺少時間限制TC而產生不正確位移的情況。第9圖之左側圖示為前一影像,右側圖示為目前影像。在前一影像中,辨識出具有座標之特徵點OF。但在目前影像中,從反光成像中辨識出具有座標之特徵點OF,座標與座標之間的位移大於時間限制TC所設定的範圍,故可以判定座標不正確。Please refer to Fig. 9 for an example of incorrect displacement due to lack of time limit TC. The image on the left of Figure 9 is the previous image, and the image on the right is the current image. In the previous image, it was identified that has coordinates The feature point OF. However, in the current image, it is identified from the reflective imaging that there are coordinates The feature point OF, coordinates with coordinates The displacement between is greater than the range set by the time limit TC, so the coordinates can be determined Incorrect.
如第2B圖所示,在追蹤階段ST2,可移動相機110擷取一連串的影像IM。從這些影像中提取數個環境特徵點EF,然後將其用於計算可移動相機110的相應的相機矩陣CM和六自由度方位CD。同時,可移動物體900的特徵點OF的座標也被神經網路推論模型MD推算出來,並由相機矩陣CM轉換、修正,以獲得可移動物體900的六自由度方位OD。As shown in FIG. 2B, in the tracking stage ST2, the
請參照第10圖,其繪示加入增量學習階段(incremental learning stage)ST3之同時追蹤可移動物體900(標示於第1A圖)與可移動相機110的六自由度方位之系統200與方法,包含:自動擴增單元260及權重調整單元270,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。Please refer to FIG. 10 , which illustrates a
在第10圖之實施例中,神經網路推論模型MD在訓練階段,其訓練資料由手動標記和自動擴充組成;而在增量學習階段,其訓練資料由自動標記和自動擴充組成。In the embodiment shown in FIG. 10 , in the training phase of the neural network inference model MD, its training data is composed of manual labeling and automatic expansion; while in the incremental learning phase, its training data is composed of automatic labeling and automatic expansion.
在追蹤可移動物體900的同時,神經網路推論模型MD在背景執行增量學習。增量學習的訓練資料包括:可移動相機110擷取的影像IM及自動擴增單元260根據影像IM自動擴增的影像IM’。自動擴增單元260並以對應影像IM及IM’的修正後之特徵點OF的座標取代手動標記,做為特徵點座標真值。權重調整單元270調整神經網路推論模型MD中的權重,以更新為神經網路推論模型MD’,藉此適應使用情境以精準追蹤可移動物體900的六自由度方位OD。While tracking the
此外,請參照第11圖,其繪示應用於MR眼鏡之同時追蹤可移動物體900與可移動相機110的六自由度方位之系統300與方法,包括:方位修正單元310、方位穩定單元320、視軸計算單元330、螢幕方位計算單元340及立體影像產生單元350,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。方位修正單元310包括交叉比對單元311及修正單元312,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。立體影像產生單元350包括影像產生單元351及成像單元352,其實施方式例如是電路、晶片、電路板、程式碼、或儲存程式碼之儲存裝置。In addition, please refer to FIG. 11 , which shows a
隨著可移動相機110和可移動物體900的移動,需要對它們的六自由度方位CD、OD進行交叉比對和修正(如第8圖所示)。方位修正單元310之交叉比對單元311用以交叉比對可移動物體900之六自由度方位OD與可移動相機110之六自由度方位CD。修正單元312用以修正可移動物體900之六自由度方位OD與可移動相機110之六自由度方位CD。As the
為減少因頭部無意識的輕微晃動,而重新計算可移動相機及可移動物體的六自由度方位,造成虛擬螢幕D2(繪示於第1A圖)跟著晃動產生暈眩。方位穩定單元320用以判斷當可移動物體900之六自由度方位OD或可移動相機110之六自由度方位CD的變動小於預設值時,不改變可移動物體900之六自由度方位OD與可移動相機110之六自由度方位CD。In order to reduce the slight unconscious shaking of the head, the six-degree-of-freedom orientation of the movable camera and the movable object is recalculated, causing the virtual screen D2 (shown in FIG. 1A ) to shake and cause dizziness. The orientation stabilizing unit 320 is used to determine that when the change of the six degrees of freedom orientation OD of the
視軸計算單元330用以根據可移動相機110之六自由度方位CD計算使用者之雙眼的視軸。The viewing axis calculation unit 330 is used for calculating the viewing axes of the user's eyes according to the six-degree-of-freedom orientation CD of the
螢幕方位計算單元340用以根據可移動物體900之六自由度方位OD與可移動相機110之六自由度方位CD計算虛擬螢幕D2之六自由度方位DD,讓虛擬螢幕D2隨著可移動物體900一起移動(如第1B圖所示),或是隨著可移動相機110之六自由度方位改變虛擬螢幕D2呈顯的視角。The screen orientation calculation unit 340 is used to calculate the six-degree-of-freedom orientation DD of the virtual screen D2 according to the six-degree-of-freedom orientation OD of the
立體影像產生單元350之影像產生單元351用以根據虛擬螢幕D2之六自由度方位DD及立體顯示器(例如是第1A圖之MR眼鏡G1)的光學參數產生虛擬螢幕D2之左眼影像及右眼影像。立體影像產生單元350之成像單元352用以顯示虛擬螢幕D2的立體影像於立體顯示器(例如是第1A圖之MR眼鏡G1)。The image generation unit 351 of the stereoscopic image generation unit 350 is used to generate the left-eye image and the right-eye image of the virtual screen D2 according to the six-degree-of-freedom orientation DD of the virtual screen D2 and the optical parameters of the stereoscopic display (such as the MR glasses G1 in FIG. 1A ). image. The imaging unit 352 of the stereoscopic image generating unit 350 is used to display the stereoscopic image of the virtual screen D2 on a stereoscopic display (for example, the MR glasses G1 in FIG. 1A ).
其中,立體影像產生單元350之成像單元352可以根據使用者設定,將虛擬螢幕D2顯示於可移動物體900周圍之特定位置。Wherein, the imaging unit 352 of the stereoscopic image generating unit 350 can display the virtual screen D2 at a specific position around the
綜上所述,雖然本揭露已以實施例揭露如上,然其並非用以限定本揭露。本揭露所屬技術領域中具有通常知識者,在不脫離本揭露之精神和範圍內,當可作各種之更動與潤飾。因此,本揭露之保護範圍當視後附之申請專利範圍所界定者為準。To sum up, although the present disclosure has been disclosed above with embodiments, it is not intended to limit the present disclosure. Those with ordinary knowledge in the technical field to which this disclosure belongs may make various changes and modifications without departing from the spirit and scope of this disclosure. Therefore, the scope of protection of this disclosure should be defined by the scope of the appended patent application.
100, 200, 300:同時追蹤可移動物體與可移動相機的六自由度方位之系統 110:可移動相機 120:可移動相機六自由度方位計算單元 121:環境特徵擷取單元 122:相機矩陣計算單元 123:相機方位計算單元 130:可移動物體六自由度方位計算單元 131:物體特徵座標推算單元 132:物體特徵座標修正單元 133:物體方位計算單元 140:訓練資料生成單元 150:神經網路訓練單元 260:自動擴增單元 270:權重調整單元 310:方位修正單元 311:交叉比對單元 312:修正單元 320:方位穩定單元 330:視軸計算單元 340:螢幕方位計算單元 350:立體影像產生單元 351:影像產生單元 352:成像單元 900:可移動物體 CD:可移動相機之六自由度方位 CM:相機矩陣 d, d’, d” :位移 D1, D4:實體螢幕 D2, D3:虛擬螢幕 DD:虛擬螢幕之六自由度方位 EF:環境特徵點 ET:特徵抽取器 FL:特徵點座標預測層 FV:特徵向量 G1:MR眼鏡 GC:幾何限制 GCL:幾何限制層 IM, IM’:影像 LV:損失值 MD:神經網路推論模型 m:位移平均值 OD:可移動物體的六自由度方位 OF, OF’, OF*, OF**:特徵點 OLV:總損失值 P1:手機 s:位移標準差 ST1:訓練階段 ST2:追蹤階段 ST3:增量學習階段 TC:時間限制 TCL:時間限制層,,,,,,,,,,:座標:位移:懲罰值 PL:最佳平面 C:中心點 N:法向量100, 200, 300: Simultaneously track the system of the six-degree-of-freedom orientation of the movable object and the movable camera 110: The movable camera 120: The six-degree-of-freedom orientation calculation unit of the movable camera 121: The environmental feature extraction unit 122: The camera matrix calculation Unit 123: Camera orientation calculation unit 130: Six degrees of freedom orientation calculation unit for movable objects 131: Object feature coordinate calculation unit 132: Object feature coordinate correction unit 133: Object orientation calculation unit 140: Training data generation unit 150: Neural network training Unit 260: automatic amplification unit 270: weight adjustment unit 310: orientation correction unit 311: cross comparison unit 312: correction unit 320: orientation stabilization unit 330: visual axis calculation unit 340: screen orientation calculation unit 350: stereoscopic image generation unit 351: image generation unit 352: imaging unit 900: movable object CD: six degrees of freedom orientation of movable camera CM: camera matrix d, d', d”: displacement D1, D4: physical screen D2, D3: virtual screen DD : six degrees of freedom orientation of virtual screen EF: environmental feature point ET: feature extractor FL: feature point coordinate prediction layer FV: feature vector G1: MR glasses GC: geometric constraint GCL: geometric constraint layer IM, IM': image LV: Loss value MD: neural network inference model m: average displacement OD: six degrees of freedom orientation OF, OF', OF*, OF**: feature point OLV: total loss value P1: mobile phone s: displacement standard Difference ST1: Training stage ST2: Tracking stage ST3: Incremental learning stage TC: Time-limited TCL: Time-limited layer , , , , , , , , , ,:coordinate : displacement : penalty value PL: best plane C: center point N: normal vector
第1A、1B圖繪示本揭露同時追蹤可移動物體與可移動相機之技術與習知技術相比在應用上的說明。 第2A圖繪示根據一實施例之同時追蹤可移動物體與可移動相機的六自由度方位之系統與方法。 第2B圖繪示加入訓練階段之同時追蹤可移動物體與可移動相機的六自由度方位之系統與方法。 第3A圖繪示可移動相機擷取的一連串影像中,環境特徵點、可移動物體特徵點各自的對應關係。 第3B圖示例說明物體在空間的位置與方向。 第4A~4B圖繪示修補可移動物體的特徵點。 第5A~5D圖繪示以手機為例之特徵點定義及各種訓練資料。 第6圖繪示神經網路在訓練階段的結構。 第7圖繪示在相鄰的兩張影像之間,特徵點位移的計算方式。 第8圖繪示時間限制的計算及判定方法。 第9圖繪示缺少時間限制而產生不正確位移的情況。 第10圖繪示加入增量學習之同時追蹤可移動物體與可移動相機的六自由度方位之系統與方法。 第11圖繪示應用於MR眼鏡之同時追蹤可移動物體與可移動相機的六自由度方位之系統與方法。Figures 1A and 1B illustrate the application of the technology of simultaneously tracking a movable object and a movable camera of the present disclosure compared with the conventional technology. FIG. 2A illustrates a system and method for simultaneously tracking the 6DOF orientation of a movable object and a movable camera according to an embodiment. FIG. 2B illustrates a system and method for simultaneously tracking the 6DOF orientation of a movable object and a movable camera, adding a training phase. FIG. 3A shows the corresponding relationship between environmental feature points and movable object feature points in a series of images captured by the movable camera. Figure 3B illustrates the position and orientation of objects in space. 4A-4B illustrate patching feature points of a movable object. Figures 5A to 5D show the definition of feature points and various training materials taking mobile phones as an example. Figure 6 shows the structure of the neural network during the training phase. Fig. 7 shows the calculation method of feature point displacement between two adjacent images. Figure 8 shows the calculation and determination method of the time limit. Fig. 9 shows the situation where the lack of time constraints produces incorrect displacements. FIG. 10 illustrates a system and method for simultaneously tracking a movable object and a movable camera's six-degree-of-freedom orientation with incremental learning. FIG. 11 shows a system and method for simultaneously tracking a movable object and a movable camera's six-degree-of-freedom orientation applied to MR glasses.
100:同時追蹤可移動物體與可移動相機的六自由度方位之系統100: A six-degree-of-freedom system that simultaneously tracks a movable object and a movable camera
110:可移動相機110: Movable camera
120:可移動相機六自由度方位計算單元120:Moveable camera six degrees of freedom orientation calculation unit
121:環境特徵擷取單元121: Environmental feature extraction unit
122:相機矩陣計算單元122: Camera matrix calculation unit
123:相機方位計算單元123: Camera orientation calculation unit
130:可移動物體六自由度方位計算單元130: Six degrees of freedom orientation calculation unit for movable objects
131:物體特徵座標推算單元131: Object feature coordinate calculation unit
132:物體特徵座標修正單元132: Object feature coordinate correction unit
133:物體方位計算單元133: Object orientation calculation unit
CD:可移動相機之六自由度方位CD: six degrees of freedom orientation of movable camera
CM:相機矩陣CM: camera matrix
EF:環境特徵點EF: environmental feature point
GC:幾何限制GC: Geometry Constraints
IM:影像IM: Image
MD:神經網路推論模型MD: Neural Network Inference Models
OD:可移動物體的六自由度方位OD: six degrees of freedom orientation of a movable object
OF,OF’:可移動物體的特徵點OF, OF’: feature points of movable objects
ST1:訓練階段ST1: training stage
ST2:追蹤階段ST2: Tracking stage
TC:時間限制TC: time limit
Claims (20)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110554564.XA CN113920189A (en) | 2020-07-08 | 2021-05-20 | Method and system for simultaneously tracking six-degree-of-freedom directions of movable object and movable camera |
US17/369,669 US11506901B2 (en) | 2020-07-08 | 2021-07-07 | Method and system for simultaneously tracking 6 DoF poses of movable object and movable camera |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202063049161P | 2020-07-08 | 2020-07-08 | |
US63/049,161 | 2020-07-08 |
Publications (2)
Publication Number | Publication Date |
---|---|
TW202203644A TW202203644A (en) | 2022-01-16 |
TWI793579B true TWI793579B (en) | 2023-02-21 |
Family
ID=80787701
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW110114401A TWI793579B (en) | 2020-07-08 | 2021-04-21 | Method and system for simultaneously tracking 6 dof poses of movable object and movable camera |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI793579B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI826189B (en) * | 2022-12-16 | 2023-12-11 | 仁寶電腦工業股份有限公司 | Controller tracking system and method with six degrees of freedom |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW201915943A (en) * | 2017-09-29 | 2019-04-16 | 香港商阿里巴巴集團服務有限公司 | Method, apparatus and system for automatically labeling target object within image |
CN111311632A (en) * | 2018-12-11 | 2020-06-19 | 深圳市优必选科技有限公司 | Object pose tracking method, device and equipment |
-
2021
- 2021-04-21 TW TW110114401A patent/TWI793579B/en active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW201915943A (en) * | 2017-09-29 | 2019-04-16 | 香港商阿里巴巴集團服務有限公司 | Method, apparatus and system for automatically labeling target object within image |
CN111311632A (en) * | 2018-12-11 | 2020-06-19 | 深圳市优必选科技有限公司 | Object pose tracking method, device and equipment |
Also Published As
Publication number | Publication date |
---|---|
TW202203644A (en) | 2022-01-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109146965B (en) | Information processing apparatus, computer readable medium, and head-mounted display apparatus | |
CN110047104B (en) | Object detection and tracking method, head-mounted display device, and storage medium | |
US20210209788A1 (en) | Method and apparatus for generating data for estimating three-dimensional (3d) pose of object included in input image, and prediction model for estimating 3d pose of object | |
JP4512584B2 (en) | Panorama video providing method and apparatus with improved image matching speed and blending method | |
JP2019536170A (en) | Virtually extended visual simultaneous localization and mapping system and method | |
CN108492316A (en) | A kind of localization method and device of terminal | |
JP6456347B2 (en) | INSITU generation of plane-specific feature targets | |
CN105678809A (en) | Handheld automatic follow shot device and target tracking method thereof | |
JP2016522485A (en) | Hidden reality effect and intermediary reality effect from reconstruction | |
JP2015521419A (en) | A system for mixing or synthesizing computer generated 3D objects and video feeds from film cameras in real time | |
CN108227920B (en) | Motion closed space tracking method and system | |
US10838515B1 (en) | Tracking using controller cameras | |
CN103914855B (en) | The localization method and device of a kind of moving target | |
CN112541973B (en) | Virtual-real superposition method and system | |
CN114095662A (en) | Shooting guide method and electronic equipment | |
US20190340773A1 (en) | Method and apparatus for a synchronous motion of a human body model | |
US11506901B2 (en) | Method and system for simultaneously tracking 6 DoF poses of movable object and movable camera | |
TWI793579B (en) | Method and system for simultaneously tracking 6 dof poses of movable object and movable camera | |
CN110120098A (en) | Scene size estimation and augmented reality control method, device and electronic equipment | |
TWI736083B (en) | Method and system for motion prediction | |
CN111193918B (en) | Image processing system and image processing method | |
TWI680005B (en) | Movement tracking method and movement tracking system | |
CN114092668A (en) | Virtual-real fusion method, device, equipment and storage medium | |
CN114913245A (en) | Multi-calibration-block multi-camera calibration method and system based on undirected weighted graph | |
TWI836582B (en) | Virtual reality system and object detection method applicable to virtual reality system |