TW561423B - Video-based image control system - Google Patents
Video-based image control system Download PDFInfo
- Publication number
- TW561423B TW561423B TW90118059A TW90118059A TW561423B TW 561423 B TW561423 B TW 561423B TW 90118059 A TW90118059 A TW 90118059A TW 90118059 A TW90118059 A TW 90118059A TW 561423 B TW561423 B TW 561423B
- Authority
- TW
- Taiwan
- Prior art keywords
- scope
- patent application
- item
- feature
- application
- Prior art date
Links
Landscapes
- Image Analysis (AREA)
Abstract
Description
561423 A7 B7五、發明説明(1 ) ^ 相關應用對照 本申請專利案係出於西元2000年7月24日所提出之美國 臨時申請案第60/220,223號,其標題爲VIDEO-BASED IMAGE CONTROL SYSTEM,並以引用的方式併入本文 中 。 技術範圍 本發明係關於一種影像處理系統,更特別的是與一種用 : 於處理立體影像資料之以視頻爲主的影像控制系統有關。-- 背景 目前有許多可用於與一種電腦系統互動或控制電腦系統 的操作系統。多數這類操作系統使用運用公認的圖形使用 者介面(GUI)功能與控制技術的標準化介面。由於此功能 與控制技術常見於GUI間,使不熟悉電腦平台與/或應用的 使用者仍可輕易控制不同的電腦平台與使用者應用。 一種常用的控制技術爲使用滑鼠或軌跡球式的點選裝置 , - 移動營幕物件上的游標。一種點選物件(一次或兩次)的姿 態執行一 GUI功能。然而,選擇GUI功能將阻礙不熟悉電腦 滑鼠操作人士聯繫電腦系統。另外也有一種無法使用電腦 滑鼠或軌跡球的情形,如在城市街道上百貨公司的展示櫥 窗前或使用者本身爲傷殘者。 概要 在一個通用觀點中揭露一種使用立體視覺與電腦聯繫的 ^ 方法。此方法包括抓取一立體影像及處理此立體影像以決 定此立體影像中物體的位置資訊。此物體可由一使用者控 v 裝 訂561423 A7 B7 V. Description of the invention (1) ^ Related application comparison The patent application of this application is based on US Provisional Application No. 60 / 220,223 filed on July 24, 2000, and its title is VIDEO-BASED IMAGE CONTROL SYSTEM , And incorporated herein by reference. TECHNICAL FIELD The present invention relates to an image processing system, and more particularly relates to a video-based image control system for processing stereoscopic image data. -Background There are many operating systems that can be used to interact with or control a computer system. Most of these operating systems use standardized interfaces that use well-known graphical user interface (GUI) functions and control technologies. Since this function and control technology are common between GUIs, users who are not familiar with computer platforms and / or applications can still easily control different computer platforms and user applications. A common control technique is to use a mouse or trackball-type pointing device,-to move the cursor on the object of the camp screen. A gesture of clicking an object (once or twice) performs a GUI function. However, choosing a GUI function will prevent unfamiliar computer mouse operators from contacting the computer system. There are also situations where a computer mouse or trackball cannot be used, such as in front of a department store display window on a city street or the user is disabled. Summary In a general perspective, a method of using stereo vision to connect with a computer is revealed. The method includes capturing a stereo image and processing the stereo image to determine position information of an object in the stereo image. This object can be controlled by a user
k -4- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 561423 A7 B7 五、發明説明(2 ) 制。此方法也包括徒用此位置資訊以允許使用者與一種電 腦應用互動。 抓取立體影像之步驟包括利用一立體照相機抓取立體影 像。此方法也包括利用分析此物體在位置資訊的改變識別 一種相關於此物體的姿態,並根據此已識別之姿態控制電 腦應用。此方法也包括決定一種電腦應用的應用狀態並使 用此應用狀態識別此姿態。此物體可爲使用者。名另一個 -範例中.,此物·體爲使用者的一部份。本方法尚包括提供反--饋給關於此電腦應用的使用者。 在上述實施例中,處理此立體影像以決定此物體之位置 資訊的步驟包括映射從相關於此物體之位置座標到相關於 此電腦應用之螢幕座標的位置資訊。處理此立體影像的步 驟也包括處理此立體影像以識別特徵資訊以及由此特徵資 訊產生一景象描述。 , 處理立體影像之步驟也包括分析此景象描述以識別一物 體的位置雙化與映射此物體的位置變化。處理立體影像以 產生景象描述之步驟也包括處理此立體影像以識別在此立 體影像的匹配特徵對,並計算每個匹配特徵對的不一致與 位置以產生一景象描述。 此方法也包括分析一景象分析方法中的景象描述以決定 此物體的位置資訊。 抓取此立體影像包括抓取一參考照相機的參考影像以及 二 一比較照相機的比較影像,且處理此立體影像也包括處理 此參考影像與比較影像以產生特徵對。 __ -5- 本紙張尺度適用中國國家標準(CNS) Α4規格(210 X 297公釐) 561423k -4- The size of this paper applies to Chinese National Standard (CNS) A4 (210 X 297 mm) 561423 A7 B7 V. Description of Invention (2). This method also includes using this location information to allow the user to interact with a computer application. The step of capturing a stereo image includes capturing a stereo image using a stereo camera. The method also includes analyzing a change in position information of the object to identify a gesture related to the object, and controlling a computer application based on the recognized gesture. The method also includes determining an application state of a computer application and using the application state to recognize the gesture. This object can be a user. Name another-In the example, this object is part of the user. The method also includes providing feedback to a user about the computer application. In the above embodiment, the step of processing the stereo image to determine the position information of the object includes mapping the position information related to the position coordinates of the object to the screen coordinates of the computer application. The steps of processing the stereo image also include processing the stereo image to identify feature information and generate a scene description from the feature information. The step of processing the stereo image also includes analyzing the scene description to identify the location of an object and the mapping of the location change of the object. The step of processing the stereo image to generate a scene description also includes processing the stereo image to identify matching feature pairs in the stereo image, and calculating the inconsistency and position of each matching feature pair to generate a scene description. This method also includes analyzing the scene description in a scene analysis method to determine the position information of the object. Capturing the stereo image includes capturing a reference image of a reference camera and a comparison image of a two comparison camera, and processing the stereo image also includes processing the reference image and the comparison image to generate a feature pair. __ -5- This paper size applies to China National Standard (CNS) Α4 size (210 X 297 mm) 561423
處理此立體影像以識別此立體影像中的匹配對也包括識 别此參考影像中的特徵,爲每個參考影像中的特徵產生一 :比較影像中的候選匹配特徵’以及從此候選匹配特徵組 中馬每個參考影像中的特徵選擇一最佳匹配特徵以產生一 =徵對。處理此立體影像也包㈣波此參考影像與比較影 像0 配特徵計算一匹配分 的候選匹配特徵以產- 處理此特徵對也包括爲各個候選匹 數與棑名”並選擇具有最高匹配分數 生此特徵對。 =參考影像中的每個特徵產生—組候選匹配特徵包括由 比板影像之縣決Μ圍中選擇候選匹配特徵。 根據候選匹配特徵的匹配分數消除特徵對。若最高排名 ::配特徵之匹配分數低於預先決定臨界値,也可消除 最高排名候選匹時徵的匹配分數在較低排名 候k匹配特徵之匹配分數的 特徵對。, 1預先决疋臨界値内’則消除此 代!算PC配刀數的步驟包括識別所有鄭近的特徵對,將特 =配分數與不同的鄰近候選匹配特徵匹配分數調整 ::’以及選擇具有最高調整匹配分數的候選匹配特 欲以產生特徵對。 像可藉由應用比較影像做爲參考影像與參考影 ” ^ 像以產生第二組特徵,以及消^該等沒有與 弟7: 且:徵對映射之特徵對的原始特徵對組内的特徵對。 立万’包括為此景象描述中的各個特徵對,利用轉換與Processing the stereo image to identify matching pairs in the stereo image also includes identifying the features in the reference image, and generating one for each feature in the reference image: comparing candidate matching features in the image, and from the candidate matching feature group. Each feature in the reference image selects a best-matching feature to generate a = sign pair. Processing this stereo image also includes the reference image and the comparison image. 0 Matching features to calculate a candidate matching feature to produce a matching score-processing this feature pair also includes for each candidate number and name "and select the student with the highest matching score. This feature pair. = Each feature in the reference image is generated—a group of candidate matching features includes selecting candidate matching features from the county decision area of the board image. Eliminating feature pairs based on the matching score of the candidate matching features. If the highest ranking :: match The matching score of the feature is lower than the pre-determined critical threshold, and it can also eliminate the feature pairs whose matching scores of the highest ranking candidate are in the matching score of the lower-ranking k-k matching features., 1 The step of calculating the number of PC matching knives includes identifying all Zheng Jin's feature pairs, adjusting the special score with different neighboring candidate matching feature matching score adjustments :: ', and selecting the candidate matching special desire with the highest adjusted matching score to generate Feature pairs. Images can be compared by using the comparison image as a reference image and a reference image. "^ Image to generate a second set of features, and 7 and brother have: and: intrinsic features of the original feature mapping of feature pairs in the group pair. Liwan ’includes the feature pairs in the description of this scene.
561423 A7561423 A7
每個立體影像之眞實世 以計算眞實世界座標。選擇徵對的不-致與位置 影像與比較影像劃㈣區塊中4 體影像的參考 像素的明視度樣式所描述。劃 :包含於區塊中之 有固定大小的像素區塊中將,影像劃分到具 塊。 此像素區塊爲8x8的像素區 分析此景象描述以決定此物 景象描述以·排除視域中所想區 建立所想區域的邊界。 月足之位置資訊也包括修剪此 域外的特徵資訊。修剪包括_ 分析此景象描述以決定此物體的位置資訊包括,在—預 先決定的範圍内利用與鄰近特徵資訊的比較將有興趣區域 (特徵資訊分群爲有特徵集合的群,並計算各群的位置。 分析此景象描述也包括消除所有具有低於預先決㈣徵臨 界値的群。 , 分析此意象描述也包括,選擇匹配一預先決定標準之群 的位置,記綠匹配此預先決定標準之群的位置爲物體位置 座標,及輸出此物體位置座標。此方法也包括利用檢查表 示偵測區域内的特徵決定此群使用者的存在。計算各群位 置可排除一物體偵測區域外的群的特徵。 此方法包括根據此物體位置座標定義一動態物體偵測區 域。此外,此動態物體偵測區域可定義成關於一使用者身 體。 此方法包括根據此物體位置座標定義一身體位置偵測區 域。定義此身體位置偵測區域也包括偵測此使用者的頭部 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) A7 B7 561423 五、發明説明(5 位置。此方法也包括平滑此物體位置座標之運動以消除連 續影像訊框間的跳動。 此方法包括計算來自此物體位置座標的手導向資訊。輸 出此物體位置座標包括輸出此手導向資訊。計算手導向資 訊也包括平滑此手導向資訊内的改變。 定我動怨物體偵測區域也包括識別特徵集合之軀幹分割 平面的位置,以及決定垂直於軀幹分割平面座標軸之關於 軀幹分割平面的手偵測區域位置。 定義動態物體偵測區域包括從此特徵集合識別一身體中 央位置與一身體邊界位置,使用此特徵對群與軀幹分割平 面的交點從此特徵集合識別一指示此使用者手臂部分的位 置,使用相關於此身體位置的手臂位置來識別此手臂爲左 臂或右臂。 此方法也包括由身體中央位置,身體邊界位置,軀幹分 割平面,冬左臂或右臂識別建立一肩膀位置。定義此動態 偵測區域包括決定與肩膀位置相關之手偵測區域的位置資 料。 此技術包括平滑手偵測區域的位置資料。此外,此技術 包括決定垂直於此軀幹分割平面的座標軸之與軀幹分割平 面相關的動態物體偵測區域的位置,在與肩膀位置相關的 水平座標軸決定動態物體偵測區域的位置,以及在與使用 此身體邊界位置之使用者全邵鬲度的垂直座標軸決定此動 態物體偵測區域的位置。 定義此動態物體偵測區域包括之步驟如下,除非最高特 -8 - 本紙張尺度適用中國國家標準(CNS) A4規格(210X 297公釐)The real world of each stereo image is used to calculate the real world coordinates. Select the non-correspondence and position of the pair of images. Describes the lightness pattern of the reference pixels of the 4-volume image in the block of the image and comparison image. Stroke: A fixed-size pixel block included in a block divides the image into blocks. This pixel block is an 8x8 pixel area. Analyze the scene description to determine the object. The scene description excludes the desired area in the view and establishes the boundary of the desired area. The position information of the moon foot also includes trimming feature information outside this domain. Trimming includes_ Analyzing this scene description to determine the position information of this object includes, using a comparison with neighboring feature information within a predetermined range to compare areas of interest (feature information is grouped into groups with feature sets, and the Position. Analyzing this scene description also includes eliminating all groups with a threshold below the pre-determined threshold. Analyzing this image description also includes selecting the location of a group that matches a predetermined criterion, and remembering that the group that matches this predetermined criterion is green. Is the position of the object, and outputs the position of the object. This method also includes checking the characteristics of the detection area to determine the existence of users in this group. Calculating the position of each group can exclude the group outside the object detection area. Features. This method includes defining a dynamic object detection area based on the object position coordinates. In addition, the dynamic object detection area can be defined as a user's body. This method includes defining a body position detection area based on the object position coordinates. .Define this body position detection area also includes detecting this user's head and paper The scale applies to the Chinese National Standard (CNS) A4 specification (210 X 297 mm) A7 B7 561423 5. Description of the invention (5 position. This method also includes smoothing the movement of the position coordinates of the object to eliminate the jump between consecutive image frames. This The method includes calculating hand guidance information from the position coordinates of the object. Outputting the position coordinates of the object includes outputting the hand guidance information. Calculating the hand guidance information also includes smoothing the changes in the hand guidance information. Definitely the object detection area also includes Identify the position of the torso segmentation plane of the feature set and determine the position of the hand detection area on the torso segmentation plane perpendicular to the coordinate axis of the torso segmentation plane. Defining a dynamic object detection area includes identifying a central position of the body and a body boundary position from this feature set Use this feature to identify the intersection of the group and the torso segmentation plane from this feature set to identify a position indicating the user's arm part, and use the arm position related to this body position to identify this arm as the left or right arm. This method also includes Plane division by body center position, body boundary position, torso The winter left or right arm recognition establishes a shoulder position. Defining this dynamic detection area includes determining the position data of the hand detection area related to the shoulder position. This technology includes smoothing the position data of the hand detection area. In addition, this technology Including determining the position of the dynamic object detection area related to the torso segmentation plane perpendicular to the coordinate axis of the torso segmentation plane, determining the position of the dynamic object detection region at the horizontal coordinate axis related to the shoulder position, and using the body boundary position The vertical coordinate axis of the user ’s full scale determines the position of this dynamic object detection area. The steps to define this dynamic object detection area are as follows, unless the highest special -8-This paper size applies the Chinese National Standard (CNS) A4 specification (210X (297 mm)
裝 訂Binding
561423 A7 B7 五、發明説明(6 欲對在邊界,否則便使用特徵集合之最南特徵對以建立使 用者頭頂位置,以及決定與此使用頭頂相關之手偵測區域 位置。 另一個觀點中揭露一種使用立體視覺與電腦聯繫的方 法。此方法包括利用一立體照相機抓取一立體影像,以及 處理此立體影像以決定此立體影像中物體的位置資訊,其 中此物體由一使用者控制。此方法尚包括處理此立體影像 以識別.特徵資訊,產生此特徵資訊的景象描述,與識別此-三體影像中的匹配特徵對。此方法也包括計算各個匹配特 徵對的不一致與位置以產生此景象描述,以及分析一景象 分析方法中的景象描述以決定此物體的位置資訊。此方法 包括利用預先決定之範圍中與鄰近特徵資訊的比較,將所 想區域之特徵資訊分群爲具有特徵集合的群,計算各群的 位置,並使用此位置資訊以允許使用者與一種電腦應用互 動0 此外,此技術包括映射由照相機座標到螢幕座標與此電 腦應用相關之特徵資訊的物體位置,並使用此映射位置與 此電腦應用互動。 此方法包括利用分析此景象描述中物體位置資訊的改變 以識別物體相關的姿態,並結合位置資訊與姿態以便與此 電腦應用互動。抓取此立體影像的步驟包括使用一立體照 相機抓取此立體影像。 另一個觀點中揭露一種用於與電腦所執行之應用程式互 動的立體視頻系統。第一與第二視頻照相機係配置於鄰近 _ __-9- 本紙張尺度適用中國國家標準(CNS) Α4規格(210X297公复) 423 A7 B7 五、發明説明(7 ) 結構並可操作產生一系列立體視頻影像。一處理器可操作 用於接收此系列之立體視頻影像與偵測在此照相機視野之 父又範圍所不的物體。此處理器執行一程序以定義與第一 與第二視頻照相機位置相關之三維座標的物體偵測區域, 選擇物體偵測區域内所示的控制物體,並在此控制物體移 動到此物體偵測區域内時,映射此控制物體的位置座標到 相關於此應用程式的位置指示器。 此程·序選-擇一控制物體爲一出現在最接近此視頻照相機 以及在此物體偵測區域之内的已偵測物體。此控制物體爲 人類的手。 一種與視頻照相機相關之控制物體的水平位置映射到此 位置指示器的X軸螢幕座標。一種與視頻照相機相關之控 制物體的垂直位置映射到此位置指示器的y軸螢幕座標。 安裝此處理器以映射相關於此視頻照相機之控制物體的 水平位置U此位置指示器的X軸螢幕座標,映射相關於此 視頻照相磯之控制物體的垂直位置到此位置指示器的y軸 螢幕座標’並模擬一使用提供給此應用程式之已結合X卓由 與y軸螢幕座標的滑鼠功能。 安裝此處理器以模擬利用由物體位置行動所得姿態的滑 鼠按鈕。安裝此處理器以模擬對於一預先決定時間週期基 於在此物體偵測區域之内的任何位置之控制物體的持續位 置之滑鼠按叙。在另一個範例中安裝此處理器以模擬對於 預先決定時間週期基於在一相互顯示區域的邊界之内持 ’ ’ ’貝、的位置指示器之位置的滑鼠按鈕。安裝此處理器以映射 -10- 561423 A7 ______ B7 五、發明説明(8 ) 相關於此視頻照相機之控制物體的z軸深度位置到此位置 指示器的Z軸螢幕座標。 安裝此處理器以映射相關於此視頻照相機之控制物體的 X軸位置到此位置指示器的x軸螢幕座標,映射相關於此視 頻照相機之控制物體的y軸位置到此位置指示器的y軸螢幕 座“,及映射相關於此視頻照相機之控制物體的z軸深度 位置到此位置指示器的Z軸螢幕座標。 一相互醑示區域邊界内之位置指示器的位置觸發應用程 式内的行動。在一預先決定時間週期之内沿著涵蓋一預先 /夬足距離之z軸深度位置的控制物體移動觸發此應用程式 内的選擇行動。 用於一預先決定時間週期之物體偵測區域内任一位置所 持續的控制物體位置觸發此應用程式内的選擇行動。 另一個觀點中揭露一種用於與電腦所執行之應用程式互 動的立體頻系統。第一與第二視頻照相機係配置於鄰近 結構並可操作產生一系列立體視頻影像。一處理器可操作 用於接收此系列之立體視頻影像與偵測在此照相機視野之 X叉範圍所示的物體。此處理執行一程序以定義在相關於 第一與第二視頻照相機位置的三維座標的物體偵測區域, 選擇一控制物體爲一出現在最接近此視頻照相機以及在此 物體偵、測區域之内的已偵測物體,定義此物體偵測區域内 的次區域,識別此控制物體佔據的次區域,當此控制物體 佔據此次區域時,啓動與此次區域相關的一個行動,應用 此行動以便與一電腦應用互動。 -11- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 五、發明説明(9 再定義與此次區域相關的行動爲鱼 鍵的啓動模擬。用於-預先決定時間週:::相關之按 續的控制物體位置觸發此行動。 。、 久區域所持 =另-個觀點中揭露-種用於與電腦所執行之應用程式 動的互體視頻系統。第一與第二視頻照相機係配 近結構並可操作產生-系列立體視頻影像。一處理哭可: 作用於接收此系列之立體視頻影像與偵測在此照相機視= 足叉又範野所示的物體。此處理執行_程序以識別理解爲 出現在照相機視野的交又視野與位於—預先決定深度範圍 <最大物體的物體,選擇此物體爲目標物標,決定代表此 所想位置之位置座標,並使用此位置座標爲一物體控制點 以控制此應用程式。 此程序也導致此處理器決定並儲存一中立控制點位置, 映射相關於此中立控制點位置之物體控制點的座標,與使 用此映射%體控制點座標以控制此應用程式。 此程序也導致此處理器根據此中立控制點位置之位置定 義具有一位置的區域,在此區域之内映射相關於其位置的 物體控制點使用此映射的物體控制點座標以控制此應用裎 式。此程序也導致此處理器轉換此已映射物體控制點到一 速度函數,決定相關於此應用程式之虛擬環境的視角,並 使用此速度函數在此虛擬環境之内移動此視角。 此程序導致此處理器在此應用程式内峡射此物體控制點 的座標以控制一指示器位置。此實行方法中的指示器爲— 化身(avatar)。 -12-561423 A7 B7 V. Description of the invention (6 If you want to be at the boundary, otherwise use the southernmost feature pair of the feature set to establish the user's head position and determine the position of the hand detection area related to the use of the head. This is disclosed in another point of view A method for communicating with a computer using stereo vision. The method includes capturing a stereo image using a stereo camera, and processing the stereo image to determine position information of an object in the stereo image, wherein the object is controlled by a user. This method It also includes processing the stereo image to identify the feature information, generate a scene description of this feature information, and identify the matching feature pairs in this three-body image. This method also includes calculating the inconsistencies and positions of each matching feature pair to generate this scene Description, and analysis of a scene description in a scene analysis method to determine the location information of the object. This method includes comparing feature information of a desired area into a group with a feature set by using a predetermined range of comparison with neighboring feature information. , Calculate the location of each group, and use this location information to allow The user interacts with a computer application. In addition, this technology includes mapping object positions from camera coordinates to screen coordinates and feature information related to this computer application, and using this mapped position to interact with this computer application. This method includes analyzing this scene description using Change the position information of the object in order to recognize the posture related to the object, and combine the position information and posture to interact with this computer application. The step of capturing the stereo image includes capturing the stereo image using a stereo camera. Another aspect discloses a A stereo video system for interacting with applications executed by the computer. The first and second video cameras are located near _ __9- This paper size applies to the Chinese National Standard (CNS) Α4 specification (210X297 public copy) 423 A7 B7 V. Description of the Invention (7) The structure can be operated to generate a series of stereo video images. A processor can be used to receive the stereo video images of this series and detect objects that are not within the range of the father of the camera field of view. The processor executes a program to define a three-dimensional relationship with the first and second video camera positions Target object detection area, select the control object shown in the object detection area, and when the control object moves into the object detection area, map the position coordinates of the control object to the position indicator related to this application This procedure · Sequential Selection-Select a control object as a detected object that appears closest to the video camera and within the object detection area. This control object is a human hand. A control related to the video camera The horizontal position of the object is mapped to the X-axis screen coordinate of this position indicator. A vertical position of the control object related to the video camera is mapped to the y-axis screen coordinate of this position indicator. Install this processor to map the video camera related Control the horizontal position of the object U The X-axis screen coordinate of this position indicator, map the vertical position of the control object related to this video camera to the y-axis screen coordinate of this position indicator and simulate a use provided to this application It has combined mouse functions of X Zhuoyou and y-axis screen coordinates. This processor is installed to simulate a mouse button using gestures derived from the position of an object. This processor is installed to simulate a mouse click for a predetermined time period based on the continuous position of the control object at any position within the object detection area. In another example, the processor is installed to simulate a mouse button for a predetermined period of time based on the position of a position indicator that holds a '' '' in a mutual display area. Install this processor to map -10- 561423 A7 ______ B7 V. Description of the invention (8) The z-axis depth position of the control object related to this video camera to the Z-axis screen coordinate of the indicator. Install this processor to map the X-axis position of the control object related to this video camera to the x-axis screen coordinates of this position indicator, and map the y-axis position of the control object related to this video camera to the y-axis of this position indicator Screen mount ", and map the z-axis depth position of the control object related to this video camera to the Z-axis screen coordinate of this position indicator. A position of the position indicator within the boundaries of the mutual display area triggers actions in the application. Movement of a control object along a z-axis depth position covering a predetermined / stomp distance within a predetermined time period triggers a selection action in this application. Any one of the object detection areas for a predetermined time period The continuous control of the position of the object triggers the selection action in this application. Another aspect discloses a stereo system for interacting with an application executed by a computer. The first and second video cameras are arranged in adjacent structures and Operable to generate a series of stereo video images. A processor is operable to receive this series of stereo video images And detecting an object shown in the X-cross range of the camera field of view. This process executes a program to define an object detection area at a three-dimensional coordinate relative to the first and second video camera positions, and selects a control object as an output Now it is closest to the video camera and the detected objects within the object detection and measurement area. Define the sub-area within this object detection area, identify the sub-area occupied by this control object, and when this control object occupies this area At this time, start an action related to this area, and apply this action to interact with a computer application. -11- This paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 5. Description of the invention (9 Redefine the action related to this area as the start simulation of the fish key. It is used for-predetermining the time period: ::: the relevant continuous control of the position of the object to trigger this action., Held by the long area = disclosed in another point of view -An interactive video system for interacting with applications executed by the computer. The first and second video cameras are equipped with near structure and can be operated to produce a series of stereo video images. A process can cry: It is used to receive the stereo video images of this series and detect the objects shown in this camera = foot fork and fan Ye. This process executes the program to identify the intersection and field of vision that are understood to appear in the camera's field of view. Located at-Predetermine the object of the depth range < the largest object, select this object as the target object, determine the position coordinate representing the desired position, and use this position coordinate as an object control point to control the application. This program also Causes the processor to determine and store a neutral control point position, maps the coordinates of the object control points related to the neutral control point position, and uses this mapping to control the coordinates of the body control point to control the application. This process also causes the processor An area having a position is defined according to the position of the neutral control point position, and an object control point related to its position is mapped within this area. The object control point coordinates of the map are used to control the application mode. This procedure also causes the processor to convert the mapped object control point to a velocity function, determine the perspective of the virtual environment related to the application, and use this velocity function to move this perspective within the virtual environment. This procedure causes the processor to shoot the coordinates of the control point of the object within the application to control a pointer position. The indicator in this implementation method is-avatar. -12-
561423 A7 B7561423 A7 B7
此程序導致此處理器在此應用程式之内映射映射此物, 控制點的座標以控制一指示器的出現。此實行方法中的浐 示器爲-化身。此目標物標爲一視野交又範圍内所出心 人類。 另一個觀點中揭露一種用於與電腦所執行之應用程式互 動的立體視頻系統。第一與第二視頻照相機係配置於鄰近 結構並可操作產生一系列立體視頻影像。一處理器可操作 _ 用於接收此系列之立體視頻影像與偵測在此照相機視野之—' 交叉範圍所示的物體。此處理執行一程序以識別理解爲出 現在照相機視野的交叉視野與位於一預先決定深度範圍之 最大物體的物體,選擇此物體爲目標物標,定義在此照相 機與感興趣物體之間的控制區域,此控制區域位於一預先 決定位置並有一相關於目標物標之大小與位置的預先決定 大小,尋找用於相關於最接近此照相機並在此控制區域内 目標物標智控制區域,如果相關於此感興趣物體的點在此 控制區域_之内,選擇相關於此感興趣物體的點爲一控制 點,並在此控制點移動到此控制區域之内,映射此控制點 的位置座標到相關於此應用程式的位置指示器。 可操作此處理器以映射相關於此視頻照相機之控制物體 的水平位置到此位置指示器的X轴螢幕座標,映射相關於 此視頻照相機之控制物體的垂直位置到此位置指示器的y 軸螢幕座標,以及模擬一使用已結合X軸與y軸螢幕座標的 滑鼠功能。 另一個選擇是,可操作此處理器以映射相關於此視頻照 -13- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐)This procedure causes the processor to map the object within the application, controlling the coordinates of the control points to control the appearance of an indicator. The indicator in this implementation method is-avatar. This target is a human being in the field of vision. Another aspect discloses a stereo video system for interacting with applications executed by a computer. The first and second video cameras are disposed adjacent structures and are operable to generate a series of stereo video images. A processor is operable _ for receiving stereo video images of this series and detecting objects shown in the-'cross range of the camera's field of view. This process executes a program to identify an object that is understood to appear in the cross-field of view of the camera and the largest object in a predetermined depth range, selects this object as the target object, and defines the control area between the camera and the object of interest This control area is located at a predetermined position and has a predetermined size related to the size and position of the target object. Find the control area for the target object intelligent control area that is closest to this camera and within this control area. The point of this object of interest is within this control area_, select the point related to this object of interest as a control point, and move this control point into this control area, mapping the position coordinates of this control point to the relevant Position indicator in this application. The processor can be operated to map the horizontal position of the control object related to the video camera to the X-axis screen coordinate of the position indicator, and map the vertical position of the control object related to the video camera to the y-axis screen of the position indicator Coordinates, and simulate a mouse function that uses X- and y-axis screen coordinates. Another option is to operate this processor to map the photos related to this video. -13- This paper size applies to China National Standard (CNS) A4 (210 X 297 mm)
装 訂Binding
561423 A7 _____B7 五、發明説明(11 ) 相機之控制物體的X軸位置到此位置指示器的X軸螢幕座 標,映射相關於此視頻照相機之控制物體的y軸位置到此 位置指示器的y軸螢幕座標,以及映射相關於此視頻照相 機之控制物體的z軸深度位置到此位置指示器的z軸螢幕座 標。 乂體視頻系統中此目標物標爲一視野交叉範圍内所出現 的人類。此外,此控制點相關於在此控制區域内·所出現之 人類的·手。> — 另一個觀點中揭露一種用於與電腦所執行之應用程式互 動的立體視頻系統。第一與第二視頻照相機係配置於鄰近 結構並可操作產生一系列立體視頻影像。一處理器可操作 用於接收此系列之立體視頻影像與偵測在此照相機視野之 交叉範圍所示的物體。此處理執行一程序以定義在相關於 第一與第二視頻照相機位置的三維座標的物體偵測區域, 選擇來自每此物體偵測區域之内的視野之交叉視野出現之 物體的兩手物體,並在此手物體在此物體偵測區域内移動 時,映射此手物體的位置座標到相關於此應用程式提供之 化身的虛擬手位置。 此程序選擇來自在最接近此視頻照相機並在此物體偵測 區域之内的視野之交叉視野出現之物體的兩手物體。此化 身取得一類似人類身體的格式。此外,此化身提供於形成 此應用程式邵分的虛擬環境並與之互動。此處理器執行一 私序以比較在此虛擬環境之内相關於此化身的虛擬手位置 與虛擬物體的位置讓使用者能夠在一虛擬環境之内與此虛 _ -14- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 561423 A7 B7 五、發明説明(12 ) 擬物體互動。 此處理器也執行一程序以偵測視野之交叉範圍内的使用 者位置座標,以及映射此使用者的位置座標到此應用程式 所提供之化身的虚擬軀幹。若沒有選擇一映射的手物體, 此程序移動至少一個與此化身相關之至少一個虛擬手到一 中立位置。 此處理器也執行一程序以偵測視野之交又範圍内的使用 者位置·座標^並映射此使用者的位置座標到應用到此化身-的速度函數’使此化身能漫遊於此應用程式提供的虛擬環 境之中。此速度函數包括一表示此化身之零速度的中立位 置。此處理器也執行一程序以映射映射相關於此中立位置 的使用者位置座標到相關於此化身的軀幹座標以便此化身 出現傾斜。 此處理器也執行一程序以以比較相關於此化身的虛擬手 位置與暫辱虛擬環境之内的虛擬物體位置讓此使用者能夠 在漫遊於ib虛擬環境之中時與此虛擬物體互動。 如互體視頻系統其中一部份所述,由此應用程式得到與 此化身相關之虛擬膝部位置並用於改進此化身的景象。另 一個選擇是’由此應用程式得到與此化身相關之虛擬肘部 位置並用於改進此化身的景象。 下列圖式與描述提出一個或更多實行方法之細節。由描 述與圖式及申請專利範圍使其他特徵與優點變得顯而易 圖式說明 ___-_____ -1 5. 本紙張尺歧财@ ®家鮮 )61423 A7561423 A7 _____B7 V. Description of the invention (11) The X-axis position of the camera control object to the X-axis screen coordinate of this position indicator, mapping the y-axis position of the control object related to this video camera to the y-axis of this position indicator Screen coordinates, and map the z-axis depth position of the control object related to this video camera to the z-axis screen coordinates of this position indicator. The target in the carcass video system is a human appearing within a field of vision. In addition, this control point is related to the human hand that appears in this control area. > — Another aspect discloses a stereo video system for interacting with applications executed by a computer. The first and second video cameras are disposed adjacent structures and are operable to generate a series of stereo video images. A processor is operable to receive the stereo video images of the series and detect objects shown at the intersection of the camera's field of view. This process executes a program to define an object detection area at three-dimensional coordinates related to the positions of the first and second video cameras, selects a two-handed object that appears from a cross-field of view of the field of view within each object detection area, and When the hand object moves within the object detection area, the position coordinates of the hand object are mapped to the virtual hand position related to the avatar provided by this application. This program selects two-handed objects from objects that appear in the cross-field of view closest to the video camera and within the field of view of the object detection area. This incarnation takes on a format similar to the human body. In addition, this avatar is provided to form and interact with the virtual environment of this application. The processor executes a private sequence to compare the position of the virtual hand and the position of the virtual object related to the avatar within the virtual environment, so that the user can make this virtual within a virtual environment. -14- This paper size applies to China National Standard (CNS) A4 specification (210 X 297 mm) 561423 A7 B7 V. Description of invention (12) Intended object interaction. The processor also executes a procedure to detect the user's position coordinates within the intersection of the field of view and map the user's position coordinates to the virtual torso of the avatar provided by this application. If a mapped hand object is not selected, the program moves at least one virtual hand associated with the avatar to a neutral position. This processor also executes a program to detect the user's position and coordinates within the intersection of the field of view and coordinates ^ and map this user's position coordinate to the speed function applied to this avatar-enabling the avatar to roam the application Provided virtual environment. The speed function includes a neutral position representing the zero speed of the avatar. The processor also executes a program to map user position coordinates related to the neutral position to torso coordinates related to the avatar so that the avatar is tilted. The processor also executes a program to compare the position of the virtual hand related to the avatar with the position of the virtual object within the temporarily degraded virtual environment so that the user can interact with the virtual object while roaming in the ib virtual environment. As described in part of the interbody video system, this application obtains the virtual knee position associated with this avatar and uses it to improve the avatar scene. Another option is, 'This app gets the virtual elbow position associated with this avatar and uses it to improve the look of this avatar. The following figures and descriptions provide details of one or more methods of implementation. Other features and advantages are made easy by the description and drawings and the scope of patent application. Schematic description ___-_____ -1 5. This paper ruler Qi Cai @ ® 家 鲜) 61423 A7
圖1描述一種以i目Λ 、 喱乂視頻馬王之影像控制系統之一 法的硬體成分與環境。 貫订万 圖2爲概要况明圖!系統所使用之處理技術的流程圖。 、圖圖式係描述與圖i之以視頻爲主影像控制系統相關 '^各照相機的視野範圍。 圖4描述一立體昭相趟逛r 、 且…、相機裝置產生足一組立體影像所顯 的所想的共通點與上極線。 圖5之流㈣係描述—種由立體影像產生景象描述資訊 之立體處理例行程序。 圖6之流程圖係一種將景象描述資訊轉換爲位置與 方向資料的程序。 圖7之曲線圖描述關於位置如距離函數D的削減S的程 度。 圖8描述影像控制系、统的實行方法”其中一物體或手偵測 區域係直#建立在電腦監視器螢幕前。 圖9之流:程圖係描述動態定義與一使用者身體相關之手 偵測區域的非必要程序。 圖1 0A-1 0C之範例係描述圖9用於動態定義與使用者身體 相關之手偵測區域的程序。 圖11A描述一種與以視頻爲主之影像控制系統聯繫之範 例使用者介面與顯示區域。 圖11B描述一種用於映射手或指示器位置到與圖11A使用 者介面聯繫之顯TF區域的技術。 圖12A描述一種虛擬眞實環境中的範例三維使用者介 -1 6 - 本紙張尺度逋用中國國家標準(CNS) A4規格(210X 297^17 561423 五、發明説明(η ) 面。 圖12B描述爲視野消除圖12A之三維使用者介面中 檔案夾内容。 的虛擬 圖13 A描述一種用於操控虛擬三維空間之三維使用者仑 面的範例表示方式。 1 圖13B之曲線圖表示影像控制系統中的座標區域爲死 區域,其中沒有任何虛擬位置必要的改變。 圖14描述-二種視頻遊戲介面的範例實行方法,其中行動 與姿態可解釋爲用於飛行於虛擬三維都市風景的控制桿形 式操控功能。 V 圖15 A之圖式係描述一劃分爲偵測平面之範例頭部偵測 區域。 圖1 5B之圖式係描述一劃分爲偵測匣之範例頭部偵測區 圖15C與45D係描述一劃分爲兩組方向偵測匣的範例頭部 偵測區域,並進一步描述一個定義鄰近方向偵測匣間的缺 α 〇 不同圖式内相同參考符號指示相同元件。 詳細説明 圖1描述一種以視頻爲主之影像控制系統1 〇〇的實行方 法。一個人物(或許多人物)101將自身置於,或將手伸入, 一所想區域102。將所想區域102定位在相關於影像偵測器 1 〇 3的位置,以位於影像偵測器的總體視野範圍1 〇 4内。户斤 想區域102包含一手偵測區域1〇5,其中若出現並偵測到人 亡 裝 訂 線 -17· 本紙張尺度適用中國國家標準(CNS) Α4規格(210X 297公釐) ------ 561423 A7 B7 五、發明説明(15 ) 身體的部份,則確定並測量人物的位置與姿態。區域,位 置與測量皆描述於一個三維X,y,z座標或世界座標,該座 標不需對準影像偵測器103。由影像彳貞測器^所產生的視 頻於像系列則由可在視頻顯示1 〇 8上顯示視頻影像的計管 裝置1 07處理,如個人電腦。 如下列較詳細説明所述,計算裝置1〇7處理視頻影像系 列,藉以分柝一物體的位置與姿態,例如使用者的手。接 著峡射此緣果位置與姿態資訊到一應用程式,諸如一圖形 使用者介面(GUI)或一視頻遊戲。使用者手部的位置與姿 態顯示於視頻顯示108且允許執行與/或控制GU]^t視頻遊 戲中的功能。一範例功能移動螢幕按鈕上的游標並接收一 輕壓按姐的姿態以選擇螢幕按鈕。接著由計算裝置1〇7執 行按鈕的相關功能。視頻偵測器103係描述於下列詳細說 明中。系統100可實行於許多種配置中,例如一桌上電腦 配置,其t影像偵測器103架設在用於觀看所想區域1〇2的 視頻顯示108表面上;或例如二j員頂照相機配置,其中影 像偵測器103架設在支援結構上並定位於觀看所想區域 的視頻顯示10 8之上。 圖2描述視頻影像分析程序200,可利用與系統1〇〇之典型 實行方法相關之電腦軟體或電腦硬體實行之。影像偵測^ 或視頻照相機103取得所想區域102之立體影像2〇1與周圍 景象。傳送此立體影像201至計算裝置107(此裝置可選擇 性地引用於影像偵測器1〇3中),其在立體影像2〇1上執行_ 立體分析程序202以產生一景象描述2〇3。計算裝置戋 -18- 本紙張尺度適财® ®家標準(CNS) A4規格(21GX 297公爱)Figure 1 depicts the hardware components and environment of one of the video control systems of the video horse king based on i-mesh Λ. Figure 2 is a summary picture! A flowchart of the processing technology used by the system. The diagram is related to the video-based image control system in Fig. I. ^ The field of view of each camera. Fig. 4 depicts a desired common point and upper polar line shown by a stereoscopic image of a stereoscopic camera r and a camera device generating a set of stereoscopic images. The current description in Fig. 5 is a stereo processing routine that generates scene description information from stereo images. The flowchart of Fig. 6 is a procedure for converting scene description information into position and orientation data. The graph of Fig. 7 describes the degree of reduction S with respect to a position such as the distance function D. Figure 8 describes the implementation of the image control system and system. "An object or hand detection area is set straight in front of the screen of a computer monitor. Figure 9 Flow: The process map describes the dynamic definition of hands related to a user's body Non-essential procedures for the detection area. Figure 1 0A-1 0C is an example depicting the procedure for dynamically defining the hand detection area related to the user's body. Figure 11A depicts a video-based image control system An example user interface and display area for contact. Figure 11B depicts a technique for mapping the position of a hand or pointer to a displayed TF area associated with the user interface of Figure 11A. Figure 12A depicts an example three-dimensional user in a virtual reality environment介 -1 6-This paper uses the Chinese National Standard (CNS) A4 specification (210X 297 ^ 17 561423) 5. The description of the invention (η) surface. Figure 12B depicts the view to eliminate the contents of the folder in the three-dimensional user interface of Figure 12A Figure 13A depicts an example representation of a three-dimensional user plane for manipulating a virtual three-dimensional space. 1 The graph in Figure 13B shows that the coordinate area in the image control system is dead. Domain without any necessary changes to the virtual position. Figure 14 depicts two examples of video game interface implementation methods, where actions and attitudes can be interpreted as joystick-type control functions for flying over a virtual three-dimensional cityscape. V Figure 15 The diagram of A describes an example head detection area divided into detection planes. Fig. 1 5B of the diagram describes an example head detection area divided into detection boxes. Figs. 15C and 45D describe one division into Two sets of directional detection box examples of head detection areas, and further describes a definition of the lack of α between adjacent directional detection boxes. 〇 The same reference symbols in different drawings indicate the same components. Detailed description Figure 1 describes a video-based Implementation method of the image control system 100. A person (or many characters) 101 places himself or puts his hand into a desired area 102. The desired area 102 is positioned in relation to the image detector 1 〇3 position, so as to be located within the overall visual field of view of the image sensor 104. The household area 102 includes a one-hand detection area 105, and if a dead line appears and is detected- 17 · This paper size applies the Chinese National Standard (CNS) A4 specification (210X 297 mm) ------ 561423 A7 B7 V. Description of the invention (15) For the body part, determine and measure the position and posture of the person The area, location, and measurement are described in a three-dimensional X, y, z coordinate or world coordinate, which does not need to be aligned with the image detector 103. The video generated by the image sensor ^ is used by the image series The metering device 107 for displaying video images on the video display 108 is processed, such as a personal computer. As described in more detail below, the computing device 107 processes the video image series so as to distinguish the position and attitude of an object, Such as the user ’s hand. The location and attitude information of the root cause is then projected to an application, such as a graphical user interface (GUI) or a video game. The position and posture of the user's hand is displayed on the video display 108 and allows execution and / or control of functions in the video game. An example function moves the cursor on the screen button and receives a light press on the gesture of the sister to select the screen button. The computing device 107 then performs the functions of the buttons. The video detector 103 is described in the following detailed description. The system 100 can be implemented in many configurations, such as a desktop computer configuration, and its image detector 103 is mounted on the surface of a video display 108 for viewing the desired area 102; or, for example, a two-head camera configuration Among them, the image detector 103 is erected on the supporting structure and positioned above the video display 10 8 of the desired area for viewing. FIG. 2 illustrates a video image analysis program 200, which can be implemented using computer software or computer hardware related to a typical implementation method of the system 100. The image detection ^ or the video camera 103 obtains the stereo image 201 and the surrounding scene of the desired area 102. Send this stereo image 201 to the computing device 107 (this device can be selectively referenced in the image detector 103), which is executed on the stereo image 201_ stereo analysis program 202 to generate a scene description 201 . Computing Device 戋 -18- This paper is suitable for Standard Paper ® ® Standard (CNS) A4 (21GX 297)
裝 訂Binding
561423 A7 ________B7 五、發明説明(16 ) 一不同的計算裝置由此景象描述203中使用一景象分析程 序204計算並輸出人物手部的手/物體位置資訊2〇5或其他合 適點選裝置與人物其他特徵的位置或測量。手/物體位置 資訊205爲一組三維座標並提供給位置映射程序2〇7,此位 置映射程序207映射或轉換此三維座標到一組螢幕座標。 由位置映射程序207產生之螢幕座標可作爲應用程式2〇8所 使用的螢幕座標位置資訊,該應用程式208執行於計算裝 二· 置107上並提供使用者反馈206。 - 也可偵測手部的特定姿態,亦即被偵測爲手/物體位置資 訊205所顯示之手與/或其他特徵位置的改變,並由姿態分 析與偵測程序209解釋爲姿態資訊或姿態211。接著將位置 映射程序207之螢幕座標位置資訊與姿態資訊211 —起傳遞 並用於控制應用程式208。 若姿態偵測的内容敏感,姿態偵測程序209可使用應用狀 況210,且^姿態之標準與意義則由應用程式208選擇。應用 · 狀況2 1 0乏範例爲,依照視頻螢幕1 〇8所示之位置改變游標 之所示。因此,若使用者將游標由一螢幕物體移到另一個 螢幕物體時,表示游標之圖像將由指示物圖像改變爲手圖 像。一般而言,使用者在視頻顯示108所示影像改變時將 接收反饋206。反饋206 —般由應用程式208提供且與視頻 顯示108之應用手位置與狀態有關。 爲組成景象之所有或某些物體之次集合或部份物體,影 二 像偵測器103與計算裝置107產生景象描述資訊203,其中 包括一三維位置或三維位置所包含的資訊。若物體位置在 -19- 本紙張尺度適用中國國家標準(CNS) A4規格(210X 297公釐) 裝 訂561423 A7 ________B7 V. Description of the invention (16) A different computing device uses this scene description 203 to calculate and output the hand / object position information of the person ’s hand 205 or other suitable pointing device and character in this scene description 203 Location or measurement of other features. The hand / object position information 205 is a set of three-dimensional coordinates and is provided to a position mapping program 207. This position mapping program 207 maps or converts the three-dimensional coordinates to a set of screen coordinates. The screen coordinates generated by the position mapping program 207 can be used as the screen coordinate position information used by the application program 208. The application program 208 is executed on the computing device 107 and provides user feedback 206. -It is also possible to detect specific gestures of the hand, that is to be detected as changes in the hand and / or other characteristic positions displayed by the hand / object position information 205, and interpreted by the posture analysis and detection program 209 as posture information or Posture 211. Then, the screen coordinate position information and attitude information 211 of the position mapping program 207 are transmitted together and used to control the application program 208. If the content of the gesture detection is sensitive, the gesture detection program 209 can use the application condition 210, and the criteria and meaning of the gesture are selected by the application program 208. Application · Condition 2 10 An example is to change the display of the cursor according to the position shown on the video screen 108. Therefore, if the user moves the cursor from one screen object to another screen object, the image representing the cursor will be changed from the pointer image to the hand image. Generally, the user will receive feedback 206 when the image shown in the video display 108 changes. Feedback 206-generally provided by the application 208 and related to the position and status of the application hand on the video display 108. In order to form a sub-set or a part of all or some objects of the scene, the image detector 103 and the computing device 107 generate scene description information 203, which includes a three-dimensional position or information contained in the three-dimensional position. If the position of the object is -19- This paper size applies to China National Standard (CNS) A4 (210X 297 mm) binding
線 561423 A7Line 561423 A7
體具有與符合系統_-般使用之 人物文勢期望不符的外形或並 器103内立體照相機所偵測的 :、,便排除影像偵測 環境的-些限制。此環境中還他此:加對系統操作 人。此對於與其他搜尋系統相:二系統互動的 成使用者的部份影像’此影像爲靜止與/或被模仿的 此外’對-使用者與手的缓:著A 车辟… U的裝置也增加限制,這是因爲人與 Γ〇〇:=三維形式是用於識科。使用者1。1操作系統 寺甚至*要穿载手套。相較於其他利用 :Γ:Γ尋系統’上述説明亦爲系統⑽所特有的觀 ....由於身體與手的外觀依不同人物而有所不同,因此系 ^00較依賴使用者與手外觀之方法更有保障。然而請注 思可與系統100並用之立體分析程序202的某些實行方法也 可利用上喽之表示。 一般由1體照相機產生景象描述資訊203。影像偵測器 1 一03二於:類系統中組成兩個或更多個單獨的照相機並作爲 上to照相機端。照相機可以是黑白視頻照相機或彩色視 頻照相機。各個獨立的照相機需要獨特視角的景象影像並 產生一系列視頻影像。使用各個照相機影像之部份景象的 相關位置,計算裝置107可推算景象描述203所想之物件與 影像偵測器103間的距離。 以下詳細説明係描述本系統所使用之立體照相機影像偵 剠姦103的一種實行方法。另有其他立體照相機系統與演 -20- 561423 A7 B7 五、發明説明 异法可產生適用於本系統的景象描述,應注意本發明並不 限於本文中所使用的特殊立體系統。 圖3中影像偵測器或立體照相機端103之照相機30 1,3 02 偵測並產生照相機視野範圍304,305(各別)内的景象影 像。視野104的整體範圍係定義爲所有視野3〇4,305各自 範圍的交又範圍。視野1〇4之整體範圍内的物體3〇7有被所 有照相機301與302整體或部份偵測的可能性。由於景象描 4 : 述203允許f含所想區域1〇2之外的物體或物體特徵,故物-體307不需在所想區域1〇2内。關於圖3,應注意手偵測區 域1 05爲所想區域1 〇2的次集合區域。 關於圖4,影像對201之影像4〇1與4〇2由照相機對ι〇3所偵 測。影像401有一線組,各組的線4〇3有在其他影像4〇2内 的映射線404。此外,定位於線403之景象的任一共同點 405也可定位於第二照相機影像402内的映射線4〇4上,故 該點可定龟於視野1 〇4的整體範圍内並可見於照相機3〇 1與 -302 (如未被景象中其他物體阻擔)。線403與404稱爲上極 線。各個上極線對的點位置差異稱爲不一致。不一致與距 離成反向常數,並提供產生景象描述2 〇3之所想資訊。 上極線對係依靠照相機30 1與3 02間的照相機影像失眞與 幾何關係。透過分類之事先處理程序決定並選擇性分析上 述屬性。系統須説明大多數照相機使用之鏡片所採用的放 射失眞。一種用於解決放射失眞之照相機特徵的技術係描If the body has a shape that does not meet the expectations of the characters used in the system, or is detected by the stereo camera in the parallelizer 103, some limitations of the image detection environment are excluded. Also in this environment: add to the system operator. This is part of the image of the user that interacts with other search systems: the two systems are 'the image is still and / or imitated. In addition'-the user and the hand are slow: the A device is located ... the U device is also The limitation is increased because people and Γ〇〇: = three-dimensional form is used for cognition. User 1.1 operating system Temple even * has to wear gloves. Compared to other uses: Γ: Γ seeking the system 'The above description is also unique to the system ⑽ .... Because the appearance of the body and hands varies with different characters, the system ^ 00 is more dependent on the user and the hand The appearance method is more secure. However, please note that some implementation methods of the stereo analysis program 202 that can be used with the system 100 can also use the above expressions. Scene description information 203 is generally generated by a one-body camera. Image Detector 1 032: In this kind of system, two or more separate cameras are composed and used as the camera side. The camera can be a black and white video camera or a color video camera. Each independent camera requires a unique perspective image and produces a series of video images. Using the relevant position of a part of the scene of each camera image, the computing device 107 can calculate the distance between the object of the scene description 203 and the image detector 103. The following detailed description describes an implementation method of the stereo camera image detection rape 103 used by the system. There are other stereo camera systems and performances. -20- 561423 A7 B7 V. Description of the invention A different method can produce a scene description suitable for this system. It should be noted that the invention is not limited to the special stereo system used in this article. In FIG. 3, the cameras 30 1, 3 02 of the image detector or the stereo camera end 103 detect and generate scene images in the camera fields of view 304, 305 (respectively). The overall range of the field of view 104 is defined as the intersection of the respective ranges of all fields of view 304,305. An object 307 in the entire range of the field of view 104 may be detected in whole or in part by all the cameras 301 and 302. Because scene description 4: 203 allows f to include objects or object features outside the desired area 102, the object-body 307 need not be within the desired area 102. Regarding Fig. 3, it should be noted that the hand detection area 105 is a sub-collection area of the desired area 102. With regard to Fig. 4, images 401 and 402 of image pair 201 are detected by the camera pair ι03. The image 401 has a line group, and each line 403 has a mapping line 404 in the other image 402. In addition, any common point 405 located in the scene of line 403 can also be located on the mapping line 400 in the second camera image 402, so the point can be located in the entire range of the field of view 104 and can be seen in Cameras 301 and -302 (if not obstructed by other objects in the scene). The lines 403 and 404 are called upper polar lines. The difference in the position of the points of each upper polar line pair is called inconsistency. Inconsistency and distance are inverse constant, and provide the desired information to generate the scene description 203. The upper polar line pair depends on the camera image loss and geometric relationship between the cameras 301 and 302. These attributes are determined and selectively analyzed through prior processing of the classification. The system must account for the loss of radiation used by the lenses used by most cameras. A description of a technical feature of a camera for solving radiation loss
述於Microsoft Research,由Z. Zhang所著之標題爲AIn Microsoft Research, titled A by Z. Zhang
Flexible New Technique for Camera Calibration 中, -21- 本紙張尺度適用中國國家標準(CNS) A4規格(210X 297公釐) 561423 A7 B7 五、發明説明(19 http: "research .micros oft .com/〜zhang,並以引用的方式併入 本文中,作爲分類的第一步驟。此技術並未發現上極線, 但將造成直線,而更易於找到此線。次集合的方法係描述 於 Z. Zhang 之 Determining the Epipolar Geometry and its Uncertainty: A Review, The International Journal ofIn Flexible New Technique for Camera Calibration, -21- This paper size applies to Chinese National Standard (CNS) A4 (210X 297 mm) 561423 A7 B7 V. Description of the invention (19 http: " research.micros oft.com/~ zhang, which is incorporated herein by reference as the first step in classification. This technique does not find the epipolar line, but will result in a straight line, making it easier to find this line. The method of the sub-set is described in Z. Zhang Determining the Epipolar Geometry and its Uncertainty: A Review, The International Journal of
Computer Vision 1997與 Z. Zhang之 Determining the EpipolarComputer Vision 1997 and Z. Zhang by Determining the Epipolar
Geometry and its Uncertainty: A Review, Technical Report 二: 2927,INRIA Sophia Antipolis,France,July 1996,以上皆以 、 引用的方式併入本文中,並可用於解決上極線,此爲分類 的第二步驟。 圖5描述一種用於產生景象描述2〇3之立體分析程序2〇2 的實行方法。影像對201包括一參考影像401與比較影像 402。影像濾波器503濾波單獨影像4〇1與402並闖入區塊 504的特徵中。各特徵爲一 8χ8的像嗉區塊。然而應注意特 徵可定義爲大於或小於8x8的像素區塊並處理。 · 匹配處理505爲參考影像中各特徵尋找匹配。爲此目的, 特徵比較處理506比較參考影像中的特徵與第二或比較影 像402内映射上極線沿線之事先定義範圍内的所有特徵。 在此特殊實施例中,定義特徵爲影像4〇1或4〇2之8χ8的像 素區塊,其中期望此區塊包含部份的景象物體,表示區塊 内的像素強度樣式(因由影像濾波器5〇3所濾波,故無法直 接表π明視度)。各特徵對匹配的可能性由不一致記綠並二 標不。若最佳特徵對匹配之可能性很小(相較於事先決定 之臨界値)或許多特徵對皆可能有最佳匹配(若其中的差異 561423 五、發明説明(20 ) 可能性在事先定義之臨界値内,特徵則被視爲相似)時,特 徵對濾波器507便消除參考影像4〇1内的區塊。對剩餘的參 考特徵,鄰近區支援處理5〇8將所有特徵對的可能性按比 例數調整爲有利於具有相同不一致特徵對的鄰近參考特徵 可能性。爲各個參考特徵,可由特徵對選擇程序509選擇 具有最佳可能性的特徵對,爲各參考特徵提供不一致(以及 距離)。 ' 參考特徵T由程序504產生)因閉塞而不表現於第二或比_ 較影像402中且所示之最可能匹配的特徵將是錯誤的。因 此在兩種照相機系統中,比較影像4〇2所選之特徵將由相 似的程序檢查(使用第二並聯匹配程序5 1〇内的程序5〇6, 5 07,508與509)以產生參考影像4〇1中最佳的匹配特徵,取 消影像4〇1與402先前的作用。在一種具有三個照相機的系 統中(即使用照相機3 〇 1與3 0 2之外的第三個照相機),第二 照相機影卷取代比較影像402,原始參考影像4〇1繼續作爲 參考影像,透過相同程序(使用第二並聯匹配程序5 1〇内的 程序506,507,508與509)決定第三視頻的最佳匹配特徵。 若有三個以上的可用照相機,可將此程序重覆於其他的照 相機影像中。任一參考特徵之最佳匹配對特徵具有參考影 像中相同匹配特徵時則消除於比較處理5丨丨中。因此,許 多錯誤匹配,則消除由閉塞所產生的錯誤距離。 上述程序之結果爲深度描述映射512,其描述與影像4〇1 和402相關特徵之位置與不一致。由座標系統轉換程序5 i 3 將位置與不一致利用下述方程式1,方程式2與方程式3轉 ___ -23- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 561423Geometry and its Uncertainty: A Review, Technical Report 2: 2927, INRIA Sophia Antipolis, France, July 1996, all of which are incorporated herein by reference, and can be used to solve the upper polar line, this is the second step of classification . FIG. 5 illustrates an implementation method of a stereo analysis program 202 for generating a scene description 203. The image pair 201 includes a reference image 401 and a comparison image 402. The image filter 503 filters the individual images 401 and 402 and breaks into the features of the block 504. Each feature is an 8 × 8 image block. It should be noted, however, that features can be defined and processed as pixel blocks larger or smaller than 8x8. Matching process 505 finds a match for each feature in the reference image. To this end, the feature comparison process 506 compares the features in the reference image with all the features in a predefined range along the epipolar line mapped on the second or comparative image 402. In this special embodiment, the feature is defined as an 8 × 8 pixel block of the image 401 or 402, where it is expected that this block contains a part of the scene object, which represents the pixel intensity pattern within the block (because of the image filter 503 filter, so can not directly express π bright vision). The possibility of matching each feature pair is recorded by inconsistency and marked by two. If the best feature pair is less likely to be matched (compared to the critical threshold determined in advance) or many feature pairs are likely to have the best match (if there is a difference 561423 V. Description of the invention (20) The possibility is defined in advance Within a critical threshold, features are considered similar), the feature pair filter 507 eliminates blocks in the reference image 401. For the remaining reference features, the neighboring area support processing 508 adjusts the probability of all feature pairs by a proportional number to favor the possibility of neighboring reference features with the same inconsistent feature pair. For each reference feature, the feature pair selection program 509 can select the feature pair with the best possibility, providing inconsistency (and distance) for each reference feature. 'The reference feature T is generated by the program 504) The feature that is not shown in the second or comparison image 402 due to occlusion and the most likely matching feature shown will be wrong. Therefore, in the two camera systems, the features selected in the comparison image 40 will be checked by similar procedures (using the second parallel matching procedure 5 10, procedures 5 06, 5 07, 508, and 509) to generate a reference image. The best matching feature in 401 cancels the previous role of images 401 and 402. In a system with three cameras (ie using a third camera other than cameras 301 and 302), the second camera shadow volume replaces the comparison image 402, and the original reference image 401 continues to be used as the reference image. Through the same procedure (using the procedures 506, 507, 508, and 509 in the second parallel matching procedure 5 10), the best matching feature of the third video is determined. If there are more than three cameras available, this procedure can be repeated on other camera images. When the best matching pair of any reference feature has the same matching feature in the reference image, it is eliminated in the comparison process 5 丨 丨. Therefore, many erroneous matches eliminate the erroneous distance caused by occlusion. The result of the above procedure is the depth description map 512, which describes the positions and inconsistencies of the features related to the images 401 and 402. Coordinate system conversion program 5 i 3 Use the following Equation 1, Equation 2 and Equation 3 to change the position and inconsistency. ___ -23- This paper size applies to China National Standard (CNS) A4 (210 X 297 mm) 561423
換到反覆三維世界座標系統(x,y,z座標系統)(圖κ 1〇6)。由於與距離並非直線關係,故很難與不一致共同運 作。因此,一般在此時會使用以上三個方程式,使景象描 述203的座標可描述成與世界座標系統1〇6相關之線性距 離重新分配特徵座標將影響區域内的特徵密度,使分群 特徵處理變得更爲困難(於稍後步驟中執行)。^此,通常 可維持以視頻爲主的座標與已轉換座標。 轉換·程序士13所產生的轉換深度描述映射係景象描述203 (圖2中)。由景象分析程序204使此資訊變得有意義並從中 得到有效的資料。一般而言,景象分析程序2〇4需依賴本 系統所使用的特定方式。 圖6之流程圖概括景象分析程序2〇4的實行方法。在景象 分析程序204中,景象描述203内的特徵由特徵排除模組 60 1濾波以排除指示特徵不屬於使珂者或位於所想區域丨〇2 之外的位^特徵。模組601也消除背景和其他相等的牽引 (如使用者-的遠方有其他人)。 一般將所想區域102定義爲對準世界座標系統i 06的邊界 匣。在這個情形下,模組60 1可輕易地檢查各特徵座標是 否位於邊界E中。 部份背景可偵測在所想區域102内,或所想匣形區域無法 將使用者由背景中區隔出來(特別是在密閉空間中的時 候)。若所想區域102中沒有使用者,可隨機取樣景象描述二 203並由背景取樣模組602修改之,以產生背景參考603。 背景參考603爲景象外形的説明,其不改變對景象外觀的 -24- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 561423 A7 B7 五、發明説明(22 改變(如亮度的改變)。因此僅在建立系統1〇〇時可有效取樣 景象,只要景象維持不變,則參考將保持其有效性。爲保 證已觀察背景維持在背景參考6〇3所定義的外形内,背景 取樣模組602可短時間觀察景象描述2〇3並爲所有位置記錄 最接近的照相機103。此外,由事先決定的距離進一步擴 張特徵所定義的値(一般距離符合特徵距離上不一致的像 素k化)。一旦芫成取樣,可比較此背景參考6〇3與景象描 述203,且街特徵排除模組6〇丨消除背景參考上或後之景象_ 描述2 0 3内的任一特徵。 排除特欲後,下個步驟爲利用特徵分群程序6〇4將剩餘的 特徵分群到一個或更多的特徵集合中。比較事先定義範圍 内各特徵與其鄰近特徵。特徵在其影像座標中較在其他轉 換座標中更易於均勾分配’故一般使用影像座標測量都近 距離。事先定義最大可接收範圍’並依賴所使用之特定立 體分析程I ’如立體分析程序2〇2。上述立體分析程序2〇2 產生相關密度與平均分配特徵’此方法較其他立體處理技 術更易於分群。符合標準的特徵對可視爲鄰近特徵對,對 、事先定我的範圍檢查其依賴不一致的轴心距離(X轴中照 相機位於所想區域之前’或y軸中的照相機位於所想區域 =若有穿過結合特徵之特徵對的路徑,此路徑上的特 啟,付合標準,則群包括不符合標準的特徵對。 。ΐ續Λ實Λ方法,利用群遽波程序605遽波群以確保群的 ::貝…表現於所想區域102内的物體,而非立體處理 歹'仃心序中錯誤定義之位置(或不—致)的特徵結果。部份 -25- 297公釐) 本紙張尺歧 561423 A7 _______B7_ 五、發明説明(23 ) 的群滤波程序605也將包含過少特徵並提供尺寸大小,外 觀或位置之確認測量的群消除。測量群的區域,邊界大小 與特徵數並與描述測量最小品質之事先定義的臨界値相比 較。不符合標準的群與其特徵將從其他考量中刪除。 由本實行方法中的表示偵測模組606決定人物是否出 現。表示偵測模組606爲選擇性的,其因不是所有系統皆 要求成分提供此資訊。在其最簡易的方式中,表示偵測模二 組606僅需f查事先定義之表示偵測區域6〇7邊界内的特徵· J (並非之前所消除的)是否出現。表示偵測區域6〇7爲任一使 用者101部份可能佔用的區域,若沒有使用者則此區域不 可能被任何物體佔用。一般表示偵測區域6〇7符合所想區 域102。然而在本系統之特定裝置中,定義此表示偵測區 域607以避免景象内不動的物體。在本成分所使用之實行 方法中,若發現使用者1 〇 1則可省略,進一步的處理。 在所述4統100的實行方法中,定義手偵測區域1〇5。利 _ 用此方法定義的區域1〇5(透過程序609)係依賴下述中系統 所使用的結構。該裎序可選擇性分析使用者身體並返回包 括身體位置/測量資訊610等其他資訊,例如人的頭部位 置。 手偵測區域105不需有任何内容或只需包含人的手部或 合適的指示物。任一尚未被濾波消除或具有手偵測區域 105特徵的群可視爲或包括手部或指示物。爲各群計算位二 置(透過程序611),若該位置位於手偵測區域1〇5内,則(在 &己憶體中)記錄爲手位置座標612。測量此位置作爲權重裝 ____-26- 本紙張尺度適用中國國家標準(CNS) Α4規格(210 X 297公釐)Switch to the iterative three-dimensional world coordinate system (x, y, z coordinate system) (Figure κ 106). Since distance is not linear, it is difficult to work with inconsistencies. Therefore, the above three equations are generally used at this time, so that the coordinates of the scene description 203 can be described as a linear distance related to the world coordinate system 106. Reassigning the feature coordinates will affect the feature density in the area and make the clustering feature processing change. It is more difficult (implemented in a later step). ^ This usually maintains video-based and transformed coordinates. The conversion depth description mapping system generated by the conversion programmer 13 is the scene description 203 (in FIG. 2). The scene analysis program 204 makes this information meaningful and obtains valid information from it. In general, the scene analysis program 204 depends on the specific method used by the system. The flowchart of FIG. 6 summarizes the execution method of the scene analysis program 204. In the scene analysis program 204, the features in the scene description 203 are filtered by the feature exclusion module 601 to exclude features indicating that the features do not belong to the user or are located outside the desired area. Module 601 also eliminates background and other equivalent tractions (such as the user-there are others in the distance). The desired area 102 is generally defined as a bounding box aligned with the world coordinate system i 06. In this case, the module 601 can easily check whether each feature coordinate is located in the boundary E. Part of the background can be detected in the desired area 102, or the desired box-shaped area cannot distinguish the user from the background (especially when in a confined space). If there are no users in the desired area 102, the scene description two 203 may be randomly sampled and modified by the background sampling module 602 to generate a background reference 603. The background reference 603 is a description of the appearance of the scene, which does not change the appearance of the scene. -24- This paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 561423 A7 B7 V. Description of the invention (22 Changes (such as Changes in brightness). Therefore, the scene can be effectively sampled only when the system is established at 100. As long as the scene remains unchanged, the reference will remain valid. To ensure that the observed background remains within the shape defined by background reference 603 The background sampling module 602 can observe the scene description 203 for a short time and record the closest camera 103 for all positions. In addition, the distance defined by the feature is further extended by the predetermined distance (generally the distance meets the inconsistent pixels on the feature distance) k)). Once the sampling is completed, the background reference 603 can be compared with the scene description 203, and the street feature exclusion module 60 can remove the scene above or behind the background reference _ description any feature in 203. After eliminating the desire, the next step is to use the feature grouping program 604 to group the remaining features into one or more feature sets. Compare each feature in the previously defined range with Proximity features. Features are easier to distribute in their image coordinates than in other transformation coordinates. Therefore, image coordinates are usually used to measure close distances. The maximum receivable range is defined in advance and depends on the specific stereoscopic analysis process I 'used, such as Stereo analysis program 202. The above stereo analysis program 202 generates correlation density and average distribution features. This method is easier to group than other stereo processing technologies. Feature pairs that meet the standard can be considered as neighboring feature pairs. The range check depends on the inconsistent axial distance (the camera in the X axis is located before the desired area 'or the camera in the y axis is located in the desired area = if there is a path through the feature pair of the combined feature, the special on this path, If the standard is met, the group includes feature pairs that do not meet the standard. Continued Λ Real Λ method, using the group 遽 wave program 605 遽 wave group to ensure the group's :: be ... objects in the desired area 102, and Non-stereoscopic processing 歹 '仃 Character sequence at the wrongly defined position (or not) in the heart order. Part -25- 297 mm) This paper rule 561423 A7 _______B7_ 5 The group filtering program 605 of the invention description (23) also eliminates a group that contains too few features and provides confirmation measurements of size, appearance, or position. The area, boundary size, and number of features of the measurement group are compared with a previously defined description that describes the minimum quality of the measurement. The critical threshold is compared. Groups that do not meet the criteria and their characteristics will be deleted from other considerations. The representation detection module 606 in this implementation method determines whether a person appears. The indication detection module 606 is selective, because not all The system requires the component to provide this information. In its simplest way, the detection detection module two group 606 only needs to check the previously defined characteristics within the detection area 607 boundary. J (not previously eliminated) Whether it appears. It indicates that the detection area 607 is an area that may be occupied by any user 101. If there is no user, this area cannot be occupied by any object. It generally indicates that the detection area 607 corresponds to the desired area 102. However, in certain devices of this system, this means that the detection area 607 is defined to avoid moving objects in the scene. In the implementation method used in this component, if the user is found to be 101, it can be omitted and further processed. In the implementation method of the above-mentioned system 100, a hand detection area 105 is defined. Benefit _ The area 105 (through program 609) defined by this method depends on the structure used by the system described below. This sequence can selectively analyze the user's body and return other information including body position / measurement information 610, such as the position of the person's head. The hand detection area 105 need not have any content or need only include a human hand or a suitable pointer. Any group that has not been filtered out or has the characteristics of the hand detection area 105 can be regarded as or include a hand or a pointer. Two positions are calculated for each group (through program 611). If the position is within the hand detection area 105, it is recorded (in & memory) as the hand position coordinate 612. Measure this position as a weight. ____- 26- This paper size applies to China National Standard (CNS) Α4 (210 X 297 mm)
裝 訂Binding
561423 A7 B7 五 發明説明( 24 置。識別距離手偵測位置區域入口(範例中之1 002)最遠的 群特徵(由圖10所述之範例1005識別),且根據表示手指或 指示物頂端的可能假設給其位置1權重。其餘群特徵權重 係基於返回其特徵的距離,利用下述方程式4的公式。若 此應用僅需一個手位置且多項群具有手偵測區域1〇5内的 特徵,提供距離入口 1002最遠的位置作爲手位置612並捨 棄其他位置。因此,使用可最遠觸及手偵測區域1〇5的 二: 手。否則,一若有兩個以上具有手偵測區域丨〇5内特徵的群,-二 距離入口 1 〇 〇 2最遠的距離以及次遠距離的位置皆可作爲手 垃置6 12,並捨棄其他位置。當上述規則導致群被包含並 用於代替不同群時,則此被包含的群在手位置資料6丨2中 被標記。 在照相機的方向可偵測人手臂的架構中,將此方向表示 爲手臂或指示物之手方向座標613”並可被手方向計算模 組614所1t算。若照相機1〇3的高度與手偵測區域105同 · 南’其中包括照相機103高於手偵測區域1 〇5的架構。可由 群的標準座標軸表示此方向,其由群的時刻中所計算而 出0 以下是在未平均分配特徵時,同樣可產生好結果的方 法。手臂可進入手偵測區域1 〇 5的位置,其中群被手偵測 區域105邊界所形成的平面分割。該位置與手位置座標“^ 間的向量提供手方向座標613。 二 動態平滑程序615可用於手位置座標612,手方向座標 613(若已解決)與任一其他身體位置或測量61〇。平滑爲結 ____-27- 本紙張尺度適财國因冢標準(CNS) A4規格(210X297公爱) 561423 A7 ____B7 五、發明説明(25 ) 合結果與之前所解決之結果,使姿態在訊框中保持不變的 程序。特殊座標値之一特殊平滑,其中各座標成分X,y,Z 皆被個別並動態平滑。下述方程式5計算削減S的程度,其 中依照位置的改變動態且自動調整S。圖7所示之距離臨界 値DA與Db定義姿態的三種範圍。對於小於DA的位置改變, 由S a將姿態在區域7 〇 1重重地削減,藉以減少兩個鄰近値 間的値往後和往前轉換的傾向(一種影像抽象取樣的邊際 效用)。大於Ib的位置改變則由SB在區域702中做輕微的削-減,或不削減。這個步驟減少或消除其他平化程序所示之 延遲與含糊不清的地方。對於DA與DB間的行動,削減程度 有所不同,此區域爲703,故較不易注意到輕微與重大削減 間的變化。下述方程式6係用於解答常數a,其係用於方程 式7 (如下所述)以便修改座標。動態平滑程序61 5之結果爲 圖2之手/物體位置資訊205。當程序611標記和之前位置屬 於不同群功位置時,由於目前與之前的位置不同,故不使 用平滑 步驟609利用所述方法決定手偵測區域1〇5係依照使用影 像控制系統100的方式。兩種方式皆描述於本文中。 最簡易的手4貞測區域1 〇 5爲事先決定之固定區域,其不包 含任何物體或僅包含人的手或指示物。如圖8所述,此定 義所使用之方式爲用於控制個人電腦之使用者介面的系統 100使用方法,其中手偵測區域105係電腦顯示監視器之前 以及電腦鍵盤802上的區域。在電腦傳統的使用中,使用 者手或其他物體通常不進入此區域中。因此任一在手偵測 2 8 - 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐)一 A7 B7561423 A7 B7 Five invention description (24 positions. Identify the group feature (recognized by the example 1005 described in Figure 10) that is farthest from the entrance of the hand detection position area (1 002 in the example), and according to the top of the finger or pointer It is possible to assume that its position is given a weight of 1. The remaining group feature weights are based on the distance returned to its features, using the formula of Equation 4 below. If this application requires only one hand position and multiple groups have a hand detection area 105 Feature, providing the position farthest from the entrance 1002 as the hand position 612 and discarding other positions. Therefore, use the two that can reach the hand detection area 105 as far as: hand. Otherwise, if more than two have hand detection The group of features in the area 丨 〇5, the second longest distance and the second longest distance from the entrance 1 002 can be used as the hand position 6 12 and other locations are discarded. When the above rules cause the group to be included and used When replacing different groups, this included group is marked in the hand position data 6 丨 2. In the structure where the direction of the camera can detect the human arm, this direction is expressed as the direction of the hand of the arm or pointer The mark 613 "can be calculated by the 1t of the hand direction calculation module 614. If the height of the camera 103 is the same as the south of the hand detection area 105, including the structure of the camera 103 higher than the hand detection area 105. The standard coordinate axis of the group indicates this direction, which is calculated from the moment of the group. The following is a method that can also produce good results when the features are not evenly distributed. The arm can enter the position of the hand detection area 1 0, where The group is divided by the plane formed by the boundary of the hand detection area 105. The vector between this position and the hand position coordinate "^ provides the hand direction coordinate 613. Two dynamic smoothing programs 615 can be used for the hand position coordinate 612, and the hand direction coordinate 613 (if the (Solved) with any other body position or measurement 61. Smoothness is knot ____- 27- This paper size is suitable for the country's national standard (CNS) A4 specification (210X297 public love) 561423 A7 ____B7 V. Description of the invention (25) The procedure of combining the result and the previous solution to keep the attitude unchanged in the frame. One of the special coordinates 値 is specially smoothed, where each coordinate component X, y, Z is individually and dynamically smoothed. Calculated by Equation 5 below reduce The degree of S, which dynamically and automatically adjusts S according to the change of position. The distance critical 値 DA and Db shown in Figure 7 define three ranges of attitude. For position changes less than DA, the attitude will be in the area 701 by Sa. Ground reduction in order to reduce the tendency of backward and forward transitions between two adjacent borders (a marginal utility of image abstract sampling). Position changes greater than Ib are slightly trimmed-reduced by SB in area 702, or No reduction. This step reduces or eliminates the delays and ambiguities shown in other flattening procedures. For DA and DB actions, the degree of reduction is different. This area is 703, so it is less easy to notice slight and major Change between cuts. Equation 6 below is used to solve the constant a, which is used in Equation 7 (described below) to modify the coordinates. The result of the dynamic smoothing program 61 5 is the hand / object position information 205 in FIG. 2. When the program 611 mark and the previous position belong to different group power positions, since the current position is different from the previous position, the smoothing is not used. Step 609 uses the method to determine the hand detection area 105 according to the way the image control system 100 is used. Both approaches are described herein. The simplest hand measurement area 105 is a fixed area determined in advance, which does not contain any objects or only human hands or pointers. As shown in FIG. 8, the method used for this definition is the method of using the system 100 for controlling the user interface of a personal computer, in which the hand detection area 105 is the area before the computer display monitor and on the computer keyboard 802. In traditional computer use, the user's hand or other objects usually do not enter this area. Therefore, any detection in hand 2 8-This paper size applies to China National Standard (CNS) A4 (210 X 297 mm)-A7 B7
示Show
裝 561423 五、發明説明(26 區105内移動足物體可解釋爲使用者1〇1執行使用手戈浐 物的成果,其中指示物可爲任一適合於執行指示,^人 姿態的物體’例如鉛筆或其他適合的指示裝二、應::乙 體分析程序2G2的特定t行方法可利用作爲^物之= 類型或外觀的限制。此外,可定義上述選擇表示偵測區: 為區域8G1以包含此方式中的使用者頭部。影像偵測哭103 可置於監視器108之上。 ° 一些方式一年的手偵測區域105可動態定義相關於使用者— 身體且不包含任何物體或僅包含人物的手或指示物。動態 區域的使用移除事先決定位置上之使用者的限制。圖 述使用本實行方法之方式。 圖9詳細説明選擇動態手偵測區域位置程序6〇9的實行方 法。在此程序中已解答出各座標軸上之手偵測區域、ι〇5的 位置,而此手偵測區域105的大小與方向則是由事先定義 之説明所復定。圖l〇A_loc之範例係用於描述此程序。 使用群資料901(圖6群濾波程序605之輸出),區塊9〇2所 述之程序包含尋找平面1001的位置(這類軀幹分割平面描 述於圖ioc所述之側視野中),其方向與收偵測區域1〇5之 邊界1002平行且使用者1〇1可經此達到此位置。若想將特 徵平均分配在原始影像上(如同實行上述立體分析程序2〇2 之方法時),則多數剩餘特徵將屬於使用者軀幹而非使用者 的手。在這個情形下可定位平台1001,以便將特徵分割爲二 數量相同的兩個群組。若不想平均分配特徵(如同實行上 述立體分析程序202之其他方法時),便無法成立上述的假 -29- 本紙張尺度適用中國國家標準(CNS) A4規格(210X297公釐) 訂Equipment 561423 V. Description of the invention (the moving foot object in the 26 area 105 can be interpreted as the result of the user's implementation of the use of hand gestures, and the pointer can be any object suitable for performing instructions, such as a person's posture. Pencil or other suitable indicator. 2. Should: The specific t-line method of B-body analysis program 2G2 can be used as a restriction of type or appearance. In addition, the above selection can be defined to indicate the detection area: for area 8G1 to Including the user's head in this way. The image detection cry 103 can be placed on the monitor 108. ° In some ways, the hand detection area 105 for one year can be dynamically defined in relation to the user — the body and does not contain any objects or Contains only people's hands or pointers. The use of dynamic areas removes the restriction of users at predetermined locations. The method of using this implementation method is illustrated. Figure 9 details the process of selecting the position of the dynamic hand detection area 609. Implementation method. In this program, the positions of the hand detection area and ι05 on each coordinate axis have been solved, and the size and direction of the hand detection area 105 are reset by a previously defined description. The example in Figure 10A_loc is used to describe this procedure. Using group data 901 (the output of the group filtering program 605 in Figure 6), the procedure described in block 902 includes finding the location of plane 1001 (this type of torso segmentation plane) (Depicted in the side field of view described in Figure 10c), its direction is parallel to the boundary 1002 of the detection area 105 and the user 101 can reach this position. If you want to distribute the features evenly on the original image ( As when implementing the method of the above-mentioned three-dimensional analysis program 202), most of the remaining features will belong to the user's torso rather than the user's hand. In this case, the platform 1001 can be positioned to divide the features into two equal numbers Group. If you don't want to distribute the features evenly (as when implementing the other methods of the three-dimensional analysis program 202 above), the above false-29 cannot be established. This paper size applies the Chinese National Standard (CNS) A4 specification (210X297 mm).
561423 五、發明説明( A7 B7 27 。,客 Try 丄 人 - ^ I市成群之外部邊界的特徵仍將期望屬於軀 $在k個h形下可定位平台1 〇〇丨,以便將最外圍的特徵 、^!爲數量相同的兩個群組。在另一個情形下,區塊9〇2 (躯幹分割程序定位平台1001,以通過使用者軀幹。 序區塊903決定座標軸上的手偵測區域1〇5位置,一般 ^ 仏置爲上述中的平台1001。定義手偵測區域105爲 二口 1001則的事先決定距離1004,並定位於使用者身體 刖。如圖1中^的情形,距離1004決定z軸上的手偵測區域105-位置。 右使用者頭部完全位於所想區域102内,則期望群的最高 特徵位置表示使用者的頭頂(因此也包含使用者之高度)並 建亙於此實行方法的程序區塊9〇4中。在程序區塊9〇5中, 根據此頭邵位置定位手偵測區域1〇5的位置,一事先定義 距離是位於使用者頭頂下。在圖丨的情形中,事先定義的 距離決定4軸上的手偵測區域位置。若無法測量使用者高 度’或群達到所想區域102的邊界(意指此人物延伸到所想 區域102之外),便在事先定義的高度定位手偵測區域丨…。 在許多模式中,可決定使用者左手臂或右手臂是否聯繫 圖6之位置計算區塊611所偵測的手。在程序區塊9〇6中決 定手臂又又平面1001前之事先定義平面的位置。一般而 言,平面符合1002所指示的手偵測區域邊界。若無接近此 平面之特徵,但於此平面前發現若干特徵,則該等特徵阻 擔與平面的交叉點,並假設交又位置位於阻擋的特徵之 後。利用區塊特徵間最短的鄰近距離,各交又點可聯繫手 裝 訂561423 V. Description of the invention (A7 B7 27), the characteristics of the outer boundary of the city group will still be expected to belong to the body. Under the k h-shaped positions, the platform 1 〇〇 丨 can be located The characteristics of ^! Are two groups with the same number. In another case, block 902 (the torso segmentation program locates platform 1001 to pass the user's torso. Sequence block 903 determines the hand detection on the coordinate axis. The position of the area 105 is generally set to the platform 1001 described above. The hand detection area 105 is defined as the second mouth 1001 and the predetermined distance 1004 is determined and positioned at the user's body. As shown by ^ in FIG. 1, The distance 1004 determines the 105-position of the hand detection area on the z-axis. The head of the right user is completely located in the desired area 102. The highest characteristic position of the expected group is the top of the user's head (and therefore also the user's height). Built in the program block 904 where the method is implemented. In program block 905, the position of the hand detection area 105 is located according to this head position. A predefined distance is located under the user's head. In the case of Figure 丨, the Determine the position of the hand detection area on the 4 axis. If the user's height cannot be measured or the group reaches the boundary of the desired area 102 (meaning that the character extends beyond the desired area 102), it will be positioned at a predefined height Hand detection area 丨 ... In many modes, it can be determined whether the user's left or right arm is connected to the hand detected by the position calculation block 611 in FIG. 6. In program block 906, it is determined that the arm is flat again. The location of the plane is defined before 1001. In general, the plane meets the boundary of the hand detection area indicated by 1002. If there are no features close to this plane, but some features are found in front of this plane, these features will block the plane The intersection point is assumed to be behind the blocking feature. Using the shortest proximity distance between block features, each intersection point can be contacted for hand binding
-30--30-
561423 A7561423 A7
點0 也可於程序區塊9〇7發現使用者身體中部與使用者身骨# 邊界的位置。-般而言’被均勻分佈特徵的位置爲= 有特徵的主要位置。若不期望均勾分配特徵,則使用另一 個群邊界間位置中點的測量方法。 在程序區塊908中,將由程序區塊9〇6發現之屬於手臂的 位置與程序區塊907所發現的身體中心位置相比。若手臂 位置充分地一抵銷身體中心位置的左邊或右邊,則表示手; 的來源爲使用者101的左肩膀或右肩膀。若發現兩手但僅 有一手被確定標記爲左手或右手,則也包含另一手之標 記。因此,根據群的結構將手標記爲左手或右手,以確: 許多形式中的適當標記,其中可發現兩手且左手位置在右 手位置的右方。Point 0 can also find the position of the boundary between the middle of the user's body and the user's bone # in the program block 907. -In general, the position where features are uniformly distributed is = the main position with features. If you do not expect even-hook distribution, use another method of measuring the midpoint of the position between group boundaries. In program block 908, the position belonging to the arm found in program block 906 is compared with the body center position found in program block 907. If the position of the arm sufficiently offsets the left or right of the center position of the body, it indicates the hand; the source is the left or right shoulder of the user 101. If two hands are found but only one is identified as left or right, the other hand is also included. Therefore, the hand is marked as left- or right-handed according to the structure of the group to determine: proper markings in many forms where both hands can be found with the left-hand position to the right of the right-hand position.
若由程序區塊908識別到手,則可定位(由程序區塊9〇9) 手偵測區婆10 5位置,使此手偵測區域1 〇 5可位於期望之結 合使用者手的行動範圍中。剩餘座標軸上的手偵測區域 105的位置可偏向方程式8所定義之手臂的手臂(如下所 述)。若程序區塊908無法識別此手臂,或另有其他辦法, 則剩餘座標軸上手偵測區域1 〇 5的位置可定位於程序區塊 907所發現之使用者身體的中央。在必須追蹤兩手的形式 中,手偵測區域105可定位於使用者身體的中央位置。 程序區塊903,906與909各自解答一座標軸内的手偵測區 域1 05的位置,並一起定義此手偵測區域} 〇5在三維空間内 的位置。一動態平滑程序910利用成分615所使用的相同方 _— _____-31- 本紙&尺度適财S S家純CNS) Α4規格(21GX 297公Θ 561423 A7 B7 五、發明説明(29 ) 法(利用方程式5,方程式6與方程式7)平滑該位置。然而, 則減的較高層係使用於程序91 〇中。 由動恐、平滑程序9 10輸出之平滑位置資訊,以及事先定義 之大小與方向資訊9 11完整地定義手偵測區域丨〇5的邊界。 解答手偵測區域105之位置的過程中,程序區塊905,907與 9〇8發現多種使用者之其他身體位置測量方法913(圖6之程 序 6 1 0) 〇 總之,由-圖6所描述之使用所有包含圖9中隨機成分的實 行方法產生景象中人物的描述(如圖2之手/物體位置資訊 205所示),其中所包含之資訊如下: -使用者存在/不存在或數量 -對各個出現的使用者: 〇身體或軀幹的左/右邊界 〇身體或軀幹的中心點 · 〇頭(若頭部位於所想區域内) 〇對出現的手: 手偵測區域 標記左/右的(若可偵測) 指尖位置 手或前臂的方向 由景象描述203分析之已知改良方法,本文所述之實行方 法詳細説咏此使用者(例如識別肘部位置)。 手/物體位置資訊205係一種次集合資訊或矸包含於上述 資訊的其他資訊,可允許使用者與多種應用稃式208互動 -32- 本紙張尺度適财國國家料(CNS) A4規格(210 X 297公董_) 561423If the hand is identified by program block 908, the position of hand detection area 105 can be located (from program block 109), so that the hand detection area 1 05 can be located in the desired range of action combined with the user's hand in. The position of the hand detection area 105 on the remaining coordinate axis can be biased toward the arm of the arm defined by Equation 8 (as described below). If the program block 908 cannot identify the arm, or if there is another method, the position of the hand detection area 105 on the remaining coordinate axis can be located at the center of the user's body found in the program block 907. In a form in which both hands must be tracked, the hand detection area 105 can be positioned at the center of the user's body. Program blocks 903, 906, and 909 each solve the position of the hand detection area 105 in an axis, and define this hand detection area} 〇5 position in three-dimensional space. A dynamic smoothing program 910 uses the same method used by component 615 __ _____- 31- paper & scale suitable financial SS home pure CNS Α4 specification (21GX 297 public Θ 561423 A7 B7 V. Description of the invention (29) method (utilization (Equation 5, Equation 6 and Equation 7) smooth the position. However, the higher layers that are subtracted are used in the program 91 0. The smooth position information output by the dynamic and smoothing programs 9 10, and the size and direction information defined in advance. 9 11 completely defines the boundary of the hand detection area 丨 05. In the process of solving the position of the hand detection area 105, program blocks 905, 907, and 908 find other methods for measuring the body position of various users 913 (Figure The procedure of 6 6 1 0) 〇 In short, the description of the characters in the scene is generated from the implementation method described in FIG. 6 using all the implementation methods containing the random components in FIG. 9 (as shown in the hand / object position information 205 in FIG. 2), where The information contained is as follows:-user presence / absence or quantity-for each user present: o left / right border of the body or torso o center point of the body or torso · if the head is in the desired area Inside) 〇 pair Appearing hand: Hand detection area mark left / right (if detectable) Fingertip position The direction of the hand or forearm is a known and improved method analyzed by scene description 203. The implementation method described in this article details this user (Such as identifying the position of the elbow). Hand / object position information 205 is a sub-collection or other information included in the above information, which allows users to interact with a variety of applications 208-32- (CNS) A4 size (210 X 297 public director_) 561423
與/或控制應用程式208。以下詳細說明三種應用的控制方 法。 透過處理上述資訊可偵測多種人類的姿態,以上處理不 受限於應用208與下列所述之特殊控制分析。這類姿態的 範例爲未處理空中的軌道或將手擦向另一邊。一般而言, 由姿態分析與偵測程序209所偵測之姿態類型係使用手/物 體位置資訊205。And / or control application 208. The control methods of the three applications are explained in detail below. A variety of human poses can be detected by processing the above information. The above processing is not limited to the application 208 and the special control analysis described below. Examples of such gestures are untreated air orbits or rubbing hands to the other side. Generally, the type of posture detected by the posture analysis and detection program 209 uses hand / object position information 205.
可使,探技術偵測姿態的狀態。偵,測程序2〇9維持手-與身體位置的所有改變過程。其中—種偵測姿態的方法爲 檢查位置是否明確通過規則限定。舉例説明,若達成下列 姿態偵測規則,則可識別將手擦去另一邊的姿態: L 段少於事先定義之限定時間内的水平位置變化大於 事先定義的距離。 2.水平位置在一段時間内保持一貫的改變。 3 · #又日t間内之頂點位置的改變小於事先定義的距離。 4·時間結束的位置比時間開始的位置更接近(或正好位於) 手偵測區域的邊界。 一些安態要求許多規則限定需按照明確的順序達成,藉 使此結果能導致此系統改變成使用不同規則限定的狀態。 此系統不能偵測細微的姿態,在此情形下便可使用Hidden Markov模組,由於此模組仍允許偵測一系列的特定行動, 同時也考慮$亥行動疋否完全符合一姿態的整體可能性。 本系統之實行方法提供一種使用者互動的方法,使用者 可藉此方法導致一指示器之表示在影像内移動(使用者反 -33- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 561423 A7 _ —_ B7__ 五、發明説明(31 ) 馈206),上述内容係表現於視頻顯示1〇8上。此指示器依照 反應使用者手部姿態的方式移動。 在一個使用者介面形式的變化中,指示器顯示於其他圖 像之前,並映射其姿態到視頻顯示螢幕108之介面所定義 的二維空間。此控制形式與桌上型電腦一般使用滑鼠的形 式相似。圖11A描述使用此類型控制之應用程式208的反饋 影像2〇6範例。 ^ 在位置映Γ射程序207中,透過以下所述方法,將先前所述 1 之景象分析程序204所偵測的手位置205映射到景象指示物 或游標1101覆蓋到視頻顯示、1〇8所示之螢幕影像206上的位 置。當偵測到一隻手並發現其位於手偵測區域105内時, 於傳遞此手位置205到應用程式208之前,將相關於手偵測 區域1〇5的手位置205映射到訊顯示器1〇8。一種映射座標 的方法係透過方程式9 (如下所述)的應用得到X座標與y座 標。如圖iJB所述,由一整個包含於手偵測區域11〇4(近似 . 於手偵对區域105)内的次區域1103標示整個顯示區域 1102。次區域11 〇3内的位置(如手位置11 〇5)直線映射到顯 示區域1102内的位置(如1106)。位於次區域Π03之外,但 仍位於手偵測區域1104内的位置(如11〇7)映射到顯示區域 1102邊界上最接近的位置(如11〇8)。這降低使用者試圖移 動游標110 1到接近顯示器邊界的位置時,不小心將手從次 區域1103消除的可能性。若在手偵測區域1 〇5内偵測出使 -用者的兩手,便於位置映射程序207中選擇一手。一般而 言,將選擇能伸最遠到手偵測區域105的手。由於此手具 -34- 本紙張尺度適财S S家標準(CNS) A4規格(210 X 297公I) - 561423 A7 B7 五、發明説明(32 ) 有最大或最小的X,y,z座標値(視此系統之配置與世界座 標系統106之定義而定),故能偵測得到。 使用此互動形式之應用通常呈現資料或控制的圖式(如按 鈕1 109)。使用者期望將導致指示器1 101被定位於其中_ 個物體上。藉由比較重新映射指示器位置1 1 06與物體圖式 邊界(如1110),可偵測以上條件,若指示器位置在物體邊 界内,則成立此條件。使用者隨機接收反饋,其指示群係 定位於一物聽上。反饋可有許多種形式,其中包括音頻信 號與/或群與/或物體的圖形改變。接著使用可活化,操作 或移動群下方的物體。 使用者期望將指示其目的,透過姿態的執行以活化,操 作或移動物體。在本文所示之系統實行方法中,姿態分析 程序209識別由景象分析程序204與/或位置映射程序207所 提供之手位置或其他位置與測量變化中的姿態形式。舉例 説明’使甩者可指示一目的以活化位於群下方的物體,該 物體將使群保持在物體上方的時間長於事先定義的期間。 此姿態偵測需要應用的狀態210,尤其是物體的邊界與/或 狀悲需反饋於姿態分析程序209。由於已存在之技術可隱 密地監視應用狀態210並可利用位置映射程序207提供之座 標’仿效其他如電腦滑鼠的介面裝置,故無需爲系統特別 建立應用。 在一些形式中,無法得到並監視此應用狀態資訊2丨〇。在 此狀況下,指示活化群下方物體的姿態中包括保持手的姿 *態不、·變(超出或將手快速向前或向後伸)。 本紙張尺度相中g g家標準(CNS) A视格㈣X撕公爱) 裝 訂Enables, detection technology to detect the state of the attitude. Detection, measurement procedures 209 maintain all hand- and body-position changes. One way to detect attitude is to check whether the position is clearly defined by rules. For example, if the following gesture detection rules are achieved, the gesture of wiping the other side can be identified: The horizontal position change of segment L less than a predefined time within a defined time is greater than a predefined distance. 2. The horizontal position changes consistently over time. 3 · # The change in the position of the vertex within day t is less than a predefined distance. 4. The position at the end of time is closer to (or just at) the boundary of the hand detection area than the position at the beginning of time. Some security states require that many rule qualifications be achieved in a clear order, so that this result can cause the system to change to a state qualified using different rules. This system cannot detect subtle gestures. In this case, the Hidden Markov module can be used, because this module still allows a series of specific actions to be detected, and also considers whether the operation of $ HAI fully meets the overall possibility of a gesture Sex. The implementation method of this system provides a method for user interaction. The user can use this method to cause the indication of an indicator to move within the image (user's anti-33- This paper size applies to the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 561423 A7 _ —_ B7__ V. Description of the Invention (31) Feed 206), the above content is shown on the video display 108. This indicator moves in a way that reflects the posture of the user's hand. In a change in the form of a user interface, an indicator is displayed before other images, and its gesture is mapped to a two-dimensional space defined by the interface of the video display screen 108. This form of control is similar to the way a desktop computer typically uses a mouse. FIG. 11A depicts an example feedback image 206 using an application 208 of this type of control. ^ In the position mapping program 207, the hand position 205 detected by the scene analysis program 204 previously described 1 is mapped to the scene indicator or cursor 1101 through the method described below to cover the video display and position 108. The position on the screen image 206 is displayed. When a hand is detected and found to be within the hand detection area 105, the hand position 205 related to the hand detection area 105 is mapped to the information display 1 before the hand position 205 is transmitted to the application 208 〇8. One method of mapping coordinates is to obtain the X and y coordinates through the application of Equation 9 (described below). As shown in iJB, the entire display area 1102 is marked by a whole sub-area 1103 included in the hand detection area 1104 (approximately. In the hand detection area 105). The position in the sub-region 11 〇3 (such as the hand position 11 〇5) is linearly mapped to the position in the display region 1102 (such as 1106). A location (such as 1107) located outside the sub-region Π03 but still within the hand detection area 1104 is mapped to the closest location (such as 1108) on the boundary of the display area 1102. This reduces the possibility of the user accidentally removing his hand from the sub-region 1103 when he tries to move the cursor 110 1 to a position close to the border of the display. If both hands of the user are detected in the hand detection area 105, it is easy to select one hand in the position mapping program 207. In general, the hand that can reach as far as the hand detection area 105 will be selected. Because this hand tool -34- this paper is suitable for SS home standard (CNS) A4 specification (210 X 297 male I)-561423 A7 B7 V. Description of the invention (32) There are maximum or minimum X, y, z coordinates 値(Depending on the configuration of this system and the definition of the world coordinate system 106), it can be detected. Applications that use this interactive form often present data or control schemes (eg button 1 109). The user expects that the indicator 1 101 will be positioned on one of these objects. By comparing and remapping the pointer position 1 1 06 with the object pattern boundary (such as 1110), the above conditions can be detected. If the pointer position is within the object boundary, this condition is established. The user randomly receives feedback, which indicates that the ancestor is positioned on an object. Feedback can take many forms, including audio signals and / or graphical changes to groups and / or objects. Then use objects that can be activated, manipulated or moved under the group. The user is expected to indicate his purpose by activating, manipulating or moving the object through the performance of the gesture. In the system implementation method shown herein, the posture analysis program 209 recognizes the hand position or other position and measurement forms provided by the scene analysis program 204 and / or the position mapping program 207. Illustrate 'so that the shaker can indicate a purpose to activate an object located below the group, which will keep the group above the object for longer than a predefined period. This attitude detection needs the applied state 210, especially the boundary and / or state of the object to be fed back to the attitude analysis program 209. Since the existing technology can secretly monitor the application state 210 and use the coordinates provided by the position mapping program 207 to imitate other interface devices such as a computer mouse, there is no need to specifically create an application for the system. In some forms, this application status information 2 cannot be obtained and monitored. In this case, the attitude of the object below the activation group includes the attitude of keeping the hand. The posture is not changed or changed (beyond or extending the hand quickly forward or backward). Standards in this paper (Chinese Standard) (CNS) A Vision Grid X Tearing Love) Binding
k ^01423k ^ 01423
其中=已偵測超出的方法係經由保留手位置的變化過程, 間的包ί所有手位置的記錄與結束最新取樣之事先定義時 狀態。該時間表示使用者必㈣持手姿態不變的最短 的:。也:於此變化過程中發現各個三維座標(x,y,ζ)内 最取小與最大位置。若手出現在變化過程的所有取樣中且 -、瑕大位置間的距離在各個三維座標的事先定義臨界 、2,便回報此超出的姿態。距離臨界値表示手允許移動 、瑕大量,以及最大變化(或itter期望由系統的各種成分引 ^ 的位置)。回報姿態之一般模仿滑鼠的方法將模仿滑 鼠接鍵。同時偵測表示滑鼠其餘操作的姿態,按兩下按 艇,並模仿這些操作。 二此外’可隨機偵測不限於與物體相關之指示器位置的姿 ^ 且由應用給予與應用狀態有關或無關的意義。一般使 用此互動形式之應用並不明確使用•或顯示使用者的手或其 他位置。!只依照此系統所做的位置解釋,完全或概要地 控制這些應用。由於此系統所做之解釋可用於模仿傳統使 用者輸出裝置可執行之姿態,如鍵盤或控制桿,故也無需 替系統特別建立應用。 寺多有用的解釋係直接依靠手偵測區域1 〇 5内之手的絕 對位置。一種產生解釋的方法爲定義匣,平面或其他外 形。若在第一匣内發現手位置,且尚未被置於立即優先觀 察内(由於手位置在手偵測區域105之内或尚未被偵測到), 則啓動一個狀態。維持此狀態直到未在第二匣内(或在第 二平面所定義邊界的遠處)發現手位置爲止,並在此時關閉 -36- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐)Where = The detected excess method is the process of retaining the position of the hand, including the recording of all hand positions and the state of the latest definition at the end of the latest sampling. This time indicates the shortest time that the user must hold the same hand posture. Also: During this change, the smallest and largest positions within each three-dimensional coordinate (x, y, ζ) were found. If the hand appears in all the samples of the change process and the distance between the-and the position of the flaw is at the predefined threshold of each three-dimensional coordinate, 2, then the excess attitude is reported. The critical distance 値 indicates the hand is allowed to move, the number of flaws, and the maximum change (or the position where the itter expects to be induced by various components of the system). The general method of mimicking a mouse in return gesture will mimic the mouse connection. Simultaneously detect gestures that indicate the rest of the mouse's operations, double-click the boat, and mimic these operations. In addition, the position and position of the pointer position related to the object may be randomly detected ^ and the meaning given or related to the application state is given by the application. Applications that use this form of interaction generally do not explicitly use or display the user's hand or other location. !! Only in accordance with the position explanations made by this system, control these applications completely or in outline. Since the explanations made by this system can be used to imitate the gestures that traditional user output devices can perform, such as keyboards or joysticks, there is no need to create special applications for the system. Tera's useful explanation is directly relying on the absolute position of the hand within the 105 detection area. One way to generate an interpretation is to define a box, plane, or other shape. If the hand position is found in the first box and has not been placed in the immediate priority observation (because the hand position is within the hand detection area 105 or has not been detected), a state is activated. Keep this state until the hand position is not found in the second box (or far away from the boundary defined by the second plane), and it will be closed at this time -36- This paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 mm)
裝 訂Binding
561423561423
此狀態。第二E必須將整個第一£包含在内,同時尺寸較 大。使用較大E可減少手位置接近g邊界時,不小心啓動 或關閉裝態的情形發生。一般而言,此狀態所使用之其中 一種解釋方法係依賴姿態期望的使用目的。在一個法 中,姿態直接反應啓動裝置開與關的狀態。模仿鍵盤鍵或 控制桿發射按鈕時,啓動狀態時按下按鈕,關閉裝態時則 鬆開按鈕。在其他一般方法中,僅利用狀態從關閉到開啓 义狀態的轉變便可引起姿態。雖不將時間與關閉狀態回報 給應用,但仍維持此狀態,使姿態於狀態關閉前將不會重 覆,因此各個姿態範例皆需要使用者清楚定義的目的。一 般所使用的第三種方法是利用從關閉到開啓之狀態的轉變 引起姿態,並依與狀態開啓時間長度相同之事先定義的時 間間隔定期重新引起姿態。此模仿方式保持鍵盤按鍵向下 將導致特徵在某些應用中重覆。. 一種在兔偵測區域1 〇5内定義用於上述技術的匣或平面 的方法描述如下。藉由定義分割手偵測區域1〇5爲發射區 域1053與中立區域1504區域的第一平面(圖15八中的15〇1)與 第一平面1502 (當手在平面間的區域ι5〇5時回報的姿態是 依賴上述之手的先前位置),使上述技術可偵測手向前移 動’其爲模仿控制桿上發射按鈕或導致應用響應與按下控 制桿按奴相關的方式(例如在視頻遊戲中發射武器)。 另一種在手偵測區域105内定義用於上述技術的匣或平 面的方法描述如下。圖15B描述與角落重疊並被定義將手 偵測區域1 05分割爲左,右,上及底部的第一類型平 561423 A7 B7 五、發明説明(35 ) 15 06,1507,1508,1509。第二類型平面被標記爲151〇, 1 5 1 1,1 5 1 2,1 5 13。分別處理第一與第二平面對。此平面 組合模仿四個方向的游標鍵,其中角落的手可啓動兩個按 鍵,許多應用一般解釋爲四個第二45度(對角線)方向。 圖15C描述另一種用於模仿應用之抽象方向與應用的方 法’其期望將明確地表示四個4 5度方向狀態。定義各個四 個主要(水平和垂直)方向的匣1514,1515,1516,1517與定 義各個第二>45度(對角線)方向的匣1518,1519,1520 , 1 52 1。爲清楚説明起見,僅描述第一類型的匣。在g之間 定位一缺口。圖15D描述定義鄰近匣的方法。位於第一類 型II 1522與1 523間的缺口確保使用者不小心進入匣,而缺 口 1524由第二類型匣1 525與1526所填滿,因此系統將回報 可一個姿態,直到使用者清楚地想要進入鄰近匣内。此按 鈕組合可用於模仿一個八個方向的控制桿發射臺。 另一種缉型的姿態依賴行動而非位置,或同時依賴行動 與位置。此妥悲之範例爲將手擦往左方。此姿態可用於傳 遞應用,此應用將返回前一頁或之前的狀態。透過模仿鍵 盤與滑鼠’此姿態產生表示軟體,特別是p〇werp〇int,以 和動到表示序列之則_個滑動。透過模仿鍵盤與滑鼠,此 姿f產生瀏覽Web的使用者介面以執行與其aek按姐相關的 安態。同制,將手擦往右方的㈣爲可用於傳遞一使用 者須七動到下-頁或狀態之應用的姿態。舉例説明,此姿 態導致表示軟體以移動到表示序列的下一個滑動,並導致 瀏覽器軟體移動到下一頁。 "11 - 3 8 - 本紙張尺度適財g @家料-- 561423 A7This state. The second E must include the entire first £ and be larger in size. Using a larger E can reduce the situation of accidentally starting or closing the device when the hand position is close to the g boundary. In general, one of the interpretation methods used in this state depends on the intended purpose of the posture. In one method, the attitude directly reflects the state of the starting device on and off. When mimicking a keyboard key or joystick launch button, press the button when it is on, and release the button when it is off. In other general methods, only the transition from the closed state to the open state can be used to cause a posture. Although the time and the closed state are not reported to the application, the state is still maintained so that the posture will not be repeated until the state is closed. Therefore, each posture example requires a clearly defined purpose by the user. The third method generally used is to use attitude transitions from the closed to the open state to cause the attitude, and to re-initiate the attitude periodically at a pre-defined time interval of the same length as the state on time. Keeping keyboard keys down in this mimicry will cause features to repeat in some applications. A method of defining a cassette or plane for the above technique in the rabbit detection area 105 is described below. By defining the split hand detection area 105 as the first plane of the emission area 1053 and the neutral area 1504 (1501 in Figure 15) and the first plane 1502 (when the hand is in the area between the planes ι505) The gesture of return is dependent on the previous position of the above hand), so that the above technology can detect the hand's forward movement, which is to mimic the launch button on the joystick or cause the application to respond in a manner related to pressing the joystick (such as in the Fire weapons in video games). Another method of defining a cassette or plane for the above technique in the hand detection area 105 is described below. FIG. 15B depicts the first type of flat overlapped with corners and defined to divide the hand detection area 105 into left, right, top, and bottom 561423 A7 B7 V. Description of the invention (35) 15 06, 1507, 1508, 1509. The second type of plane is labeled 1510, 1 5 1 1, 1 5 1 2, 1 5 13. The first and second plane pairs are processed separately. This plane combination mimics the cursor keys in four directions, where the corner hand can activate two keys, and many applications are generally interpreted as four second 45 degree (diagonal) directions. Fig. 15C depicts another method for mimicking the abstract direction and application of an application ', which is expected to clearly represent the four 45 degree direction states. Define each of the four main (horizontal and vertical) boxes 1514, 1515, 1516, 1517 and define each second > 45-degree (diagonal) box 1518, 1519, 1520, 1 52 1. For clarity, only the first type of cassette is described. Locate a gap between g. FIG. 15D describes a method of defining a proximity bin. The gap between the first type II 1522 and 1 523 ensures that the user accidentally enters the box, and the gap 1524 is filled by the second type box 1 525 and 1526, so the system will report a gesture until the user clearly wants to To enter the adjacent box. This button combination can be used to simulate an eight-way joystick launch pad. Another type of stance depends on action rather than position, or on both action and position. This sad example is to rub your hand to the left. This gesture can be used to pass an application, which will return to the previous or previous state. By imitating the keyboard and mouse, this gesture generates the presentation software, especially pοwerp〇int, and moves to and from the sequence to show a slide. By imitating a keyboard and a mouse, this pose f generates a user interface for browsing the Web to perform security related to its aek press sister. With the same system, rubbing the hand to the right is a gesture that can be used to convey an application where the user must move to the next page or state seven times. For example, this pose causes the presentation software to move to the next slide of the presentation sequence and causes the browser software to move to the next page. " 11-3 8-This paper is suitable for g @ 家 料-561423 A7
"以下描述一種比先前所述方法更簡易,用於偵測將手擦 往左方姿態的方法,其係利用將手偵測區域105分割成透 過平面分割的區域之方法。將手偵測區域i 〇5最左部份上 的細條紋定義爲左邊區域。手位置的表現方式如以下三種 狀態: 1·出現手但手不在左邊區域内 2·出現手且手在左邊區域内 3 ·手偵測ΐ域内沒有出現手 上述從狀態1到狀態2的轉換導致偵測程序209進入一狀 怨中’並藉此啓動一計時器與等候下一次的轉換。若在一 事先決定的時間内觀察到狀態3的轉換,便已發生已回報 之將手擦往左方的姿態。此技術一般重覆對右邊,上邊與 較低邊,且因爲是在三維中發現手位置,所以也將重覆偵 測ulling the hand back。可利用手或軀幹的位置偵測上述 的所有姿態。 在本系統另一個變化中,使用者導致一個指示器或兩個 指示器的表示三維虛擬環境的表示(使用者反饋206)内移 動。立體裝置可提供反饋並藉此使各使用者可用肉眼觀看 建立深度幻覺的獨特影像,即使無法在許多形式中實踐此 系統類型,因此只可隨機使用。然而,也可能包含利用投 影轉換執行虛擬環境之物體的深度。圖12A,12B與13 A提 供執行此類型之使用方法的範例。 參考圖12A,以下描述一種方法,將由之前所述之景象 分析程序204所偵測之位置映射程序207内的手位置205映 -39- 本紙張尺度適用中國國家標準(CNS) A4规格(210X297公釐) 561423 A7 B7 五、發明説明(37 射到虛擬環境中定位指示器1 2〇1的位置。相關於手偵測區 域105的手位置205,在轉換到應用程式208前,由位置映射 程序207映射到相關於視頻顯示! 〇8的座標。一種映射座標 的方法透過方程式9的應用,得到X座標與y與z座標的等 式。除增加第三維之外,此方法與上述方法相似。 若使用者有運用所有三維内之指示器1201位置的能力, 則使用者101可導致指示器接觸如同眞實環境之虛擬環境 的物體(如1202)。此爲使用者與虛擬環境互動的方法。比 較表示爲立方體或球形之指示器與物體的邊界(如12〇3與 1204)。兩種邊界交叉的情形指示此指示器接觸到物體。 若有被安排好的物體,使用者可能可以導致指示器移動到 接觸物體的位置,其中指示器路徑可避免接觸其他任意物 體。因此,一個ouch信號表示活化,操作或移動物體的使 用者意圖。因此,不同於二維控制”指示器120 1的三維控 制消除開j台實施其中一項行動之明確姿態的需要。同樣 的,不同鉢二維控制,可在不同深度(如圖12a的檔案匣)安 排物體以提供一介面,其更類似使用者於眞實世界執行之 熟悉的行動。此外,不受限於相關於一物體之指示器丨2〇 j 位置的姿態可被隨機偵測以指示執行行動的意圖。 使用者利用本系統而可能航行於一虛擬環境中。相較於 一度出現於使用者反饋206中,透過讓使用者導致所示之 物體或資訊的可用選擇,航行允許使用者存取吏多的物體 或資訊。使用者101利用隨機形式的航行,漫遊於一虛擬 環境’且物體次集合與可用物體係依賴虛擬環境中的使用 -40- 本紙張尺度適用中國國家標準(CNS) A4規格(210X297公釐) 561423 A7" The following describes a method that is simpler than the previously described method for detecting the gesture of wiping the hand to the left, which uses the method of dividing the hand detection area 105 into areas divided by planes. The thin stripe on the leftmost part of the hand detection area i 〇5 is defined as the left area. The expression of the hand position is shown in the following three states: 1. The hand is present but the hand is not in the left area 2. The hand is present and the hand is in the left area 3. The hand does not appear in the hand detection area. The above transition from state 1 to state 2 results in The detection program 209 enters into a complaint 'and thereby starts a timer and waits for the next transition. If a state 3 transition is observed within a predetermined time, a gesture of wiping the hand to the left has already occurred. This technique generally repeats the right, upper, and lower edges, and because the hand position is found in 3D, it will also repeatedly detect the ulling the hand back. All the above gestures can be detected using the position of the hand or torso. In another variation of the system, the user causes one or two indicators to move within a representation (user feedback 206) of the three-dimensional virtual environment. Stereoscopic devices provide feedback and thereby allow each user to view the unique image of the illusion of depth created with the naked eye, even if this type of system cannot be practiced in many forms, so it can only be used randomly. However, it may also include the depth of objects that perform virtual environments using projection transformations. Figures 12A, 12B, and 13 A provide examples of how to perform this type of usage. Referring to FIG. 12A, the following describes a method that maps the hand position 205 in the position mapping program 207 detected by the scene analysis program 204 described above. -39- This paper size applies the Chinese National Standard (CNS) A4 specification (210X297). (Centi) 561423 A7 B7 V. Description of the invention (37 Shoot the position of the positioning indicator 1 2101 in the virtual environment. The hand position 205 related to the hand detection area 105 is converted by the position mapping program before being transferred to the application 208 207 is mapped to the coordinates related to the video display! 〇8. A method of mapping coordinates is obtained by applying Equation 9 to obtain the coordinates of the X coordinate and the y and z coordinates. This method is similar to the above method except that the third dimension is added. If the user has the ability to use the position of the pointer 1201 in all three dimensions, the user 101 can cause the pointer to contact an object (such as 1202) in a virtual environment like a real environment. This is a method for the user to interact with the virtual environment. Compare Represents the boundary of a cube or sphere pointer with an object (such as 1203 and 1204). The intersection of two types of boundaries indicates that the pointer is in contact with the object. If arranged The user may cause the pointer to move to a position where it contacts the object, where the pointer path avoids contact with any other object. Therefore, an ouch signal indicates the user's intention to activate, operate or move the object. Therefore, unlike the second The three-dimensional control of the “dimensional control” indicator 120 1 eliminates the need for a clear posture for one of the operations. Similarly, two-dimensional control of different bowls can arrange objects at different depths (such as the file box in Figure 12a) to provide An interface that is more similar to the familiar actions that users perform in the real world. In addition, the gestures that are not limited to pointers related to an object 丨 20j position can be randomly detected to indicate the intention to perform the action. Use Using this system, the user may navigate in a virtual environment. Compared to what once appeared in the user feedback 206, by allowing the user to cause the available selection of the objects or information shown, navigation allows the user to access many objects Or information. User 101 uses a random form of navigation to roam in a virtual environment 'and sub-collections of objects and a system of available things Depends on use in a virtual environment -40- This paper size applies to China National Standard (CNS) A4 (210X297 mm) 561423 A7
Πΐ觸Γ3,表示—範例’其中使用者可漫遊於虛擬空 間以接觸表不爲儲存空間之任一物體集合。 接考“ ι4 -種使用者漫遊於_虛擬環境的方法 =目=之虛擬環境的方式執行虛擬環境,藉此: 虛挺照相機視野範圍内部,且不被任何虛擬 的物體表現給使用纟。在-個稱爲imp⑽n的選擇中j 相機的位置代表虛擬環境中的使用者位置。在另一個選擇 中二另一指示器代表虛擬環境中的使用者位置。此指示器 可犯爲個代表使用者1〇1的化身(顯示於視頻顯示108)。 發生跟奴扣不益之虛擬照相機位置,使此指示器與所有相 關於目前使用者位置之使用者的物體可位於虛擬照相機視 野範圍内。 使用者手,身體或頭部位置皆影響使用者漫遊時的虛擬 位置。一表示使用者軀幹中央或頭頂的位置係由本系統的 一些實行方法中發現的,特別是在如圖9所述之完全執行 隨機姿態分析程序609的實行方法中。使用其中任一位置 可使使用者10 1單獨執行手位置的漫遊姿態,允許使用手 在漫遊時接觸虛擬物體。請注意這些可接觸物體可固定在 相關於虛擬環境的位置,或固定於虛擬照相機的位置,使 使用者隨時可用。若沒有可用的位置則需使用使用者手位 置控制漫遊。在這個情形下,系統在使用者已漫遊接近可 接觸的虛擬物體或以執行一事先定義的姿態時,可自動轉 換爲接觸内容。 爲提供一表示無對於虛擬位置改變的區域,稱爲死亡區 -41 - 本紙張尺度適用中國國家標準(CNS) A4規格(210X297公釐)Πΐ touches Γ3, which means-example ', in which the user can roam in the virtual space to contact any set of objects that is not the storage space. Exam "ι4-A way for users to roam in _virtual environment = virtual environment to implement the virtual environment, by which: the camera's field of vision is within the scope of the camera, and is not represented by any virtual objects. -In a choice called impjn, the position of the camera represents the location of the user in the virtual environment. In another option, another indicator represents the location of the user in the virtual environment. This indicator can be regarded as a representative user The avatar of 010 (shown in video display 108). The position of the virtual camera that caused unscrupulous clasps, so that this indicator and all user objects related to the current user position can be located within the field of vision of the virtual camera. Use The position of the user ’s hand, body, or head all affects the virtual position of the user when roaming. One indicates that the position of the center or top of the user ’s torso was discovered by some implementation methods of the system, especially when it is fully implemented as shown in FIG. 9 In the implementation method of the random attitude analysis program 609. Using any of these positions allows the user 101 to perform the roaming gesture of the hand position independently, allowing use Touch virtual objects while roaming. Please note that these accessible objects can be fixed at the location related to the virtual environment or fixed to the position of the virtual camera, so that the user is available at any time. If there is no available location, use the user's hand position control Roaming. In this case, the system can automatically convert to contact content when the user has roamed close to an accessible virtual object or performed a predefined gesture. To provide an area that indicates no change to the virtual position, it is called Dead zone-41-This paper size applies to Chinese National Standard (CNS) A4 (210X297 mm)
裝 訂Binding
561423 -42- A7 -、發明説明(39 域,可利用方程式1 〇的虛m γ π ~ 勺應用(及传到y與z座標的相似方程式) 重新映射位置,此方法導致圖13B所述之關係。請注意邊 界與中土位置需符合手偵測區域105與其中心,或JL他已 動態調整以適應使用者的區域。 使用軀幹或頭部時,方程式10所使用之邊界與中立位置 可依照以下説明調整以適應使用者。首先,方程式10所i 用的中立位置Xc’yc,zc符合使用者身體的中立位置。使用 此系..·先後,所有使用者不會處於絕對相同的位置。在使用 者已給時間以進入所想區域102後,取樣使用者驅幹或頭 部位置並作爲中立位置。定義使用者期望自在移動(咬代 表各厘標軸)之距離的行動最大範圍。爲確保使用者移動 到位置盡頭時仍能維持於所想區域102内,利用上述姿態 (半個最大範圍的最小範圍’外加各x,y,與z維中典型身 組大小的半邵’將中立位置\的邊界限制在所想區域102 内:邊界—b,與\定位於與中立位置相關的位置,各距離中 JL位置半個行動之最大範圍。 上述姿態係依照頭部或身體軀幹的位置與/或行動。在這 個情形T,使用這些邊界所定義的區$而非手偵 105 〇 使用者之水平行動(在圖丨範例中標示爲χ的座標軸上)使 ,擬環境的視野移向左或向右。方程式1〇轉換之水平位置 是作爲虛擬垂直座標軸相關旋轉的速度函數,導致指示器 或照相機偏離。也可能使用者的垂直姿態使虛擬視; 私向上或向下(在圖1範例中標示爲y的座標軸上)。方程式 本紙張尺度適用巾® S家標準(CNS) A4規格(21GX297公董) 裝 訂 線 561423 五 、發明説明 A7 B7561423 -42- A7-Description of the invention (domain 39, the virtual m γ π ~ spoon application of equation 1 0 (and similar equations passed to the y and z coordinates) is used to remap the position. This method leads to the problem described in FIG. 13B Relationship. Please note that the boundary and middle-earth position must conform to the hand detection area 105 and its center, or the area that has been dynamically adjusted to suit the user. When using the torso or head, the boundary and neutral position used in Equation 10 can be determined according to The following description adjusts to suit the user. First, the neutral position Xc'yc, zc used in Equation 10 i matches the neutral position of the user's body. With this system, all users will not be in absolutely the same position. After the user has given time to enter the desired area 102, sample the user's drive or head position and use it as a neutral position. Define the maximum range of action that the user expects to move freely (bite represents each centrifugal axis). To ensure that the user can still stay within the desired area 102 when moving to the end of the position, use the above posture (half of the maximum range's minimum range 'plus each x, y, and typical body groups in the z dimension The size of Ban Shao 'limits the boundary of the neutral position \ to the desired area 102: boundary-b, which is positioned at a position related to the neutral position, and the maximum range of half the action of the JL position in each distance. The above attitude is based on The position and / or action of the head or body torso. In this case T, use the area defined by these boundaries instead of the hand detection 105. The horizontal movement of the user (on the coordinate axis labeled χ in the example in Figure 丨) makes The visual field of the pseudo-environment moves to the left or right. The horizontal position of the transformation of Equation 10 is a function of the speed of the virtual vertical coordinate axis related rotation, causing the pointer or camera to deviate. It is also possible that the user's vertical posture makes the virtual view; private upward Or downward (on the coordinate axis labeled y in the example in Figure 1). Formula This paper size is suitable for towels® S Standard (CNS) A4 size (21GX297 male director) Binding line 561423 V. Description of invention A7 B7
1 〇轉換的垂直位置可直接解釋爲關於水平座標軸的旋轉傾 斜角,以定位指示器與/或照相機。所顯示的使用者101行 動或出自顯示的使用者行動(在圖1範例中標示爲z的座標 幸由上)導致虛擬位置移向前或向後。一類型的行動類似行 走,其中指示器與/或照相機保持在虛擬地板上的事先定義 高度,並沿著地板的輪廓進行(例如往樓梯上升)。此轉換 位置可作爲向量的速度函數,其爲地板所定義之指示器與/ 或照相機方向到平面的推測。另一種行動類型近似於謊 報。若有此必要,轉換位置可作爲指示器與/或照相機方向 所定義之向量的速度函數。圖14描述利用所述控制謊報方 法以航行於一虛擬環境的範例。本範例中使用由先前描述 方法與利用方程式10之映射與如先前所述之適合的中立位 置所發現的使用者軀幹位置。 不論是否爲使用者用於控制或漫遊虛擬環境的方法,使 用於虛擬環境中的指示器可採用化身的形式。一化身通常 採用如圖]4之1401的人體形式。本系統所發現的位置充分 提供活化虛擬人體形式的資訊。 本系統在使用者的手位於手偵測區域10 5内時,發現使用 者的兩手。將這些位置重新映:射到化身軀幹前的映射位 置,使此化身的手可接觸如使用者接觸的相同位置。當手 不在手偵測區域105内時,將不發現或選擇使用者的手。 在這個情形下,可移動化身的映射虛擬手到化身身體上的 中立位置。 在本方法使用漫遊的實行方法中,發現與中立位置相關 -43- 本紙張尺度適用中國國家標準(CNS) A4規格(210X 297公釐)10 The vertical position of the transformation can be directly interpreted as the tilt angle of rotation about the horizontal coordinate axis to locate the indicator and / or the camera. The displayed user 101 moves or results from the displayed user action (coordinates labeled z in the example in FIG. 1 fortunately from above) cause the virtual position to move forward or backward. One type of action is similar to walking, where indicators and / or cameras are held at a predefined height on the virtual floor and follow the contour of the floor (for example, going up a staircase). This translation position can be used as a velocity function of the vector, which is a speculation of the pointer and / or camera orientation to the plane defined by the floor. Another type of action is similar to a lie. If necessary, the transition position can be a function of the speed of a vector defined by the pointer and / or camera orientation. FIG. 14 illustrates an example of using the control false report method to navigate a virtual environment. This example uses the user's torso position found by the previously described method and using the mapping of Equation 10 and a suitable neutral position as previously described. Regardless of whether the method is used by the user to control or navigate the virtual environment, the pointer used in the virtual environment may take the form of an avatar. An avatar usually takes the form of a human body as shown in Figure 4 of 1401. The locations found by this system provide sufficient information to activate the virtual human form. When the user's hand is within the hand detection area 105, the system finds both hands of the user. Remap these locations: projected locations in front of the avatar's torso so that the hands of this avatar can reach the same location as the user touches. When the hand is not within the hand detection area 105, the user's hand will not be found or selected. In this case, the avatar's mapping virtual hand is neutral to the avatar's body. In the implementation of this method using roaming, it is found that it is related to neutral position -43- This paper size applies the Chinese National Standard (CNS) A4 specification (210X 297 mm)
裝 訂Binding
k 561423 A7 B7 五、發明説明(41 ) ' 的控制位置。在這些實行方法中,化身的腳可維持在固定 的位置並可直接利用相關控制位置以決定固定腳上之化身 軀幹的位置(姿勢)。圖14描述以此方式所控制的化身。在 未使用漫遊的實行方法中,可利用如選擇成分6〇9中表示 使用者軀幹中心的位置或相關於頭頂的位置直接地決定化 身軀幹位置。 可透過反力學技術發現其他第二結合位置的細節。特別 是相關於則臂之方向資料6 13可用於限制反力學解釋以定 位肘到鄰近區域,其中的前臂出自於手偵測區域1〇5内。 方向資料613限制到平面的肘。將平面上的肘位置定爲弧 形交叉,具有表示化身長度上臂或下臂數據段,一個集中 於化身手位置(位於虛擬環境中),而其他則集中於相關於 表示肩膀的化身軀幹。應用程式同樣也決定化身膝蓋位 置。透過在固定位置定位化身的腳亟確保化身踝部不扭曲 的方式二_固定f曲膝蓋的平面,並利用如肘部的相同 交叉計算決定膝蓋位置。此外,使用固定腳位置,可計算 化身位置,其中化身往所需方向傾斜。利用以上計算可= 到足以活化此化身之化身軀幹,手,肘,腳與膝蓋的位 置。 方程式 方程式1k 561423 A7 B7 V. Control position of invention description (41). In these implementation methods, the feet of the avatar can be maintained in a fixed position and the relevant control position can be directly used to determine the position (posture) of the avatar torso on the fixed foot. FIG. 14 depicts an avatar controlled in this manner. In the practice method without using roaming, the position of the center of the user's torso or the position related to the top of the head can be used to directly determine the position of the torso of the avatar using the selection component 609. Details of other second bonding positions can be found through inverse mechanics techniques. In particular, the directional data 6 13 related to the arm can be used to limit the inverse mechanics interpretation to locate the elbow to the adjacent area, of which the forearm comes from the hand detection area 105. The orientation data 613 is limited to a flat elbow. The elbow position on the plane is defined as an arc cross with data segments representing the length of the avatar upper or lower arm, one focused on the position of the avatar hand (located in the virtual environment), and the other focused on the avatar torso related to the shoulder. The app also determines the knee position of the avatar. By positioning the feet of the avatar in a fixed position, it is urgent to ensure that the ankle of the avatar is not distorted. Secondly, fix the plane of the f-curved knee, and use the same cross calculation as the elbow to determine the knee position. In addition, using the fixed foot position, the avatar position can be calculated, where the avatar is tilted in the desired direction. Using the above calculations = the position of the avatar torso, hands, elbows, feet and knees sufficient to activate this avatar. Equation 1
X上 D 其中I爲照相機之間的距離 D爲不一致D on X where I is the distance between the cameras and D is inconsistent
-44--44-
561423 A7 B7 五、發明説明(42 ) X爲影像位置 X爲世界座標位置 方程式2 {d + dh -d0) 0 ..(sFI sin a) + Uycosa) i =-561423 A7 B7 V. Description of the invention (42) X is the image position X is the world coordinate position Equation 2 (d + dh -d0) 0 .. (sFI sin a) + Uycosa) i =-
D 其中I爲照相機之間的距離 D爲不一致 F爲平均焦距 s爲應用於此焦距的單位轉換因子 α爲照相機與世界座標z軸之間的傾斜角 y爲影像位置 Y爲世界座標位置 方程式3 7_ (sFIcosa) + (fysina) L· =-D where I is the distance between the cameras D is inconsistent F is the average focal length s is the unit conversion factor applied to this focal length α is the tilt angle between the camera and the world coordinate z axis y is the image position Y is the world coordinate position Equation 3 7_ (sFIcosa) + (fysina) L ·-
D 其中I爲照相機之間的距離 < D爲不_ 一致 F爲丰均焦距 s爲應用於此焦距的單位轉換因子 以爲照相機與世界座標z軸之間的傾斜角 z爲影像位置 Z爲世界座標位置 方程式4 -dh) otherwise -45- 本紙張尺度適用中國國家標準(CNS) A4規格(210X297公釐)D where I is the distance between the cameras < D is not consistent; F is the average focal length s is the unit conversion factor applied to this focal length; is the tilt angle between the camera and the world coordinate z axis; z is the image position; Z is the world Coordinate position equation 4 -dh) otherwise -45- This paper size applies to China National Standard (CNS) A4 (210X297 mm)
裝 訂Binding
561423 A7 B7 五、發明説明(43 ) 其中w爲權重,量由0到1 d爲此特徵到手偵測區域的距離 d〇爲最遠特徵到手偵測區域的距離 dh爲代表此手之期望大小的預先決定距離 方程式5 S = ^ aSB + (1 -a)SA where a = d-da db —Da if{D<DA) if(DA<D<DB) if(D>DB) 其中 D = |r(t)-s(t-l)| s(t)爲在時間t的平滑値 r(t)爲在時間t的未處理値 Da與Db爲臨界値 SA與SB定義削減度 方程式6 α = |其中a爲邊界使得OSaSl 其中S爲利用方程式8找到的消減 e爲從先前取樣至今的消逝時間 a爲一常數 方程式7 其中s(t)爲在時間t的平滑値 r(t)爲在時間t的未處理値 a爲一常數,其中OSaS 1 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 561423 A7561423 A7 B7 V. Description of the invention (43) Where w is the weight and the quantity is from 0 to 1 d The distance from this feature to the hand detection area d is the distance from the furthest feature to the hand detection area dh is the expected size representing the hand The predetermined distance equation 5 S = ^ aSB + (1 -a) SA where a = d-da db —Da if (D < DA) if (DA < D < DB) if (D > DB) where D = | r (t) -s (tl) | s (t) is the smoothing at time t; r (t) is the unprocessed at time t; Da and Db are critical; SA and SB define the degree of reduction equation 6 α = | Where a is the boundary such that OSaSl where S is the subtraction found using Equation 8 e is the elapsed time since the previous sampling a is a constant Equation 7 where s (t) is a smoothing at time t 値 r (t) is at time t The untreated 値 a is a constant, where OSaS 1 This paper size applies to China National Standard (CNS) A4 (210 X 297 mm) 561423 A7
方程式8 Λ: ^ βφι -bc) if ι6β _arm ^ β(br —bc) if right - arm ^ be if unknown 其中X爲此手偵測區域的位置 bc爲身體中央的位置 bl與br爲身體左與右邊界的位置· 々爲代表此手偵測區域位置偏向左或右邊之數量 的常數 方程式9 夂=Equation 8 Λ: ^ βφι -bc) if ι6β _arm ^ β (br —bc) if right-arm ^ be if unknown where X is the position of the hand detection area bc is the position of the center of the body bl and br are the left and right of the body The position of the right border · 々 is a constant equation representing the number of positions of the hand detection area to the left or right. 9 夂 =
if xh <bt if 其中Xh爲在此世界座標系統的手部位置if xh < bt if where Xh is the hand position in this world coordinate system
Xc爲在此螢幕上的游標位置,映射〇到1 b 1與b r爲有關此世界座標系統在此手偵測區域之 内的次區域之左與右邊界的位置 方程式10 - Xm Χ/ι - (Xc -夸) b丨a) ΟXc is the cursor position on this screen, mapping 0 to 1 b 1 and br are the equations for the position of the left and right borders of the sub-regions of this world coordinate system within the detection area of this hand 10-Xm χ / ι- (Xc -quad) b 丨 a) 〇
Xh ~ (Xc + ^j) KX + 夸) Xm if ^ ^bt ' z/ 6/ < χΛ < (xc — ^-) lf (xc + ff)<xh <br if xh -47- 本紙張尺度適用中國國家標準(CNS) A4規格(210 x 297公釐) 561423 A7 B7 五、發明説明(45 ) 其中xv爲應用於此虛擬座標系統的速度 乂^^爲可以應用於此虛擬座標系統的速度最大量 Xh爲在此世界座標系統的位置 xc爲在此世界座標系統的中立位置 xd爲在此世界座標系統之”死亡區域π的寬度 b!與br爲有關此世界座標系統在此手偵測區域之 内的次區域之左與右邊界的位置 已經·描述一些履行。不過,可以做出不同修改將是可以 理解的。因此,其他履行市在下述申請專利範圍的領域之 内0 -48- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐)Xh ~ (Xc + ^ j) KX + Qua) Xm if ^ ^ bt 'z / 6 / < χΛ < (xc — ^-) lf (xc + ff) < xh < br if xh -47- This paper size applies the Chinese National Standard (CNS) A4 specification (210 x 297 mm) 561423 A7 B7 V. Description of the invention (45) where xv is the speed applied to this virtual coordinate system 乂 ^^ is applicable to this virtual coordinate The maximum speed of the system Xh is the position in this world coordinate system xc is the neutral position in this world coordinate system xd is the width of the "dead area π" in this world coordinate system b! And br are related to this world coordinate system here The positions of the left and right borders of the sub-areas within the hand detection area have already described some implementations. However, it will be understood that different modifications can be made. Therefore, other implementation cities are within the scope of the patent application below. -48- This paper size applies to China National Standard (CNS) A4 (210 X 297 mm)
Claims (1)
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US22022300P | 2000-07-24 | 2000-07-24 |
Publications (1)
Publication Number | Publication Date |
---|---|
TW561423B true TW561423B (en) | 2003-11-11 |
Family
ID=32392307
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW90118059A TW561423B (en) | 2000-07-24 | 2001-07-24 | Video-based image control system |
Country Status (1)
Country | Link |
---|---|
TW (1) | TW561423B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102681656A (en) * | 2011-01-17 | 2012-09-19 | 联发科技股份有限公司 | Apparatuses and methods for providing 3d man-machine interface (mmi) |
CN103250124A (en) * | 2010-12-06 | 2013-08-14 | 三星电子株式会社 | 3 dimensional (3D) display system of responding to user motion and user interface for the 3D display system |
CN103365401A (en) * | 2012-03-29 | 2013-10-23 | 宏碁股份有限公司 | Gesture control method and gesture control device |
TWI488068B (en) * | 2012-03-20 | 2015-06-11 | Acer Inc | Gesture control method and apparatus |
US9373169B2 (en) | 2010-01-12 | 2016-06-21 | Koninklijke Philips N.V. | Determination of a position characteristic for an object |
US9983685B2 (en) | 2011-01-17 | 2018-05-29 | Mediatek Inc. | Electronic apparatuses and methods for providing a man-machine interface (MMI) |
-
2001
- 2001-07-24 TW TW90118059A patent/TW561423B/en not_active IP Right Cessation
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9373169B2 (en) | 2010-01-12 | 2016-06-21 | Koninklijke Philips N.V. | Determination of a position characteristic for an object |
CN103250124A (en) * | 2010-12-06 | 2013-08-14 | 三星电子株式会社 | 3 dimensional (3D) display system of responding to user motion and user interface for the 3D display system |
CN102681656A (en) * | 2011-01-17 | 2012-09-19 | 联发科技股份有限公司 | Apparatuses and methods for providing 3d man-machine interface (mmi) |
CN102681656B (en) * | 2011-01-17 | 2015-06-10 | 联发科技股份有限公司 | Apparatuses and methods for providing 3d man-machine interface (mmi) |
US9632626B2 (en) | 2011-01-17 | 2017-04-25 | Mediatek Inc | Apparatuses and methods for providing a 3D man-machine interface (MMI) |
US9983685B2 (en) | 2011-01-17 | 2018-05-29 | Mediatek Inc. | Electronic apparatuses and methods for providing a man-machine interface (MMI) |
TWI488068B (en) * | 2012-03-20 | 2015-06-11 | Acer Inc | Gesture control method and apparatus |
CN103365401A (en) * | 2012-03-29 | 2013-10-23 | 宏碁股份有限公司 | Gesture control method and gesture control device |
CN103365401B (en) * | 2012-03-29 | 2016-08-10 | 宏碁股份有限公司 | Gestural control method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8963963B2 (en) | Video-based image control system | |
Leibe et al. | The perceptive workbench: Toward spontaneous and natural interaction in semi-immersive virtual environments | |
EP2287708B1 (en) | Image recognizing apparatus, operation determination method, and program | |
US6775014B2 (en) | System and method for determining the location of a target in a room or small area | |
CN104102343B (en) | Interactive input system and method | |
US20170235377A1 (en) | Systems and methods of creating a realistic grab experience in virtual reality/augmented reality environments | |
O'Hagan et al. | Visual gesture interfaces for virtual environments | |
CN103677240B (en) | Virtual touch exchange method and virtual touch interactive device | |
Leibe et al. | Toward spontaneous interaction with the perceptive workbench | |
JP7026825B2 (en) | Image processing methods and devices, electronic devices and storage media | |
JP2011022984A (en) | Stereoscopic video interactive system | |
TW201214266A (en) | Three dimensional user interface effects on a display by using properties of motion | |
CN110448898B (en) | Method and device for controlling virtual characters in game and electronic equipment | |
EP1292877A1 (en) | Apparatus and method for indicating a target by image processing without three-dimensional modeling | |
WO2016166902A1 (en) | Gesture interface | |
WO2017000917A1 (en) | Positioning method and apparatus for motion-stimulation button | |
CN109564703A (en) | Information processing unit, method and computer program | |
TW561423B (en) | Video-based image control system | |
WO2019127325A1 (en) | Information processing method and apparatus, cloud processing device, and computer program product | |
Yoo et al. | 3D remote interface for smart displays | |
CN112292656B (en) | Image display system, image display method, and computer-readable recording medium storing computer program | |
Lacolina et al. | Natural exploration of 3D models | |
CN114740997A (en) | Interaction control device and interaction control method | |
JP4186742B2 (en) | Virtual space position pointing device | |
Gope et al. | Interaction with Large Screen Display using Fingertip & Virtual Touch Screen |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
GD4A | Issue of patent certificate for granted invention patent | ||
MK4A | Expiration of patent term of an invention patent |