TWI248021B

TWI248021B - Method and system for correcting out-of-focus eyesight of attendant images in video conferencing

Info

Publication number: TWI248021B
Application number: TW90106574A
Authority: TW
Inventors: Reuven Ackner; Timothy Lo Fong
Original assignee: Wistron Corp; Acer Inc
Priority date: 2001-03-21
Filing date: 2001-03-21
Publication date: 2006-01-21

Abstract

A method for correcting out-of-focus eyesight of people images in video videoconferencing is introduced. Multiple cameras are disposed around a display device at one side of a video conference and are used to grab attendant images. A host system analyzes and integrates images grabbed by the cameras to generate a front image of each attendant, which is transmitted to be displayed on a display device on the other side of the video conference. Therefore, the out-of-focus eyesight issue of attendant images is eliminated.

Description

1248021 九、發明說明：【發明所屬之技術領域】本發明係屬於視訊會議及有關數位網路應用之領域，尤其係關於一視訊會議方法與裝置，用於定義參與者 (participant)其相對於攝影機裝置之確切定位。【先前技術】隨著通訊技術精進，透過廣域網路（Wide Area Networks， WANs)如眾戶斤周知之網際網路（Internet network)的各種通訊應用如遠端通訊會議等已漸趨實用普及。例如，遠端視訊會議在過去數年中的顯著進步，使得攝影機、設備及視訊會議軟體不僅視訊晝面遠較過去清晰、順暢，尚且聲效、影像品質更是大幅提升。雖然，許多個例中，視訊編碼/解碼方法（video codec methods)、資料壓縮技術（data compression techniques)以及頻寬保留架構（bandwidth reservation schemes)之進步禮實為視訊品質提升之主要因素。然而，硬體設施包含攝影機及同類產品，亦在提升視訊品質的層面具有相當的影響。在現行視訊會議中，於具備能上網設備如個人電腦、攝影機及必須軟體的情形下，二或多位使用者得以透過網路建立連結，從而使得每位使用者可藉以視訊/聲音顯示 (representation)之形式與其他參與會議者進行交流。在其中，使用者影像是透過參與者顯示器之晝面視窗（framed viewing windows)中來顯示，前述顯示器可為陰極射線管 (CRT)顯示器、平面顯示器（flat panel displays)或其他適用 4 1248021 顯示界面。 *見（用中#使用者可由另-位具有開放頻道較用者處接收其影像或聲音。倘若視訊會 :奴加：弟二方，則必須開放第二頻道並讓第一頻道处於#候狀態（on hold)。在雄卩比庙多方使用相時進行巾’視訊會财可允許在會礅進行期間，若藉由軟體或適 :同:，任一位使用者皆可在其顯示器上不同晝面視窗合;^到所有其他|與者。參與人數的限制是大多視訊 =經常面臨的問題’㈣題的嚴重性往往需視前述會議 :的網路設施配備以及當中軟體與硬體混合之複雜程度而有所不同。技这中？見5fl會礒系統的問題常在於攝影機的提 =只有n其定位是以指向每—參與者單—方向進此使用者唯有注視攝影機時，遠端參與者才能看 =使用者面對鏡頭（face_on)的影像。是故若使用者眼光偏 ^在他處’則可能會造成遠端參與者的不適感。此種不適感正如同與某人交談但卻與此人的眼光沒有交集一般。 =即，酱綦影像中使用者發生了_種眼神失焦_】讀)的十月形:電視業拍攝實務中也存在著同樣問題，一般的解決式疋V使用者其、專注在—持續進行錄影的攝影機士來：免眼神失焦的問題。然則若被拍攝人無法照樣實行’其中-台攝影機可藉著擷取使用者注視另一之影像來解決此問題，此一解決方法雖尚稱人意，但不可避免的使用者影像W或訊L的成效可能會降低。丄248021 〃自人』視覺本能的不適問題，雖'然可透過視訊 =系統操作來解決，但如果能藉由一視訊會議系統，讓有的視訊會議參與者影像眼神U«、且以自㈣書面示（亦㈣神不會偏離至旁側），這將對視訊會議效果的hi咼有極大助益。 ^目民砷失焦的原因係由於使用者交談時不直視攝影 I而疋庄視正與他交談對象的影像。於是，其目光由原先專注在-部攝影機，而後轉移至任—位遠端參與者之顯、 ' 方、疋右在多方對談的情形下，則使用者在交談時其目光即可能會在不同攝影機間轉換，而不會一直注視攝影機。這些情況不僅導致眼神失焦的現象產生，此外亦對使二者間訊息傳遞的可信度有相當的影響。目引已有公司嘗試將攝影機整合在顯示器或顯示螢幕内以減緩該影響。舉例來說，將攝影機置於顯示器之中心部位可縮短攝影機至任何畫面視窗的旁側距離，然而，這雖可減緩眼神失焦的效應但卻無法完全消除該效應。不僅右”、員不态蚤幕極大且相對應視窗又位於近顯示器邊緣處，則眼神失焦效應極可能仍將相當顯著。另種"式ί丁、提供—特殊螢幕近似於投影螢幕，可將眼神失焦效應減至最低的，胃方式中，其中一部攝影機會將影像鏡射至前述f幕内。對於該種對光線極敏感的系統其兀件尺寸必需靈巧且需密封。但是，若前述營幕面知逆大於私準個人電腦顯示器之面積，且相對應的視窗又位方、月;1述舍幕4緣處，則眼神失焦的效應仍不能避免。 1248021 亦有其他方法，利用二部攝影機鱼 Μ ^ m ^ , 戍一立肢聲軟體來將一虛㈣定位錢示螢幕内的方式來解決此問題，如吳國專利案號5风735便為其中—例。但即便如此，此一解決方案仍未臻完美。綜上所述，很清楚地，解決問題的方案必須是一種可產生虛擬攝影效果之方法與裝置，—種其位置可設定在視訊會議螢純的任—角㈣虛擬攝频。當使用者沒有、主視真實攝影機時，此種虛擬相機可自行鎖定跟隨使用者目光角度’如此-來便可產生使用者面對鏡頭之影像流（face_ …mage s⑽m)絲之傳駐遠端㈣應者，解決有角晝面（angled shot)所引發的眼神失焦問題。【發明内容】本發明提供一種校正視訊會議中人物影像眼神失焦之方法’該視訊會敎雙方分別具有—透過網路連線之主機及一用以顯示對方影像之顯示器，該方法包含：⑷於該顯示器周圍設置至少二部與該主機電性連接之攝影機，使同時擷取-會議參與者之影像；⑻由該主機根據該等攝影機所擷取之影像進行分析整合，以產生該會議參與者之一正面影像；及（c)由該主機將該正面影像傳送至該視訊會議另一方之顯示器上顯示。。此外本發明提供一種視訊會議系統，用以與遠端另視訊會議系統連線以進行一視訊會議，該系統包括頒不裔，至少二攝影機及一主機。該等攝影機設置在該顯示器周圍，用以同時擷取一會議參與者之影像。該主機與該顯示器及該等攝影機電性連接，其根據該等攝影機所擷取之影像進行分析整合，以產生該會議參與者之一正面影像，並傳送給該視訊會議之另一方。【實施方式】圖一係根據習知技藝之視訊會議系統100之方塊圖。系統100具有—顯示器1G3，其可為CRT顯示器、平面顯示器或其他任何適用於視訊會議之標準顯示器。在此習知技藝射_，顯示器⑻是用來搭配個人電腦使用的標準 *頁丁。。再者，顯不器103亦可為視訊電話 (videoph繼）、網路電視（刪τν)或其他任何可進行網路視訊會議之設備。大邵Μ知技藝之視訊會議系統提供外接式（咖吻 ^刚)攝影機⑻。本例顯示之攝影機]01是安裝1248021 IX. Description of the Invention: [Technical Field] The present invention relates to the field of video conferencing and related digital network applications, and more particularly to a video conferencing method and apparatus for defining a participant relative to a camera The exact location of the device. [Prior Art] With the advancement of communication technologies, various communication applications such as remote communication conferences such as Wide Area Networks (WANs), which are well known to the Internet, have become more and more popular. For example, the significant advancement of far-end video conferencing over the past few years has made video, video, and video conferencing software more clear and smooth than ever before, and sound quality and image quality have increased dramatically. Although, in many cases, the advancement of video codec methods, data compression techniques, and bandwidth reservation schemes is a major factor in video quality improvement. However, hardware facilities, including cameras and similar products, have a considerable impact on improving the quality of video. In the current video conferencing, two or more users can establish a connection through the Internet in the case of Internet-enabled devices such as personal computers, cameras and necessary software, so that each user can use video/sound display (representation). ) in the form of communication with other participants in the meeting. The user image is displayed through the framed viewing windows of the participant display, and the display may be a cathode ray tube (CRT) display, a flat panel display or other suitable 4 1248021 display interface. . *See (users in ## can receive their images or sounds from other users who have open channels. If the video conference: slave plus two brothers, you must open the second channel and let the first channel be at # Waiting state (on hold). When the male is used in more than the temple, the towel will be allowed to be in the display. If the software is used, the user can use it in the display. On the other side of the window; ^ to all other | and the number of participants is mostly video = often faced problems ' (4) the severity of the problem often depends on the aforementioned meeting: network facilities and software and hardware The complexity of the mix varies. In this case, see the 5fl meeting system. The problem with the system is usually that the camera is only = n is positioned to point to each participant's order - the user only looks at the camera. The far-end participant can see the image of the user facing the lens (face_on). Therefore, if the user's eyes are on the other side, it may cause discomfort to the remote participant. This discomfort is just like with a certain People talk but with this person's eyes There is no intersection in general. = That is, the user of the sauce image has a _ kind of eye __ read] October shape: the same problem exists in the television industry shooting practice, the general solution 疋V user, Focus on the photographic camera that continues to video: avoid the problem of eye loss. However, if the photographer cannot perform the same method, the camera can solve the problem by capturing the user's image of another image. This solution is still satisfactory, but the inevitable user image W or message L The effectiveness may be reduced.丄248021 〃』》视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉视觉Written (also (4) God will not deviate to the side), which will greatly help the effectiveness of the video conference. The reason why the arsenic is out of focus is because the user does not look directly at the photography I when talking, and Zhuang Zhuang sees the image of the person who is talking with him. Therefore, the gaze of the original focus on the camera, and then transferred to the position of the remote participant, 'square, 疋 right in the multi-party talk, then the user may be in the eyes when talking Switch between different cameras without looking at the camera all the time. These conditions not only lead to the phenomenon of eye loss, but also have a considerable impact on the credibility of the message transmission between the two. It has been suggested that companies have attempted to integrate cameras into displays or display screens to mitigate this effect. For example, placing the camera in the center of the display shortens the distance from the camera to the side of any picture window. However, this slows the effect of the eye out of focus but does not completely eliminate the effect. Not only the right, the staff is not very big, and the corresponding window is located near the edge of the display, the eye loss effect will probably still be quite significant. Another kind of "style", provide - special screen similar to the projection screen, The eye can be minimized. In the stomach mode, one of the photographic opportunities mirrors the image into the aforementioned screen. For such a system that is extremely sensitive to light, the size of the element must be dexterous and sealed. If the above-mentioned camp face is more than the area of the private computer monitor, and the corresponding window is square and monthly; the effect of the eye loss is still unavoidable in the 4th edge of the curtain. 1248021 There are other ways. Using the two camera fish Μ ^ m ^ , 戍立立声声来来立立立四四四四四四四四四四四四四四四四四四四四四四吴吴吴吴吴吴吴吴吴吴吴吴吴吴吴吴吴吴Even so, this solution is still not perfect. In summary, it is clear that the solution to the problem must be a method and device that can produce virtual photographic effects, which can be set at the video conference. Fluorescent pure-corner (four) virtual camera. When the user does not have the main camera, the virtual camera can lock and follow the user's gaze angle 'so--the image stream of the user facing the lens can be generated ( Face_ ...mage s(10)m) The remote end of the silk (4) responds to the problem of eye loss caused by the angled shot. SUMMARY OF THE INVENTION The present invention provides a method for correcting the image of a person in a video conference. Method [The video conferencing device has a host connected through the network and a display for displaying the image of the other party, and the method includes: (4) setting at least two cameras electrically connected to the host around the display, so that Simultaneously capturing images of the conference participants; (8) analyzing and integrating the images captured by the cameras based on the images captured by the cameras to generate a frontal image of one of the conference participants; and (c) transmitting the front image by the host Displayed on the display of the other party of the video conference. The present invention further provides a video conference system for connecting with a remote video conference system. For a video conference, the system includes an affirmation, at least two cameras and a host. The cameras are disposed around the display for simultaneously capturing images of a conference participant. The host and the display and the photographic electromechanical Sexual connection, which is analyzed and integrated according to the images captured by the cameras to generate a positive image of one of the conference participants and transmitted to the other party of the video conference. [Embodiment] FIG. 1 is based on the prior art. Block diagram of video conferencing system 100. System 100 has a display 1G3, which can be a CRT display, a flat panel display, or any other standard display suitable for video conferencing. Here, the display (8) is used to match a personal computer. The standard used is the page. In addition, the display device 103 can also be a video phone (videoph relay), a network television (delete τν) or any other device capable of performing a network video conference. The video conferencing system of Da Shao's knowledge technology provides an external (Kan kiss) camera (8). The camera shown in this example]01 is installed

於顯示器103頂端φ γ、卜 , A 中 &。某些較新之習知技藝實施例中’攝影機101可置於螢幕1〇3内部為一整合單元。 ^在視訊會議進行期間，可有_或多個㈣視胃出ϋ 瓦幕1〇3上。其中’螢幕103上内容視窗105a-c對應於進行中的視訊會儀。例如相食視自105^C裡至少會有一視窗顯示运端對應者（會議參盥者）一、）視Λ /茸曰。其他視窗則可顯示其他的遠端參與者（若超 , 、一個茶與者）、文字或文字對話王、或/、他與特殊視訊會議相關之訊息。 1248021 為了在螢幕103裡顯示參盥备 /、θ X»義者正面影傻，择用去必須面對攝影機101。若使用去 >山车 “ 光轉向其他視窗内容，則运多而糸統裡芩與會議之其他使用者將無法看到此使用者之 JE面衫像。此種異常現象即 P為刖迷發明背景中之眼神失隹效應。 …、若攝影機101整合於螢幕 pm梁冤秦ι〇3内，置於螢幕内之一固疋位置，如中心位置，則眼甲失…、效應可降低但無法消除。本明精由提供多部攝影機 ,„A 4+1攝〜妆及一軟體應用程式解決上應用^會產生—虛擬攝影機，可伙#幕1 03内之任柄你罢進仃拍攝，且可視需要而調整位置。此發明之方法與裝置將於後文進一步說明。圖二為係根據本發明-實施例之視訊會議系統方塊圖，該系統正在進行視訊會議。此實施例中之顯示器202 係與顯…03相同之個人電腦用CRT顯示器，描繪於圖一中。顯㈣202亦可為應詩視訊會議巾的網路設備顯示器，如視訊電話或網路電視之螢幕等。不同於圖-習知技藝系統廟之單一攝影機ι〇ι裝置方式，本發明在螢幕2〇2肖圍提供多部攝影機加以，攝影機孤d可設置於圖二位置以外，此點並不偏離本發明之精神與範圍。此外，攝影機期以的數量亦可少或多於此處所述之四部，最少應有二部。本發明之實施例中，設置多部攝影機，其要點在於可提供不同虛擬位置。在一實施例中，亦可僅安排二部攝影機2〇ld與攝影機2〇]b使其 1248021 位置互相面對。或者在—最佳實施例中，可安裝三部以上攝影機2CU，優點係多部攝影機提供系統期之輸入資料亦更多，此點將另詳述於後。攝影機201a-d之安裝並不限於本發明之特殊位置，一般只要環㈣幕202即可。於每—安|位置，攝影機2〇ia_ 將焦占對準使用者後，攝影機拍攝角度便可以視線式攝影 (lme-of-sight recording)進行攝影作業。圖一之金幕202上有二個視訊顯示視窗2〇3卜c。視窗 2〇3a-c類似於圖一習知系統1〇〇之視冑购…例如，三視ή 203a c可頒示正在視訊會議中操作本發明系統綱的使用者與其遠端對應者之影像。在—實施例中，視窗2〇ia 顯示-遠端對應者之影像，而視窗鳩係顯示文字，視窗 2^3c則顯示另-形式之文字對話如聊天或訊息系統。簡而言之，顯示視窗的顯示訊息種類組合可有多種可能性。本發明之—特殊目的係能從四部攝影機2Gla-d所結合特殊影像資料，以在顯示器的螢幕區之一位置產生虛擬攝影機。此-虛擬攝影機並非真實攝影機，其藉由結合來自攝影機2Gla.d之即時(reaMime)攝影機影像輸人訊號所產生的資料結果序列’再生（recreate)所得之使用者影像。另外尚有〃他資料如軏入資料及變數資料，像是目前會議進行中之視囪2(ba‘c之座標位置，以及相關資訊等亦皆整合在虛擬攝影機之計算序列中。在κ %本叙月日寸，若使用系統2〇〇進行會議之使用者注視視窗203b，基於本發明，此視窗為假定活動視窗。如 10 1248021 ;斤返視囪203b顯示内容有可能是遭#^•水時:二:子顯示。理想狀況下’使用者注視視窗鳩者不遵^他攝影機，亦即攝影機2Gla_d。但若使用別在螢幕=’麟城2Gla_d m料準❹者，分用=從各自的角度與位置，記錄個別且完整之使此外，由每部攝影機2_之已知拍攝角又亦可求得使用者與個別攝影機之面對面距離。當二吏用者注視視窗腦日寺，四部攝影機2m己錄其個別資料流至—處理裝置（未顯示），以躲、分析。使用已結合且計算過之資料，進而產生前述使用者之虛擬影像相。㈣給其他遠端對應者之影像㈣，影像2 果宛如攝影機被置於前述使用者目光。在此例中下，产擬攝影機置於視窗鳩之中心點，模擬錄製面對面的鏡：序列0 ，注意此實施例中’虛擬相機之位置會被調整至接近活動視窗之中心。但若活動視窗轉為另—視窗’取得該視窗之座標肓料後，虛擬攝影機位置可隨即轉至新的活動視窗。特別是在使用者隨時可能注視不同活動視窗，或者使用者另啟動一視窗後注視該啟動視窗的等可能情形下，本發明設計可達到最大的功效。如上所述，虛擬攝影機效應發生後，將視訊流傳送至其他遠端對應者，即便使用者的目光從螢幕之一區偏離至其他區域，其視訊效果宛如一正對使用者的移動式攝影機一般。如此，系統200便消除大多習知系統所易發生眼神 1248021 失焦效應。其中相關結合與計算個別輪入資料以產生此虛擬攝影機之說明將詳述於後。圖三係根據本發明一實施例描緣進行中之視訊會議系統200與利用軟體3〇2升級的視訊碼處理單元j⑴之間交互影響之方塊圓。系統糊利用—處理單& 3gi來處= 訊碼之編碼與解碼。處理單元301可為任—可利用軟體處理、緩衝數位視訊資料之處理硬體。本發明之處理單元301可為一電腦處理單元、一改良 ^電話單元、或其他任何具有適當處理能力可處理視訊/ =流之連結處理器。此實施例中’處理單元3〇ι在視窗 :執仃’其包含視訊/聲音處理所需的必要元件與先前 w中所應有元件。這些元件包含但不僅限於視訊操取為、視訊設備驅動器、紋的隨機存取記憶體（r /視訊卡等元件。处如圖，所示’藉由方向線305將四部攝影機2〇la-d連至處理單元使處 ,,^ 里早兀301從四部攝影機201a-d —雔個別的視訊資料流。如圖三鱗之元件符號如，提供 :向網路連結’以作為習知中系統所需之通訊網路在此實施例中’連結3〇3為網際網路連結，為一透二二通訊的通訊電路連結（iand·-⑺航㈣或固定式域^式之無線連結。在—實施例中，連、结303可為一區網路^連結’此區域網路進一步連結於一廣域網路或網際的雙向通訊連結，可建般而言，連結303代表任何 12 1248021 立於進行通訊網路與應用此發明之節點間。如圖上連結307 之箭頭所示，視訊輸出頻道3〇7係用以傳送並顯示來自處理單元3〇1之輸入視訊資料於螢幕2〇2上。大多情形中，連結307是由電腦硬體架構來決定。在其他實施例中，如視訊電話，所有提及元件包括系統2〇〇，皆可包含在一實體元件中。除了提供系統200典型的視訊處理元件，為了透過連 …、、’罔路來傳送與接收視訊/聲音，本發明提供一改良之圖場應用軟體302，此應用軟體與典型的視訊/聲音處理軟體可整合於處理單元301内。怎用叙體302包含所需常式（r〇mines)，透過方向線3… 接收並結合來自攝影機201a_d之資料(方向線3〇5内亦包含 f他已知的資料輸人），可執行運算取得虛擬資料進而產生 w述之虛擬影像流。進入處理單& 3〇1後，產生之影像流便透過連結303及適當網路設施來進行傳送動作。圖場在全像影像（h—raphie imaging)領域裡為知名技術，其利用計算來虛擬三度空間内物體在某一觀點的特定影像。圖場的應用通常在全像記錄、或立體聲攝影等領域。圖場用來計算與影像相關之干擾圖案（interference patt叫，以代表來自特定觀點之全像影像。雕名用叙月旦302在此貫施例中，是改良之圖場應用軟 ^不Γ習知圖場應用軟體，本發明之圖場應用軟體像’根據所需之虛擬位置及虛擬攝影機之；向“出虛擬影像。虛擬攝影機之方向係由許 13 1248021 影機如攝影機2〇la-d之資料取得。目前使用之中央處理單元（CPU)如 Intel Pentium IITM 與 AMD 目士 1ΧΌ 具有一組内建之加速器功能可增進圖場之計算能力。在本發明之-般實施情況下，使用者操作系統2⑽至少可與-位其他遠端使用者進行視訊會議。在會議進行期間’當内容視窗如視窗203a-c出現於螢幕2〇2上時，使用錢易分神注視其中任n若視t 2G3a_e皆有代表遠端，應者之視訊流時，此時，使用者會在視窗中搜尋何= 為是目前通訊中的活動視窗。假設目前使用者操作系統200正與視窗咖之遠端對應者互動’在此期間使用者極可能注視視窗咖。互動包含交談、或跨聽視窗施内之遠端對應者。視窗紙為電腦所知係目前活動中的視窗。在序列代表之此段時間内，四部攝影機2Gla_d記錄與❹者互動期間之個別視訊流。每-視訊流均透過單獨連結3〇5從個別攝影機加以傳送至處理單元301以進行處理。在此例中，赏幕202上視窗2〇3a之座標位置為已知。在某些實施例中，需經由使用者的點選來決定視窗203a為活動視窗。接著，應用軟體3〇2將視窗孤座標整合進入計算中。應用軟體302產生一影像，其視訊效果宛如攝影機直接在虛擬位置所拍攝-般，雖然該影像係根據四部攝影機2〇la_d所提供的資料所模擬得來的。 ❹之對應座標資料將用以與所有或部份攝影機 20] a-d所得影像的座標資料結合。 14 1248021 此例中，欲產生正向鏡頭、最近距離的使用者影像，從攝影機201d與201c所得資料中，便已足夠產生一正向鏡頭的虛擬影像。舉例來說，以面部位置而言，攝影機可顯示使用者目光偏離至右邊且漸漸地沿一角度下降，而攝影機201c則可顯示使用者目光偏離至左邊且漸漸地沿一角度上升。從距離之觀點來看，其餘的攝影機距離接收使用者注視視窗2Gla太遠。因此，只有攝影機2()ia與2心能被轉成實用之資料。同理，若視窗2G3b是活動視窗（接收· 使用者目光者），則來自攝影機2〇la_d資料將為可用資料。·’ 應用軟體302能將輸入資料流、變數與常數資料轉《新的影像流’新的影像流包含由二個個別影像流之真實圖像育料所產生新的圖素值(pixel va]ues)，新的圖素值是取自攝影機與201c(最適用拍攝影像)。所形成的或虛擬的資料流透過連、結3〇3傳送至其他所有的遠端對應者。其視訊效杲宛若使用者真正面對鏡頭拍攝一般，從影像的拍攝角度來看，彷彿可推論攝影機就放在於視窗2〇3a之正後方 -般。 · 在另-實施例中，對於應用軟體而言，螢幕上游標之位置亦與視訊攝影機之虛擬位置有關。而在另一實施例中，實施本發明並不-定必須取得為吸引使用者目光之參考座標而啟動視窗。在此實施例中，藉由比較真實攝影機之使用者臉部影像與預先輸入之面對鏡頭的臉部影像，即可辨認傾斜方向與使用者沿xyz轴之臉部角度，藉此可增進應用軟體302之功效。因此，應用軟體3〇2可在任何時間 15 1248021 預測使用者目光之方向角度。結合來自四部攝影機鳩-d 之視值將能提供-組螢I 202上任何接收使用者目光的地區之正確座標值。如此使用者不受限於注視螢幕搬上前述之活動視窗或物體，而可隨意改變其注視㈣搬之目光方向。使用者可能注視螢I 202上與系統較無關連之部份區域或工作區域’如工作列區域、側邊區域或甚至與螢幕202無關之區域如鍵盤區。應用軟體地若有足夠之處理能力，可即時⑽tlme)計算代表使用者面對鏡頭之虛擬資料流。對那些熟知此技藝之人而言，將顯見本發明之方法與裳置裡攝職可為外裝式攝影機如上述之攝職, 或將攝衫機整合於顯示螢幕或顯示器内而不偏離本發明之精鑛與範®壽。 —在本發明另一實施例中，系統2〇〇需在多位使用者共旱單一顯示螢幕之情況下運作。本發明提供-改良系統讓使用者利用輸入聲音之方向指示器來精確指示共享顯示器使用者之位置’進而在視訊會議期間讓攝影機可目苗準正在吏用中的έ 4 *與者進行拍攝。此—較佳實施例將詳述於後。圖四係根據本發明—實施例所描繪之視訊會議系統401 鬼囷一中夕位使用者共享一視訊會議螢幕。本例中所提供與描繪之_ 4G1類似於前述之系統可允許使用者共享同一顯示螢幕。系、統401包含使用者設備403與者。又備405。使用者設備4〇3與4〇5代表遠端設備群， ]6 1248021 且透過通訊網路303彼此連結，因此，提供系統進行視訊會議連結。就共享意義而言，設備4〇3與4〇5可假設為相同設備以支援多位使用者使用。但這不是必要條件， /、要有一 δ又備此支援多位使用者即可。本實施例中設備彻具有—位使用者407，其利用顯示器彻操作一會議站。顯示器彻可為-修正的（放大的）CRT顯、示螢I，連結於此處所述之個人電⑩412。在一實施例中，顯示㈣4G9可為前投射或後投射螢幕’或可執行本發明之另-形式網路設備螢幕。如前所述，多位使用者可以共享方式操作設備如，’然而，為便於討論，此處设備403只描緣一位使用者407。視訊會議期間’提供多部外接式攝影機川記錄使用者術影像。攝影機川類似於圖三之攝影機2()ia_d，在—實施例：，允許攝影機411可機械式調整位置，而非固定式或人工调整。由圖四可知，顯于哭 / 4不為409之周圍安裝了三部攝影機411 ’如圖二所述攝影機栻20la~d 一般。攝影機數量可以增減，可能不只三部攝影棬械也可能少至二部攝影機。在此例中，與習知技藝相同，在視訊會議進行期間使用者407使用標準擴音哭（圖去 σ。（圖未不）以執行聲音功能。擴音線 413透過擴音器口或插座41 义、執订則述之實施動作並活動式連結至電腦412。擴音哭口 4〗ς π — 415可能藉由安裝多個擴音器插槽以同時處理幾個擴音器。設備4〇5包含多位使用去二 ^优用者425a-b共享顯示螢幕423。 W面討論與螢幕4〇9相關的有蒼數與選擇皆可適用於設 ]7 1248021 備他之螢幕423。提供幾個外接式攝影機切並將其置於螢幕423之外圍。攝影機421類似於前述之攝影機川。同已知之習知技藝，在進行視訊會議時，使用者425a-b可使用多個標準擴音^執行聲音魏。職將於麟述。错由擴音線d連結至擴音器口或插座427來描!會使用者仙之擴音器能力。擴音線6與£分別描緣使用者偶 ’ 之擴曰為月匕力（擴音器圖未示）。所有之擴音線d-f皆插入具有多個擴音器插槽之單—插座427内。插口似亦了執灯較多或❹擴音輯結但不㈣本發明之精髓與範田哥。較簡單之實施例中，所有連結之擴音器都藉由辨二入#口 427之入D，以獲取—特殊插口位址。視訊 ==期間’對使用者仍㈠而言，每位使用者個別保 ^42址之擴音器，可辨識位於«切前方的使用者425a-c之位置安排。在一實施例中，藉由軟體將使用者連結至連結擴音哭 =址以機械式調整攝影機421與411。舉例來說，當一: β户b吳他/她的擴音器對話’前述特殊使用者被認為 ^二狀態’因而攝影機421會隨其位置而調整。如二=何-位使…⑽與擴音器對話，攝影機 1動將鏡頭轉至前述特殊使用者。當另—使用者機寺’細幾42】會自動轉移鏡頭。故在此攝影丰式下，會議進行期間，每位使用者需依序發言。根據所述之實施例在實施本發明時，需注意使用者術 1248021 =使^ 425a_e在進行視訊會議，可能為-面談或類似 1Γ中使用者術為面談者，而使用者仍b則是被在^對象。面談者―407可能有三個内容視窗（未顯示）顯示者2彻上，而母—視窗係用以顯示被指定的遠端對應 1 425之虛擬影像流。 :為只有—位面談者術，因此不須轉動攝影機川， /、需味練攝影機411將焦點對準使用者4。7。使用者彻若使用另-座位Μ之擴音器時，則他必須移動至前述座 j ’因為當他開始發言時，攝影機411將轉移至前述位置。 ―,面試者術暫停回應被面談之其中一對象時，設備彻的螢奉409上之内容視窗會開始顯示虛擬之影像回應。一指：器會在預期回應之前被傳送，因此攝影機會在開始回應前轉至適當位置。在一實施例中，若有擴音器開始啟動攝影機拍攝時，某，攝景續421可指定為移動狀態，其他攝影機則指定在固:位置。組合情形有許多種。產生虛擬影像的系統40]，除所述聲音/攝影㈣令功能外，於、三所述之系統 200亚無二致。特殊視訊會議系統則可能配有多個。。機械式攝影機。 59裔鉍假設螢幕409上顯示三個内容視窗，各顯示使用者 425a-c之影像流。在此例中，使用者術具有三個特定目圖上所繪虛線箭頭。當任一位使用者42〜被指疋發言時，虛擬攝影機效應便開始運作，使攝影機將焦點對準前述使用者。對使用者4〇7而言，產生之影像流宛如 19 1248021 —正向鏡頭影像流，而其餘二位使用者則會產生之效應。這是因為所有的攝影機421焦 ^ 動佶^ 白^矛夕至則述活不同任养/實施例中，特殊攝影機可能被分派 :=，下，使用者〜㈣伽 :用者407衫像貧料之—内容視窗。當使用者 =言時，即使使用者術其注視方向因而開始在螢幕^ :不使用者425a_c的内容視窗間劉覽，設備攝影機仍可使内容視窗内使用者術: 鏡頭，不受使用者視線轉移的影響。為正向右有額外視窗出現’如文字框或其他測試顯者可啟動前述視窗並改變岸之用像是安梦於、，+ 攝影機效應，使得攝影機就 :疋女裝於雨述活動文字視窗之正後方一般。如此一來， |挺影像流的正向鏡頭方向方可維持不變。在另一實施例中，實施本發明並不需要單 2^'。舉例來說，不用擴音器連结❿ 供單聲或立體聲擴立哭“向聲音感測器來提測器細其=二::當一新使用者開始發〜機421便會轉向前述使用者。、隹缺失疋使用者在開始回應前也許需先進行八 W使攝影機421有足夠時間轉至適當位置。在:偏離本發明之精趙與範嘴情況下，對那些熟知此 =^ ’將顯見本發明之虛擬影像視訊會議系統不 trr—新系統實施’同時亦可將整合於現有視訊會舉例來說，可在現有系統中，藉由一軟體/硬體 20 1248021 2套件加人，便可增加系統之特殊功能。此外，昇級產。…十與糸統作業平台於系統所需昇級套件規格相容。另貝轭例，系統藉由引入適當三度空間繪圖軟體與硬體，可具有模擬三度空間的功能。在此l統下，遠At the top of the display 103 φ γ, Bu, A & In some of the newer prior art embodiments, the camera 101 can be placed inside the screen 1〇3 as an integrated unit. ^ During the video conference, there may be _ or more (four) depending on the stomach. The content window 105a-c on the screen 103 corresponds to the ongoing video conferencing apparatus. For example, if there is at least one window in the 105^C, the corresponding player (meeting participant) will be displayed. Other windows can display other remote participants (if super, one tea and one), text or text conversations, or /, and his messages related to special video conferences. 1248021 In order to display the front of the paradox /, θ X» in the screen 103, the front face must be faced with the camera 101. If you use Go > Mountain Bike to turn to other window content, you will not be able to see the JE shirt image of this user. The abnormal phenomenon is that P is a fan. The eye-loss effect in the background of the invention. If the camera 101 is integrated in the screen pm beam 冤〇〇〇 , , , , , , , , , , , , , , , , , 之一之一之一 pm pm pm pm pm pm pm pm pm pm pm pm pm pm pm pm pm Can not be eliminated. Ben Ming Jing is provided by a number of cameras, „A 4+1 photo~ makeup and a software application to solve the application ^ will be generated — virtual camera, can be used in the screen 1 03, you are stuck in the shooting And adjust the position as needed. The method and apparatus of the present invention will be further described below. Figure 2 is a block diagram of a video conferencing system in accordance with an embodiment of the present invention, which is undergoing a video conference. The display 202 in this embodiment is the same as the CRT display for personal computers shown in Fig. 03, and is depicted in Fig. 1. The display (4) 202 can also be a network device display for a video conference towel, such as a video phone or a network television screen. Different from the single camera ι〇ι device of the figure-known technology system temple, the present invention provides a plurality of cameras on the screen 2〇2 Xiaowei, and the camera l can be set outside the position of the second position, which does not deviate from the present position. The spirit and scope of the invention. In addition, the number of camera periods may be less or more than the four parts described here, and there should be at least two. In an embodiment of the invention, a plurality of cameras are provided, the main point being that different virtual positions can be provided. In one embodiment, it is also possible to arrange only two cameras 2〇ld and the camera 2〇]b so that the 1248021 positions face each other. Or, in the preferred embodiment, more than three cameras 2CU can be installed. The advantage is that multiple cameras provide more input data during the system period, which will be described in more detail later. The mounting of the cameras 201a-d is not limited to the particular position of the present invention, and generally only the ring (four) screen 202 is sufficient. In the position of each camera, the camera 2〇ia_ will focus on the user, and the camera shooting angle can be used for lme-of-sight recording. There are two video display windows 2〇3bc on the gold screen 202 of Fig. 1. The window 2〇3a-c is similar to the image of the conventional system. For example, the three-view 203a c can present an image of a user who is operating the system of the present invention and a remote counterpart in the video conference. . In the embodiment, the window 2〇ia displays the image of the remote counterpart, while the window displays the text, and the window 2^3c displays another-form text dialog such as a chat or message system. In short, there are many possibilities for the display message type combination of the display window. The special purpose of the present invention is to combine special image data from four cameras 2Gla-d to create a virtual camera at one of the screen areas of the display. This virtual camera is not a real camera, which recreates the resulting user image by combining the data result sequence generated by the reaMime camera image input signal from the camera 2Gla.d. In addition, there are still data such as intrusion data and variable data, such as the current position of the meeting (the coordinates of the ba'c, and related information are also integrated in the calculation sequence of the virtual camera. According to the present invention, this window is a hypothetical activity window, such as 10 1248021; the display content of the gazeback 203b may be #^• Water time: 2: Sub-display. Under ideal conditions, the user will not follow his camera, that is, the camera 2Gla_d. However, if you use the screen = 'Lincheng 2Gla_d m material permit, use = from The respective angles and positions are recorded individually and completely. In addition, the known shooting angle of each camera 2_ can also find the face-to-face distance between the user and the individual camera. When the second user looks at the window of the brain, The four cameras have recorded their individual data to a processing device (not shown) for hiding and analysis. The combined and calculated data is used to generate the virtual image phase of the user. (4) For other remote pairs The image of the person (4), image 2 is like a camera placed in the eyes of the user. In this example, the camera is placed at the center of the window, simulating the face-to-face mirror: sequence 0, note that in this example The position of the virtual camera will be adjusted to be close to the center of the active window. However, if the active window is changed to another window, the virtual camera position can be transferred to the new active window, especially in the user. The present invention can achieve maximum efficiency when possible to look at different active windows at any time, or when the user activates a window and then looks at the startup window. As described above, after the virtual camera effect occurs, the video stream is transmitted to other The remote counterpart, even if the user's gaze deviates from one area of the screen to other areas, the video effect is like a mobile camera facing the user. Thus, the system 200 eliminates the appearance of most conventional systems. Defocusing effect. The description of the relevant combination and calculation of individual rounding data to generate this virtual camera will Figure 3 is a block diagram of the interaction between the video conferencing system 200 and the video code processing unit j(1) upgraded by the software 3〇2 according to an embodiment of the present invention. The processing unit 301 can be any processing software that can process and buffer the digital video data by using the software. The processing unit 301 of the present invention can be a computer processing unit and an improved ^ telephone. A unit, or any other link processor that has the appropriate processing power to process the video/stream. In this embodiment, the 'processing unit 3' is in the window: 仃' contains the necessary components for video/sound processing and the previous w There should be components in the device. These components include, but are not limited to, video manipulation, video device drivers, random access memory (r / video card, etc.). As shown in the figure, the four cameras 2〇la-d are connected to the processing unit by the direction line 305, and the video data streams are transmitted from the four cameras 201a-d. For example, the symbol of the three scales provides: to connect to the network as the communication network required by the conventional system. In this embodiment, the link 3 is an Internet connection, which is a transparent communication. The communication circuit is connected (the iand-(7) navigation (4) or the fixed-type wireless connection. In the embodiment, the connection and the connection 303 can be a regional network connection. The local area network is further connected to a wide area network or The two-way communication link of the Internet can be built. In general, the connection 303 represents any 12 1248021 standing between the communication network and the node applying the invention. As shown by the arrow of the link 307, the video output channel 3〇7 is used. The input video data from the processing unit 3〇1 is transmitted and displayed on the screen 2〇2. In most cases, the connection 307 is determined by the computer hardware architecture. In other embodiments, such as video telephony, all reference components include The system 2A can be included in a physical component. In addition to providing a typical video processing component of the system 200, the present invention provides an improved field for transmitting and receiving video/sound through the connection, ... The application software 302, the application software and the typical video/sound processing software can be integrated into the processing unit 301. How to use the description 302 to include the required routines, and receive and combine the signals from the camera 201a_d through the direction lines 3... The data (the direction line 3〇5 also contains f known data input), can perform the operation to obtain the virtual data and generate the virtual image stream. After entering the processing list & 3〇1, the generated image stream The transmission is performed through the connection 303 and the appropriate network facilities. The field is a well-known technology in the field of h-raphie imaging, which uses calculations to virtualize specific images of objects in a certain three-dimensional space. The application of the field is usually in the field of holographic recording, or stereo photography. The field is used to calculate the interference pattern associated with the image (interference patt is called to represent the holographic image from a specific viewpoint. The name is used here. In the example, the improved image field application software is not applicable to the field application software, and the image field application software of the present invention is based on the virtual position required and the virtual camera; “The virtual image is taken. The direction of the virtual camera is obtained from the data of the 13 1248021 camera such as the camera 2〇la-d. The central processing unit (CPU) currently used, such as Intel Pentium IITM and AMD, has a built-in set. The accelerator function can improve the computing power of the field. In the general implementation of the present invention, the user operating system 2 (10) can perform video conferencing with at least other remote users. During the conference, when the content window is like a window When 203a-c appears on screen 2〇2, use the money to look at it. If t2G3a_e has a remote video, the user will search in the window. It is the active window in the current communication. Assume that the user operating system 200 is currently interacting with the remote counterpart of the window coffee. During this time, the user is likely to look at the window coffee. The interaction includes a conversation, or a remote counterpart in the window. Window paper is known to the computer as the window in the current activity. During this time period of the sequence, the four cameras 2Gla_d record the individual video streams during the interaction with the latter. Each video stream is transmitted from an individual camera to the processing unit 301 via a separate link 3〇5 for processing. In this example, the coordinates of the window 2〇3a on the viewing screen 202 are known. In some embodiments, the window 203a is determined to be the active window by clicking on the user. Next, the application software 3〇2 integrates the window solitary coordinates into the calculation. The application software 302 generates an image whose video effect is similar to that captured by the camera directly at the virtual location, although the image is modeled from the data provided by the four cameras 2〇la_d. The corresponding coordinate data will be used to combine the coordinate data of the images obtained by all or part of the camera 20] a-d. 14 1248021 In this example, the user's image of the forward lens and the closest distance is sufficient for generating a virtual image of the forward lens from the data obtained by the cameras 201d and 201c. For example, in terms of face position, the camera can display that the user's gaze is off to the right and gradually decreases along an angle, and the camera 201c can display that the user's gaze is off to the left and gradually rises along an angle. From the point of view of distance, the remaining cameras are too far away from the receiving user to look at the window 2Gla. Therefore, only the camera 2 () ia and 2 hearts can be converted into practical information. Similarly, if the window 2G3b is the active window (received by the user), the data from the camera 2〇la_d will be available. · 'Application software 302 can convert the input data stream, variables and constant data to the new image stream. The new image stream contains the new pixel values generated by the real image breeding of the two individual image streams (pixel va) Ues), the new pixel value is taken from the camera and 201c (the most suitable for shooting images). The formed or virtual data stream is transmitted to the other remote counterparts through the connection and the connection 3〇3. The video effect is like if the user is actually facing the lens. From the perspective of the image, it seems that the camera is placed just behind the window 2〇3a. · In another embodiment, for the application software, the position of the upstream label of the screen is also related to the virtual position of the video camera. In yet another embodiment, the practice of the present invention does not necessarily require that the window be activated to attract the reference coordinates of the user's gaze. In this embodiment, by comparing the face image of the user of the real camera with the face image of the lens that is input in advance, the angle of the tilt and the angle of the face of the user along the xyz axis can be recognized, thereby enhancing the application. The effect of software 302. Therefore, the application software 3〇2 can predict the direction of the user's gaze at any time 15 1248021. Combining the values from the four cameras 鸠-d will provide the correct coordinate value for any area on the group I 202 that receives the user's gaze. Thus, the user is not limited to watching the screen to move the aforementioned moving window or object, and can freely change the direction of the eye (4) moving. The user may look at a portion of the Fire I 202 that is unrelated to the system or a work area such as a work area, a side area, or even an area unrelated to the screen 202, such as a keyboard area. If the application software has sufficient processing power, the virtual data stream representing the user facing the lens can be calculated instantly (10) tlme). For those skilled in the art, it will be apparent that the method of the present invention and the placement of the camera can be used for external cameras such as the above-mentioned camera, or the camera can be integrated into the display screen or display without departing from the present. Invented concentrate and Fan® Shou. - In another embodiment of the invention, the system 2 does not need to operate with multiple users sharing a single display screen. The present invention provides an improved system that allows the user to use the direction indicator of the input sound to accurately indicate the location of the shared display user', thereby allowing the camera to focus on the video during the video conference. This - the preferred embodiment will be described in detail later. 4 is a video conferencing system 401 depicted in accordance with an embodiment of the present invention. A sneak peek user shares a video conferencing screen. The system provided and described in this example is similar to the aforementioned system, which allows the user to share the same display screen. The system 401 includes user equipment 403 and the user equipment 403. Also prepared for 405. User devices 4〇3 and 4〇5 represent remote device groups, ]6 1248021 and are connected to each other via communication network 303, thus providing a system for video conference connection. In terms of sharing, devices 4〇3 and 4〇5 can be assumed to be the same device to support multiple users. But this is not a necessary condition, /, there must be a δ and support for multiple users. In this embodiment, the device has a user 407 that uses the display to operate a conference station. The display can be a modified (amplified) CRT display, display I, coupled to the personal power 10412 described herein. In one embodiment, the display (4) 4G9 may be a front projection or a rear projection screen' or another form of network device screen in which the present invention may be implemented. As previously mentioned, multiple users can operate the device in a shared manner, e.g., however, for ease of discussion, device 403 is only one user 407. During the video conference, a number of external cameras were provided to record user images. The camera machine is similar to the camera 2 () ia_d of Fig. 3, in the embodiment: allows the camera 411 to mechanically adjust the position instead of fixed or manual adjustment. As can be seen from Figure 4, it is obvious that the crying / 4 is not installed around the 409 three cameras 411 ' as shown in Figure 2 camera 栻 20la ~ d. The number of cameras can be increased or decreased, and there may be as many as three cameras or as few as two cameras. In this example, as in the prior art, the user 407 uses standard sound reinforcement during the video conference (the picture goes to σ. (not shown) to perform the sound function. The sound line 413 passes through the microphone port or socket. 41 Sense, the implementation of the action is described and the activity is linked to the computer 412. The sound reinforcement is crying 4 ς π — 415 may handle several loudspeakers simultaneously by installing multiple loudspeaker slots. 〇5 contains multiple users to use 425a-b to share the display screen 423. W-face discussion and screen 4〇9 related to the number and selection can be applied to the setting] 7 1248021 prepare his screen 423. Several external cameras cut and place them on the periphery of the screen 423. The camera 421 is similar to the aforementioned camera. As is known in the art, the user 425a-b can use multiple standard amplifications during a video conference. ^Executive voice Wei. The post will be Lin Shu. The wrong line is connected to the loudspeaker port or socket 427 by the sound line d. The user will be able to use the speaker's ability. The sound line 6 and the £ separately describe the user. The expansion of the even ' is the monthly force (not shown in the loudspeaker). The line df is inserted into the single-socket 427 having a plurality of loudspeaker slots. The socket also seems to have more lights or amplifying the sound but not (4) the essence of the invention and Fan Tiange. In the middle, all connected loudspeakers are obtained by dialing the input port 427 to obtain the special socket address. Video == period 'for the user (1), each user is individually protected ^42 The address of the loudspeaker can identify the position of the user 425a-c located in front of the cut. In an embodiment, the user is connected to the connected sound reinforcement by software to mechanically adjust the cameras 421 and 411. For example, when a: β household b Wu / her loudspeaker dialogue 'the aforementioned special user is considered ^ two states' and thus the camera 421 will be adjusted with its position. If two = Ho-bit makes ... (10) Conversation with the loudspeaker, the camera 1 moves the lens to the aforementioned special user. When the other user machine temple 'fine 42' will automatically shift the lens. Therefore, in this photography mode, each session is used during the conference. It is necessary to speak in order. According to the embodiment described, it is necessary to pay attention to the implementation of the present invention. User 1248021 = Make ^ 425a_e in a video conference, which may be - interview or similar to the user in the interview, and the user is still in the object. The interviewer - 407 may have three content windows ( Not shown) The display 2 is over, and the parent-window is used to display the virtual image stream corresponding to the specified distal end of 1 425. : For the only-interviewer, there is no need to rotate the camera, /, need The camera 411 focuses on the user 4. 7. When the user uses the speaker of the other seat, he must move to the seat j' because when he starts speaking, the camera 411 will shift to the aforementioned Position. ― When the interviewer pauses to respond to one of the interviewed objects, the content window on the device will begin to display the virtual image response. One means that the device will be transmitted before the expected response, so the camera will be moved to the appropriate position before starting the response. In one embodiment, if a microphone starts to shoot the camera, one of the scenes 421 can be designated as the moving state, and the other cameras are designated as the solid: position. There are many combinations of situations. The system 40 for generating a virtual image, in addition to the sound/photography (four) command function, is the same as the system 200 described above. A special video conferencing system may be equipped with multiple. . Mechanical camera. 59 铋假设 Suppose the screen 409 displays three content windows, each displaying the video stream of the user 425a-c. In this example, the user has three dotted arrows drawn on a particular eye. When any user 42~ is instructed to speak, the virtual camera effect begins to operate, causing the camera to focus on the aforementioned user. For the user 4〇7, the resulting image stream is like 19 1248021—the forward lens image stream, and the other two users will have the effect. This is because all the cameras 421 Jiao ^ 佶 ^ white ^ spears to the end of the different live / in the embodiment, the special camera may be assigned: =, down, the user ~ (four) gamma: the user 407 shirt like poor The material-content window. When the user = speaks, even if the user starts his or her gaze direction and starts browsing between the content window of the screen: not the user 425a_c, the device camera can still make the user in the content window: the lens is not subject to the user's sight. The impact of the transfer. There is an extra window for the right side. 'If the text box or other test shows, you can start the above window and change the shore. It is like a dream, + camera effect, so that the camera is: 疋 women in the rain activity text window It is generally behind. In this way, the positive lens direction of the image stream can be maintained. In another embodiment, the implementation of the invention does not require a single 2^'. For example, don't use a loudspeaker link ❿ for mono or stereo expansion. "To the sound sensor, the detector is fine. = 2: When a new user starts to send the machine 421, it will turn to the aforementioned use.隹隹隹疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋疋It will be apparent that the virtual video conferencing system of the present invention is not trr--a new system implementation' can also be integrated into an existing video conference, for example, in an existing system, by a software/hardware 20 1248021 2 kit, It can increase the special functions of the system. In addition, it is upgraded. Ten is compatible with the system upgrade kit specifications required by the system. Another example is the system, by introducing appropriate three-dimensional space drawing software and hardware. It has the function of simulating three-dimensional space. Under this system, far

Π吏用t之三度空間影像可藉由虛擬影像流㈣示。舉例 Μ A右使用者所在位置以—角度偏離遠端對應者影像視窗’則代表遠端對應者影像的虛擬影像可追縱使用者之動作’使虛㈣像能與觀看使用者目光相接。此種複雜之實施例需要相當系統處理效能，因此對一般使用者並不實用。然而’此種實施例仍是可實施的。當然，以上所述僅為本發明之較佳實施例，其並非用以限制本發明之貫施範圍，任何熟習該項技藝者在不違背本發明之精神所做之修改均應屬於本發明之範圍，因此本發明之保護範圍當以下列所述之中請專利範圍做為依據。【圖式簡單說明】圖一‘根據習知技藝之視訊會議站之方塊圖。圖二係根據本發明一實施例之視訊會議站之方塊圖。圖二係圖二視訊會議站與視訊碼處理器間其交互影塑之方塊圖。圖四係根據本發明一實施例所描繪之視訊會議系統之方塊圖，圖中多位使用者共享一視訊會議螢幕。 21 1248021 【圖式之主要元件代表符號說明】 100 視訊會議系統 101 外接式攝影機 103 顯示器 105a- -105c 内容視窗 200 視訊會議系統 20 la- 201d 攝影機 202 螢幕 205a-205c 視窗 301 視訊碼處理單元 302 軟體 303 網際網路 305 方向線 307 視訊輸出頻道 401 視訊會議系統 403 使用者設備 405 使用者設備 407 使用者 409 顯示器 411 外接式攝影機 412 個人電腦 413 擴音線 415 擴音裔口 417 通訊網路 419 個人電腦 421 攝影機 423 螢幕 425a-425b 使用者 427 插座 429d-429f 擴音線The three-dimensional image of t can be shown by the virtual image stream (4). For example, ΜA right user's position is—the angle deviates from the far-end corresponding image window', and the virtual image of the remote counterpart image can be traced to the user's action to make the virtual (four) image meet the viewing user's gaze. Such complex implementations require considerable system processing power and are therefore not practical for the average user. However, such an embodiment is still implementable. The above is only the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification made by those skilled in the art without departing from the spirit of the present invention should belong to the present invention. Scope, therefore, the scope of protection of the present invention is based on the scope of the patents described below. [Simple description of the figure] Figure 1 is a block diagram of a video conference station according to the conventional art. 2 is a block diagram of a video conferencing station in accordance with an embodiment of the present invention. Figure 2 is a block diagram of the interaction between the video conferencing station and the video code processor. 4 is a block diagram of a video conferencing system depicted in accordance with an embodiment of the present invention, in which a plurality of users share a video conferencing screen. 21 1248021 [Description of main components of the drawings] 100 Videoconferencing system 101 External camera 103 Display 105a--105c Content window 200 Video conferencing system 20 la- 201d Camera 202 Screen 205a-205c Window 301 Video code processing unit 302 Software 303 Internet 305 Direction Line 307 Video Output Channel 401 Video Conference System 403 User Equipment 405 User Equipment 407 User 409 Display 411 External Camera 412 Personal Computer 413 Sound Amplifier 415 Amplifier 417 Communication Network 419 Personal Computer 421 Camera 423 Screen 425a-425b User 427 Socket 429d-429f Loudspeaker

22twenty two

Claims

1248021 X. Applying for a patent ΐΐ"®~Τ一"' ' 1 · 彳彳彳彳彳人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物人物To display a display of the other party's image, the method includes: U) arranging at least two cameras electrically connected to the host around the display to simultaneously capture images of a conference participant; (b) by the host according to the The image captured by the camera is analyzed and integrated to generate a frontal image of one of the conference participants; and (0 is transmitted by the host to the display of the other party of the video conference. 2. According to the patent application scope In the method of correcting the image of the person in the video conference, the method of the method is: at least one window is opened in each of the displays, and when the window is an active window and is watched by the conference participant, 'in step (b), The host further analyzes and integrates the images captured by the camera and the cameras captured by the camera, to generate a frontal image of the participants of the conference. The method for correcting the distraction of a person's image in a video conference as described in claim 2, wherein the activity window is a window displaying an image, text or dialogue of a participant of the conference in which the other party of the video conference is interacting. In step (b), the host further captures the face of the conference participant by the camera in accordance with the method of the scope of the patent application, which corrects the method of the person's image defocusing in the video conference. Performing an analysis and comparison with a pre-generated facial image of the participant of the conference to instantly generate a frontal image of the facial of the meeting. 23 1248021 5·Correcting the video conference as described in item i of the patent application scope A method for the image to be out of focus, wherein the display is shared by a plurality of conference participants, and in V Jesus (a), a speaker of the conference participant is electrically connected to the host, and each The loudspeaker has a special address corresponding to the position f of each conference participant. When a conference participant uses his corresponding loudspeaker to speak, the host makes the camera such as thai The participant of the conference - the location of the film, so that the image of the participant of the conference is captured at the same time. n "The method of correcting the eye of the person in the video conference as described in item i of the patent scope" wherein the display is shared by a plurality of conference participants And in step (4), 'each of the conference participants' loudspeakers, and a loudspeaker that electrically connects the loudspeakers and the direction of the cameras, when the _hetie tea uses the corresponding When the loudspeaker is speaking, the direction sound is sensed: the position of the conference participant can be detected. 豕豕摄摄调整调整调整调整 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ According to the scope of the application for patents, the eyes are missing. "...the video is also the video of the person. The video is ordered to cry to the host. According to the number of participants in the conference, the number of participants in the conference is the same.视视". 4 θ 礅与 and the image of the person. 8 · According to the method of applying for the patent ribs and out of focus, in the cow to correct the video conference, the image of the person in the video is the same as the camera 2: Τ The host is more applied - the field application soft is the front image. Retracting the knife to analyze the difference, in order to produce a 9--type video conferencing system, using the system to communicate with the 24 1248021 - video conferencing, the system includes: a display; / a camera "set" The display is arranged to simultaneously capture an image of a conference participant; and a host computer is electrically connected to the display and the camera, and the image is captured according to the image captured by the camera, such as i is the participant's τ 9 oh, and the stop image is transmitted to the other party of the video conference. 10. The video conferencing system according to the scope of application patent 帛9, in which the display is opened in the bricks, s, 、, 、. There is at least one window, when the window is an activity window and the meeting spring fish goes, ± #. "The left side of the day" is based on the image of the activity window and is analyzed and integrated with the images captured by the cameras to generate a positive image of the participants of the conference. The video conferencing system of item 0, wherein the moving window is a window displaying an image, a text or a dialogue of a participant who is interacting with the other party of the video conference. Video conferencing system, machine review /, T 53⁄4 main 曰将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该将该A facial image of the face is analyzed 13 to compare the face image of the participant's face in real time. 仃 _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ It is shared by the participants of the plurality of conferences, and the system further provides a loudspeaker that is electrically connected to the host, and a special address of the location of the conference participant. The answerer uses his corresponding loudspeaker to send In the words, the host can judge the position of the conference participant according to the special address of the '25!248〇2l loudspeaker, and let the §π^ be aligned with the shooting position of the club At the same time, the image of the participants of the conference is captured. The video conferencing system described in the nine patents is also available, wherein the device can be shared by a plurality of conference participants, and the system provides each: 7 > The latter: a loudspeaker and a directional sound sensor electrically connected to the loudspeakers, when a conference participant uses its 曰°.^° day inch, the direction sound sensor can detect The participant position of the conference is measured and the camera is adjusted to be aligned with the shooting position of one of the conference participants, so that the image of the conference participant is simultaneously captured. The video described in claim #1 or 3 of the patent scope The conference system, wherein the "Xuan host can correspond to the plurality of conference participants of the other party, and the same number of views as the plurality of conference participants of the other party are opened on the display to individually present the images of the conference participants. _ Yishen, Qing patent scope mentioned in item 9 The video conferencing system, wherein the host computer further applies a field application software to perform analysis operations on the image data captured by the cameras to generate the front image. The field application software system is executed by the processing unit application. The videoconferencing system machine according to item 9 of the patent application scope is a personal computer. 1 9. The video conferencing system machine according to item # of the patent application scope is a network device applicable to video conferences. The video conferencing system according to claim 16 wherein the host further comprises a processing unit for processing the image data, and wherein the figure is 26 1248021 20. According to the 19th item of the patent application scope The video conferencing system, wherein the network device is any one of a video phone or a network television. 27 1248021 ! 94 8 . 16 XI. Schema

Figure 1 Conventional Skills 28 1248021

29 1248021

303 steal three 30 407 1248021