TW202316239A

TW202316239A - Frame extrapolation with application generated motion vector and depth

Info

Publication number: TW202316239A
Application number: TW111132755A
Authority: TW
Inventors: 大衛詹姆斯伯雷; 希安魏; 馬修羅伯特富爾格姆; 尼爾貝德克; 張簡
Original assignee: 美商元平台技術有限公司
Priority date: 2021-10-11
Filing date: 2022-08-30
Publication date: 2023-04-16
Also published as: WO2023064090A1

Abstract

In one embodiment, a method includes receiving a rendered image, motion vector data, and a depth map corresponding to a current frame of a video stream generated by an application, calculating a current three-dimensional position corresponding to the current frame of an object presented in the rendered image using the depth map, calculating a past three-dimensional position of the object corresponding to a past frame using the motion vector data and the depth map, estimating a future three-dimensional position of the object corresponding to a future frame based on the past three-dimensional position and the current three-dimensional position of the object, and generating an extrapolated image corresponding to the future frame by reprojecting the object presented in the rendered image to a future viewpoint associated with the future frame using the future three-dimensional position of the object.

Description

Frame extrapolation with application-generated motion vectors and depth

本發明大體上係關於人工實境系統，且詳言之，係關於外插圖框。The present invention relates generally to artificial reality systems, and in particular, to external insets.

本申請案根據35 U.S.C. § 119（e）主張2021年10月11日申請之美國臨時專利申請案第63/254476號及2022年2月01日申請之美國非臨時專利申請案第17/590682號的權益，該些案以引用的方式併入本文中。This application is asserted under 35 U.S.C. § 119(e) in U.S. Provisional Patent Application No. 63/254476, filed October 11, 2021, and U.S. Nonprovisional Patent Application No. 17/590682, filed February 01, 2022 interests, which are incorporated herein by reference.

人工實境係在展現給使用者之前已以某一方式調整之實境形式，其可包括例如虛擬實境（virtual reality；VR）、擴增實境（augmented reality；AR）、混合實境（mixed reality；MR）、混雜實境或其某一組合及/或衍生物。人工實境內容可包括完全產生之內容，或與所捕獲之（例如，真實世界）內容組合之所產生內容。人工實境內容可包括視訊、音訊、觸覺反饋或其某一組合，且其中之任一者可在單一通道中或在多個通道中展現（諸如，對觀看者產生三維效應之立體聲視訊）。人工實境可與例如用於在人工實境中創建內容及/或用於人工實境中（例如，在人工實境中進行活動）之應用程式、產品、配件、服務或其某一組合相關聯。提供人工實境內容之人工實境系統可實施於各種平台上，包括連接至主機電腦系統之頭戴式顯示器（head-mounted display；HMD）、獨立式HMD、移動裝置或運算系統或能夠將人工實境內容提供至一或多個觀看者之任何其他硬體平台。Artificial reality is a form of reality that has been adjusted in some way before being presented to the user, which may include, for example, virtual reality (VR), augmented reality (augmented reality, AR), mixed reality ( mixed reality; MR), mixed reality, or a combination and/or derivative thereof. Artificial reality content may include fully generated content, or generated content combined with captured (eg, real world) content. Artificial reality content may include video, audio, haptic feedback, or some combination thereof, and any of these may be presented in a single channel or in multiple channels (such as stereoscopic video that creates a three-dimensional effect on the viewer). Artificial reality may relate to, for example, applications, products, accessories, services or some combination thereof for creating content in artificial reality and/or for use in artificial reality (e.g., performing activities in artificial reality) couplet. Artificial reality systems that provide artificial reality content can be implemented on a variety of platforms, including a head-mounted display (HMD) connected to a host computer system, a stand-alone HMD, a mobile device or computing system, or an artificial Any other hardware platform on which the reality content is provided to one or more viewers.

本文所描述之特定具體實例係關於用於藉由使用應用程式所產生之移動向量及深度圖，來產生高品質圖框外插及再投影之系統及方法。為了向使用者提供舒適人工實境體驗，需要以高圖框率呈現高解析度圖框。然而，歸因於硬體之運算能力限制，此對於移動HMD為巨大挑戰。用於人工實境系統之傳統時間扭曲（time warp）解決方案具有若干限制：解決方案僅校正旋轉但不校正平移移動，且不解決動畫快門。傳統時間扭曲解決方案簡單地旋轉二維RGB影像，以適應使用者之新視點。傳統圖框外插解決方案係基於較低品質移動向量，此係由於自二維影像估計移動向量。本文中揭示之新穎圖框外插解決方案可考慮平移移動以及旋轉移動。新穎圖框外插解決方案可利用由應用程式基於所呈現物件產生之移動向量及深度資訊。Certain embodiments described herein relate to systems and methods for generating high quality frame extrapolation and reprojection by using motion vectors and depth maps generated by applications. In order to provide users with a comfortable artificial reality experience, it is necessary to present high-resolution frames at a high frame rate. However, this is a great challenge for mobile HMDs due to the limitation of computing power of the hardware. Traditional time warp solutions for artificial reality systems have several limitations: the solution only corrects for rotation but not translational movement, and does not address animation shutter. Traditional time warping solutions simply rotate the 2D RGB image to accommodate the user's new viewpoint. Traditional frame extrapolation solutions are based on lower quality motion vectors due to estimating motion vectors from 2D images. The novel frame extrapolation solution disclosed in this paper can consider translational movement as well as rotational movement. The novel frame extrapolation solution utilizes motion vector and depth information generated by the application based on rendered objects.

在特定具體實例中，與穿戴式裝置相關聯之運算系統可接收對應於由應用程式產生之當前圖框的所呈現影像、移動向量資料及深度圖。移動向量資料及深度圖可基於由應用程式呈現之三維物件而產生。移動向量資料中之移動向量可為三維的。運算系統可處理接收到之移動向量資料及深度圖，使得擴展對應於所呈現影像之前景的區。運算系統可使用深度圖計算展現於對應於當前圖框之所呈現影像中之物件的當前三維位置。為了計算物件之當前三維位置，運算系統可自與當前圖框相關聯之當前視點將深度圖反向投影至三維空間上。當前視點可在呈現當前圖框之時間瞬間時與穿戴式裝置之位置及定向相關聯。運算系統可使用移動向量資料及深度圖計算對應於過去圖框之物件的過去三維位置。為了計算物件之過去三維位置，運算系統可藉由自深度圖減去移動向量來產生對應於過去圖框之經估計深度圖。運算系統可自與過去圖框相關聯之過去視點將經估計深度圖反向投影至三維空間上。運算系統可基於物件之過去三維位置及當前三維位置而估計對應於未來圖框之物件的未來三維位置。可基於物件以恆定速度自對應於過去圖框之時間瞬間移動至對應於未來圖框之時間瞬間的假設而進行估計物件的未來三維位置。運算系統可進行線性內插以估計物件之未來三維位置。在估計物件之未來三維位置之後，運算系統可藉由將物件之經估計未來三維位置投影至未來視點上來產生失真網格運算系統可藉由使用物件之未來三維位置將展現於所呈現影像中之物件再投影至與未來圖框相關聯之未來視點來產生對應於未來圖框的外插影像。為了產生外插影像，運算系統可將失真網格應用於所呈現影像。In certain embodiments, a computing system associated with a wearable device can receive a rendered image, motion vector data, and depth map corresponding to a current frame generated by an application. Motion vector data and depth maps can be generated based on 3D objects rendered by the application. The motion vectors in the motion vector data can be three-dimensional. The computing system can process the received motion vector data and depth map such that regions corresponding to the foreground of the rendered image are expanded. The computing system can use the depth map to calculate the current three-dimensional position of objects present in the rendered image corresponding to the current frame. To calculate the current three-dimensional position of the object, the computing system may back-project the depth map onto the three-dimensional space from the current viewpoint associated with the current frame. The current viewpoint can be associated with the position and orientation of the wearable device at the instant in time at which the current frame is presented. The computing system can use the motion vector data and the depth map to calculate the past three-dimensional position of the object corresponding to the past frame. To calculate the past three-dimensional position of the object, the computing system can generate an estimated depth map corresponding to the past frame by subtracting the motion vector from the depth map. The computing system may back-project the estimated depth map onto three-dimensional space from a past viewpoint associated with the past frame. The computing system can estimate the future three-dimensional position of the object corresponding to the future frame based on the past three-dimensional position and the current three-dimensional position of the object. The future three-dimensional position of the object may be estimated based on the assumption that the object moves at a constant speed from a time instant corresponding to a past frame to a time instant corresponding to a future frame. The computing system can perform linear interpolation to estimate the future three-dimensional position of the object. After estimating the future 3D position of the object, the computing system can generate a distorted mesh by projecting the estimated future 3D position of the object onto a future viewpoint The object is reprojected to a future viewpoint associated with the future frame to generate an extrapolated image corresponding to the future frame. To generate the extrapolated image, the computing system may apply a distortion grid to the rendered image.

本文中所揭示之具體實例僅為實例，且本發明之範圍不限於該些具體實例。特定具體實例可包括上文所揭示之具體實例的組件、元件、特徵、功能、操作或步驟中之全部、一些或無一者。根據本發明之具體實例尤其在針對一種方法、儲存媒體、系統及電腦程式產品之所附申請專利範圍中揭示，其中在一個請求項類別中提及之任何特徵（例如方法）亦可在另一請求項類別（例如系統）中主張。出於僅形式原因而選擇所附申請專利範圍中之依賴性或反向參考。然而，同樣可主張由對任何前述請求項之反向故意參考（特定言之在多個依賴性方面）產生的任何主題，以使得請求項及其特徵之任何組合經揭示且可無關於在所附申請專利範圍中選擇的依賴性而主張。可主張之主題不僅包括如所附申請專利範圍中闡述的特徵之組合而且包含請求項中特徵之任何其他組合，其中請求項中所提及之各特徵可組合於任何其他特徵或請求項中之其他特徵之組合。此外，本文中描述或描繪之具體實例及特徵中之任一者可在獨立請求項中、及/或在與本文中描述或描繪之任何具體實例或特徵或與所附申請專利範圍之特徵中之任一者的任何組合中主張。The specific examples disclosed herein are examples only, and the scope of the invention is not limited to these specific examples. A particular embodiment may include all, some, or none of the components, elements, features, functions, operations or steps of the embodiments disclosed above. Embodiments according to the present invention are especially disclosed in the appended claims for a method, storage medium, system and computer program product, wherein any feature mentioned in one claim category (such as a method) may also be described in another asserted in the claim item category (eg system). Dependencies or back references in the appended claims are selected for formality reasons only. However, any subject matter arising from a reverse deliberate reference to any preceding claim, particularly in terms of multiple dependencies, may likewise be claimed such that any combination of the claims and their features are disclosed and may not be related to the It is asserted with the dependency selected in the scope of the application. Claimable subject matter includes not only combinations of features as set forth in the appended claims but also any other combination of features in the claims, where each feature mentioned in a claim can be combined in any other feature or claim Combinations of other features. Furthermore, any of the embodiments and features described or depicted herein may be included in a separate claim and/or in connection with any embodiment or feature described or depicted herein or with features of the appended claims. asserted in any combination of any of them.

圖 1A說明實例人工實境系統100A。在特定具體實例中，人工實境系統100A可包含頭戴組104、控制器106及運算系統108。使用者102可配戴頭戴組104，該頭戴組可將視覺人工實境內容顯示給使用者102。頭戴組104可包括音訊裝置，其可將音訊人工實境內容提供至使用者102。頭戴組104可包括一或多個攝影機，其可捕獲環境之影像及視訊。頭戴組104可包括眼動追蹤系統以判定使用者102之輻輳距離。頭戴組104可包括麥克風以自使用者102捕獲語音輸入。頭戴組104可稱為頭戴式顯示器（HMD）。控制器106可包含軌跡墊及一或多個按鈕。控制器106可自使用者102接收輸入，且將輸入中繼至運算系統108。控制器106亦可將觸覺反饋提供至使用者102。運算系統108可經由纜線或無線連接而連接至頭戴組104及控制器106。運算系統108可控制頭戴組104及控制器106，以將人工實境內容提供至使用者102及自該使用者102接收輸入。運算系統108可為獨立式主機運算系統，與頭戴組104整合之機載運算系統、移動裝置，或能夠將人工實境內容提供至使用者102及自該使用者接收輸入之任何其他硬體平台。 FIG. 1A illustrates an example artificial reality system 100A. In a specific example, the artificial reality system 100A may include a headset 104 , a controller 106 and a computing system 108 . The user 102 can wear a headset 104 that can display visual artificial reality content to the user 102 . The headset 104 can include an audio device that can provide audio artificial reality content to the user 102 . Headset 104 may include one or more cameras that capture images and video of the environment. The headset 104 may include an eye tracking system to determine the convergence distance of the user 102 . Headset 104 may include a microphone to capture voice input from user 102 . Headset 104 may be referred to as a head-mounted display (HMD). Controller 106 may include a trackpad and one or more buttons. Controller 106 can receive input from user 102 and relay the input to computing system 108 . The controller 106 can also provide tactile feedback to the user 102 . The computing system 108 can be connected to the headset 104 and the controller 106 via a cable or a wireless connection. The computing system 108 can control the headset 104 and the controller 106 to provide artificial reality content to and receive input from the user 102 . Computing system 108 may be a stand-alone mainframe computing system, an onboard computing system integrated with headset 104, a mobile device, or any other hardware capable of providing artificial reality content to and receiving input from user 102 platform.

圖 1B說明實例擴增實境系統100B。擴增實境系統100B可包括頭戴式顯示器（HMD）110（例如眼鏡），其包含框架112、一或多個顯示器114及運算系統108。顯示器114可為透明或半透明的，允許穿戴HMD 110之使用者透過顯示器114看到真實世界，且同時將視覺人工實境內容顯示至使用者。HMD 110可包括可將音訊人工實境內容提供至使用者之音訊裝置。HMD 110可包括一或多個攝影機，其可捕獲環境之影像及視訊。HMD 110可包括眼動追蹤系統以追蹤穿戴HMD 110之使用者的輻輳移動。HMD 110可包括麥克風以自使用者捕獲語音輸入。擴增實境系統100B可進一步包括控制器，其包含軌跡墊及一或多個按鈕。控制器可自使用者接收輸入，且將輸入中繼至運算系統108。控制器亦可將觸覺反饋提供至使用者。運算系統108可經由電纜或無線連接而連接至HMD 110及控制器。運算系統108可控制HMD 110及控制器，以將擴增實境內容提供至使用者且自使用者接收輸入。運算系統108可為獨立式主機電腦裝置，與HMD 110整合之機載電腦裝置、移動裝置，或能夠將人工實境內容提供至使用者及自該使用者接收輸入之任何其他硬體平台。 FIG. 1B illustrates an example augmented reality system 100B. The augmented reality system 100B may include a head-mounted display (HMD) 110 (eg, glasses) including a frame 112 , one or more displays 114 , and a computing system 108 . The display 114 can be transparent or translucent, allowing the user wearing the HMD 110 to see the real world through the display 114 while simultaneously displaying visual artificial reality content to the user. HMD 110 may include an audio device that may provide audio artificial reality content to a user. HMD 110 may include one or more cameras that can capture images and video of the environment. The HMD 110 may include an eye-tracking system to track the vergent movements of a user wearing the HMD 110 . HMD 110 may include a microphone to capture voice input from the user. The augmented reality system 100B may further include a controller including a track pad and one or more buttons. The controller can receive input from a user and relay the input to the computing system 108 . The controller can also provide tactile feedback to the user. Computing system 108 may be connected to HMD 110 and the controller via a cable or wireless connection. The computing system 108 can control the HMD 110 and the controller to provide augmented reality content to the user and to receive input from the user. Computing system 108 may be a stand-alone host computer device, an on-board computer device integrated with HMD 110, a mobile device, or any other hardware platform capable of providing artificial reality content to and receiving input from the user.

圖 2說明使用應用程式所產生之移動向量及深度資訊進行之圖框外插及再投影之實例概述。在圖2中所說明之實例中，應用程式210以每秒36個圖框（frames per second；FPS）呈現視訊圖框的影像。因此，應用程式210針對圖框N及N+2呈現影像211。應用程式亦產生對應於各影像211之移動向量及深度資訊213。藉由使用本文中所提議之圖框外插解決方案，運算系統108之作業系統220以72 FPS將圖框展現至使用者。作業系統220將被應用程式210呈現、針對圖框N及圖框N+2 221之影像211，展現至與運算系統108相關聯的顯示器。基於所呈現影像211、及隨著所呈現影像211產生之移動向量及深度資訊213，作業系統220產生針對圖框N+1及N+3之影像223。作業系統220將針對圖框N+1及N+3之所產生影像223展現至顯示器。儘管使用移動向量及深度資訊213進行高品質圖框外插可為重要的，但由取樣一半引起之延遲常常可對使用者體驗施加顯著影響。當延遲保持較高時，使用者可在旋轉頭戴組104時經歷顯著拉黑（black pulling）或在移動其控制器時經歷顯著滯後。本文中提議之數個技術以減小潛在延遲：（1）作業系統220可延遲圖框之開始，以減小在應用程式210結束呈現影像時之時間瞬間與影像在顯示器上消耗之時間瞬間之間的間隔。（2）作業系統220可再提取頭戴組104及控制器106之姿勢，以填充呈現影像與將所呈現影像展現至顯示器之間的時間間隙。（3）作業系統220可在進行時間扭曲之前立刻再取樣攝影機姿態，且基於自攝影機旋轉及攝影機平移兩者考慮而再投影像素。此技術稱作位置時間扭曲（positional time warp；PTW）。應用程式所產生之深度圖可用於該PTW。在特定具體實例中，在不使用圖框外插之情況下，具有PTW之頭部姿態延遲可甚至低於等效全圖框速率應用中之頭部姿態延遲。在特定具體實例中，特別負責圖框外插之運行時間系統可替代作業系統220起作用。儘管本發明描述以特定速率進行圖框外插，但本發明涵蓋以任何適合速率進行圖框外插。 Figure 2 illustrates an example overview of frame extrapolation and reprojection using motion vector and depth information generated by an application. In the example illustrated in FIG. 2 , the application 210 renders images of video frames at 36 frames per second (FPS). Therefore, application 210 presents image 211 for frames N and N+2. The application also generates motion vectors and depth information 213 corresponding to each image 211 . By using the frame extrapolation solution proposed herein, the operating system 220 of the computing system 108 presents the frame to the user at 72 FPS. The operating system 220 presents the image 211 presented by the application 210 for frame N and frame N+2 221 to a display associated with the operating system 108 . Based on the rendered image 211 , and the motion vector and depth information 213 generated along with the rendered image 211 , the operating system 220 generates an image 223 for frames N+1 and N+3. The operating system 220 presents the generated images 223 for frames N+1 and N+3 to the display. Although high-quality frame extrapolation using motion vectors and depth information 213 can be important, the delay caused by sampling half can often exert a significant impact on user experience. When latency is kept high, the user may experience significant black pulling when rotating the headset 104 or significant lag when moving their controls. Several techniques are proposed herein to reduce potential delays: (1) The operating system 220 can delay the start of the frame to reduce the time instant when the application 210 finishes rendering the image and the time instant the image spends on the display interval between. (2) The operating system 220 can then extract the poses of the headset 104 and the controller 106 to fill the time gap between presenting the image and displaying the presented image to the display. (3) The operating system 220 can resample the camera pose immediately before time warping, and reproject pixels based on both self-camera rotation and camera translation considerations. This technique is called positional time warping (PTW). The depth map generated by the application can be used for this PTW. In certain embodiments, without using frame extrapolation, the head pose latency with PTW can be even lower than that in an equivalent full frame rate application. In certain embodiments, a runtime system specifically responsible for frame extrapolation may function in place of the operating system 220 . Although this disclosure describes frame extrapolation at a particular rate, this disclosure contemplates frame extrapolation at any suitable rate.

在特定具體實例中，與HMD 110相關聯之運算系統108之作業系統220，可接收對應於由應用程式210產生之當前圖框的所呈現影像、移動向量資料及深度圖。在特定具體實例中，運算系統108可包含特別負責圖框外插之運行時間系統。在此類情況下，運行時間系統可替換作業系統220以用於本文中所揭示之程序。不同於在基於二維影像之間的比較而估計移動向量之先前方法，移動向量資料及深度圖可基於由應用程式呈現之三維物件而產生。移動向量資料中之移動向量可為三維的。可使用移動模糊技術、時間抗頻疊技術或任何適合技術來產生移動向量資料。由於深度緩衝始終用於移動向量計算，因此用於產生深度圖之開銷可較小。在特定具體實例中，運算系統108之作業系統220可處理接收到之移動向量資料及深度圖，使得對應於所呈現影像之前景的區被擴展。儘管本發明描述以特定方式接收所呈現影像、移動向量資料及深度圖，但本發明涵蓋以任何適合方式接收所呈現影像、移動向量資料及深度圖。In a particular embodiment, operating system 220 of computing system 108 associated with HMD 110 may receive a rendered image, motion vector data, and depth map corresponding to a current frame generated by application 210 . In certain embodiments, computing system 108 may include a runtime system specifically responsible for frame extrapolation. In such cases, the runtime system may replace operating system 220 for the programs disclosed herein. Unlike previous methods that estimate motion vectors based on comparisons between 2D images, motion vector data and depth maps can be generated based on 3D objects rendered by an application. The motion vectors in the motion vector data can be three-dimensional. Motion vector data may be generated using motion blurring techniques, temporal anti-aliasing techniques, or any suitable technique. Since the depth buffer is always used for motion vector calculations, the overhead for generating the depth map can be small. In a particular embodiment, operating system 220 of computing system 108 may process the received motion vector data and depth map such that regions corresponding to the foreground of the rendered image are expanded. Although this disclosure describes receiving rendered images, motion vector data, and depth maps in a particular manner, this disclosure contemplates receiving rendered images, motion vector data, and depth maps in any suitable manner.

在特定具體實例中，運算系統108之作業系統220可使用深度圖，計算展現於對應於當前圖框之所呈現影像中之物件的當前三維位置。為了計算物件之當前三維位置，運算系統108之作業系統220可將深度圖自與當前圖框相關聯之當前視點反向投影至三維空間上。當前視點可相關聯於在呈現當前圖框之時間瞬間時的穿戴式裝置之位置及定向。圖 3說明圖框外插之實例資料流。作為實例而非作為限制，圖3中所說明，運算系統108之作業系統220可存取UV深度圖311，該UV深度圖311相關聯於對應至圖框 N的所呈現影像。UV深度圖311可為深度資訊至二維螢幕位置之映射。在特定具體實例中，UV深度圖311可為UV座標上之二維圖。藉由進行將UV深度圖311自與圖框 N相關聯之視點反向投影至三維空間上，運算系統108之作業系統220可計算對應於圖框 N之所呈現影像中之物件的三維位置314。藉由將對應於圖框 N之視圖投影矩陣313之倒數應用於UV深度圖311，反向投影可被進行。儘管本發明描述以特定方式計算展現於所呈現影像中之物件的當前三維位置，但本發明涵蓋以任何適合方式計算展現於所呈現影像中之物件的當前三維位置。 In certain embodiments, the operating system 220 of the computing system 108 may use the depth map to calculate the current three-dimensional position of an object present in the rendered image corresponding to the current frame. To calculate the current three-dimensional position of the object, the operating system 220 of the computing system 108 may back-project the depth map from the current viewpoint associated with the current frame onto the three-dimensional space. The current viewpoint may be associated with the position and orientation of the wearable device at the instant in time at which the current frame is presented. Figure 3 illustrates an example data flow for frame extrapolation. By way of example and not limitation, illustrated in FIG. 3 , the operating system 220 of the computing system 108 can access a UV depth map 311 associated with a rendered image corresponding to frame N . The UV depth map 311 may be a mapping of depth information to 2D screen positions. In certain embodiments, UV depth map 311 may be a two-dimensional map on UV coordinates. By performing backprojection of the UV depth map 311 from the viewpoint associated with frame N onto the three-dimensional space, the operating system 220 of the computing system 108 may calculate the three-dimensional position 314 of an object in the rendered image corresponding to frame N . By applying the inverse of the view projection matrix 313 corresponding to frame N to the UV depth map 311 , backprojection can be performed. Although this disclosure describes calculating the current three-dimensional position of an object represented in a rendered image in a particular manner, the present invention contemplates computing the current three-dimensional position of an object represented in a rendered image in any suitable manner.

在特定具體實例中，運算系統108之作業系統220可使用移動向量資料及深度圖，計算對應於過去圖框之物件的過去三維位置。為了計算物件之過去三維位置，運算系統108之作業系統220可藉由自深度圖減去移動向量來產生對應於過去圖框之經估計深度圖。運算系統108之作業系統220可將經估計深度圖自與過去圖框相關聯之過去視點反向投影至三維空間上。作為實例而非作為限制，繼續圖3中所說明之先前實例，藉由自對應於圖框 N之UV深度圖311減去對應於圖框 N之移動向量312，運算系統108之作業系統220可估計對應於圖框 N-1的深度圖321。藉由進行將對應於圖框 N-1之經估計UV深度圖321自與圖框 N-1相關聯之視點反向投影至三維空間上，運算系統108之作業系統220可計算對應於圖框 N-1之物件的三維位置324。藉由將對應於圖框N-1之視圖投影矩陣323的倒數應用於對應於圖框 N-1之經估計UV深度圖321，反向投影可被進行。儘管本發明描述以特定方式計算對應於過去圖框之物件的過去三維位置，但本發明涵蓋以任何適合方式計算對應於過去圖框之物件的過去三維位置。 In certain embodiments, the operating system 220 of the computing system 108 may use the motion vector data and the depth map to calculate past three-dimensional positions of objects corresponding to past frames. To calculate the past three-dimensional position of the object, the operating system 220 of the computing system 108 may generate an estimated depth map corresponding to the past frame by subtracting the motion vector from the depth map. The operating system 220 of the computing system 108 may back-project the estimated depth map from the past viewpoint associated with the past frame onto the three-dimensional space. By way of example and not limitation, continuing with the previous example illustrated in FIG. 3 , by subtracting the motion vector 312 corresponding to frame N from the UV depth map 311 corresponding to frame N , the operating system 220 of the computing system 108 may Depth map 321 corresponding to frame N -1 is estimated. The operating system 220 of the computing system 108 may calculate the The three-dimensional position 324 of the N -1 objects. By applying the inverse of the view projection matrix 323 corresponding to frame N-1 to the estimated UV depth map 321 corresponding to frame N -1, backprojection may be performed. Although this disclosure describes computing the past 3D position of an object corresponding to a past frame in a particular manner, the present invention contemplates computing the past 3D position of an object corresponding to a past frame in any suitable manner.

在特定具體實例中，基於物件之過去三維位置及當前三維位置，運算系統108之作業系統220可估計對應於未來圖框之物件的未來三維位置。基於物件之過去三維位置及物件之當前三維位置而估計物件之未來三維位置，可稱作空間扭曲（Space Warp）。可基於物件以恆定速度自對應於過去圖框之時間瞬間移動至對應於未來圖框之時間瞬間的假設，而進行估計物件的未來三維位置。運算系統108之作業系統220可進行線性內插以估計物件之未來三維位置。作為實例而非作為限制，繼續圖3中所說之先前實例，基於對應於圖框 N之物件的經計算三維位置314、及對應於圖框 N-1之物件的經估計三維位置324，運算系統108之作業系統220可估計物件之三維位置334。儘管本發明描述以特定方式基於物件之過去三維位置及當前三維位置而估計對應於未來圖框之物件的未來三維位置，但本發明涵蓋以任何適合方式基於物件之過去三維位置及當前三維位置而估計對應於未來圖框之物件的未來三維位置。 In certain embodiments, based on the past three-dimensional position and the current three-dimensional position of the object, the operating system 220 of the computing system 108 may estimate the future three-dimensional position of the object corresponding to the future frame. Estimating the future three-dimensional position of the object based on the past three-dimensional position and the current three-dimensional position of the object can be called space warp. Estimating the future three-dimensional position of an object may be performed based on the assumption that the object moves at a constant speed from a time instant corresponding to a past frame to a time instant corresponding to a future frame. The operating system 220 of the computing system 108 can perform linear interpolation to estimate the future three-dimensional position of the object. By way of example and not limitation, continuing with the previous example described in FIG. 3, based on the calculated three-dimensional position 314 of the object corresponding to frame N , and the estimated three-dimensional position 324 of the object corresponding to frame N -1, the calculation The operating system 220 of the system 108 can estimate the three-dimensional position 334 of the object. Although this disclosure describes estimating the future 3D position of an object corresponding to a future frame based on the object's past and current 3D positions in a particular manner, the present invention contemplates estimating the object's past and current 3D positions based on the object's past and current 3D positions in any suitable manner. A future three-dimensional position of an object corresponding to a future frame is estimated.

圖 4說明基於物件之過去位置及當前位置而對物件之未來位置的實例估計。在圖4中所說明之具體實例中，物件位於時間t1處之三維位置x1處，且位於時間t2處之三維位置x2處。運算系統108之作業系統220可藉由進行線性內插來估計時間t3處之三維位置x3，其中x3=Ler(x1, x2, (t3-t1)/(t2-t1))。儘管本發明描述以特定方式基於先前位置而進行線性內插以預測物件之三維位置，但本發明涵蓋以任何適合方式基於先前位置而進行線性內插以預測物件的三維位置。 4 illustrates an example estimate of a future location of an object based on its past location and current location . In the specific example illustrated in FIG. 4, the object is located at a three-dimensional position x1 at time t1, and is located at a three-dimensional position x2 at time t2. The operating system 220 of the computing system 108 can estimate the three-dimensional position x3 at the time t3 by performing linear interpolation, where x3=Ler(x1, x2, (t3-t1)/(t2-t1)). Although this disclosure describes linearly interpolating to predict a three-dimensional position of an object based on previous positions in a particular manner, this disclosure contemplates performing linear interpolation based on previous positions in any suitable manner to predict the three-dimensional position of an object.

在特定具體實例中，藉由將物件之經估計未來三維位置再投影至未來視點上，運算系統108之作業系統220可產生失真網格。作為實例而非作為限制，繼續圖3中所說之先前實例，藉由將對應於圖框 N+1之物件之經估計三維位置334再投影至對應於圖框 N+1之視點上，運算系統108之作業系統220可產生失真網格337。藉由把對應於圖框 N+1之視圖投影矩陣335應用物件之經估計三維位置334，再投影可被進行。可藉由再提取頭戴組104之姿態獲取視圖投影矩陣335。儘管本發明描述藉由以特定方式將物件之經估計未來三維位置再投影至未來視點上來產生失真網格，但本發明涵蓋藉由以任何適合方式將物件之經估計未來三維位置再投影至未來視點上來產生失真網格。 In certain embodiments, the operating system 220 of the computing system 108 may generate the distorted mesh by reprojecting the estimated future three-dimensional position of the object onto a future viewpoint. By way of example and not limitation, continuing the previous example described in FIG. 3, by reprojecting the estimated three-dimensional position 334 of the object corresponding to frame N+1 onto the viewpoint corresponding to frame N +1, the computation The operating system 220 of the system 108 can generate the distortion grid 337 . Reprojection can be performed by applying the view projection matrix 335 corresponding to frame N +1 to the estimated three-dimensional position 334 of the object. The view projection matrix 335 can be obtained by re-extracting the pose of the headset 104 . Although this disclosure describes generating a distorted mesh by reprojecting the estimated future 3D position of an object onto a future viewpoint in a particular manner, this disclosure contemplates generating a distorted mesh by reprojecting the estimated future 3D position of an object onto the future in any suitable manner. view point up to produce a distorted mesh.

在特定具體實例中，藉由使用物件之未來三維位置而將展現於所呈現影像中之物件再投影至與未來圖框相關聯之未來視點，運算系統108之作業系統220可產生對應於未來圖框的外插影像。為了產生外插影像，運算系統108之作業系統220可將失真網格應用於所呈現影像。作為實例而非作為限制，繼續圖3中所說明之先前實例，藉由將失真網格337應用於對應於圖框 N之所呈現影像（圖中未示），運算系統108之作業系統220可產生對應於圖框 N+1之影像（圖中未示）。運算系統108之作業系統220可將對應於圖框N+1之所產生影像，展現給與頭戴組104相關聯的顯示器。儘管本發明描述以特定方式產生對應於未來圖框之外插影像，但本發明涵蓋以任何適合方式產生對應於未來圖框之外插影像。 In certain embodiments, the operating system 220 of the computing system 108 can generate a corresponding future image by using the future three-dimensional position of the object to reproject the object represented in the rendered image to a future viewpoint associated with the future frame. The extrapolated image of the frame. To generate the extrapolated image, operating system 220 of computing system 108 may apply a distortion grid to the rendered image. By way of example and not limitation, continuing with the previous example illustrated in FIG. 3, by applying the distortion grid 337 to the rendered image (not shown) corresponding to frame N , the operating system 220 of the computing system 108 can An image (not shown) corresponding to frame N+1 is generated. The operating system 220 of the computing system 108 can display the generated image corresponding to frame N+1 to a display associated with the headset 104 . Although the present invention describes generating the extrapolated image corresponding to the future frame in a particular manner, the present invention contemplates generating the extrapolated image corresponding to the future frame in any suitable manner.

圖 5說明全圖框呈現應用程式與半圖框呈現應用程式之間的每圖框時間預算之實例比較。在圖5中，（a）說明以72 FPS呈現影像之應用程式，而（b）說明以36 FPS呈現影像之應用程式。對於（b）中之應用程式，可使用本文中所揭示之發明每秒外插額外36個圖框。對於（a）中之應用程式，每圖框總預算可為13.9ms，其可需要在應用程式210與作業系統220之間拆分。對於每個圖框，作業系統220可進行組成工作以將所呈現影像推送至後端中之螢幕上。由於應用程式210及作業系統220可共用同一圖形處理單元（Graphics Processing Unit；GPU），因此若作業系統220每vsync佔用1.3ms，則應用程式210可具有12.6ms以使用。同時，在應用程式以36 FPS呈現時，（b）中之應用程式可具有27.8ms每圖框。由於圖框外插，作業系統220可消耗更多時間用於vsync，例如如圖5中所說明之1.8ms。此外，（b）中之應用程式可耗費額外時間以用於產生移動向量，例如如圖5中所說明之2.5ms。（b）中之應用程式可具有21.7ms之GPU時間每圖框，其比（a）中之應用程式的預算大71%。 5 illustrates an example comparison of time budget per frame between a full frame rendering application and a half frame rendering application. In FIG. 5 , (a) illustrates an application that renders images at 72 FPS, and (b) illustrates an application that renders images at 36 FPS. For the application in (b), an additional 36 frames per second can be extrapolated using the invention disclosed herein. For the application in (a), the total budget per frame may be 13.9 ms, which may need to be split between the application 210 and the operating system 220 . For each frame, the operating system 220 may perform composition work to push the rendered image to the screen in the backend. Since the application program 210 and the operating system 220 can share the same Graphics Processing Unit (GPU), if the operating system 220 occupies 1.3 ms per vsync, the application program 210 can have 12.6 ms to use. Meanwhile, the application in (b) may have 27.8ms per frame when the application is rendered at 36 FPS. Due to frame extrapolation, the operating system 220 may consume more time for vsync, eg 1.8ms as illustrated in FIG. 5 . Furthermore, the application in (b) may take additional time for generating motion vectors, eg 2.5 ms as illustrated in FIG. 5 . The application in (b) may have 21.7 ms of GPU time per frame, which is 71% larger than the budget of the application in (a).

應用程式可需要呈現透明物件。舉例而言，應用程式可在向右移動之不透明物件之頂部上呈現向左移動的透明物件。對於含有兩個物件之像素，移動向量可為含糊的，此係由於像素在兩個方向上移動。然而，問題可能不太顯著。當透明表面遠離攝影機時，投影移動可在圖框之間為極小的。此外，對於粒子效應，極少移動抖動可能不明顯，透明度呈現之較大使用案例，此係由於該效應常常與快速動畫一起出現，諸如爆炸。利用圖框外插及再投影之有問題的情況可為近場快速移動物件是透明之情況。近場快速移動物件之實例會是控制器106。因此，與控制器106相關聯之物件及控制器106之任何子物件可需要為不透明的。Applications may need to render transparent objects. For example, an application may render a transparent object moving to the left on top of an opaque object moving to the right. For a pixel containing two objects, the motion vector can be ambiguous because the pixel moves in two directions. However, the problem may be less pronounced. When the transparent surface is far from the camera, the projection movement can be minimal between frames. Also, for particle effects, little movement jitter may not be noticeable, transparency presents a greater use case since this effect often occurs with fast animations, such as explosions. A problematic situation using frame extrapolation and reprojection may be the case where near-field fast moving objects are transparent. An example of a near field fast moving object would be the controller 106 . Therefore, the object associated with the controller 106 and any child objects of the controller 106 may need to be opaque.

圖框外插及再投影可致使一定程度之影像失真，尤其在背景上。當背景具有豐富紋理圖案時，失真可能不明顯。然而，當物件在清晰背景上移動時，由圖框外插及再投影引起之失真可能對使用者為明顯的。需要進行特定考慮以使背景對圖框外插更友好。Frame extrapolation and reprojection can cause some image distortion, especially on backgrounds. Distortion may not be noticeable when the background has a rich textured pattern. However, distortions caused by frame extrapolation and reprojection may be noticeable to the user when objects are moving on a clear background. Specific considerations are required to make backgrounds more friendly to frame extrapolation.

當物件快速旋轉時，圖框外插及再投影可致使物件周圍之像素失真假影。設想立方體以約每秒100轉地旋轉。立方體之自圖框至下一圖框之定向可似乎或多或少為隨機的，此係由於可能不會準確地構造移動向量。為了減輕此問題，當應用程式在移動向量產生階段期間偵測到高速旋轉時，應用程式可停用與物件旋轉相關聯之移動向量的部分。Frame extrapolation and reprojection can cause pixel distortion artifacts around the object when the object is rotated rapidly. Imagine the cube rotating at about 100 revolutions per second. The orientation of the cube from frame to next frame may appear to be more or less random, since the movement vector may not be constructed accurately. To alleviate this problem, when the application detects high-speed rotation during the motion vector generation phase, the application can disable the portion of the motion vector associated with the object's rotation.

圖 6說明用於基於應用程式所產生之移動向量及深度圖而外插圖框的實例方法600。方法可在步驟610處開始，其中運算系統108之作業系統可接收對應於由應用程式產生之當前圖框的所呈現影像、移動向量資料及深度圖。在步驟620處，運算系統108之作業系統可使用深度圖，計算對應於展現於所呈現影像中之物件的當前圖框之當前三維位置。在步驟630處，運算系統108之作業系統可使用移動向量資料及深度圖，計算對應於過去圖框之物件的過去三維位置。在步驟640，運算系統108之作業系統可基於物件之過去三維位置及當前三維位置，而估計對應於未來圖框之物件的未來三維位置。在步驟650處，運算系統108之作業系統可藉由使用物件之未來三維位置而將展現於所呈現影像中之物件再投影至與未來圖框相關聯之未來視點，來產生對應於未來圖框的外插影像。在適當的情況下，特定具體實例可重複圖6之方法之一或多個步驟。儘管本發明將圖6之方法的特定步驟描述及說明為按特定次序發生，但本發明涵蓋圖6之方法的任何適合步驟按任何適合次序發生。此外，儘管本發明描述及說明用於基於應用程式所產生之移動向量及深度圖而外插圖框之實例方法，該實例方法包括圖6之方法的特定步驟，但本發明涵蓋用於基於應用程式所產生之移動向量及深度圖而外插圖框之任何適合方法，該任何適合方法包括任何適合步驟，在適當的情況下，該些步驟可包括圖6之方法之步驟中之所有、一些或中無一者。此外，儘管本發明描述及說明實行圖6之方法的特定步驟的特定組件、裝置或系統，但本發明涵蓋實行圖6之方法的任何適合步驟之任何適合組件、裝置或系統之任何適合組合。 6 illustrates an example method 600 for outlining an inset based on motion vectors and depth maps generated by an application. The method may begin at step 610, where the operating system of the computing system 108 may receive a rendered image, motion vector data, and depth map corresponding to the current frame generated by the application. At step 620, the operating system of the computing system 108 may use the depth map to calculate the current three-dimensional position corresponding to the current frame of the object represented in the rendered image. At step 630, the operating system of the computing system 108 may calculate the past three-dimensional position of the object corresponding to the past frame using the motion vector data and the depth map. In step 640, the operating system of the computing system 108 may estimate the future three-dimensional position of the object corresponding to the future frame based on the past three-dimensional position and the current three-dimensional position of the object. At step 650, the operating system of the computing system 108 may generate a corresponding future frame by reprojecting the object represented in the rendered image to a future viewpoint associated with the future frame using the future three-dimensional position of the object. extrapolated image. Certain embodiments may repeat one or more steps of the method of FIG. 6 where appropriate. Although this disclosure describes and illustrates certain steps of the method of FIG. 6 as occurring in a particular order, this disclosure contemplates that any suitable steps of the method of FIG. 6 occur in any suitable order. Furthermore, although this disclosure describes and illustrates an example method for outlining an inset frame from motion vectors and depth maps generated by an application, which includes certain steps of the method of FIG. Any suitable method of producing motion vectors and depth maps out of inset frames, including any suitable steps, which may include, where appropriate, all, some, or all of the steps of the method of FIG. 6 None. Furthermore, although this disclosure describes and illustrates particular components, devices or systems for carrying out particular steps of the method of FIG. 6 , this disclosure contemplates any suitable combination of any suitable components, devices or systems for carrying out any suitable steps of the method of FIG. 6 .

系統及方法System and method

圖 7說明實例電腦系統700。在特定具體實例中，一或多個電腦系統700進行本文所描述或說明之一或多種方法之一或多個步驟。在特定具體實例中，一或多個電腦系統700提供本文中描述或說明之功能。在特定具體實例中，在一或多個電腦系統700上運行之軟體，進行本文中描述或說明之一或多種方法之一或多個步驟、或提供本文中描述或說明的功能。特定具體實例包括一或多個電腦系統700之一或多個部分。在本文中，在適當的情況下，對電腦系統之參考可涵蓋運算系統，且反之亦然。此外，在適當的情況下，對電腦系統之參考可涵蓋一或多個電腦系統。 FIG. 7 illustrates an example computer system 700 . In certain embodiments, one or more computer systems 700 perform one or more steps of one or more methods described or illustrated herein. In certain embodiments, one or more computer systems 700 provide the functionality described or illustrated herein. In certain embodiments, software running on one or more computer systems 700 performs one or more steps of one or more methods described or illustrated herein, or provides functions described or illustrated herein. Particular embodiments include one or more portions of one or more computer systems 700 . Herein, references to computer systems may encompass computing systems, and vice versa, where appropriate. In addition, a reference to a computer system may encompass one or more computer systems, where appropriate.

本發明涵蓋任何適合數目個電腦系統700。本發明涵蓋採取任何適合實體形式之電腦系統700。作為實例而非限制，電腦系統700可為嵌入式電腦系統、系統單晶片（system-on-chip；SOC）、單板電腦系統（single-board computer system；SBC）（諸如模組電腦（computer-on-module；COM）或模組系統（system-on-module；SOM））、桌上型電腦系統、膝上型電腦或筆記本電腦系統、互動式公共資訊查詢站、大型電腦、電腦系統之網格、移動電話、個人數位助理（personal digital assistant；PDA）、伺服器、平板電腦系統，或此等中之兩者或更多者之組合。在適當的情況下，電腦系統700可包括一或多個電腦系統700；為單式或分佈式；橫跨多個位置；橫跨多個機器；橫跨多個資料中心；或駐留於雲中，該雲可包括一或多個網路中之一或多個雲組件。在適當的情況下，一或多個電腦系統700可在無實質空間或時間限制之情況下，進行本文中描述或說明之一或多種方法之一或多個步驟。作為實例而非限制，一或多個電腦系統700可即時或以批量模式，進行本文中描述或說明之一或多種方法之一或多個步驟。在適當的情況下，一或多個電腦系統700可在不同時間或在不同位置，進行本文中描述或說明的一或多種方法之一或多個步驟。The present invention contemplates any suitable number of computer systems 700 . The invention contemplates computer system 700 taking any suitable physical form. By way of example and not limitation, the computer system 700 may be an embedded computer system, a system-on-chip (SOC), a single-board computer system (single-board computer system; SBC) (such as a computer- on-module; COM) or module system (system-on-module; SOM)), desktop computer system, laptop or notebook computer system, interactive public information query station, mainframe computer, network of computer systems cell phone, personal digital assistant (PDA), server, tablet computer system, or a combination of two or more of these. Where appropriate, computer system 700 may comprise one or more computer systems 700; be standalone or distributed; span multiple locations; span multiple machines; span multiple data centers; , the cloud may include one or more cloud components in one or more networks. Where appropriate, one or more computer systems 700 may perform one or more steps of one or more methods described or illustrated herein without substantial spatial or temporal limitation. By way of example and not limitation, one or more computer systems 700 may perform one or more steps of one or more methods described or illustrated herein in real-time or in batch mode. Where appropriate, one or more computer systems 700 may perform one or more steps of one or more methods described or illustrated herein at different times or at different locations.

在特定具體實例中，電腦系統700包括處理器702、記憶體704、儲存器706、輸入/輸出（input/output；I/O）介面708、通信介面710，及匯流排712。儘管本發明描述及示出具有在特定配置中之特定數目個特定組件的特定電腦系統，但本發明涵蓋在任何適合配置中之任何適合數目個任何適合組件之任何適合電腦系統。In a specific example, computer system 700 includes processor 702 , memory 704 , storage 706 , input/output (I/O) interface 708 , communication interface 710 , and bus 712 . Although this disclosure describes and illustrates a particular computer system with a particular number of particular components in a particular configuration, this disclosure contemplates any suitable computer system with any suitable number of any suitable components in any suitable configuration.

在特定具體實例中，處理器702包括用於執行指令（諸如組成電腦程式之指令）之硬體。作為實例而非作為限制，為了執行指令，處理器702可自內部暫存器、內部快取記憶體、記憶體704或儲存器706取回（或提取）指令；對指令進行解碼並執行指令；且接著將一或多個結果寫入至內部暫存器、內部快取記憶體、記憶體704或儲存器706。在特定具體實例中，處理器702可包括用於資料、指令或位址之一或多個內部快取記憶體。在適當的情況下，本發明涵蓋包括任何適合數目個任何適合內部快取記憶體之處理器702。作為實例而非限制，處理器702可包括一或多個指令快取記憶體、一或多個資料快取記憶體及一或多個轉譯後備緩衝器（translation lookaside buffer；TLB）。指令快取記憶體中之指令，可為記憶體704或儲存器706中之指令的複本，且指令快取記憶體可加速藉由處理器702進行的對於彼等指令的取回。資料快取記憶體中之資料，可為記憶體704或儲存器706中供在處理器702處執行的指令操作之資料的複本；供在處理器702處執行之後續指令存取、或供寫入至記憶體704或儲存器706的在處理器702處執行的先前指令之結果；或其他適合資料。資料快取記憶體可加速藉由處理器702進行的讀取或寫入操作。TLB可加速用於處理器702之虛擬位址轉譯。在特定具體實例中，處理器702可包括用於資料、指令或位址之一或多個內部暫存器。在適當的情況下，本發明涵蓋包括任何適合數目個任何適合內部暫存器之處理器702。在適當的情況下，處理器702可包括一或多個算術邏輯單元（arithmetic logic unit；ALU）；為多核處理器；或包括一或多個處理器702。儘管本發明描述及說明特定處理器，但本發明涵蓋任何適合處理器。In certain embodiments, processor 702 includes hardware for executing instructions, such as those making up a computer program. By way of example and not limitation, to execute instructions, processor 702 may retrieve (or fetch) instructions from internal scratchpad, internal cache, memory 704, or storage 706; decode and execute the instructions; And then write one or more results to internal registers, internal cache, memory 704 or storage 706 . In certain embodiments, processor 702 may include one or more internal cache memories for data, instructions, or addresses. The invention contemplates processor 702 including any suitable number of any suitable internal cache memory, where appropriate. By way of example and not limitation, processor 702 may include one or more instruction caches, one or more data caches, and one or more translation lookaside buffers (TLBs). Instructions in the I-cache may be copies of instructions in memory 704 or storage 706 , and the I-cache may speed up the retrieval of those instructions by the processor 702 . The data in the data cache memory can be a copy of the data in the memory 704 or the storage 706 for the operation of the instructions executed at the processor 702; for the subsequent instructions executed at the processor 702 to access or write the results of previous instructions executed at processor 702; or other suitable data, into memory 704 or storage 706. The data cache can speed up read or write operations performed by the processor 702 . The TLB can speed up virtual address translation for the processor 702 . In certain embodiments, processor 702 may include one or more internal registers for one or more of data, instructions, or addresses. The invention encompasses processor 702 including any suitable number of any suitable internal registers, where appropriate. Where appropriate, the processor 702 may include one or more arithmetic logic units (ALU); be a multi-core processor; or include one or more processors 702 . Although this disclosure describes and illustrates a particular processor, this disclosure contemplates any suitable processor.

在特定具體實例中，記憶體704包括主記憶體，其用於儲存供處理器702執行之指令、或供處理器702操作所針對之資料。作為實例而非作為限制，電腦系統700可自儲存器706或另一來源（諸如另一電腦系統700），將指令載入至記憶體704。處理器702接著可自記憶體704將指令載入至內部暫存器或內部快取記憶體。為了執行指令，處理器702可自內部暫存器或內部快取記憶體取回指令並對其進行解碼。在指令執行期間或之後，處理器702可將一或多個結果（其可為中間或最終結果）寫入至內部暫存器或內部快取記憶體。處理器702接著可將彼等結果中之一或多者寫入至記憶體704。在特定具體實例中，處理器702僅僅執行一或多個內部暫存器或內部快取記憶體中或記憶體704（與儲存器706相對或在別處）中的指令，且僅僅對一或多個內部暫存器或內部快取記憶體中或記憶體704（與儲存器706相對或在別處）中之資料進行操作。一或多個記憶體匯流排（其可各自包括位址匯流排及資料匯流排）可將處理器702耦接至記憶體704。如下文所描述，匯流排712可包括一或多個記憶體匯流排。在特定具體實例中，一或多個記憶體管理單元（memory management unit；MMU）駐留在處理器702與記憶體704之間，且促進由處理器702請求之對記憶體704的存取。在特定具體實例中，記憶體704包括隨機存取記憶體（random access memory；RAM）。在適當的情況下，此RAM可為揮發性記憶體。在適當的情況下時，此RAM可為動態RAM（dynamic RAM；DRAM）或靜態RAM（static RAM；SRAM）。此外，在適當的情況下，此RAM可為單埠或多埠RAM。本發明涵蓋任何適合RAM。在適當的情況下，記憶體704可包括一或多個記憶體704。儘管本發明描述及說明特定記憶體，但本發明涵蓋任何適合記憶體。In certain embodiments, memory 704 includes main memory for storing instructions for execution by processor 702 or data against which processor 702 operates. By way of example and not limitation, computer system 700 may load instructions into memory 704 from storage 706 or from another source, such as another computer system 700 . The processor 702 can then load the instructions from the memory 704 into an internal register or an internal cache. To execute the instructions, processor 702 may retrieve and decode the instructions from internal register or internal cache. During or after execution of instructions, processor 702 may write one or more results (which may be intermediate or final results) to internal registers or internal cache memory. Processor 702 may then write one or more of those results to memory 704 . In certain embodiments, processor 702 executes only one or more instructions in internal scratchpad or internal cache memory or in memory 704 (as opposed to storage 706 or elsewhere), and only for one or more internal scratchpad or internal cache memory or in memory 704 (as opposed to storage 706 or elsewhere). One or more memory buses (which may each include an address bus and a data bus) may couple processor 702 to memory 704 . As described below, busses 712 may include one or more memory buses. In certain embodiments, one or more memory management units (MMUs) reside between the processor 702 and the memory 704 and facilitate accesses to the memory 704 requested by the processor 702 . In a particular embodiment, memory 704 includes random access memory (random access memory; RAM). Where appropriate, this RAM can be volatile memory. This RAM may be dynamic RAM (DRAM) or static RAM (SRAM), where appropriate. Furthermore, this RAM may be a single-port or multi-port RAM, where appropriate. This invention encompasses any suitable RAM. Memory 704 may include one or more memories 704, where appropriate. Although this disclosure describes and illustrates particular memory, this disclosure contemplates any suitable memory.

在特定具體實例中，儲存器706包括用於資料或指令之大容量儲存器。作為實例而非限制，儲存器706可包括硬碟機（hard disk drive；HDD）、軟碟機、快閃記憶體、光學光碟、磁性光學光碟、磁帶或通用串列匯流排（Universal Serial Bus；USB）驅動器、或此等中之兩者或更多者的組合。在適當的情況下，儲存器706可包括可移除或不可移除（或固定）媒體。在適當的情況下，儲存器706可在電腦系統700內部或外部。在特定具體實例中，儲存器706為非揮發性固態記憶體。在特定具體實例中，儲存器706包括唯讀記憶體（read-only memory；ROM）。在適當的情況時，此ROM可為遮罩經程式化ROM、可程式化ROM（programmable ROM；PROM）、可抹除PROM（erasable PROM；EPROM）、電可抹除PROM（electrically erasable PROM；EEPROM）、電可改ROM（electrically alterable ROM；EAROM），或快閃記憶體或此等中之兩者或更多者的組合。本發明涵蓋採取任何適合實體形式的大容量儲存器706。在適當的情況下，儲存器706可包括促進處理器702與儲存器706之間的通信之一或多個儲存裝置控制單元。在適當的情況下，儲存器706可包括一或多個儲存器706。儘管本發明描述及說明特定儲存器，但本發明涵蓋任何適合儲存器。In certain embodiments, storage 706 includes mass storage for data or instructions. By way of example and not limitation, storage 706 may include a hard disk drive (HDD), floppy disk, flash memory, optical disk, magneto optical disk, magnetic tape, or a Universal Serial Bus (Universal Serial Bus; USB) drive, or a combination of two or more of these. Storage 706 may include removable or non-removable (or fixed) media, where appropriate. Storage 706 may be internal or external to computer system 700, as appropriate. In a particular embodiment, storage 706 is a non-volatile solid-state memory. In a particular embodiment, storage 706 includes read-only memory (ROM). When appropriate, this ROM can be masked programmable ROM, programmable ROM (programmable ROM; PROM), erasable PROM (erasable PROM; EPROM), electrically erasable PROM (electrically erasable PROM; EEPROM ), electrically alterable ROM (electrically alterable ROM; EAROM), or flash memory or a combination of two or more of these. The invention contemplates mass storage 706 taking any suitable physical form. Storage 706 may include one or more storage device control units that facilitate communication between processor 702 and storage 706, where appropriate. Storage 706 may include one or more storages 706, where appropriate. Although this disclosure describes and illustrates a particular storage, this disclosure contemplates any suitable storage.

在特定具體實例中，I/O介面708包括硬體、軟體或兩者，提供一或多個介面用於電腦系統700與一或多個I/O裝置之間的通信。在適當的情況下，電腦系統700可包括此等I/O裝置中之一或多者。此等I/O裝置中之一或多者可實現個人與電腦系統700之間的通信。作為實例而非作為限制，I/O裝置可包括鍵盤、小鍵盤、麥克風、監視器、滑鼠、印表機、掃描器、揚聲器、靜態攝影機、手寫筆、平板電腦、觸控螢幕、軌跡球、視訊攝影機，另一適合I/O裝置或此等中之兩者或更多者的組合。I/O裝置可包括一或多個感測器。本發明涵蓋任何適合I/O裝置及用於其之任何適合I/O介面708。在適當的情況下，I/O介面708可包括一或多個裝置或軟體驅動器，使得處理器702能夠驅動此等I/O裝置中之一或多者。在適當的情況下，I/O介面708可包括一或多個I/O介面708。儘管本發明描述及說明特定I/O介面，但本發明涵蓋任何適合I/O介面。In certain embodiments, I/O interface 708 includes hardware, software, or both, providing one or more interfaces for communication between computer system 700 and one or more I/O devices. Computer system 700 may include one or more of these I/O devices, where appropriate. One or more of these I/O devices may enable communication between the individual and the computer system 700 . By way of example and not limitation, I/O devices may include keyboards, keypads, microphones, monitors, mice, printers, scanners, speakers, still cameras, stylus, tablets, touch screens, trackballs , a video camera, another suitable I/O device or a combination of two or more of these. An I/O device may include one or more sensors. The present invention contemplates any suitable I/O device and any suitable I/O interface 708 therefor. I/O interface 708 may include one or more devices or software drivers, where appropriate, enabling processor 702 to drive one or more of these I/O devices. I/O interface 708 may include one or more I/O interfaces 708, where appropriate. Although this disclosure describes and illustrates a particular I/O interface, this disclosure contemplates any suitable I/O interface.

在特定具體實例中，通信介面710包括硬體、軟體或兩者，提供一或多個介面用於電腦系統700與一或多個其他電腦系統700或一或多個網路之間的通信（諸如基於封包之通信）。作為實例而非限制，通信介面710可包括用於與乙太網或其他基於有線之網路通信的網路介面控制器（network interface controller；NIC），或用於與無線網路（諸如WI-FI網路）通信的無線NIC（wireless NIC；WNIC）或無線配接器。本發明涵蓋任何適合網路及用於其之任何適合通信介面710。作為實例而非限制，電腦系統700可與特用網路、個人區域網路（personal area network ；PAN）、區域網路（local area network；LAN）、廣域網路（wide area network；WAN）、都會區域網路（metropolitan area network；MAN）或網際網路之一或多個部分或此等中之兩者或更多者的組合通信。此等網路中之一或多者的一或多個部分可為有線或無線的。作為實例，電腦系統700可與無線PAN（wireless PAN；WPAN）（諸如藍芽WPAN）、WI-FI網路、WI-MAX網路、蜂巢式電話網路（諸如全球移動通信系統（Global System for Mobile Communication；GSM）網路），或其他適合無線網路或此等中之兩者或更多者的組合通信。在適當的情況下，電腦系統700可包括用於此等網路中之任一者的任何適合通信介面710。在適當的情況下，通信介面710可包括一或多個通信介面710。儘管本發明描述及說明特定通信介面，但本發明涵蓋任何適合通信介面。In certain embodiments, communication interface 710 includes hardware, software, or both, providing one or more interfaces for communication between computer system 700 and one or more other computer systems 700 or one or more networks ( such as packet-based communication). By way of example and not limitation, communication interface 710 may include a network interface controller (network interface controller; NIC) for communicating with Ethernet or other wire-based networks, or for communicating with wireless networks such as WI- FI network) communication wireless NIC (wireless NIC; WNIC) or wireless adapter. The present invention contemplates any suitable network and any suitable communication interface 710 therefor. By way of example and not limitation, the computer system 700 can be connected to ad hoc networks, personal area networks (PAN), local area networks (LAN), wide area networks (WAN), metropolitan One or more parts of a metropolitan area network (MAN) or the Internet, or a combination of two or more of these. One or more portions of one or more of these networks may be wired or wireless. As an example, the computer system 700 can communicate with wireless PAN (wireless PAN; WPAN) (such as Bluetooth WPAN), WI-FI network, WI-MAX network, cellular telephone network (such as Global System for Mobile Communications (Global System for Mobile Communication; GSM) network), or other suitable wireless network or a combination of two or more of these communications. Computer system 700 may include any suitable communications interface 710 for any of these networks, where appropriate. Where appropriate, the communication interface 710 may include one or more communication interfaces 710 . Although this disclosure describes and illustrates a particular communication interface, this disclosure contemplates any suitable communication interface.

在特定具體實例中，匯流排712包括將電腦系統700之組件彼此耦接的硬體、軟體，或兩者。作為實例而非限制，匯流排712可包括加速圖形埠（Accelerated Graphics Port；AGP）或其他圖形匯流排、增強行業標準架構（Enhanced Industry Standard Architecture；EISA）匯流排、前側匯流排（front-side bus；FSB）、超傳輸（HYPERTRANSPORT；HT）互連、工業標準架構（Industry Standard Architecture；ISA）匯流排、INFINIBAND互連、低針腳數（low-pin-count；LPC）匯流排、記憶體匯流排、微型頻道架構（Micro Channel Architecture；MCA）匯流排、周邊組件互連（Peripheral Component Interconnect；PCI）匯流排、PCI高速（PCI-Express；PCIe）匯流排、串列進階附接技術（serial advanced technology attachment；SATA）匯流排、視訊電子標準協會局部（Video Electronics Standards Association local；VLB）匯流排，或另一適合匯流排或此等匯流排中之兩者或更多者的組合。在適當的情況下，匯流排712可包括一或多個匯流排712。儘管本發明描述及說明特定匯流排，但本發明涵蓋任何適合匯流排或互連件。In certain embodiments, bus 712 includes hardware, software, or both that couple components of computer system 700 to each other. By way of example and not limitation, bus 712 may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industry Standard Architecture (EISA) bus, a front-side bus ; FSB), Hypertransport (HYPERTRANSPORT; HT) interconnect, Industry Standard Architecture (Industry Standard Architecture; ISA) bus, INFINIBAND interconnect, low pin count (low-pin-count; LPC) bus, memory bus , Micro Channel Architecture (MCA) bus, Peripheral Component Interconnect (PCI) bus, PCI Express (PCI-Express; PCIe) bus, serial advanced attachment technology (serial advanced technology attachment (SATA) bus, Video Electronics Standards Association local (VLB) bus, or another suitable bus or a combination of two or more of such buses. Bus bars 712 may include one or more bus bars 712, where appropriate. Although this disclosure describes and illustrates a particular busbar, this disclosure contemplates any suitable busbar or interconnect.

本文中，電腦可讀取非暫時性儲存媒體或媒體可包括一或多個基於半導體或其他積體電路（integrated circuit；IC）（諸如場可程式化閘極陣列（field-programmable gate array；FPGA）或特殊應用IC（application-specific IC；ASIC））、硬碟機（HDD）、混合式硬碟機（hybrid hard drive；HHD）、光學光碟、光學光碟機（optical disc drives；ODD）、磁性光學光碟、磁性光學驅動器、軟碟、軟碟機（floppy disk drive；FDD）、磁帶、固態磁碟機（solid-state drive；SSD）、RAM驅動機、安全數位卡或驅動器、任何其他適合電腦可讀取非暫時性儲存媒體，或此等中之兩者或更多者的任何適合組合。在適當的情況下，電腦可讀取非暫時性儲存媒體可為揮發性、非揮發性或揮發性與非揮發性之組合。Herein, the computer-readable non-transitory storage medium or the medium may include one or more semiconductor-based or other integrated circuit (integrated circuit; IC) (such as field-programmable gate array (field-programmable gate array; FPGA) ) or application-specific IC (application-specific IC; ASIC)), hard disk drive (HDD), hybrid hard drive (hybrid hard drive; HHD), optical disc, optical disc drive (optical disc drives; ODD), magnetic Optical disc, magnetic optical drive, floppy disk, floppy disk drive (FDD), magnetic tape, solid-state drive (SSD), RAM drive, secure digital card or drive, any other suitable computer The non-transitory storage medium may be read, or any suitable combination of two or more of these. Computer readable non-transitory storage media may be volatile, non-volatile, or a combination of volatile and non-volatile, as appropriate.

其他other

在本文中，除非另外明確指示或上下文另外指示，否則「或」為包括性且並非排他性的。因此，除非另外明確指示或上下文另外指示，否則本文中「A或B」意謂「A、B或兩者」。此外，除非另外明確指示或上下文另外指示，否則「及」為聯合及各自兩者。因此，除非另外明確指示或上下文另外指示，否則本文中「A及B」意謂「A及B，聯合地或各自地」。Herein, unless expressly indicated otherwise or the context dictates otherwise, "or" is inclusive and not exclusive. Thus, herein "A or B" means "A, B, or both" unless expressly indicated otherwise or the context dictates otherwise. Further, "and" means both jointly and each unless expressly indicated otherwise or the context dictates otherwise. Thus, herein "A and B" means "A and B, jointly or separately," unless expressly indicated otherwise or the context dictates otherwise.

本發明之範圍涵蓋所屬技術領域中具有通常知識者將瞭解的本文中描述或說明之實例具體實例的全部改變、取代、變化、更改及修改。本發明之範疇不限於本文中所描述或說明的實例具體實例。此外，儘管本發明將本文各別具體實例描述及說明為包括特定組件、元件、特徵、功能、操作或步驟，但此等具體實例中之任一者可包括所屬技術領域中具有通常知識者將瞭解的本文中任何位置描述或說明的組件、元件、特徵、功能、操作或步驟中之任一者的任何組合或排列。此外，所附申請專利範圍中對經調適以、經配置以、能夠、經組態以、經啟用以、經操作以或可操作以進行特定功能的設備或系統或設備或系統之組件的參考涵蓋只要彼設備、系統或組件因此經調適、經配置、能夠、經組態、經啟用、經操作或可操作，彼設備、系統、組件（不管其或彼特定功能）便經激活、接通或解鎖。另外，儘管本發明將特定具體實例描述或說明為提供特定優點，但特定具體實例可提供此等優點中之無一者、一些或全部。The scope of the present invention encompasses all changes, substitutions, changes, alterations and modifications of the example embodiments described or illustrated herein that would occur to one of ordinary skill in the art. The scope of the invention is not limited to the example embodiments described or illustrated herein. Furthermore, although the present disclosure has described and illustrated various embodiments herein as including particular components, elements, features, functions, operations or steps, any of such embodiments may include one of ordinary skill in the art. any combination or permutation of any of the components, elements, features, functions, operations or steps described or illustrated anywhere herein. Furthermore, references in the appended claims to an apparatus or system or a component of an apparatus or system adapted, configured, able, configured, enabled, operated, or operable to perform a particular function Covers that a device, system, component (regardless of its or that specific function) is activated, switched on, or unlock. Additionally, although particular embodiments are described or illustrated herein as providing particular advantages, particular embodiments may provide any, some, or all of such advantages.

100A:人工實境系統 100B:實例擴增實境系統 102:使用者 104:頭戴組 106:控制器 108:運算系統 110:頭戴式顯示器 112:框架 114:顯示器 210:應用程式 211:影像 213:移動向量及深度資訊 220:作業系統 221:影像 223:影像 311:UV深度圖 312:移動向量 313:視圖投影矩陣 314:三維位置 321:深度圖 323:視圖投影矩陣 324:三維位置 334:經估計三維位置 335:視圖投影矩陣 337:失真網格 600:方法 610:步驟 620:步驟 630:步驟 640:步驟 650:步驟 700:電腦系統 702:處理器 704:記憶體 706:儲存器 708:輸入/輸出介面 710:通信介面 712:匯流排 100A: Artificial Reality Systems 100B: Example Augmented Reality System 102: user 104: Headwear group 106: Controller 108: Computing system 110:Head-mounted display 112: frame 114: Display 210: Application 211: Image 213:Movement vector and depth information 220: Operating system 221: Image 223: Image 311: UV depth map 312: Moving vector 313: View projection matrix 314: Three-dimensional position 321: Depth map 323:View projection matrix 324: Three-dimensional position 334: Estimated 3D position 335:View projection matrix 337:Distortion Mesh 600: method 610: Step 620: Step 630: step 640: step 650: step 700: Computer system 702: Processor 704: memory 706: Storage 708: Input/Output Interface 710: communication interface 712: busbar

[圖1A]說明實例人工實境系統。[FIG. 1A] An example artificial reality system is illustrated.

[圖1B]說明實例擴增實境系統。[FIG. 1B] Illustrates an example augmented reality system.

[圖2]說明使用應用程式所產生之移動向量及深度資訊進行之圖框外插及再投影之實例概述。[FIG. 2] An overview illustrating an example of frame extrapolation and reprojection using motion vector and depth information generated by an application.

[圖3]說明圖框外插之實例資料流。[Fig. 3] Example data flow illustrating frame extrapolation.

[圖4]說明基於物件之過去位置及當前位置而對物件之未來位置的實例估計。[FIG. 4] Illustrates an example estimation of the future location of an object based on its past location and current location.

[圖5]說明全圖框呈現應用程式與半圖框呈現應用程式之間的每圖框時間預算之實例比較。[FIG. 5] illustrates an example comparison of time budget per frame between a full frame rendering application and a half frame rendering application.

[圖6]說明用於基於應用程式所產生之移動向量及深度圖而外插圖框的實例方法。[FIG. 6] Illustrates an example method for outlining an inset based on motion vectors and depth maps generated by an application.

[圖7]說明實例電腦系統。[FIG. 7] An example computer system is illustrated.

600:方法 600: method

610:步驟 610: Step

620:步驟 620: Step

630:步驟 630: step

640:步驟 640: step

650:步驟 650: step

Claims

A method comprising, by a computing system associated with a wearable device: receiving a rendered image, motion vector data, and a depth map corresponding to a current frame of a video stream generated by an application; calculating a current three-dimensional position of an object corresponding to the current frame using the depth map for an object represented in the rendered image; calculating a past three-dimensional position of the object corresponding to a past frame using the motion vector data and the depth map; estimating a future three-dimensional position of the object corresponding to a future frame based on the past three-dimensional position and the current three-dimensional position of the object; and generating an extrapolated image corresponding to the future frame by reprojecting the object represented in the rendered image to a future viewpoint associated with the future frame using the future three-dimensional position of the object .

The method of claim 1, wherein the motion vector data and the depth map are generated based on a three-dimensional object presented by the application program.

The method according to claim 2, wherein the motion vector in the motion vector data is three-dimensional.

The method of claim 1, further comprising processing the received motion vector data and the depth map such that a region corresponding to a foreground of the rendered image is expanded.

The method of claim 1, wherein calculating the current three-dimensional position of the object includes back-projecting the depth map from a current viewpoint associated with the current frame onto a three-dimensional space.

The method of claim 5, wherein the current viewpoint is associated with a position and an orientation of the wearable device at a time instant when the current frame is presented.

The method of claim 1, wherein calculating the past three-dimensional position of the object includes: generating an estimated depth map corresponding to the past frame by subtracting the motion vector from the depth map; and The estimated depth map is backprojected onto a three-dimensional space from a past viewpoint associated with the past frame.

The method of claim 1, wherein estimating the future three-dimensional position of the object is based on an assumption that the object moves at a constant speed from a time instant corresponding to the past frame to a time instant corresponding to the future frame Box one of the instants of time.

The method of claim 8, further comprising generating a distortion mesh by projecting the estimated future three-dimensional position of the object onto the future viewpoint.

The method of claim 9, wherein generating the extrapolated image corresponding to the future frame comprises applying the distortion grid to the rendered image.

One or more computer-readable non-transitory storage media containing software operable when executed to: receiving a rendered image, motion vector data, and a depth map corresponding to a current frame of a video stream generated by an application; calculating a current three-dimensional position of an object corresponding to the current frame using the depth map for an object represented in the rendered image; calculating a past three-dimensional position of the object corresponding to a past frame using the motion vector data and the depth map; estimating a future three-dimensional position of the object corresponding to a future frame based on the past three-dimensional position and the current three-dimensional position of the object; and generating an extrapolated image corresponding to the future frame by reprojecting the object represented in the rendered image to a future viewpoint associated with the future frame using the future three-dimensional position of the object .

The medium of claim 11, wherein the motion vector data and the depth map are generated based on a three-dimensional object presented by the application.

The medium according to claim 12, wherein the motion vector in the motion vector data is three-dimensional.

The medium of claim 11, wherein the software, when executed, is further operable to process the received motion vector data and the depth map such that a region corresponding to a foreground of the rendered image is expanded.

The medium of claim 11, wherein calculating the current three-dimensional position of the object includes back-projecting the depth map from a current viewpoint associated with the current frame onto a three-dimensional space.

The medium of claim 15, wherein the current viewpoint is associated with a position and an orientation of the wearable device at an instant in time when the current frame is presented.

The method of claim 11, wherein calculating the past three-dimensional position of the object comprises: generating an estimated depth map corresponding to the past frame by subtracting the motion vector from the depth map; and The estimated depth map is backprojected onto a three-dimensional space from a past viewpoint associated with the past frame.

The medium of claim 11, wherein estimating the future three-dimensional position of the object is based on an assumption that the object moves at a constant speed from a time instant corresponding to the past frame to a time instant corresponding to the future frame Box one of the instants of time.

The medium of claim 18, wherein the software, when executed, is further operable to: generate a distortion mesh by projecting the estimated future three-dimensional position of the object onto the future viewpoint.

A system comprising: one or more processors; and a non-transitory memory coupled to the processors, the non-transitory memory containing instructions executable by the processors, the processors in When executing these instructions, you can operate to perform the following operations: receiving a rendered image, motion vector data, and a depth map corresponding to a current frame of a video stream generated by an application; calculating a current three-dimensional position of an object corresponding to the current frame using the depth map for an object represented in the rendered image; calculating a past three-dimensional position of the object corresponding to a past frame using the motion vector data and the depth map; estimating a future three-dimensional position of the object corresponding to a future frame based on the past three-dimensional position and the current three-dimensional position of the object; and generating an extrapolated image corresponding to the future frame by reprojecting the object represented in the rendered image to a future viewpoint associated with the future frame using the future three-dimensional position of the object .