JP2018523326A

JP2018523326A - Full spherical capture method

Info

Publication number: JP2018523326A
Application number: JP2017555540A
Authority: JP
Inventors: ラッセル，アンドリュー・イアン
Original assignee: Google LLC
Current assignee: Google LLC
Priority date: 2015-09-16
Filing date: 2016-09-16
Publication date: 2018-08-16
Anticipated expiration: 2036-09-16
Also published as: EP3350653A1; CN107636534A; CN107636534B; EP3350653B1; GB2555724A; DE112016004216T5; GB201716679D0; WO2017049055A9; GB2555724B; KR20170133444A; KR101986329B1; JP6643357B2; US20170076429A1; GB2555724A8; US10217189B2; WO2017049055A1

Abstract

球状コンテンツを取込むためのシステムおよび方法が説明される。当該システムおよび方法は、２次元データを３次元データに変換するための、複数のカメラを用いて取込まれた複数の画像内の領域を判断することと、領域における画素の一部について深度値を計算することと、領域における画素の一部についての画像データを含む球状画像を生成することと、画像データを使用して、画像処理システムによって生成されたコンピュータグラフィックスオブジェクトの３次元空間において３次元表面を構成することと、画像データを使用して、コンピュータグラフィックスオブジェクトの表面へのテクスチャマッピングを生成することと、頭部装着型ディスプレイデバイスにおける表示のために、球状画像およびテクスチャマッピングを送信することとを含み得る。 Systems and methods for capturing spherical content are described. The system and method determine a region in a plurality of images captured using a plurality of cameras for converting two-dimensional data to three-dimensional data, and depth values for some of the pixels in the region , Generating a spherical image that includes image data for a portion of the pixels in the region, and using the image data, 3 in a three-dimensional space of computer graphics objects generated by the image processing system Construct spherical surfaces, use image data to generate texture mappings to the surface of computer graphics objects, and send spherical images and texture mappings for display on head mounted display devices Can include.

Description

関連出願との相互参照
本願は、２０１５年９月１６日に出願された「全球状取込方法」（General Spherical Capture Methods）と題された米国仮特許出願第６２／２１９，５３４号の優先権を主張する、２０１６年９月１５日に出願された「全球状取込方法」と題された米国非仮特許出願第１５／２６６，６０２号の優先権を主張し、当該非仮特許出願の継続出願である。これらの出願は双方とも、それら全体がここに引用により援用される。 CROSS REFERENCE TO RELATED APPLICATION This application is a priority of US Provisional Patent Application No. 62 / 219,534, filed September 16, 2015, entitled “General Spherical Capture Methods”. Claiming the priority of US Non-Provisional Patent Application No. 15 / 266,602, entitled “Entire Ball Capture Method”, filed on September 15, 2016, It is a continuation application. Both of these applications are hereby incorporated by reference in their entirety.

技術分野
本明細書は一般に、２次元（２Ｄ）および３次元（３Ｄ）画像を取込んで処理するための方法および装置に関する。 TECHNICAL FIELD This description relates generally to methods and apparatus for capturing and processing two-dimensional (2D) and three-dimensional (3D) images.

背景
球状画像は、シーンの３６０度のビューを提供できる。そのような画像は、特定の投影フォーマットを使用して取込まれ、規定され得る。たとえば、球状画像は、画像の幅および高さに関するアスペクト比が２：１である単一画像を提供するために、正距円筒投影フォーマットで規定されてもよい。別の例では、球状画像は、立方体の６つの面に再マッピングされた画像を提供するために、立方体投影フォーマットで規定されてもよい。 Background A spherical image can provide a 360 degree view of the scene. Such images can be captured and defined using a specific projection format. For example, a spherical image may be defined in an equirectangular projection format to provide a single image with an aspect ratio of 2: 1 for the width and height of the image. In another example, a spherical image may be defined in a cube projection format to provide an image that is remapped to the six faces of the cube.

概要
１つ以上のコンピュータのシステムが、動作時に当該システムにアクションを行なわせる、当該システム上にインストールされたソフトウェア、ファームウェア、ハードウェア、またはそれらの組合せを有することにより、特定の動作またはアクションを行なうように構成され得る。１つ以上のコンピュータプログラムが、データ処理装置によって実行されると当該装置にアクションを行なわせる命令を含むことにより、特定の動作またはアクションを行なうように構成され得る。 Overview A system of one or more computers performs a specific operation or action by having software, firmware, hardware, or combinations thereof installed on the system that causes the system to perform an action during operation Can be configured as follows. One or more computer programs may be configured to perform a particular operation or action by including instructions that, when executed by a data processing device, cause the device to perform an action.

一般的な一局面では、これらの命令は、２次元データを３次元データに変換するための、複数のカメラを用いて取込まれた複数の画像内の領域を判断するステップを含む、コンピュータにより実現される方法を含んでいてもよい。２次元データを３次元データに変換するための領域を判断するステップは、頭部装着型ディスプレイで検出されたユーザ入力に少なくとも部分的に基づいて自動的に行なわれてもよい。ユーザ入力は頭部回転を含んでいてもよく、３次元データは、ビューに対応する複数の画像のうちの少なくとも１つにおいて３次元部分を生成するために使用されてもよい。別の例では、ユーザ入力は凝視方向の変更を含んでいてもよく、３次元データは、ユーザの視線上の複数の画像のうちの少なくとも１つにおいて３次元部分を生成するために使用されてもよい。 In one general aspect, these instructions include a step of determining regions in a plurality of images captured using a plurality of cameras for converting two-dimensional data to three-dimensional data. It may include a method to be realized. The step of determining a region for converting the two-dimensional data into the three-dimensional data may be automatically performed based at least in part on a user input detected by the head mounted display. User input may include head rotation, and the 3D data may be used to generate a 3D portion in at least one of the plurality of images corresponding to the view. In another example, the user input may include a change in gaze direction, and the three-dimensional data is used to generate a three-dimensional portion in at least one of the plurality of images on the user's line of sight. Also good.

この方法はまた、領域における画素の一部について深度値を計算するステップと、球状画像を生成するステップとを含んでいてもよい。球状画像は、領域における画素の一部についての画像データを含んでいてもよい。いくつかの実現化例では、画素の一部は、領域における画素の一部のうちの１つ以上に関連付けられた対応する深度値と等しい半径で、コンピュータグラフィックスオブジェクトの表面上に表わされる。この方法はまた、画像データを使用して、画像処理システムによって生成されたコンピュータグラフィックスオブジェクトの３次元空間において３次元表面を構成するステップと、画像データを使用して、コンピュータグラフィックスオブジェクトの表面へのテクスチャマッピングを生成するステップとを含んでいてもよい。テクスチャマッピングは、コンピュータグラフィックスオブジェクトの表面への画像データのマッピングを含んでいてもよい。この方法はまた、頭部装着型ディスプレイデバイスにおける表示のために、球状画像およびテクスチャマッピングを送信するステップを含んでいてもよい。本局面の他の実施形態は、これらの方法のアクションを行なうように各々構成された、対応するコンピュータシステム、装置、および、１つ以上のコンピュータ記憶装置上に記録されたコンピュータプログラムを含む。 The method may also include calculating a depth value for a portion of the pixels in the region and generating a spherical image. The spherical image may include image data for some of the pixels in the region. In some implementations, the portion of the pixel is represented on the surface of the computer graphics object with a radius equal to the corresponding depth value associated with one or more of the portion of the pixel in the region. The method also includes using the image data to construct a three-dimensional surface in a three-dimensional space of the computer graphics object generated by the image processing system; and using the image data, the surface of the computer graphics object Generating a texture mapping to. Texture mapping may include the mapping of image data to the surface of a computer graphics object. The method may also include transmitting a spherical image and texture mapping for display on a head mounted display device. Other embodiments of this aspect include corresponding computer systems, devices, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of these methods.

いくつかの実現化例では、この方法はまた、領域に関連付けられた追加の球状画像およびテクスチャマッピングを生成するステップと、画像データの一部と球状画像とを組合せることによって、左目ビューを生成するステップと、追加の画像データを生成し、追加の画像データと追加の球状画像とを組合せることによって、右目ビューを生成するステップと、頭部装着型ディスプレイデバイスにおいて左目ビューおよび右目ビューを表示するステップとを含んでいてもよい。画像データは、領域における画素の一部のうちの少なくともいくつかについての深度値データおよびＲＧＢデータを含んでいてもよい。 In some implementations, the method also generates a left eye view by combining the step of generating additional spherical images and texture mappings associated with the region with a portion of the image data and the spherical image. Generating a right eye view by generating additional image data and combining the additional image data and the additional spherical image, and displaying a left eye view and a right eye view on the head mounted display device And a step of performing. The image data may include depth value data and RGB data for at least some of the pixels in the region.

いくつかの実現化例では、複数の画像はビデオコンテンツを含み、画像データは、画素の一部に関連付けられたＲＧＢデータおよび深度値データを含む。いくつかの実現化例では、この方法はさらに、画像データを使用して、領域の２次元バージョンを領域の３次元バージョンに変換するステップと、頭部装着型ディスプレイデバイスにおける表示のために、領域の３次元バージョンを提供するステップとを含む。いくつかの実現化例では、複数の画像は、球状形状のカメラリグ上に搭載された複数のカメラを用いて取込まれる。 In some implementations, the plurality of images includes video content, and the image data includes RGB data and depth value data associated with a portion of the pixels. In some implementations, the method further includes using the image data to convert a two-dimensional version of the region to a three-dimensional version of the region and displaying the region on a head-mounted display device. Providing a three-dimensional version of In some implementations, multiple images are captured using multiple cameras mounted on a spherical camera rig.

別の一般的な局面では、複数のカメラを用いて複数の画像を取得するステップと、複数の画像について少なくとも２つの更新画像を生成するステップとを含み、少なくとも２つの更新画像は、予め規定された中心線からの左側オフセットでコンテンツを取込み、予め規定された中心線からの右側オフセットでコンテンツを取込むように構成された少なくとも１つの仮想カメラについての視点を補間することによって生成される、コンピュータにより実現される方法が説明される。いくつかの実現化例では、視点を補間することは、複数の画像における複数の画素をサンプリングすることと、オプティカルフローを使用して仮想コンテンツを生成することと、少なくとも２つの更新画像のうちの少なくとも１つの内部に仮想コンテンツを設置することとを含む。 In another general aspect, the method includes obtaining a plurality of images using a plurality of cameras and generating at least two update images for the plurality of images, wherein the at least two update images are predefined. A computer generated by interpolating a viewpoint for at least one virtual camera configured to capture content with a left offset from a centerline and capture content with a right offset from a predefined centerline The method realized by is described. In some implementations, interpolating viewpoints involves sampling a plurality of pixels in a plurality of images, generating virtual content using optical flow, and out of at least two updated images. Placing virtual content in at least one of the interiors.

この方法はさらに、頭部装着型ディスプレイの左アイピースに提供するための第１の球状画像を生成するために、少なくとも２つの更新画像における第１の画像を第１の球面にマッピングするステップと、頭部装着型ディスプレイの右アイピースに提供するための第２の球状画像を生成するために、少なくとも２つの更新画像における第２の画像を第２の球面にマッピングするステップと、頭部装着型ディスプレイの左アイピースに第１の球状画像を表示し、頭部装着型ディスプレイの右アイピースに第２の球状画像を表示するステップとを含んでいてもよい。 The method further includes mapping the first image in the at least two updated images to the first sphere to generate a first spherical image for provision to the left eyepiece of the head mounted display. Mapping a second image in at least two updated images to a second sphere to generate a second spherical image for provision to the right eyepiece of the head mounted display; Displaying the first spherical image on the left eyepiece and displaying the second spherical image on the right eyepiece of the head-mounted display.

いくつかの実現化例では、少なくとも１つの仮想カメラは、１つ以上の物理的カメラを使用して取込まれたコンテンツを使用し、コンテンツを視点から提供されるよう適合させるように構成される。いくつかの実現化例では、第１の画像のマッピングは、第１の画像から第１の球面に画素座標を割当てることによって第１の画像にテクスチャを適用することを含み、第２の画像のマッピングは、第２の画像から第２の球面に画素座標を割当てることによって第２の画像にテクスチャを適用することを含む。いくつかの実現化例では、少なくとも２つの球状画像は、左側オフセットで取込まれたコンテンツに含まれる複数の画素のうちの少なくとも一部を有するＲＧＢ画像と、右側オフセットで取込まれたコンテンツに含まれる複数の画素のうちの少なくとも一部を有するＲＧＢ画像とを含む。いくつかの実現化例では、左側オフセットおよび右側オフセットは修正可能であり、また、頭部装着型ディスプレイにおける第１の画像および第２の画像の表示精度を適合させるために機能的である。 In some implementations, the at least one virtual camera is configured to use content captured using one or more physical cameras and to adapt the content to be provided from a viewpoint. . In some implementations, the mapping of the first image includes applying a texture to the first image by assigning pixel coordinates from the first image to the first sphere, The mapping includes applying a texture to the second image by assigning pixel coordinates from the second image to the second sphere. In some implementations, the at least two spherical images are an RGB image having at least some of the plurality of pixels included in the content captured with the left offset and a content captured with the right offset. And an RGB image having at least a part of the plurality of included pixels. In some implementations, the left and right offsets can be modified and are functional to adapt the display accuracy of the first and second images on the head mounted display.

本局面の他の実施形態は、これらの方法のアクションを行なうように各々構成された、対応するコンピュータシステム、装置、および、１つ以上のコンピュータ記憶装置上に記録されたコンピュータプログラムを含む。 Other embodiments of this aspect include corresponding computer systems, devices, and computer programs recorded on one or more computer storage devices, each configured to perform the actions of these methods.

１つ以上の実現化例の詳細を、添付図面および以下の説明で述べる。他の特徴は、説明および図面から、ならびに請求項から明らかとなるであろう。 The details of one or more implementations are set forth in the accompanying drawings and the description below. Other features will be apparent from the description and drawings, and from the claims.

バーチャルリアリティ（ＶＲ）空間のために２Ｄおよび３Ｄコンテンツを取込み、処理し、レンダリングするための例示的なシステム１００のブロック図である。1 is a block diagram of an example system 100 for capturing, processing, and rendering 2D and 3D content for virtual reality (VR) space. ビデオコンテンツの３Ｄ部分を生成する際に使用するシーンの画像を取込むように構成された例示的な球状カメラリグを示す図である。FIG. 6 illustrates an exemplary spherical camera rig configured to capture an image of a scene for use in generating a 3D portion of video content. ビデオコンテンツの３Ｄ部分を生成する際に使用するシーンの画像を取込むように構成された例示的な二十面体状カメラリグを示す図である。FIG. 6 illustrates an exemplary icosahedron camera rig configured to capture an image of a scene for use in generating a 3D portion of video content. ビデオコンテンツの３Ｄ部分を生成する際に使用するシーンの画像を取込むように構成された例示的な六角形球カメラリグを示す図である。FIG. 6 illustrates an exemplary hexagonal sphere camera rig configured to capture an image of a scene for use in generating a 3D portion of video content. ビデオコンテンツを生成するためのプロセスの一実施形態を示すフローチャートである。2 is a flowchart illustrating one embodiment of a process for generating video content. ここに説明される手法を実現するために使用され得るコンピュータデバイスおよびモバイルコンピュータデバイスの一例を示す図である。FIG. 11 illustrates an example of a computing device and a mobile computing device that can be used to implement the techniques described herein.

さまざまな図面における同じ参照符号は、同じ要素を示す。
詳細な説明
シーンの各部分を２次元および／または３次元で正確に再生するために使用され得る画像コンテンツを取得することは一般に、３次元カメラリグに収容された複数のカメラを使用してそのシーンの画像またはビデオを取込むことを含む。これらのカメラは、上部、側部、底部でカメラリグを包囲するシーンの各部分と、間に示されるあらゆるシーンコンテンツとを取込むように構成されてもよい。本開示で説明されるシステムおよび方法は、ほんの数例を挙げると球状形状、二十面体状形状、または３Ｄ多角形形状であるカメラリグを採用することができる。そのようなカメラリグには、リグを包囲する外向きに取込可能なすべてのエリアに関する画像コンテンツを取込むためにリグ上に戦略的に設置されたいくつかのグループ（たとえばトライアド（３つ組））のカメラが収容され得る。画像コンテンツは、複数のカメラ間で取込まれた重複画像コンテンツを含む場合があり、この重複は後に、追加の画像コンテンツを生成し、既存の画像コンテンツをつなぎ合わせ、または画像コンテンツにおいて視覚的効果（たとえば３Ｄ効果）を生成するために使用され得る。 Like reference symbols in the various drawings indicate like elements.
DETAILED DESCRIPTION Obtaining image content that can be used to accurately reproduce each part of a scene in 2D and / or 3D is generally achieved using multiple cameras housed in a 3D camera rig. Including capturing images or videos. These cameras may be configured to capture each portion of the scene that surrounds the camera rig at the top, sides, and bottom and any scene content shown therebetween. The systems and methods described in this disclosure can employ camera rigs that are spherical, icosahedral, or 3D polygonal, to name just a few. Such camera rigs include several groups (eg triads) strategically placed on the rig to capture image content for all outwardly captureable areas surrounding the rig. ) Cameras can be housed. The image content may include duplicate image content captured between multiple cameras, and this duplication later generates additional image content and stitches existing image content or visual effects in the image content. (Eg 3D effect) can be used to generate.

そのような１つの視覚的効果は、画像コンテンツ内に３Ｄ領域を生成することを含んでいてもよい。画像コンテンツ（たとえばビデオコンテンツ）内に３Ｄ領域をランタイム（またはほぼリアルタイム）に生成することは、ここに説明されるシステムおよび方法を用いて取込まれたコンテンツを使用して達成されてもよい。なぜなら、そのようなカメラリグは、カメラを収容するように構成された球または他の３Ｄ形状を包囲する全ての各エリアを取込むように構成されるためである。シーン内のすべての可能なビューイングコンテンツへのアクセスを有することは深度についての計算を可能にし、深度は、２Ｄコンテンツを３Ｄコンテンツに修正し、元に戻すために使用され得る。３Ｄ領域生成の一例は、画像コンテンツに示されるオブジェクトまたはアクションに基づいて、ある特定のエリアが３Ｄで示されるべきであると判断することを含み得る。 One such visual effect may include generating a 3D region in the image content. Generating 3D regions in image content (eg, video content) at runtime (or near real time) may be accomplished using content captured using the systems and methods described herein. This is because such a camera rig is configured to capture all the areas surrounding a sphere or other 3D shape configured to accommodate the camera. Having access to all possible viewing content in the scene allows calculation for depth, which can be used to modify 2D content back to 3D content. An example of 3D region generation may include determining that a particular area should be shown in 3D based on the object or action shown in the image content.

たとえば、画像コンテンツが天井から示されるアクロバット演技を表わす場合、本開示で説明されるシステムおよび方法は、たとえば、ユーザがＶＲ空間の天井に目を向けると、ＶＲ空間でユーザの上方に示されるコンテンツに３Ｄ効果が適用されるべきであると判断することができる。３Ｄ効果は、画像コンテンツ（たとえばビデオコンテンツ）に自動的に適用され、ＶＲ頭部装着型ディスプレイ（head mounted display：ＨＭＤ）デバイスにおいてユーザに表示され得る。いくつかの実現化例では、３Ｄ効果は、３Ｄ効果を適用するためのエリアをシフトさせるように手動で構成され得る。たとえば、アクロバットの例での３Ｄ効果は、演技がメインステージから天井に移動するよう予定されている場合、ステージから天井にシフトされてもよい。すなわち、画像コンテンツ（たとえばビデオコンテンツ）は、アクロバットがユーザ（たとえば観客）の上方で行なわれ始めると、３Ｄ効果を天井にシフトさせるように構成され得る。いくつかの実現化例では、画像３Ｄ効果は画像コンテンツの一部に適用可能であり、一方、画像コンテンツの周囲部分は２次元フォーマットのままである。いくつかの実現化例では、ここに説明されるシステムは、画像またはビデオ全体に３Ｄ効果を適用するために使用され得る。他の実現化例では、ここに説明されるシステムは、シーン、シーンの一部、画像／シーンにおけるエリア、ユーザが選択した、またはユーザが凝視して選択した画像／シーンの一部に、３Ｄ効果を適用するために使用され得る。 For example, if the image content represents an acrobatic performance shown from the ceiling, the systems and methods described in this disclosure can be used to display content above the user in the VR space, for example, when the user looks at the ceiling of the VR space. It can be determined that the 3D effect should be applied. The 3D effect can be automatically applied to image content (eg, video content) and displayed to the user on a VR head mounted display (HMD) device. In some implementations, the 3D effect may be manually configured to shift the area for applying the 3D effect. For example, the 3D effect in the acrobat example may be shifted from stage to ceiling if the performance is scheduled to move from the main stage to the ceiling. That is, the image content (eg, video content) can be configured to shift the 3D effect to the ceiling when the acrobat begins to be performed above the user (eg, the audience). In some implementations, the image 3D effect can be applied to a portion of the image content, while the surrounding portion of the image content remains in a two-dimensional format. In some implementations, the system described herein can be used to apply 3D effects to an entire image or video. In other implementations, the systems described herein are 3D to a scene, a portion of a scene, an image / area in a scene, a user selected or a portion of an image / scene selected by the user. Can be used to apply effects.

画像コンテンツの２次元（２Ｄ）から３次元（３Ｄ）への修正を行なうことができるのは、球状カメラリグが球状カメラリグの周りの全角度から画像を取込むために使用されるためであり、このため、すべての可能なエリアを３Ｄ調節可能にする。自動調節は、画像コンテンツに関連付けられた深度マップを計算するために密なオプティカルフローを計算することを含み得る。深度マップを計算することは、カメラの位置に対する、シーンにおけるさまざまな点の距離を表わす、多くの深度値を計算することを含み得る。いくつかの例では、２つ以上の画像を使用して深度値を計算し、２Ｄ画像データに加えてこれらの深度値を使用して、ある特定のシーンの部分についての３Ｄ画像データを推定することができる。 The two-dimensional (2D) to three-dimensional (3D) modification of the image content can be performed because the spherical camera rig is used to capture images from all angles around the spherical camera rig, and this Thus, all possible areas are 3D adjustable. Automatic adjustment may include calculating a dense optical flow to calculate a depth map associated with the image content. Computing the depth map may include computing a number of depth values that represent the distance of various points in the scene relative to the position of the camera. In some examples, two or more images are used to calculate depth values and these depth values are used in addition to 2D image data to estimate 3D image data for a particular scene portion. be able to.

いくつかの実現化例では、オブジェクトへのテクスチャマッピングが、２次元データをオブジェクト上にマッピングする２次元効果を生成してもよい。深度マップデータ（または深度データ）およびテクスチャマップデータを深度面に送信する他の実現化例では、効果は３次元効果であってもよい。 In some implementations, texture mapping to an object may generate a two-dimensional effect that maps two-dimensional data onto the object. In other implementations that send depth map data (or depth data) and texture map data to the depth plane, the effect may be a three-dimensional effect.

本開示で説明されるシステムおよび方法は、画像コンテンツ内に３Ｄ領域を生成するために、オプティカルフローアルゴリズム、深度マップ計算、ユーザ入力、および／または監督入力を使用することを含んでいてもよい。たとえば、説明されるシステムおよび方法は、画像コンテンツの選択エリアに３Ｄ効果を適用することができる。３Ｄ効果は戦略的に計算され、ビデオコンテンツなどの画像コンテンツ内にほぼリアルタイムに設置され得る。いくつかの実現化例では、３Ｄ効果は、バーチャルリアリティ（ＶＲ）空間に画像コンテンツを提供する前に手動で設置され得る。いくつかの実現化例では、３Ｄ効果は、ユーザがＶＲ空間で画像コンテンツを見ている間に、たとえばユーザが関心エリアの方を向くのに応答して自動的に設置され得る。たとえば、ここに説明されるデバイスを用いて取込まれた画像コンテンツがＶＲ空間においてユーザに提供される場合、ユーザはＶＲ空間においてコンテンツを見るためにあるエリアの方を向く場合があり、ユーザがそのコンテンツに関心を示したのに応答して、そのコンテンツは自動的に３Ｄコンテンツとして生成され得る。 The systems and methods described in this disclosure may include using an optical flow algorithm, depth map calculation, user input, and / or supervisory input to generate a 3D region in the image content. For example, the described systems and methods can apply 3D effects to selected areas of image content. The 3D effect is calculated strategically and can be placed in near real time within image content such as video content. In some implementations, the 3D effect can be manually installed prior to providing image content in a virtual reality (VR) space. In some implementations, the 3D effect may be placed automatically while the user is viewing image content in VR space, for example, in response to the user facing toward the area of interest. For example, if image content captured using the devices described herein is provided to the user in VR space, the user may turn to an area to view the content in VR space, In response to showing interest in the content, the content can be automatically generated as 3D content.

特定の実現化例では、ここに説明されるシステムおよび方法は、３Ｄ画像コンテンツを構成し表示するために、球状カメラリグ上の多くのトライアドのカメラ間の密なオプティカルフロー場を計算することを含み得る。（オプティカルフロー補間手法を使用した）フロー場の計算および送信は、ユーザが見たい特定の３Ｄビューを（ランタイムに、またはランタイムに先立って）再構成するために行なわれ得る。これらの手法は、ユーザの頭部の傾きおよび平行移動を考慮に入れることができ、３Ｄコンテンツが球状カメラリグによって取込まれたシーン内の任意の選択可能エリアで提供されることを可能にしてもよい。いくつかの実現化例では、前方および後方頭部平行移動も行なわれ得る。 In certain implementations, the systems and methods described herein include calculating a dense optical flow field between many triad cameras on a spherical camera rig to compose and display 3D image content. obtain. The calculation and transmission of the flow field (using optical flow interpolation techniques) can be done to reconstruct (at or prior to runtime) the specific 3D view that the user wants to see. These techniques can take into account the tilt and translation of the user's head and allow 3D content to be provided in any selectable area in the scene captured by the spherical camera rig. Good. In some implementations, anterior and posterior head translation may also be performed.

いくつかの実現化例では、ここに説明されるシステムおよび方法は、画像の各画素について深度値を得るために、オプティカルフローおよび／またはステレオ整合手法を採用することができる。オプティカルフローおよび／またはステレオ整合手法を使用して生成された球状画像（またはビデオ）は、たとえばＨＭＤデバイスに送信され得る。球状画像は、ＲＧＢ（赤緑青）画素データ、ＹＵＶ（ルーメンおよびクロミナンス）データ、深度データ、もしくは、追加の計算されたまたは取得可能な画像データを含んでいてもよい。ＨＭＤデバイスはそのようなデータを受信し、深度成分によって規定された３Ｄ空間における表面上にマッピングされたテクスチャとして画像をレンダリングすることができる。 In some implementations, the systems and methods described herein can employ optical flow and / or stereo matching techniques to obtain a depth value for each pixel of the image. A spherical image (or video) generated using optical flow and / or stereo matching techniques can be transmitted to an HMD device, for example. The spherical image may include RGB (red green blue) pixel data, YUV (lumen and chrominance) data, depth data, or additional calculated or obtainable image data. The HMD device can receive such data and render the image as a texture mapped onto the surface in 3D space defined by the depth component.

いくつかの実現化例では、ここに説明されるシステムおよび方法は、オプティカルフロー手法を使用して多くの異なる仮想カメラを補間することができる。結果として生じるオプティカルフローデータを使用して、少なくとも２つの球状画像が生成され得る（たとえば、左ＲＧＢ球状画像および右ＲＧＢ球状画像）。左ＲＧＢ球状画像における画素は、左にオフセットした仮想カメラから取得可能であり、右ＲＧＢ球状画像における画素は、右にオフセットした仮想カメラから取得可能である。正確な３Ｄ効果を生成するために、ここに説明されるシステムおよび方法は、仮想カメラに使用される左および右のオフセットの量を修正することができる。すなわち、最大オフセットを選択することは、画像またはビデオにおけるコンテンツに基づいて、もしくは監督からの入力に基づいて正確な３Ｄ画像コンテンツを提供するように機能し得る。左および右の画像はその場合、たとえばＨＭＤデバイスにおける（一定半径の）球上にマッピングされたテクスチャであり得る。 In some implementations, the systems and methods described herein can interpolate many different virtual cameras using optical flow techniques. The resulting optical flow data can be used to generate at least two spherical images (eg, a left RGB spherical image and a right RGB spherical image). Pixels in the left RGB spherical image can be acquired from a virtual camera offset to the left, and pixels in the right RGB spherical image can be acquired from a virtual camera offset to the right. In order to generate accurate 3D effects, the systems and methods described herein can modify the amount of left and right offsets used in a virtual camera. That is, selecting the maximum offset may function to provide accurate 3D image content based on content in the image or video, or based on input from the director. The left and right images can then be textures mapped onto a (constant radius) sphere, for example in an HMD device.

図１は、バーチャルリアリティ（ＶＲ）空間のために２Ｄおよび３Ｄコンテンツを取込み、処理し、レンダリングするための例示的なシステム１００のブロック図である。例示的なシステム１００では、球状形状のカメラリグ１０２が静止画像およびビデオ画像を取込み、ネットワーク１０４を通して、またはこれに代えて直接、分析および処理のために画像処理システム１０６に提供し得る。画像が一旦取込まれると、画像処理システム１０６が画像に対して多くの計算およびプロセスを行ない、たとえば、処理画像をレンダリングのためにネットワーク１０４を通して頭部装着型ディスプレイ（ＨＭＤ）デバイス１１０に提供し得る。いくつかの実現化例では、画像処理システム１０６はまた、処理画像を、レンダリング、格納、またはさらなる処理のために、モバイルデバイス１０８に、および／またはコンピューティングデバイス１１２に提供し得る。 FIG. 1 is a block diagram of an exemplary system 100 for capturing, processing, and rendering 2D and 3D content for virtual reality (VR) space. In the exemplary system 100, a spherically shaped camera rig 102 may capture still and video images and provide them to the image processing system 106 for analysis and processing either directly through the network 104 or alternatively. Once the image is captured, the image processing system 106 performs a number of calculations and processes on the image, for example providing the processed image to the head mounted display (HMD) device 110 through the network 104 for rendering. obtain. In some implementations, the image processing system 106 may also provide processed images to the mobile device 108 and / or to the computing device 112 for rendering, storage, or further processing.

ＨＭＤデバイス１１０は、バーチャルリアリティコンテンツを表示できるバーチャルリアリティヘッドセット、眼鏡、アイピース、または他のウェアラブルデバイスを表わしていてもよい。動作時、ＨＭＤデバイス１１０は、受信画像および／または処理画像をユーザに対して再生できるＶＲアプリケーション（図示せず）を実行し得る。いくつかの実現化例では、ＶＲアプリケーションは、図１に示すデバイス１０６、１０８、または１１２のうちの１つ以上によってホストされ得る。一例では、ＨＭＤデバイス１１０は、シーンの部分を３Ｄビデオコンテンツとして生成可能であり、カメラリグ１０２によって取込まれたシーンの３Ｄフォーマットでのビデオ再生を、戦略的に選択された場所で提供可能である。 The HMD device 110 may represent a virtual reality headset, glasses, eyepiece, or other wearable device that can display virtual reality content. In operation, the HMD device 110 may execute a VR application (not shown) that can play the received and / or processed images to the user. In some implementations, the VR application may be hosted by one or more of the devices 106, 108, or 112 shown in FIG. In one example, the HMD device 110 can generate portions of the scene as 3D video content and can provide video playback in a 3D format of the scene captured by the camera rig 102 at strategically selected locations. .

カメラリグ１０２は、ＶＲ空間においてコンテンツをレンダリングするために画像データを収集するためのカメラ（取込デバイスとも呼ばれ得る）および／または処理デバイスとして使用するために構成され得る。カメラリグ１０２はここでは特定の機能性を有して説明されるブロック図として示されているが、リグ１０２は図２〜図４に示す実現化例のうちのいずれかの形態を取ることができ、加えて、本開示全体にわたってカメラリグについて説明される機能性を有していてもよい。たとえば、システム１００の機能性の説明を簡潔にするために、図１は、画像を取込むためのカメラがリグの周りに配置されていないカメラリグ１０２を示す。カメラリグ１０２の他の実現化例は、リグ１０２などの３Ｄカメラリグ上の任意の点に配置され得る任意の数のカメラを含み得る。 The camera rig 102 may be configured for use as a camera (which may also be referred to as a capture device) and / or processing device for collecting image data to render content in VR space. Although the camera rig 102 is shown here as a block diagram described with particular functionality, the rig 102 can take any of the implementations shown in FIGS. In addition, the functionality described for camera rigs may be provided throughout this disclosure. For example, to simplify the description of the functionality of the system 100, FIG. 1 shows a camera rig 102 in which no camera for capturing images is placed around the rig. Other implementations of the camera rig 102 may include any number of cameras that may be placed at any point on a 3D camera rig, such as the rig 102.

図１に示すように、カメラリグ１０２は、多くのカメラ１３０と、通信モジュール１３２とを含む。カメラ１３０は、スチルカメラまたはビデオカメラを含み得る。いくつかの実現化例では、カメラ１３０は、球状リグ１０２の表面に沿って配置された（たとえば着座した）複数のスチルカメラまたは複数のビデオカメラを含み得る。カメラ１３０は、ビデオカメラ、画像センサ、立体視カメラ、赤外線カメラ、および／またはモバイルデバイスカメラを含んでいてもよい。通信システム１３２は、画像、命令、および／または他のカメラ関連コンテンツをアップロードおよびダウンロードするために使用され得る。通信システム１３２は有線でも無線でもよく、私設ネットワークまたは公衆ネットワークを通してインターフェイス接続可能である。 As shown in FIG. 1, the camera rig 102 includes a number of cameras 130 and a communication module 132. The camera 130 may include a still camera or a video camera. In some implementations, the camera 130 may include multiple still cameras or multiple video cameras positioned (eg, seated) along the surface of the spherical rig 102. The camera 130 may include a video camera, an image sensor, a stereoscopic camera, an infrared camera, and / or a mobile device camera. The communication system 132 may be used to upload and download images, instructions, and / or other camera related content. The communication system 132 may be wired or wireless and can be interfaced through a private network or a public network.

カメラリグ１０２は、静止リグまたは回転リグとして機能するように構成され得る。リグ上の各カメラは、リグの回転中心からオフセットして配置（たとえば設置）される。カメラリグ１０２は、たとえば、シーンの３６０度の球状ビューのすべてまたは一部を掃引して取込むために、３６０度回転するように構成され得る。いくつかの実現化例では、リグ１０２は静止位置で動作するように構成可能であり、そのような構成では、追加のカメラをリグに追加してシーンの追加の外向き角度のビューを取込んでもよい。 The camera rig 102 may be configured to function as a stationary rig or a rotating rig. Each camera on the rig is arranged (eg, installed) offset from the rotation center of the rig. The camera rig 102 may be configured to rotate 360 degrees, for example, to sweep and capture all or part of a 360 degree spherical view of the scene. In some implementations, the rig 102 can be configured to operate in a stationary position, and in such a configuration, an additional camera is added to the rig to capture an additional outward angle view of the scene. But you can.

いくつかの実現化例では、カメラは、特定の時点でカメラリグ上のカメラからビデオを取込むために、同期して機能するように構成（たとえばセットアップ）され得る。いくつかの実現化例では、カメラは、ある期間にわたってカメラのうちの１つ以上からビデオの特定部分を取込むために、同期して機能するように構成され得る。カメラリグを較正する別の例は、受信画像をどのように格納するかを構成することを含み得る。たとえば、受信画像は個々のフレームまたはビデオ（たとえば、ａｖｉファイル、ｍｐｇファイル）として格納可能であり、そのような格納画像は、インターネット、別のサーバまたはデバイスにアップロードされ得るか、もしくはカメラリグ１０２上の各カメラを用いてローカルに記憶され得る。 In some implementations, the camera may be configured (eg, set up) to function synchronously to capture video from the camera on the camera rig at a particular point in time. In some implementations, the cameras may be configured to function synchronously to capture specific portions of the video from one or more of the cameras over a period of time. Another example of calibrating a camera rig may include configuring how received images are stored. For example, received images can be stored as individual frames or videos (eg, avi files, mpg files), and such stored images can be uploaded to the Internet, another server or device, or on camera rig 102 Each camera can be stored locally.

画像処理システム１０６は、補間モジュール１１４と、オプティカルフローモジュール１１６と、つなぎ合わせモジュール１１８と、深度マップ生成器１２０と、３Ｄ生成器モジュール１２２とを含む。補間モジュール１１４は、たとえば、デジタル画像およびビデオの部分をサンプリングし、カメラリグ１０２から取込まれた隣接画像間で生じそうな多くの補間画像を求めるために使用され得るアルゴリズムを表わす。いくつかの実現化例では、補間モジュール１１４は、隣接画像間の補間された画像フラグメント、画像部分、および／または垂直もしくは水平画像ストリップを求めるように構成され得る。いくつかの実現化例では、補間モジュール１１４は、隣接画像内の関連画素間のフロー場（および／またはフローベクトル）を求めるように構成され得る。フロー場は、画像が受けた両変換、および変換を受けた画像の処理を補償するために使用され得る。たとえば、フロー場は、取得画像の特定の画素格子の変換を補償するために使用され得る。いくつかの実現化例では、補間モジュール１１４は、周囲画像の補間により、取込まれた画像の一部ではない１つ以上の画像を生成可能であり、生成された画像を取込まれた画像にインターリーブして、シーンの追加のバーチャルリアリティコンテンツを生成可能である。たとえば、補間モジュール１１４は、実在する（たとえば物理的な）カメラ間の仮想カメラからのビューを再構成し、各ビューの中心光線を選択して、球の中心から１つの仮想カメラ画像を作り上げることにより、２Ｄ（平面）写真／ビデオ球のつなぎ合わせを提供し得る。 The image processing system 106 includes an interpolation module 114, an optical flow module 116, a stitching module 118, a depth map generator 120, and a 3D generator module 122. Interpolation module 114 represents an algorithm that can be used, for example, to sample a portion of a digital image and video and determine a number of interpolated images likely to occur between adjacent images captured from camera rig 102. In some implementations, the interpolation module 114 may be configured to determine interpolated image fragments, image portions, and / or vertical or horizontal image strips between adjacent images. In some implementations, the interpolation module 114 may be configured to determine a flow field (and / or flow vector) between related pixels in neighboring images. The flow field can be used to compensate for both transformations the image has undergone and processing of the transformed image. For example, the flow field can be used to compensate for the transformation of a particular pixel grid of the acquired image. In some implementations, the interpolation module 114 can generate one or more images that are not part of the captured image by interpolation of the surrounding image, and the captured image Can be interleaved to generate additional virtual reality content for the scene. For example, the interpolation module 114 reconstructs views from virtual cameras between real (eg, physical) cameras and selects the central ray of each view to create one virtual camera image from the center of the sphere. May provide 2D (planar) photo / video sphere stitching.

オプティカルフローモジュール１１６は、各トライアドのカメラ間の密なオプティカルフローを計算するように構成され得る。たとえば、モジュール１１６は、球状カメラリグ上に三角形を形成するペアのカメラ間の、３方向のペア状のオプティカルフローを計算し得る。オプティカルフローモジュール１１６は、第１のカメラと第２のカメラとの間、第２のカメラと第３のカメラとの間、および第３のカメラと第１のカメラとの間のオプティカルフローを計算し得る。計算に使用される各ペアのカメラは、ステレオペアと考えられ得る。いくつかの実現化例では、オプティカルフローの計算は、フローベクトルが２Ｄの量または構成を作成するように任意の方向に向けられる場合に、ペアのカメラ間で行なわれ得る。いくつかの実現化例では、オプティカルフローの計算は、フローベクトルが１次元に制限される場合（たとえば、フローが水平である水平ステレオペア）に行なわれ得る。 The optical flow module 116 may be configured to calculate a dense optical flow between the cameras of each triad. For example, the module 116 may calculate a three-way paired optical flow between a pair of cameras forming a triangle on a spherical camera rig. The optical flow module 116 calculates optical flows between the first camera and the second camera, between the second camera and the third camera, and between the third camera and the first camera. Can do. Each pair of cameras used in the calculation can be considered a stereo pair. In some implementations, optical flow computations may be performed between a pair of cameras when the flow vectors are oriented in any direction to create a 2D quantity or configuration. In some implementations, the optical flow calculation may be performed when the flow vector is limited to one dimension (eg, a horizontal stereo pair where the flow is horizontal).

カメラリグの表面の周りに複数のトライアドのカメラを有する、球状形状のカメラリグ（またはここに説明される他の３Ｄ形状のリグ）を使用して、オプティカルフローモジュール１１６は、リグを包囲する正確なシーンを生成し得る。たとえば、オプティカルフローモジュール１１６は、特定の取込画像コンテンツについてオプティカルフロー場を計算し、つなぎ合わせモジュール１１８にアクセスして、シーンについての平面視パノラマをつなぎ合わせ得る。これは、ビデオコンテンツにおけるアーチファクトを減少させる場合がある。平面視パノラマの生成は、ユーザの両目に同じ画像を提示することを含んでいてもよい。これは、ユーザには２Ｄに見えるかもしれない。いくつかの実現化例では、つなぎ合わせモジュール１１８は、ユーザに関連付けられた各目に独特で異なる画像を提供可能な立体視パノラマをつなぎ合わせることができ、そのような画像はユーザには３Ｄに見えるかもしれない。ここに使用されるように、３Ｄコンテンツは、立体視の提示コンテンツと考えられてもよく、深度面上にマッピングされたテクスチャを示し得る。同様に、２Ｄコンテンツは、たとえば平面または球面上にマッピングされたテクスチャを示す平面視の提示コンテンツと考えられてもよい。 Using a spherically shaped camera rig (or other 3D-shaped rig described herein) with multiple triad cameras around the surface of the camera rig, the optical flow module 116 allows the precise scene to surround the rig. Can be generated. For example, the optical flow module 116 may calculate an optical flow field for the particular captured image content and access the stitching module 118 to stitch together a planar panorama for the scene. This may reduce artifacts in the video content. Generation of the planar panorama may include presenting the same image to both eyes of the user. This may seem 2D to the user. In some implementations, the stitching module 118 can stitch together stereoscopic panoramas that can provide a unique and different image for each eye associated with the user, such images being displayed to the user in 3D. May be visible. As used herein, 3D content may be considered stereoscopic presentation content and may show a texture mapped onto the depth plane. Similarly, 2D content may be considered as presentation content in a plan view showing a texture mapped on, for example, a plane or a sphere.

いくつかの実現化例では、モジュール１１４およびつなぎ合わせモジュール１１８は、パノラマツイストを導入するために、または、たとえば選択された方向において３Ｄ効果を導入するために、代わりに非中心光線を採用することによってステレオ球状ペアを生成するために使用され得る。パノラマツイストは、第１の方向に偏向された光線を用いて第１の目（左目）用の光線を取込み、反対方向に偏向された光線を用いて第２の目（右目）用の光線を取込むことを含む。 In some implementations, module 114 and splicing module 118 may instead employ non-centered rays to introduce a panoramic twist or, for example, to introduce a 3D effect in a selected direction. Can be used to generate stereospherical pairs. The panorama twist uses the light beam deflected in the first direction to capture the light beam for the first eye (left eye) and uses the light beam deflected in the opposite direction to generate the light beam for the second eye (right eye). Including capturing.

一般に、オプティカルフローモジュール１１６はオプティカルフロー手法を使用して、カメラの球状集団における隣接ペアのカメラ間のオプティカルフローを計算することにより、正確なモノパノラマおよびステレオ球状パノラマ（たとえば、全方向ステレオまたはメガステレオパノラマ用のパノラマツイスト）を生成し得る。カメラの集団は、空間における各点が少なくとも３台のカメラに見えるように、カメラ配置の制約を受ける場合がある。 In general, the optical flow module 116 uses optical flow techniques to calculate the optical flow between adjacent pairs of cameras in a spherical population of cameras, thereby providing accurate mono and stereo spherical panoramas (eg, omnidirectional stereo or mega A panorama twist for a stereo panorama). A group of cameras may be constrained by camera placement so that each point in space appears to be at least three cameras.

いくつかの実現化例では、ここに説明されるカメラリグは、アーチファクト（たとえば、つなぎ合わせエラー／アーチファクト、カメラ境界上のオブジェクトの不連続性、境界での欠損データ、または境界近くでの２重の画像コンテンツ、裂けたオブジェクト、ゆがんたオブジェクト、除去されたコンテンツなど）を減少させるかまたは除去するという利点を提供し得る。アーチファクトは、動画コンテンツを表わすビデオコンテンツについて特に良好に除去され得る。そのようなアーチファクトの除去は、重複したビデオ／画像コンテンツを含むトライアドのカメラを有する球状カメラリグの使用に基づいて可能であり、重複したビデオ／画像コンテンツは、カメラが取込んだ重複画像エリアにアクセスし、オプティカルフロー手法を行ない、アーチファクト／エラーを提供したと思われる画像エリアを再計算することによって、つなぎ合わせエラー／アーチファクトを訂正するために使用され得る。 In some implementations, the camera rig described herein can produce artifacts (eg, stitching errors / artifacts, object discontinuities on the camera boundary, missing data at the boundary, or double near the boundary. Image content, ripped objects, distorted objects, removed content, etc.) may be provided with the advantage of reducing or eliminating. Artifacts can be removed particularly well for video content representing video content. Such artifact removal is possible based on the use of a spherical camera rig with a triad camera that contains duplicate video / image content, where the duplicate video / image content accesses the duplicate image area captured by the camera. However, it can be used to correct splicing errors / artifacts by performing an optical flow approach and recalculating the image area that appears to have provided the artifact / error.

ここに説明されるシステムおよび方法は、３Ｄ球状形状のカメラリグ（または他の３Ｄ形状のカメラリグ）の周りの取込可能な任意の点でステレオ３Ｄコンテンツを生成するために使用され得る。そのような広く取込まれたコンテンツは、数学的な方法が、スチルまたはビデオコンテンツ内の任意の場所にステレオ３Ｄ効果／ビデオコンテンツを戦略的に設置しつつ、他の場所で３Ｄを除去するかまたは３Ｄ効果を提供せず、ストリーミング帯域幅、処理パワー、および／または記憶空間を節約することを可能にする。 The systems and methods described herein may be used to generate stereo 3D content at any point that can be captured around a 3D spherical shaped camera rig (or other 3D shaped camera rig). Such widely-acquired content can be obtained by mathematical methods that strategically place stereo 3D effects / video content anywhere in the still or video content while removing 3D elsewhere. Or it does not provide 3D effects and allows to save streaming bandwidth, processing power and / or storage space.

深度マップ生成器１２０は、カメラリグ１０２を用いて取込まれた画像に関するオプティカルフローデータ（たとえばフロー場）にアクセスし、そのようなフローデータを使用して、取込画像コンテンツについての深度マップを計算し得る。たとえば、深度マップ生成器１２０は、さまざまな方向を指し示すリグ１０２上の多くのカメラからの画像データを使用し得る。深度マップ生成器１２０はステレオ整合アルゴリズムにアクセスし、それを採用して、取込画像に表わされた各画素についての深度値を計算し得る。さまざまなカメラからのビューと深度値とが組合されて、各画素についてのＲ（赤）値、Ｇ（緑）値、Ｂ（青）値および深度値を有する１つの球状画像になり得る。ビューアでは、深度マップ生成器１２０は、球の各点が深度値と等しい半径を有するように、すべての画素で深度値を得ることによって構成された３Ｄ空間における表面へのＲＧＢ画像のテクスチャマップを行ない得る。この手法は、典型的には深度値および／または深度マップではなくステレオペアを使用する３Ｄ球状画像手法とは異なるかもしれない。 Depth map generator 120 accesses optical flow data (eg, a flow field) for images captured using camera rig 102 and uses such flow data to calculate a depth map for captured image content. Can do. For example, the depth map generator 120 may use image data from many cameras on the rig 102 pointing in various directions. The depth map generator 120 may access and employ a stereo matching algorithm to calculate a depth value for each pixel represented in the captured image. Views from various cameras and depth values can be combined into one spherical image with R (red), G (green), B (blue) and depth values for each pixel. In the viewer, the depth map generator 120 generates a texture map of the RGB image to the surface in 3D space that is constructed by obtaining a depth value at every pixel so that each point of the sphere has a radius equal to the depth value. You can do it. This approach may differ from a 3D spherical image approach that typically uses stereo pairs rather than depth values and / or depth maps.

一般に、深度マップ生成器１２０は、球状画像とともに送信される深度マップを生成する。画像コンテンツとともに深度マップを送信することは、ユーザが、極（たとえば、ユーザの上方の北、およびユーザの下方の南）を含む全方向を見て３Ｄコンテンツを見ることを可能にするという利点を提供できる。加えて、画像コンテンツとともに深度マップを送信することはまた、ユーザが自分の頭部を傾け、依然として３Ｄ効果を見ることを可能にすることもできる。一例では、深度情報が画像コンテンツとともに送信されるため、ユーザは、自分の名目上の場所から（たとえばＸ、Ｙ、および／またはＺ方向に）わずかな距離動き回ることができてもよく、オブジェクトが正しい方法で動くのを適切な視差で見ることができてもよい。ＶＲ空間内でのユーザの動きは実際の動きと呼ばれてもよく、システム１００はユーザ位置を追跡できる。 In general, the depth map generator 120 generates a depth map that is transmitted with the spherical image. Sending a depth map with image content has the advantage of allowing the user to view 3D content in all directions, including the poles (eg, north above the user and south below the user). Can be provided. In addition, sending the depth map along with the image content can also allow the user to tilt his head and still see the 3D effect. In one example, since depth information is transmitted with the image content, the user may be able to move a small distance from his nominal location (eg, in the X, Y, and / or Z directions) You may be able to see it moving in the right way with the proper parallax. User movement in the VR space may be referred to as actual movement, and the system 100 can track the user position.

計算されたオプティカルフローデータ（光照射野送信データを含む）は、球状ビデオデータと組合されてＨＭＤデバイス（または他のデバイス）に送信され、ＨＭＤデバイスにアクセスしているユーザのために左および右のビューを生成し得る。いくつかの実現化例では、深度マップ生成器１２０は、各目について別々のおよび別個の球状画像およびＲＧＢデータを提供し得る。 The calculated optical flow data (including light field transmission data) is combined with the spherical video data and sent to the HMD device (or other device) for left and right access for users accessing the HMD device. Can generate a view of In some implementations, the depth map generator 120 may provide separate and separate spherical images and RGB data for each eye.

いくつかの実現化例では、オプティカルフロー補間は、ＨＭＤデバイス１０６と通信しているコンピュータシステムによって実行可能であり、特定の画像コンテンツがＨＭＤデバイスに送信可能である。他の実現化例では、３Ｄ画像コンテンツを表示のために修正するために、補間がＨＭＤデバイス１０６でローカルに実行可能である。フローデータは、ＨＭＤデバイス１０６にアクセスしている左および右目用の左および右のビューを生成するために使用され得る。補間は、ＨＭＤデバイス１０６で実行可能である。なぜなら、システム１０６は、組合されたデータ（たとえば、球状ビデオデータおよびオプティカルフローデータ）をランタイムに提供するためである。 In some implementations, optical flow interpolation can be performed by a computer system in communication with the HMD device 106 and specific image content can be sent to the HMD device. In other implementations, interpolation can be performed locally at the HMD device 106 to modify the 3D image content for display. The flow data can be used to generate left and right views for the left and right eyes accessing the HMD device 106. Interpolation can be performed by the HMD device 106. This is because the system 106 provides combined data (eg, spherical video data and optical flow data) to the runtime.

いくつかの実現化例では、３Ｄ発生器モジュール１２２は、オプティカルフローデータおよび深度マップデータを使用して、画像コンテンツ内に３Ｄ領域を生成し、ＶＲ空間でそのような３Ｄ効果をユーザに提供する。３Ｄ効果は、手動または自動で設置されるようトリガされ得る。たとえば、特定の画像コンテンツの３Ｄ局面は、監督の決定時の後処理での取込後に構成され得る。特に、監督は、自分のＶＲ空間におけるシーンが、ＶＲ空間において飛行機およびヘリコプターがユーザの頭上を飛ぶようシミュレートされる飛行機およびヘリコプターのシーケンスを提供するように構成され得る、と判断できる。監督は、３Ｄ生成器ツール（図示せず）を含む１組のツールにアクセスして、３Ｄ効果をビデオコンテンツに適用してもよい。この例では、監督は、ユーザは飛行機またはヘリコプターの騒音が聞こえると空を見上げるであろうと判断でき、飛行機およびヘリコプターを３Ｄコンテンツとして提供するために３Ｄ生成器ツールを使用してビデオ画像コンテンツを調節できる。そのような例では、監督は、ユーザはヘリコプターおよび飛行機が通り過ぎるまで空を見上げているかもしれないため、他の周囲のビデオコンテンツは、３Ｄコンテンツとして提供された場合に、ユーザに使用をあまり提供しないかもしれない、と判断できる。したがって、監督は、ヘリコプターおよび飛行機を含むシーケンスの終了が予定されると、３Ｄ効果を空からビデオコンテンツにおける別のエリアに調節するようにビデオコンテンツを構成できる。 In some implementations, the 3D generator module 122 uses the optical flow data and depth map data to generate 3D regions in the image content and provides such 3D effects to the user in VR space. . The 3D effect can be triggered to be installed manually or automatically. For example, a 3D aspect of specific image content may be configured after capture in post-processing at the director's decision. In particular, the director can determine that the scene in his VR space can be configured to provide a sequence of airplanes and helicopters that are simulated in the VR space such that airplanes and helicopters fly over the user's head. The director may access a set of tools including a 3D generator tool (not shown) to apply 3D effects to the video content. In this example, the director can determine that the user will look up in the sky if he or she hears the noise of an airplane or helicopter and adjusts the video image content using a 3D generator tool to provide the airplane and helicopter as 3D content it can. In such an example, the director may provide less use to the user when other surrounding video content is provided as 3D content, as the user may be looking up at the sky until helicopters and airplanes pass by. It can be judged that it may not. Thus, the director can configure the video content to adjust the 3D effect from the sky to another area in the video content when the end of the sequence involving helicopters and airplanes is scheduled.

３Ｄ効果を含む画像コンテンツの部分を手動で選択することは、たとえばＶＲ映画監督によってトリガされ得る。監督は、コンテンツに基づいて、または所望のユーザ応答に基づいて画像コンテンツを構成してもよい。たとえば、監督は、ユーザの注目をコンテンツ内のどこかに集中させたいかもしれず、ほんの数例を挙げると、データへの有用なアクセス、芸術的ビジョン、または滑らかな遷移を提供するために、そうすることができる。監督は、画像コンテンツ内に３Ｄ変更を予め構成し、そのような変更がＶＲ空間でユーザに表示される時間を調節することができる。 Manually selecting a portion of the image content that includes a 3D effect may be triggered, for example, by a VR movie director. The director may configure the image content based on the content or based on the desired user response. For example, a director may want to focus the user's attention somewhere in the content, and to name just a few examples, so as to provide useful access to data, an artistic vision, or a smooth transition. can do. The director can preconfigure 3D changes in the image content and adjust the time that such changes are displayed to the user in the VR space.

３Ｄ効果を含む画像コンテンツの部分を自動的に選択することは、効果をトリガするためにユーザ入力を使用することを含み得る。たとえば、システム１００は、ＶＲ空間においてコンテンツにアクセスしているユーザの検出された頭部の傾きに基づいて、３Ｄ効果を画像コンテンツ内に現われるようトリガするために使用され得る。ユーザの他の動き、コンテンツ変更、センサ、および場所ベースの効果が、３Ｄ効果の特定の適用または除去をトリガするための入力として使用され得る。一例では、ステージ上のコンサートをＶＲ空間において３Ｄで表わすことができ、一方、ユーザはコンサート中にまず振り返らないため、コンサートにアクセスしているユーザの背後の観衆は２Ｄのままであってもよい。しかしながら、ユーザが振り返ることを選択した場合、３Ｄ効果は、ステージ／コンサート画像コンテンツから観客画像コンテンツにシフトされ得る。 Automatically selecting the portion of the image content that includes the 3D effect may include using user input to trigger the effect. For example, the system 100 can be used to trigger a 3D effect to appear in the image content based on the detected head tilt of a user accessing the content in VR space. Other user movements, content changes, sensors, and location-based effects can be used as input to trigger specific application or removal of 3D effects. In one example, a concert on stage can be represented in 3D in VR space, while the user does not first look back during the concert, so the audience behind the user accessing the concert may remain 2D. . However, if the user chooses to look back, the 3D effect can be shifted from stage / concert image content to audience image content.

例示的なシステム１００では、デバイス１０６、１０８、および１１２は、ラップトップコンピュータ、デスクトップコンピュータ、モバイルコンピューティングデバイス、またはゲーム機であってもよい。いくつかの実現化例では、デバイス１０６、１０８、および１１２は、ＨＭＤデバイス１１０内に配置され（たとえば設置され／位置し）得るモバイルコンピューティングデバイスであり得る。モバイルコンピューティングデバイスは、たとえば、ＨＭＤデバイス１１０のためのスクリーンとして使用され得るディスプレイデバイスを含み得る。デバイス１０６、１０８、および１１２は、ＶＲアプリケーションを実行するためのハードウェアおよび／またはソフトウェアを含み得る。加えて、デバイス１０６、１０８、および１１２は、これらのデバイスがＨＭＤデバイス１１０の前に設置されるか、またはＨＭＤデバイス１１０に対してある範囲の位置内に保持される場合に、ＨＭＤデバイス１１０の３Ｄ移動を認識、監視、および追跡可能なハードウェアおよび／またはソフトウェアを含み得る。いくつかの実現化例では、デバイス１０６、１０８、および１１２は、追加のコンテンツをネットワーク１０４を通してＨＭＤデバイス１１０に提供し得る。いくつかの実現化例では、デバイス１０２、１０６、１０８、１１０、および１１２は、ネットワーク１０４を介してペアリングされるかまたは接続された互いのうちの１つ以上と接続／インターフェイス接続され得る。この接続は有線でも無線でもよい。ネットワーク１０４は公衆通信ネットワークでも私設通信ネットワークでもよい。 In exemplary system 100, devices 106, 108, and 112 may be laptop computers, desktop computers, mobile computing devices, or game consoles. In some implementations, the devices 106, 108, and 112 may be mobile computing devices that may be located (eg, installed / located) within the HMD device 110. The mobile computing device may include a display device that can be used as a screen for the HMD device 110, for example. Devices 106, 108, and 112 may include hardware and / or software for executing VR applications. In addition, devices 106, 108, and 112 may be installed on HMD device 110 when they are installed in front of HMD device 110 or held within a range of positions relative to HMD device 110. It may include hardware and / or software capable of recognizing, monitoring and tracking 3D movement. In some implementations, devices 106, 108, and 112 may provide additional content to HMD device 110 over network 104. In some implementations, the devices 102, 106, 108, 110, and 112 may be connected / interfaced with one or more of each other paired or connected via the network 104. This connection may be wired or wireless. The network 104 may be a public communication network or a private communication network.

システム１００は電子記憶装置を含んでいてもよい。電子記憶装置は、情報を電子的に格納する非一時的記憶媒体を含み得る。電子記憶装置は、取込画像、取得画像、前処理された画像、後処理された画像などを格納するように構成されてもよい。開示されるカメラリグのいずれかを用いて取込まれた画像は、ビデオの１つ以上のストリームとして処理されて格納され得るか、または個々のフレームとして格納され得る。いくつかの実現化例では、格納は取込時に起こり、レンダリングは取込みの部分の直後に起こり、取込みおよび処理が同時でなかった場合よりも早く、パノラマステレオコンテンツへの高速アクセスを可能とし得る。 System 100 may include an electronic storage device. The electronic storage device may include a non-transitory storage medium that stores information electronically. The electronic storage device may be configured to store captured images, acquired images, pre-processed images, post-processed images, and the like. Images captured using any of the disclosed camera rigs can be processed and stored as one or more streams of video, or stored as individual frames. In some implementations, storage occurs at the time of capture and rendering occurs immediately after the portion of the capture, allowing faster access to panoramic stereo content faster than if capture and processing were not simultaneous.

図２は、ビデオコンテンツの３Ｄ部分を生成する際に使用するシーンの画像を取込むように構成された例示的な球状カメラリグ２００を示す図である。カメラリグ２００は、多くのカメラ２０２、２０４、２０６、２０８、２１０、２１２、２１４、２１６、および２１８を含む。カメラ２０２〜２１８は、球状形状のリグに取付けられて図示されている。球の他の角度についての追加のカメラは図２に示されていないが、そのような他の角度から画像コンテンツを収集するように構成されている。カメラ２０２〜２１８は、３つのカメラの各々が、球を包囲する各点／エリアについて画像コンテンツを取込むために、ともに機能できるように配置される。各点／エリアを取込むことは、リグ２００を包囲するシーンの静止画像またはビデオ画像を取込むことを含む。カメラ２０２〜２１８は、球（または他の形状のリグ）に当たって設置され得る。いくつかの実現化例では、カメラ２０２〜２１８（および／または、より多いかより少ないカメラ）は、追加の画像コンテンツを取込むために球に対して傾斜して設置され得る。 FIG. 2 is a diagram illustrating an exemplary spherical camera rig 200 configured to capture an image of a scene for use in generating a 3D portion of video content. Camera rig 200 includes a number of cameras 202, 204, 206, 208, 210, 212, 214, 216, and 218. Cameras 202-218 are shown attached to a spherical rig. Additional cameras for other angles of the sphere are not shown in FIG. 2, but are configured to collect image content from such other angles. Cameras 202-218 are arranged so that each of the three cameras can work together to capture image content for each point / area surrounding the sphere. Capturing each point / area includes capturing a still or video image of the scene surrounding the rig 200. Cameras 202-218 may be placed against a sphere (or other shaped rig). In some implementations, the cameras 202-218 (and / or more or fewer cameras) may be placed at an angle with respect to the sphere to capture additional image content.

非限定的な一例では、カメラ２０２、２０４、および２０６は、球を包囲するシーンのエリアの画像を取込むために配置され得る。取込画像は分析され、ともに組合され（たとえばつなぎ合わされ）て、ＶＲ空間においてユーザのための可視シーンを形成し得る。同様に、カメラ２０４を用いて取込まれた画像が、カメラ２０２および２０８を用いて取込まれた画像と組合されて、可視シーンの別のエリアを提供し得る。カメラ２０２、２０８、および２１０を用いて取込まれた画像同士は、カメラ２０６、２１２、および２１４と同じやり方で組合され得る。カメラ間のより広い空間も可能になり得る。たとえば、カメラ２１０、２１２、および２１６を用いて取込まれた画像同士を組合せて、リグ２００の半球の半分から見えるシーン（たとえば点）についての画像コンテンツを提供することができる。球２００の半球の別の半分からの可視画像を提供するために、同様の組合せがカメラ２０２、２１２、および２１８を用いて作られ得る。いくつかの実現化例では、カメラリグ２００の直径２２０は、約０．１５メートル〜約１．５メートルのどこかにあってもよい。非限定的な一例では、直径２２０は約０．２〜約０．９メートルである。別の非限定的な例では、直径２２０は約０．５〜約０．６メートルである。いくつかの実現化例では、カメラ間の間隔は約０．０５メートル〜約０．６メートルであり得る。非限定的な一例では、カメラ間の間隔は約０．１メートルである。 In one non-limiting example, the cameras 202, 204, and 206 can be arranged to capture an image of the area of the scene surrounding the sphere. The captured images can be analyzed and combined (eg, stitched together) to form a visible scene for the user in VR space. Similarly, images captured using camera 204 may be combined with images captured using cameras 202 and 208 to provide another area of the visible scene. Images captured using cameras 202, 208, and 210 can be combined in the same manner as cameras 206, 212, and 214. A wider space between the cameras may also be possible. For example, images captured using cameras 210, 212, and 216 can be combined to provide image content for a scene (eg, a point) that is visible from half of the hemisphere of rig 200. Similar combinations can be made using cameras 202, 212, and 218 to provide a visible image from another half of the hemisphere of sphere 200. In some implementations, the diameter 220 of the camera rig 200 may be anywhere from about 0.15 meters to about 1.5 meters. In one non-limiting example, the diameter 220 is about 0.2 to about 0.9 meters. In another non-limiting example, the diameter 220 is about 0.5 to about 0.6 meters. In some implementations, the distance between cameras can be from about 0.05 meters to about 0.6 meters. In one non-limiting example, the distance between cameras is approximately 0.1 meters.

いくつかの実現化例では、空間における各点を取込むために、カメラの集団は、そのような球状カメラリグ（または他の３Ｄ形状のリグ）上に、多くの方向において配置され得る。すなわち、空間における各点は、少なくとも３つのカメラによって取込まれてもよい。一例では、多くのカメラができるだけ接近して球上に（たとえば、二十面体の各隅、ジオデシックドームの各隅などに）配置され得る。多くのリグ構成を以下に説明する。本開示で説明される各構成は、前述のまたは他の直径およびカメラ間距離を有して構成され得る。 In some implementations, a collection of cameras can be placed on such a spherical camera rig (or other 3D-shaped rig) in many directions to capture each point in space. That is, each point in space may be captured by at least three cameras. In one example, many cameras can be placed on a sphere as close as possible (eg, at each corner of the icosahedron, at each corner of the geodesic dome, etc.). A number of rig configurations are described below. Each configuration described in this disclosure may be configured with the aforementioned or other diameters and inter-camera distances.

図３を参照して、二十面体状カメラリグ３００が示される。カメラリグ３００上には多くのカメラが搭載され得る。カメラは、カメラ３０２、３０４、および３０６によって図示されるように、二十面体における三角形の点に設置され得る。これに代えて、またはこれに加えて、カメラは、カメラ３０８、３１０、３１２、および３１４によって図示されるように、二十面体の三角形の中心に設置され得る。二十面体の辺の周りに、カメラ３１６、３１８、３２０、３２２、および３２４が図示されている。二十面体の周りに、追加のカメラが含まれ得る。カメラ間隔および直径３２６は、本開示全体にわたって説明される他のカメラリグと同様に構成されてもよい。いくつかの実現化例では、カメラは、カメラリグに対して接線方向に設置され得る。他の実現化例では、各カメラは、カメラリグに対してさまざまな角度で設置され得る。 Referring to FIG. 3, an icosahedron camera rig 300 is shown. Many cameras can be mounted on the camera rig 300. The cameras can be placed at triangular points in the icosahedron, as illustrated by cameras 302, 304, and 306. Alternatively or in addition, the camera may be placed in the center of an icosahedron triangle, as illustrated by cameras 308, 310, 312, and 314. Cameras 316, 318, 320, 322, and 324 are shown around the sides of the icosahedron. Additional cameras may be included around the icosahedron. The camera spacing and diameter 326 may be configured similarly to other camera rigs described throughout this disclosure. In some implementations, the camera may be placed tangential to the camera rig. In other implementations, each camera may be installed at various angles relative to the camera rig.

カメラリグ３００は静止していてもよく、視野が広いカメラ３０２〜３２４で構成されてもよい。たとえば、カメラ３０２〜３２４は、約１５０度〜約１８０度の視野を取込むことができる。カメラ３０２〜３２４は、より広いビューを取込むために魚眼レンズを有していてもよい。いくつかの実現化例では、隣り合うカメラ（たとえば、３０２および３２０）はステレオペアとして機能でき、第３のカメラ３０６がカメラ３０２および３２０の各々とペアになって、カメラ３０２、３０６、および３２０から取得された画像からオプティカルフローが計算され得るステレオトライアドのカメラを生成し得る。同様に、図３に示されない数あるカメラ組合せの中でも特に、以下のカメラは、３Ｄ画像を生成するために組合せ可能画像を生成し得る：（カメラ３０２、３１２、および３２４）、（カメラ３０２、３０４、および３２４）、（カメラ３０４、３１６、および３２４）、（カメラ３０２、３０６、および３２０）、（カメラ３０４、３１６、および３１８）、（カメラ３０４、３０６、および３１８）、ならびに（カメラ３１０、３１２、および３１４）。 The camera rig 300 may be stationary or may be configured with cameras 302 to 324 having a wide field of view. For example, the cameras 302-324 can capture a field of view from about 150 degrees to about 180 degrees. Cameras 302-324 may have fisheye lenses to capture a wider view. In some implementations, adjacent cameras (eg, 302 and 320) can function as a stereo pair, with a third camera 306 paired with each of the cameras 302 and 320, and the cameras 302, 306, and 320. A stereo triad camera can be generated from which the optical flow can be calculated from the images obtained from. Similarly, among other camera combinations not shown in FIG. 3, the following cameras may generate combinable images to generate 3D images: (cameras 302, 312, and 324), (cameras 302, 304 and 324), (cameras 304, 316, and 324), (cameras 302, 306, and 320), (cameras 304, 316, and 318), (cameras 304, 306, and 318), and (camera 310) , 312 and 314).

いくつかの実現化例では、カメラリグ３００（およびここに説明される他のカメラ）は、シーン３３０などのシーンの画像を取込むように構成され得る。画像は、シーン３３０の部分、シーン３３０のビデオ、またはシーン３３０のパノラマビデオを含んでいてもよい。動作時、ここに説明されるシステムは、そのような取込画像を検索し、コンテンツを処理して、取込画像内の特定の領域を３次元フォーマットで表示することができる。たとえば、ここに説明されるシステムは、２次元データを３次元データに変換するための、複数のカメラを用いて取込まれた複数の画像内の領域を判断できる。例示的な領域は、領域３３２、３３４、および３３６を含む。そのような領域は、ユーザにより選択され、監督により選択され、または自動的に選択され得る。いくつかの実現化例では、領域は、画像が取込まれた後に、およびＨＭＤデバイスでの画像の表示中に選択され得る。シーン３３０全体を通して他の領域が選択可能であり、領域３３２、３３４、および３３６は例示的な領域を表わす。領域３３２は、取込経路３３８、３４０、および３４２を使用してリグ３００によって取込まれた領域を示す。 In some implementations, the camera rig 300 (and other cameras described herein) may be configured to capture an image of a scene, such as the scene 330. The image may include a portion of the scene 330, a video of the scene 330, or a panoramic video of the scene 330. In operation, the system described herein can retrieve such captured images, process the content, and display specific regions within the captured images in a three-dimensional format. For example, the system described herein can determine regions within a plurality of images captured using a plurality of cameras for converting 2D data to 3D data. Exemplary regions include regions 332, 334, and 336. Such an area may be selected by the user, selected by the director, or automatically selected. In some implementations, the region may be selected after the image is captured and during display of the image on the HMD device. Other regions can be selected throughout the scene 330, with regions 332, 334, and 336 representing exemplary regions. Region 332 shows the region captured by rig 300 using capture paths 338, 340, and 342.

図４を参照して、六角形球カメラリグ４００が示される。カメラリグ４００上には多くのカメラが搭載され得る。カメラは、カメラ４０２、４０４、および４０６によって図示されるように、六角形の点に、または六角形の辺に沿って設置され得る。これに代えて、またはこれに加えて、カメラは、六角形の中心に設置され得る。六角形球カメラリグ４００の周りに、追加のカメラが含まれ得る。カメラ間隔および直径４０８は、本開示全体にわたって説明される他のカメラリグと同様に構成されてもよい。 Referring to FIG. 4, a hexagonal spherical camera rig 400 is shown. Many cameras can be mounted on the camera rig 400. The cameras may be placed at hexagonal points or along hexagonal sides, as illustrated by cameras 402, 404, and 406. Alternatively or additionally, the camera can be placed in the center of the hexagon. Additional cameras may be included around the hexagonal spherical camera rig 400. The camera spacing and diameter 408 may be configured similarly to other camera rigs described throughout this disclosure.

図５は、ＶＲ空間にアクセスしているユーザに３Ｄ画像コンテンツのエリアを提供するためのプロセス５００の一実施形態を示すフローチャートである。プロセス５００は、取込画像を使用して、ＲＧＢデータを含むもののそれに限定されない画像データを検索および／または計算することができ、そのようなデータを使用して、画像の領域における画素の一部に関連付けられた深度値データを計算することができる。システムは、頭部装着型ディスプレイデバイスにおける表示のために領域の３次元バージョンを提供するために、画像データを使用して領域の２次元バージョンを領域の３次元バージョンに変換することができる。 FIG. 5 is a flowchart illustrating one embodiment of a process 500 for providing an area of 3D image content to a user accessing a VR space. Process 500 can use the captured image to retrieve and / or calculate image data, including but not limited to RGB data, which can be used to generate a subset of pixels in a region of the image. Depth value data associated with can be calculated. The system can use the image data to convert a two-dimensional version of the region into a three-dimensional version of the region to provide a three-dimensional version of the region for display on a head mounted display device.

ブロック５０２で、システム１００は、２次元データを３次元データに変換するための、複数のカメラを用いて取込まれた複数の画像内の領域を判断し得る。複数の画像は、静止画像、ビデオ、画像の部分、および／またはビデオの部分であってもよい。いくつかの実現化例では、複数の画像は、球状形状のカメラリグ上に搭載された多くのカメラを用いて取込まれたビデオ画像コンテンツを含んでいてもよい。いくつかの実現化例では、２次元データを３次元データに変換するための領域を判断するステップは、ディスプレイデバイス１０６などの頭部装着型ディスプレイで検出されたユーザ入力に少なくとも部分的に基づいて自動的に行なわれる。ユーザ入力は、頭部回転、凝視方向の変更、手ぶり、場所変更などを含んでいてもよい。いくつかの実現化例では、領域を判断するステップは、特定のビデオまたは画像内に３Ｄ領域を提供するというＶＲ映画監督の選択に基づいて、手動で起こり得る。 At block 502, the system 100 may determine regions in the plurality of images captured using the plurality of cameras for converting the two-dimensional data to the three-dimensional data. The plurality of images may be still images, videos, image portions, and / or video portions. In some implementations, the plurality of images may include video image content captured using a number of cameras mounted on a spherically shaped camera rig. In some implementations, the step of determining an area for converting 2D data to 3D data is based at least in part on user input detected on a head mounted display such as display device 106. Done automatically. User input may include head rotation, gaze direction change, hand gesture, location change, and the like. In some implementations, determining the region may occur manually based on the VR movie director's choice to provide a 3D region within a particular video or image.

ブロック５０４で、システム１００は、その領域における画素の一部について深度値を計算し得る。いくつかの実現化例では、システム１００は、領域における各画素について深度値を計算し得る。深度値を計算するステップは、複数のカメラによって取込まれた多くの領域を比較するステップを含んでいてもよい。たとえば、領域３３２の３つの画像が、領域３３２に対して異なる角度にある３つのカメラ（たとえば、カメラ３０４、３１６、および３２４）によって取込まれ得る。システム１００は、画素強度の精度を判断するために、３つの画像中の画素強度および場所を比較し得る。比較を使用して、深度値が、領域３３２における１つ以上の画素について計算され得る。画素強度の精度を確認するために、他の基準オブジェクトがシーンで比較され得る。 At block 504, the system 100 may calculate depth values for some of the pixels in the region. In some implementations, the system 100 may calculate a depth value for each pixel in the region. The step of calculating a depth value may include comparing a number of regions captured by a plurality of cameras. For example, three images of region 332 may be captured by three cameras (eg, cameras 304, 316, and 324) at different angles with respect to region 332. System 100 may compare pixel intensities and locations in the three images to determine pixel intensity accuracy. Using the comparison, a depth value can be calculated for one or more pixels in region 332. Other reference objects can be compared in the scene to confirm the accuracy of the pixel intensity.

ブロック５０６で、システム１００は球状画像を生成し得る。球状画像を生成するステップは、画像データを使用して、画像の球状にフォーマット化されたバージョンを計算するステップを含み得る。 At block 506, the system 100 may generate a spherical image. Generating the spherical image may include calculating a spherically formatted version of the image using the image data.

ブロック５０８で、システム１００は、画像データを使用して、画像処理システムによって生成されたコンピュータグラフィックスオブジェクトの３次元空間において３次元表面を構成し得る。たとえば、画素の一部は、領域における画素の一部のうちの１つ以上に関連付けられた対応する深度値と等しい半径で、コンピュータグラフィックスオブジェクトの表面上に表わされてもよい。コンピュータグラフィックスオブジェクトは、球、二十面体、三角形、または他の多角形であってもよい。 At block 508, the system 100 may use the image data to construct a three-dimensional surface in a three-dimensional space of computer graphics objects generated by the image processing system. For example, a portion of the pixels may be represented on the surface of the computer graphics object with a radius equal to the corresponding depth value associated with one or more of the portions of the pixels in the region. Computer graphics objects may be spheres, icosahedrons, triangles, or other polygons.

ブロック５１０で、システム１００は、画像データを使用して、コンピュータグラフィックスオブジェクトの表面へのテクスチャマッピングを生成し得る。テクスチャマッピングは、画像データをコンピュータグラフィックスオブジェクトの表面にマッピングすることを含んでいてもよい。ブロック５１２で、システム１００は、頭部装着型ディスプレイデバイスにおける表示のために、球状画像およびテクスチャマッピングを送信し得る。 At block 510, the system 100 may use the image data to generate a texture mapping to the surface of the computer graphics object. Texture mapping may include mapping image data to the surface of a computer graphics object. At block 512, the system 100 may transmit a spherical image and texture mapping for display on a head mounted display device.

ブロック５１２で、システム１００は、頭部装着型ディスプレイデバイスにおける表示のために、球状画像およびテクスチャマッピングを送信し得る。いくつかの実現化例では、プロセス５００は、領域についての追加の球状画像およびテクスチャマッピングを生成するステップと、画像データの一部と球状画像とを組合せることによって、左目ビューを生成するステップとを含んでいてもよい。プロセス５００は加えて、追加の画像データを生成し、追加の画像データと追加の球状画像とを組合せることによって、右目ビューを生成するステップを含み得る。プロセス５００は加えて、頭部装着型ディスプレイデバイスにおいて左目ビューおよび右目ビューを表示するステップを含み得る。いくつかの実現化例では、画像データは、領域における画素の一部のうちの少なくともいくつかについての深度値データおよびＲＧＢデータを含む。 At block 512, the system 100 may transmit a spherical image and texture mapping for display on a head mounted display device. In some implementations, the process 500 generates additional spherical images and texture mapping for the region, and generates a left eye view by combining a portion of the image data with the spherical image. May be included. Process 500 may additionally include generating additional image data and generating a right eye view by combining the additional image data and the additional spherical image. Process 500 may additionally include displaying a left eye view and a right eye view on the head mounted display device. In some implementations, the image data includes depth value data and RGB data for at least some of the portions of the pixels in the region.

表示は、領域に３Ｄ画像コンテンツを含んでいてもよい。方法はまた、追加の球状画像およびテクスチャマッピングを生成するステップと、深度値の一部をＲＧＢデータおよび球状画像と組合せることによって、左目ビューを生成するステップと、追加の深度値を生成し、追加の深度値を更新されたＲＧＢデータおよび追加の球状画像と組合せることによって、右目ビューを生成するステップと、頭部装着型ディスプレイデバイスにおいて左目ビューおよび右目ビューを表示するステップとを含んでいてもよい。 The display may include 3D image content in the area. The method also generates an additional spherical image and texture mapping, generates a left eye view by combining a portion of the depth values with the RGB data and the spherical image, generates an additional depth value, Generating a right eye view by combining the additional depth values with the updated RGB data and the additional spherical image; and displaying the left eye view and the right eye view on the head mounted display device. Also good.

いくつかの実現化例では、ここに説明されるシステムは、任意の数のカメラを用いて画像を取得するように構成されてもよい。たとえば、カメラ４０２、４０４、および４０６（図４）を使用して、特定の画像を取込むことができる。ここに説明されるシステムは、取込画像のうちの１つ以上を使用して、頭部装着型ディスプレイデバイスに提供するための少なくとも２つの更新画像を生成し得る。更新画像は、２Ｄまたは３Ｄコンテンツを提供するように構成されてもよい。３Ｄコンテンツは、更新画像の部分または更新画像のすべてにおいて構成され得る。更新画像は、たとえばカメラ４０２、４０４、および４０６などの物理的カメラから取込まれた画像から生成された仮想カメラ視点を使用して生成されてもよい。視点は、画像の特定の領域において特定の３Ｄコンテンツを提供するために選択された１つ以上のオフセットに関していてもよい。 In some implementations, the systems described herein may be configured to acquire images using any number of cameras. For example, cameras 402, 404, and 406 (FIG. 4) can be used to capture specific images. The system described herein may use at least one of the captured images to generate at least two updated images for provision to a head mounted display device. The updated image may be configured to provide 2D or 3D content. The 3D content can be composed of parts of the updated image or all of the updated image. The updated image may be generated using a virtual camera viewpoint generated from images captured from physical cameras, such as cameras 402, 404, and 406, for example. The viewpoint may relate to one or more offsets selected to provide specific 3D content in a specific region of the image.

いくつかの実現化例では、更新画像は、特定のオフセットから生成された画像データを含む。たとえば、ある更新画像は、コンテンツにおける画素の一部が１つ以上のカメラ４０２、４０４、または４０６の左側に面するオフセットから取込まれた画像コンテンツを含んでいてもよい。別の更新画像は、コンテンツにおける画素の一部が１つ以上のカメラ４０２、４０４、または４０６の右側に面するオフセットから取込まれた画像コンテンツを含んでいてもよい。 In some implementations, the updated image includes image data generated from a specific offset. For example, an updated image may include image content in which some of the pixels in the content are captured from an offset that faces the left side of one or more cameras 402, 404, or 406. Another updated image may include image content in which some of the pixels in the content are captured from an offset that faces the right side of one or more cameras 402, 404, or 406.

一般に、更新画像は、オフセット画像コンテンツ、仮想コンテンツ、さまざまなカメラ角度からのコンテンツ、操作された画像コンテンツ、およびそれらの組合せを含んでいてもよい。いくつかの実現化例では、更新画像は、少なくとも１つの仮想カメラの視点を補間することによって生成されてもよい。補間は、取込画像における複数の画素をサンプリングすることと、オプティカルフローを使用して仮想コンテンツを生成することと、仮想コンテンツが更新画像のうちの少なくとも１つの内部に設置されるよう適合させることとを含んでいてもよい。 In general, the updated image may include offset image content, virtual content, content from various camera angles, manipulated image content, and combinations thereof. In some implementations, the updated image may be generated by interpolating the viewpoint of at least one virtual camera. Interpolation samples a plurality of pixels in a captured image, generates virtual content using optical flow, and adapts the virtual content to be placed inside at least one of the updated images. And may be included.

仮想カメラは、予め規定された中心線からの左側オフセットでコンテンツを取込み、予め規定された中心線からの右側オフセットでコンテンツを取込むように構成されてもよい。左側オフセットおよび右側オフセットは修正可能であり、また、頭部装着型ディスプレイにおける正確な表示のために画像を適合させるために機能的であってもよい。 The virtual camera may be configured to capture content with a left offset from a predefined centerline and capture content with a right offset from a predefined centerline. The left and right offsets can be modified and may be functional to adapt the image for accurate display on a head mounted display.

仮想カメラは、１つ以上の物理的カメラを用いて取込まれたコンテンツを利用し、コンテンツを補間された視点から提供されるよう適合させるように構成されてもよい。特に、仮想カメラは、１つ以上の物理的カメラによって生成された任意のオフセット（角度）を取込むように適合され得る。オフセットは視点を規定してもよい。オフセットは、物理的カメラの中心線から、または２つの物理的カメラ間に規定された中心線から規定されてもよい。コンテンツの補間は、いずれかの中心線からの任意のオフセットを有するコンテンツを生成するために調整可能であり、オフセットの量および方向は、ＨＭＤディスプレイデバイスにおいて提供される画像において３次元効果の正確な描写を保証するように選択され得る。 A virtual camera may be configured to utilize content captured using one or more physical cameras and to adapt the content to be provided from an interpolated viewpoint. In particular, the virtual camera may be adapted to capture any offset (angle) generated by one or more physical cameras. The offset may define the viewpoint. The offset may be defined from the center line of the physical camera or from the center line defined between the two physical cameras. The content interpolation can be adjusted to produce content with any offset from either centerline, and the amount and direction of the offset is accurate for the 3D effect in the images provided in the HMD display device. It can be selected to ensure depiction.

少なくとも２つの更新画像を生成すると、ここに説明されるシステムは、頭部装着型ディスプレイの左アイピースに提供するための第１の球状画像を生成するために、第１の画像を第１の球面にマッピングするように構成されてもよい。同様に、ここに説明されるシステムは、頭部装着型ディスプレイの右アイピースに提供するための第２の球状画像を生成するために、第２の画像を第２の球面にマッピングするように構成されてもよい。第１の画像のマッピングおよび第２の画像のマッピングは、第１の画像および第２の画像にテクスチャを適用することを含んでいてもよい。テクスチャの適用は、詳細に上述されたように、第１の画像から第１の球面に画素座標を割当てること、および、第２の画像から第２の球面に画素座標を割当てることを含んでいてもよい。 Upon generating at least two updated images, the system described herein converts the first image to the first spherical surface to generate a first spherical image for provision to the left eyepiece of the head mounted display. May be configured to map to. Similarly, the system described herein is configured to map a second image to a second sphere to generate a second spherical image for provision to the right eyepiece of the head mounted display. May be. The mapping of the first image and the mapping of the second image may include applying a texture to the first image and the second image. Applying the texture includes assigning pixel coordinates from the first image to the first sphere, and assigning pixel coordinates from the second image to the second sphere, as described in detail above. Also good.

図６は、ここに説明される手法を用いて使用され得る汎用コンピュータデバイス６００および汎用モバイルコンピュータデバイス６５０の例を示す。コンピューティングデバイス６００は、プロセッサ６０２と、メモリ６０４と、記憶装置６０６と、メモリ６０４および高速拡張ポート６１０に接続している高速インターフェイス６０８と、低速バス６１４および記憶装置６０６に接続している低速インターフェイス６１２とを含む。コンポーネント６０２、６０４、６０６、６０８、６１０、および６１２の各々は、さまざまなバスを使用して相互接続されており、共通のマザーボード上にまたは他の態様で適宜搭載されてもよい。プロセッサ６０２は、コンピューティングデバイス６００内で実行される命令を処理可能であり、これらの命令は、ＧＵＩのためのグラフィック情報を、高速インターフェイス６０８に結合されたディスプレイ６１６などの外部入出力デバイス上に表示するために、メモリ６０４内または記憶装置６０６上に格納された命令を含む。他の実現化例では、複数のプロセッサおよび／または複数のバスが、複数のメモリおよび複数のタイプのメモリとともに適宜使用されてもよい。加えて、複数のコンピューティングデバイス６００が接続されてもよく、各デバイスは（たとえば、サーババンク、ブレードサーバのグループ、またはマルチプロセッサシステムとして）必要な動作の部分を提供する。 FIG. 6 illustrates an example of a general purpose computing device 600 and a general purpose mobile computing device 650 that may be used with the techniques described herein. The computing device 600 includes a processor 602, a memory 604, a storage device 606, a high speed interface 608 connected to the memory 604 and a high speed expansion port 610, and a low speed bus 614 and a low speed interface connected to the storage device 606. 612. Each of the components 602, 604, 606, 608, 610, and 612 are interconnected using various buses and may be optionally mounted on a common motherboard or in other manners. The processor 602 is capable of processing instructions that are executed within the computing device 600, which instructions display graphics information for the GUI on an external input / output device such as a display 616 coupled to the high speed interface 608. Includes instructions stored in memory 604 or on storage device 606 for display. In other implementations, multiple processors and / or multiple buses may be used as appropriate with multiple memories and multiple types of memory. In addition, multiple computing devices 600 may be connected, each device providing a portion of the required operation (eg, as a server bank, a group of blade servers, or a multiprocessor system).

メモリ６０４は、情報をコンピューティングデバイス６００内に格納する。一実現化例では、メモリ６０４は１つまたは複数の揮発性メモリユニットである。別の実現化例では、メモリ６０４は１つまたは複数の不揮発性メモリユニットである。メモリ６０４はまた、磁気ディスクまたは光ディスクといった別の形態のコンピュータ読取可能媒体であってもよい。 Memory 604 stores information within computing device 600. In one implementation, the memory 604 is one or more volatile memory units. In another implementation, the memory 604 is one or more non-volatile memory units. The memory 604 may also be another form of computer readable media such as a magnetic disk or optical disk.

記憶装置６０６は、コンピューティングデバイス６００のための大容量記憶を提供可能である。一実現化例では、記憶装置６０６は、フロッピー（登録商標）ディスクデバイス、ハードディスクデバイス、光ディスクデバイス、またはテープデバイス、フラッシュメモリもしくは他の同様のソリッドステートメモリデバイス、または、ストレージエリアネットワークもしくは他の構成におけるデバイスを含むデバイスのアレイといった、コンピュータ読取可能媒体であってもよく、または当該コンピュータ読取可能媒体を含んでいてもよい。コンピュータプログラム製品が情報担体において有形に具体化され得る。コンピュータプログラム製品はまた、実行されると上述のような１つ以上の方法を行なう命令を含んでいてもよい。情報担体は、メモリ６０４、記憶装置６０６、またはプロセッサ６０２上のメモリといった、コンピュータ読取可能媒体または機械読取可能媒体である。 Storage device 606 can provide mass storage for computing device 600. In one implementation, the storage device 606 is a floppy disk device, hard disk device, optical disk device, or tape device, flash memory or other similar solid state memory device, or storage area network or other configuration. Or may be a computer readable medium, such as an array of devices including the device. A computer program product may be tangibly embodied in an information carrier. The computer program product may also include instructions that, when executed, perform one or more methods as described above. The information carrier is a computer-readable or machine-readable medium, such as memory 604, storage device 606, or memory on processor 602.

高速コントローラ６０８はコンピューティングデバイス６００のための帯域幅集約的な動作を管理し、一方、低速コントローラ６１２はより低い帯域幅集約的な動作を管理する。機能のそのような割当ては例示に過ぎない。一実現化例では、高速コントローラ６０８は、メモリ６０４、ディスプレイ６１６に（たとえば、グラフィックスプロセッサまたはアクセラレータを介して）、および、さまざまな拡張カード（図示せず）を受付け得る高速拡張ポート６１０に結合される。この実現化例では、低速コントローラ６１２は、記憶装置６０６および低速拡張ポート６１４に結合される。さまざまな通信ポート（たとえば、ＵＳＢ、ブルートゥース（登録商標）、イーサネット（登録商標）、無線イーサネット）を含み得る低速拡張ポートは、キーボード、ポインティングデバイス、スキャナなどの１つ以上の入出力デバイスに、もしくは、スイッチまたはルータなどのネットワーキングデバイスに、たとえばネットワークアダプタを介して結合されてもよい。 The high speed controller 608 manages bandwidth intensive operations for the computing device 600, while the low speed controller 612 manages lower bandwidth intensive operations. Such assignment of functions is exemplary only. In one implementation, the high speed controller 608 is coupled to the memory 604, the display 616 (eg, via a graphics processor or accelerator), and a high speed expansion port 610 that can accept various expansion cards (not shown). Is done. In this implementation, the low speed controller 612 is coupled to the storage device 606 and the low speed expansion port 614. A low-speed expansion port that can include various communication ports (eg, USB, Bluetooth, Ethernet, wireless Ethernet) to one or more input / output devices such as a keyboard, pointing device, scanner, or May be coupled to a networking device, such as a switch or router, via a network adapter, for example.

コンピューティングデバイス６００は、図に示すように多くの異なる形態で実現されてもよい。たとえばそれは、標準サーバ６２０として、またはそのようなサーバのグループで複数回実現されてもよい。それはまた、ラックサーバシステム６２４の一部として実現されてもよい。加えて、それは、ラップトップコンピュータ６２２などのパーソナルコンピュータにおいて実現されてもよい。これに代えて、コンピューティングデバイス６００からのコンポーネントは、デバイス６５０などのモバイルデバイス（図示せず）における他のコンポーネントと組合されてもよい。そのようなデバイスの各々は、コンピューティングデバイス６００、６５０のうちの１つ以上を含んでいてもよく、システム全体が、互いに通信する複数のコンピューティングデバイス６００、６５０で構成されてもよい。 The computing device 600 may be implemented in many different forms as shown. For example, it may be implemented multiple times as a standard server 620 or in a group of such servers. It may also be implemented as part of the rack server system 624. In addition, it may be implemented on a personal computer such as a laptop computer 622. Alternatively, components from computing device 600 may be combined with other components in a mobile device (not shown), such as device 650. Each such device may include one or more of the computing devices 600, 650, and the entire system may be comprised of multiple computing devices 600, 650 communicating with each other.

コンピューティングデバイス６５０は、数あるコンポーネントの中でも特に、プロセッサ６５２と、メモリ６６４と、ディスプレイ６５４などの入出力デバイスと、通信インターフェイス６６６と、トランシーバ６６８とを含む。デバイス６５０にはまた、追加の格納を提供するために、マイクロドライブまたは他のデバイスなどの記憶装置が設けられてもよい。コンポーネント６５０、６５２、６６４、６５４、６６６、および６６８の各々は、さまざまなバスを使用して相互接続されており、当該コンポーネントのうちのいくつかは、共通のマザーボード上にまたは他の態様で適宜搭載されてもよい。 Computing device 650 includes, among other components, processor 652, memory 664, input / output devices such as display 654, communication interface 666, and transceiver 668. Device 650 may also be provided with a storage device, such as a microdrive or other device, to provide additional storage. Each of the components 650, 652, 664, 654, 666, and 668 are interconnected using various buses, some of which are optionally on a common motherboard or otherwise. It may be mounted.

プロセッサ６５２は、メモリ６６４に格納された命令を含む、コンピューティングデバイス６５０内の命令を実行可能である。プロセッサは、別個の複数のアナログおよびデジタルプロセッサを含むチップのチップセットとして実現されてもよい。プロセッサは、たとえば、ユーザインターフェイス、デバイス６５０が実行するアプリケーション、およびデバイス６５０による無線通信の制御といった、デバイス６５０の他のコンポーネント同士の連携を提供してもよい。 The processor 652 can execute instructions within the computing device 650, including instructions stored in the memory 664. The processor may be implemented as a chip set of chips that include separate analog and digital processors. The processor may provide coordination between other components of the device 650, such as, for example, a user interface, applications executed by the device 650, and control of wireless communication by the device 650.

プロセッサ６５２は、ディスプレイ６５４に結合された制御インターフェイス６５８およびディスプレイインターフェイス６５６を介してユーザと通信してもよい。ディスプレイ６５４は、たとえば、ＴＦＴＬＣＤ（Thin-Film-Transistor Liquid Crystal Display：薄膜トランジスタ液晶ディスプレイ）、またはＯＬＥＤ（Organic Light Emitting Diode：有機発光ダイオード）ディスプレイ、または他の適切なディスプレイ技術であってもよい。ディスプレイインターフェイス６５６は、ディスプレイ６５４を駆動してグラフィカル情報および他の情報をユーザに提示するための適切な回路を含んでいてもよい。制御インターフェイス６５８は、ユーザからコマンドを受信し、それらをプロセッサ６５２に送出するために変換してもよい。加えて、デバイス６５０と他のデバイスとの近接エリア通信を可能にするように、外部インターフェイス６６２がプロセッサ６５２と通信した状態で設けられてもよい。外部インターフェイス６６２は、たとえば、ある実現化例では有線通信を提供し、他の実現化例では無線通信を提供してもよく、複数のインターフェイスも使用されてもよい。 The processor 652 may communicate with the user via a control interface 658 and a display interface 656 coupled to the display 654. The display 654 may be, for example, a TFT LCD (Thin-Film-Transistor Liquid Crystal Display), or an OLED (Organic Light Emitting Diode) display, or other suitable display technology. Display interface 656 may include appropriate circuitry for driving display 654 to present graphical information and other information to the user. The control interface 658 may receive commands from the user and convert them for delivery to the processor 652. In addition, an external interface 662 may be provided in communication with the processor 652 to allow near area communication between the device 650 and other devices. The external interface 662 may provide, for example, wired communication in some implementations, wireless communication in other implementations, and multiple interfaces may also be used.

メモリ６６４は、情報をコンピューティングデバイス６５０内に格納する。メモリ６６４は、１つまたは複数のコンピュータ読取可能媒体、１つまたは複数の揮発性メモリユニット、もしくは、１つまたは複数の不揮発性メモリユニットの１つ以上として実現され得る。拡張メモリ６７４も設けられ、拡張インターフェイス６７２を介してデバイス６５０に接続されてもよく、拡張インターフェイス６７２は、たとえばＳＩＭＭ（Single In Line Memory Module）カードインターフェイスを含んでいてもよい。そのような拡張メモリ６７４は、デバイス６５０に余分の格納スペースを提供してもよく、もしくは、デバイス６５０のためのアプリケーションまたは他の情報も格納してもよい。具体的には、拡張メモリ６７４は、上述のプロセスを実行または補足するための命令を含んでいてもよく、安全な情報も含んでいてもよい。このため、たとえば、拡張メモリ６７４はデバイス６５０のためのセキュリティモジュールとして設けられてもよく、デバイス６５０の安全な使用を許可する命令でプログラミングされてもよい。加えて、ハッキング不可能な態様でＳＩＭＭカード上に識別情報を乗せるといったように、安全なアプリケーションが追加情報とともにＳＩＭＭカードを介して提供されてもよい。 Memory 664 stores information within computing device 650. Memory 664 may be implemented as one or more of one or more computer readable media, one or more volatile memory units, or one or more non-volatile memory units. An expansion memory 674 is also provided and may be connected to the device 650 via the expansion interface 672, and the expansion interface 672 may include, for example, a SIMM (Single In Line Memory Module) card interface. Such an extended memory 674 may provide extra storage space for the device 650 or may also store applications or other information for the device 650. Specifically, the expanded memory 674 may include instructions for performing or supplementing the above-described process, and may also include secure information. Thus, for example, the expansion memory 674 may be provided as a security module for the device 650 and may be programmed with instructions that allow the device 650 to be used safely. In addition, a secure application may be provided via the SIMM card with additional information, such as placing identification information on the SIMM card in a non-hackable manner.

メモリはたとえば、以下に説明されるようなフラッシュメモリおよび／またはＮＶＲＡＭメモリを含んでいてもよい。一実現化例では、コンピュータプログラム製品が情報担体において有形に具体化される。コンピュータプログラム製品は、実行されると上述のような１つ以上の方法を行なう命令を含む。情報担体は、メモリ６６４、拡張メモリ６７４、またはプロセッサ６５２上のメモリといった、コンピュータ読取可能媒体または機械読取可能媒体であり、たとえばトランシーバ６６８または外部インターフェイス６６２を通して受信され得る。 The memory may include, for example, flash memory and / or NVRAM memory as described below. In one implementation, the computer program product is tangibly embodied in an information carrier. The computer program product includes instructions that, when executed, perform one or more methods as described above. The information carrier is a computer-readable or machine-readable medium, such as memory 664, expansion memory 674, or memory on processor 652, and may be received through transceiver 668 or external interface 662, for example.

デバイス６５０は、必要に応じてデジタル信号処理回路を含み得る通信インターフェイス６６６を介して無線通信してもよい。通信インターフェイス６６６は、とりわけ、ＧＳＭ（登録商標）音声通話、ＳＭＳ、ＥＭＳ、またはＭＭＳメッセージング、ＣＤＭＡ、ＴＤＭＡ、ＰＤＣ、ＷＣＤＭＡ（登録商標）、ＣＤＭＡ２０００、またはＧＰＲＳといった、さまざまなモードまたはプロトコル下での通信を提供してもよい。そのような通信は、たとえば無線周波数トランシーバ６６８を介して生じてもよい。加えて、ブルートゥース、Ｗｉ−Ｆｉ、または他のそのようなトランシーバ（図示せず）などを使用して、短距離通信が生じてもよい。加えて、ＧＰＳ（Global Positioning System：全地球測位システム）レシーバモジュール６７０が、追加のナビゲーション関連および位置関連無線データをデバイス６５０に提供してもよく、当該データは、デバイス６５０上で実行されるアプリケーションによって適宜使用されてもよい。 Device 650 may communicate wirelessly via communication interface 666, which may include digital signal processing circuitry as required. Communication interface 666 communicates under various modes or protocols such as, among others, GSM® voice calls, SMS, EMS, or MMS messaging, CDMA, TDMA, PDC, WCDMA®, CDMA2000, or GPRS. May be provided. Such communication may occur via radio frequency transceiver 668, for example. In addition, short range communications may occur using Bluetooth, Wi-Fi, or other such transceivers (not shown). In addition, a GPS (Global Positioning System) receiver module 670 may provide additional navigation-related and position-related wireless data to the device 650, which data is applied to the application running on the device 650. May be used as appropriate.

デバイス６５０はまた、ユーザから口頭情報を受信してそれを使用可能なデジタル情報に変換し得る音声コーデック６６０を使用して、音声通信してもよい。音声コーデック６６０はまた、たとえばデバイス６５０のハンドセットにおいて、スピーカを介すなどして、ユーザに聞こえる音を生成してもよい。そのような音は、音声電話からの音を含んでいてもよく、録音された音（たとえば、音声メッセージ、音楽ファイルなど）を含んでいてもよく、デバイス６５０上で動作するアプリケーションが生成する音も含んでいてもよい。 Device 650 may also communicate in voice using an audio codec 660 that can receive verbal information from the user and convert it to usable digital information. Audio codec 660 may also generate sounds audible to the user, such as via a speaker, for example, in the handset of device 650. Such sounds may include sounds from voice calls, may include recorded sounds (eg, voice messages, music files, etc.), and sound generated by applications running on device 650. May also be included.

コンピューティングデバイス６５０は、図に示すように多くの異なる形態で実現されてもよい。たとえばそれは、携帯電話６８０として実現されてもよい。それはまた、スマートフォン６８２、携帯情報端末、または他の同様のモバイルデバイスの一部として実現されてもよい。 The computing device 650 may be implemented in many different forms as shown in the figure. For example, it may be implemented as a mobile phone 680. It may also be implemented as part of a smart phone 682, a personal digital assistant, or other similar mobile device.

ここに説明されたシステムおよび手法のさまざまな実現化例は、デジタル電子回路、集積回路、特別に設計されたＡＳＩＣ（application specific integrated circuit：特定用途向け集積回路）、コンピュータハードウェア、ファームウェア、ソフトウェア、および／またはそれらの組合せで実現され得る。これらのさまざまな実現化例は、少なくとも１つのプログラマブルプロセッサを含むプログラマブルシステム上で実行可能および／または解釈可能な１つ以上のコンピュータプログラムにおける実現化例を含んでいてもよく、当該プロセッサは専用であっても汎用であってもよく、ストレージシステム、少なくとも１つの入力デバイス、および少なくとも１つの出力デバイスからデータおよび命令を受信するとともに、これらにデータおよび命令を送信するように結合されてもよい。 Various implementations of the systems and techniques described herein include digital electronic circuits, integrated circuits, specially designed application specific integrated circuits (ASICs), computer hardware, firmware, software, And / or a combination thereof. These various implementations may include implementations in one or more computer programs that are executable and / or interpretable on a programmable system including at least one programmable processor, the processor being dedicated. It may be general purpose or may be coupled to receive and send data and instructions to and from the storage system, at least one input device, and at least one output device.

これらのコンピュータプログラム（プログラム、ソフトウェア、ソフトウェアアプリケーションまたはコードとしても公知）は、プログラマブルプロセッサのための機械命令を含んでおり、高レベル手続き型および／またはオブジェクト指向プログラミング言語で、および／またはアセンブリ／機械言語で実現され得る。ここに使用されるように、「機械読取可能媒体」「コンピュータ読取可能媒体」という用語は、機械命令および／またはデータをプログラマブルプロセッサに提供するために使用される任意のコンピュータプログラム製品、装置および／またはデバイス（たとえば、磁気ディスク、光ディスク、メモリ、プログラマブルロジックデバイス（Programmable Logic Device：ＰＬＤ））を指し、機械命令を機械読取可能信号として受信する機械読取可能媒体を含む。「機械読取可能信号」という用語は、機械命令および／またはデータをプログラマブルプロセッサに提供するために使用される任意の信号を指す。 These computer programs (also known as programs, software, software applications or code) contain machine instructions for programmable processors, in high level procedural and / or object oriented programming languages, and / or assemblies / machines. Can be implemented in a language. As used herein, the terms “machine-readable medium” and “computer-readable medium” refer to any computer program product, apparatus, and / or device used to provide machine instructions and / or data to a programmable processor. Or refers to a device (eg, magnetic disk, optical disk, memory, programmable logic device (PLD)), and includes machine-readable media that receives machine instructions as machine-readable signals. The term “machine-readable signal” refers to any signal used to provide machine instructions and / or data to a programmable processor.

ユーザとのやりとりを提供するために、ここに説明されたシステムおよび手法は、情報をユーザに表示するためのディスプレイデバイス（たとえば、ＣＲＴ（cathode ray tube：陰極線管）またはＬＣＤ（liquid crystal display：液晶ディスプレイ）モニタ）と、ユーザが入力をコンピュータに提供できるようにするキーボードおよびポインティングデバイス（たとえば、マウスまたはトラックボール）とを有するコンピュータ上で実現され得る。他の種類のデバイスを使用してユーザとのやりとりを提供することもでき、たとえば、ユーザに提供されるフィードバックは、任意の形態の感覚フィードバック（たとえば、視覚フィードバック、聴覚フィードバック、または触覚フィードバック）であってもよく、ユーザからの入力は、音響、音声、または触覚入力を含む任意の形態で受信され得る。 In order to provide user interaction, the systems and techniques described herein provide a display device (eg, a CRT (cathode ray tube) or LCD (liquid crystal display)) for displaying information to the user. Display) monitor) and a computer having a keyboard and pointing device (eg, a mouse or trackball) that allows the user to provide input to the computer. Other types of devices can also be used to provide interaction with the user, for example, the feedback provided to the user is any form of sensory feedback (eg, visual feedback, audio feedback, or tactile feedback). The input from the user may be received in any form including acoustic, voice, or haptic input.

ここに説明されたシステムおよび手法は、（たとえばデータサーバとしての）バックエンドコンポーネントを含む、またはミドルウェアコンポーネント（たとえばアプリケーションサーバ）を含む、またはフロントエンドコンポーネント（たとえば、ユーザがここに説明されたシステムおよび手法の実現化例とやりとりできるようにするグラフィカルユーザインターフェイスもしくはウェブブラウザを有するクライアントコンピュータ）を含む、もしくは、そのようなバックエンド、ミドルウェア、またはフロントエンドコンポーネントの任意の組合せを含む、コンピューティングシステムにおいて実現され得る。システムのコンポーネントは、任意の形態または媒体のデジタルデータ通信（たとえば通信ネットワーク）によって相互接続され得る。通信ネットワークの例は、ローカルエリアネットワーク（local area network：ＬＡＮ）、ワイドエリアネットワーク（wide area network：ＷＡＮ）、およびインターネットを含む。 The systems and techniques described herein include a back-end component (eg, as a data server), or include a middleware component (eg, an application server), or a front-end component (eg, the system and user described herein) In a computing system comprising a graphical user interface or a client computer having a web browser that enables interaction with an implementation of the technique, or comprising any combination of such backend, middleware, or frontend components Can be realized. The components of the system can be interconnected by any form or medium of digital data communication (eg, a communication network). Examples of communication networks include a local area network (LAN), a wide area network (WAN), and the Internet.

コンピューティングシステムは、クライアントおよびサーバを含み得る。クライアントおよびサーバは一般に互いにリモートであり、典型的には通信ネットワークを介してやりとりする。クライアントとサーバとの関係は、それぞれのコンピュータ上で実行されて互いにクライアント−サーバ関係を有するコンピュータプログラムによって生じる。 The computing system can include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship between the client and the server is caused by computer programs that are executed on the respective computers and have a client-server relationship with each other.

いくつかの実現化例では、図６に示すコンピューティングデバイスは、バーチャルリアリティ（ＶＲヘッドセット６９０）とインターフェイス接続するセンサを含み得る。たとえば、図６に示すコンピューティングデバイス６５０または他のコンピューティングデバイス上に含まれる１つ以上のセンサは、ＶＲヘッドセット６９０への入力を提供でき、または一般に、ＶＲ空間への入力を提供できる。センサは、タッチスクリーン、加速度計、ジャイロスコープ、圧力センサ、生体認証センサ、温度センサ、湿度センサ、および周囲光センサを含み得るものの、それらに限定されない。コンピューティングデバイス６５０はこれらのセンサを使用して、ＶＲ空間におけるコンピューティングデバイスの絶対位置および／または検出された回転を判断可能であり、それは次に、ＶＲ空間への入力として使用され得る。たとえば、コンピューティングデバイス６５０は、コントローラ、レーザポインタ、キーボード、武器などの仮想オブジェクトとしてＶＲ空間に組込まれてもよい。ＶＲ空間に組込まれた場合のコンピューティングデバイス／仮想オブジェクトのユーザによる位置付けは、ユーザが、ＶＲ空間において仮想オブジェクトをある態様で見るようにコンピューティングデバイスを位置付けることを可能にし得る。たとえば、仮想オブジェクトがレーザポインタを表わす場合、ユーザは、コンピューティングデバイスを、実際のレーザポインタであるかのように操作することができる。ユーザはコンピューティングデバイスをたとえば左右に、上下に、円形に動かして、レーザポインタを使用するのと同様の態様でデバイスを使用することができる。 In some implementations, the computing device shown in FIG. 6 may include a sensor that interfaces with a virtual reality (VR headset 690). For example, one or more sensors included on the computing device 650 shown in FIG. 6 or other computing devices can provide input to the VR headset 690 or can generally provide input to the VR space. Sensors can include, but are not limited to, touch screens, accelerometers, gyroscopes, pressure sensors, biometric sensors, temperature sensors, humidity sensors, and ambient light sensors. The computing device 650 can use these sensors to determine the absolute position and / or detected rotation of the computing device in the VR space, which can then be used as an input to the VR space. For example, the computing device 650 may be incorporated into the VR space as a virtual object such as a controller, a laser pointer, a keyboard, or a weapon. Positioning by a user of a computing device / virtual object when incorporated in a VR space may allow a user to position the computing device to view the virtual object in some manner in the VR space. For example, if the virtual object represents a laser pointer, the user can manipulate the computing device as if it were an actual laser pointer. The user can use the device in a manner similar to using a laser pointer, for example, moving the computing device left and right, up and down, and circularly.

いくつかの実現化例では、コンピューティングデバイス６５０上に含まれ、またはコンピューティングデバイス６５０に接続された１つ以上の入力デバイスは、ＶＲ空間への入力として使用され得る。入力デバイスは、タッチスクリーン、キーボード、１つ以上のボタン、トラックパッド、タッチパッド、ポインティングデバイス、マウス、トラックボール、ジョイスティック、カメラ、マイクロホン、入力機能性を有するイヤホンまたは小型イヤホン、ゲーミングコントローラ、または他の接続可能な入力デバイスを含み得るものの、それらに限定されない。コンピューティングデバイスがＶＲ空間に組込まれた場合にコンピューティングデバイス６５０上に含まれる入力デバイスとやりとりするユーザは、特定のアクションがＶＲ空間で生じるようにすることができる。 In some implementations, one or more input devices included on or connected to the computing device 650 may be used as input to the VR space. Input device can be touch screen, keyboard, one or more buttons, trackpad, touchpad, pointing device, mouse, trackball, joystick, camera, microphone, earphone or mini earphone with input functionality, gaming controller, or others Can be connected to, but is not limited to. A user interacting with an input device included on the computing device 650 when the computing device is incorporated into the VR space can cause certain actions to occur in the VR space.

いくつかの実現化例では、コンピューティングデバイス６５０のタッチスクリーンは、ＶＲ空間においてタッチパッドとしてレンダリングされ得る。ユーザは、コンピューティングデバイス６５０のタッチスクリーンとやりとりすることができる。やりとりは、たとえばＶＲヘッドセット６９０において、ＶＲ空間におけるレンダリングされたタッチパッド上の動きとしてレンダリングされる。レンダリングされた動きは、ＶＲ空間においてオブジェクトを制御することができる。 In some implementations, the touch screen of computing device 650 may be rendered as a touchpad in VR space. A user can interact with the touch screen of computing device 650. The interaction is rendered as movement on the rendered touchpad in VR space, for example in VR headset 690. The rendered movement can control the object in VR space.

いくつかの実現化例では、コンピューティングデバイス６５０上に含まれる１つ以上の出力デバイスは、ＶＲ空間においてＶＲヘッドセット６９０のユーザに出力および／またはフィードバックを提供することができる。出力およびフィードバックは、視覚、触覚、または音声によるものであり得る。出力および／またはフィードバックは、振動、１つ以上のライトまたはストロボをオンオフすることもしくは点滅および／または明滅させること、アラームを鳴らすこと、チャイムを鳴らすこと、歌を演奏すること、および音声ファイルを演奏することを含み得るものの、それらに限定されない。出力デバイスは、振動モータ、振動コイル、圧電デバイス、静電デバイス、発光ダイオード（ＬＥＤ）、ストロボ、およびスピーカを含み得るものの、それらに限定されない。 In some implementations, one or more output devices included on computing device 650 may provide output and / or feedback to a user of VR headset 690 in VR space. Output and feedback can be visual, tactile, or audio. Output and / or feedback can be to turn vibrations, turn one or more lights or strobe on or off or blink and / or blink, sound an alarm, sound a chime, play a song, and play an audio file Including, but not limited to. Output devices may include, but are not limited to, vibration motors, vibration coils, piezoelectric devices, electrostatic devices, light emitting diodes (LEDs), strobes, and speakers.

いくつかの実現化例では、コンピューティングデバイス６５０は、コンピュータが生成した３Ｄ環境において別のオブジェクトのように見えてもよい。ユーザによるコンピューティングデバイス６５０とのやりとり（たとえば、タッチスクリーンを回転させ、振動させること、タッチスクリーンに触れること、タッチスクリーンを横切って指をスワイプすること）は、ＶＲ空間におけるオブジェクトとのやりとりとして解釈され得る。ＶＲ空間におけるレーザポインタの例では、コンピューティングデバイス６５０は、コンピュータが生成した３Ｄ環境において仮想レーザポインタのように見える。ユーザがコンピューティングデバイス６５０を操作すると、ユーザはＶＲ空間においてレーザポインタの動きを見る。ユーザは、コンピューティングデバイス６５０上またはＶＲヘッドセット６９０上で、ＶＲ空間におけるコンピューティングデバイス６５０とのやりとりからフィードバックを受信する。 In some implementations, the computing device 650 may look like another object in a computer generated 3D environment. User interaction with computing device 650 (eg, rotating and vibrating the touch screen, touching the touch screen, swiping a finger across the touch screen) is interpreted as an interaction with an object in VR space. Can be done. In the example of a laser pointer in VR space, the computing device 650 looks like a virtual laser pointer in a computer generated 3D environment. As the user operates computing device 650, the user sees the movement of the laser pointer in VR space. The user receives feedback from the interaction with the computing device 650 in the VR space on the computing device 650 or on the VR headset 690.

いくつかの実現化例では、コンピューティングデバイスに加えて１つ以上の入力デバイス（たとえばマウス、キーボード）が、コンピュータが生成した３Ｄ環境においてレンダリングされ得る。レンダリングされた入力デバイス（たとえば、レンダリングされたマウス、レンダリングされたキーボード）は、ＶＲ空間においてオブジェクトを制御するために、ＶＲ空間においてレンダリングされたとして使用され得る。 In some implementations, one or more input devices (eg, mouse, keyboard) in addition to the computing device may be rendered in a computer generated 3D environment. A rendered input device (eg, rendered mouse, rendered keyboard) can be used as rendered in VR space to control objects in VR space.

コンピューティングデバイス６００は、ラップトップ、デスクトップ、ワークステーション、携帯情報端末、サーバ、ブレードサーバ、メインフレーム、および他の適切なコンピュータといった、さまざまな形態のデジタルコンピュータを表わすよう意図されている。コンピューティングデバイス６５０は、携帯情報端末、携帯電話、スマートフォン、および他の同様のコンピューティングデバイスといった、さまざまな形態のモバイルデバイスを表わすよう意図されている。ここに示すコンポーネント、それらの接続および関係、ならびにそれらの機能は、単なる例示であるよう意図されており、本文書に記載のおよび／または請求項に記載の本発明の実現化例を限定するよう意図されてはいない。 Computing device 600 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. Computing device 650 is intended to represent various forms of mobile devices, such as personal digital assistants, cellular phones, smartphones, and other similar computing devices. The components shown here, their connections and relationships, and their functions are intended to be examples only, so as to limit the implementation of the invention described in this document and / or in the claims. Not intended.

多くの実施形態を説明してきた。しかしながら、明細書の精神および範囲から逸脱することなくさまざまな変更がなされ得ることが理解されるであろう。 A number of embodiments have been described. However, it will be understood that various modifications can be made without departing from the spirit and scope of the specification.

加えて、図面に示す論理フローは、所望の結果を達成するために、図示された特定の順序または順番を必要としない。加えて、説明されたフローに他のステップが提供されてもよく、または当該フローからステップが除去されてもよく、説明されたシステムに他のコンポーネントが追加されてもよく、または当該システムから除去されてもよい。したがって、他の実施形態は以下の請求項の範囲内にある。 In addition, the logic flow shown in the drawings does not require the particular order or sequence shown to achieve the desired result. In addition, other steps may be provided to the described flow, or steps may be removed from the flow, other components may be added to, or removed from, the described system. May be. Accordingly, other embodiments are within the scope of the following claims.

以下の例において、さらなる実現化例を要約する。
例１：コンピュータにより実現される方法であって、２次元データを３次元データに変換するための、複数のカメラを用いて取込まれた複数の画像内の領域を判断するステップと、領域における画素の一部について深度値を計算するステップと、領域における画素の一部についての画像データを含む球状画像を生成するステップと、画像データを使用して、画像処理システムによって生成されたコンピュータグラフィックスオブジェクトの３次元空間において３次元表面を構成するステップと、画像データを使用して、コンピュータグラフィックスオブジェクトの表面へのテクスチャマッピングを生成するステップとを含み、テクスチャマッピングは、画像データをコンピュータグラフィックスオブジェクトの表面にマッピングすることを含み、方法はさらに、頭部装着型ディスプレイデバイスにおける表示のために、球状画像およびテクスチャマッピングを送信するステップを含む、方法。 In the following examples, further implementation examples are summarized.
Example 1: A computer-implemented method for determining regions in a plurality of images captured using a plurality of cameras for converting two-dimensional data into three-dimensional data; Computing a depth value for a portion of the pixel; generating a spherical image including image data for the portion of the pixel in the region; and computer graphics generated by the image processing system using the image data. Constructing a three-dimensional surface in a three-dimensional space of the object and generating a texture mapping to the surface of the computer graphics object using the image data, the texture mapping comprising the image data into the computer graphics Including mapping to the surface of the object. The method further comprises, for display in the head-mounted display device, comprising the step of transmitting spherical image and texture mapping method.

例２：画素の一部は、領域における画素の一部のうちの１つ以上に関連付けられた対応する深度値と等しい半径で、コンピュータグラフィックスオブジェクトの表面上に表わされる、例１に記載の方法。 Example 2: The part of a pixel is represented on the surface of a computer graphics object with a radius equal to a corresponding depth value associated with one or more of the part of the pixels in the region. Method.

例３：領域に関連付けられた追加の球状画像およびテクスチャマッピングを生成するステップと、画像データの一部と球状画像とを組合せることによって、左目ビューを生成するステップと、追加の画像データを生成し、追加の画像データと追加の球状画像とを組合せることによって、右目ビューを生成するステップと、頭部装着型ディスプレイデバイスにおいて左目ビューおよび右目ビューを表示するステップとをさらに含み、画像データは、領域における画素の一部のうちの少なくともいくつかについての深度値データおよびＲＧＢデータを含む、例１または例２に記載の方法。 Example 3: generating additional spherical image and texture mapping associated with a region, generating a left eye view by combining a portion of the image data with a spherical image, and generating additional image data And combining the additional image data and the additional spherical image to generate a right eye view and displaying a left eye view and a right eye view on the head mounted display device, the image data comprising: The method of Example 1 or Example 2, comprising depth value data and RGB data for at least some of the portions of the pixels in the region.

例４：複数の画像はビデオコンテンツを含み、画像データは、画素の一部に関連付けられたＲＧＢデータおよび深度値データを含み、システムはさらに、画像データを使用して、領域の２次元バージョンを領域の３次元バージョンに変換するステップと、頭部装着型ディスプレイデバイスにおける表示のために、領域の３次元バージョンを提供するステップとを含む、例１〜３のうちの１つに記載の方法。 Example 4: The plurality of images includes video content, the image data includes RGB data and depth value data associated with a portion of the pixel, and the system further uses the image data to generate a two-dimensional version of the region. 4. The method of one of Examples 1-3, comprising converting to a three-dimensional version of the region and providing a three-dimensional version of the region for display on a head mounted display device.

例５：複数の画像は、球状形状のカメラリグ上に搭載された複数のカメラを用いて取込まれる、例１〜４のうちの１つに記載の方法。 Example 5: The method according to one of Examples 1-4, wherein the plurality of images are captured using a plurality of cameras mounted on a spherically shaped camera rig.

例６：２次元データを３次元データに変換するための領域を判断するステップは、頭部装着型ディスプレイで検出されたユーザ入力に少なくとも部分的に基づいて自動的に行なわれる、例１〜５のうちの１つに記載の方法。 Example 6: The step of determining a region for converting 2D data to 3D data is automatically performed based at least in part on user input detected on the head mounted display, Examples 1-5 The method according to one of the above.

例７：ユーザ入力は頭部回転を含み、３次元データは、ビューに対応する複数の画像のうちの少なくとも１つにおいて３次元部分を生成するために使用される、例６に記載の方法。 Example 7: The method of example 6, wherein the user input includes head rotation and the three-dimensional data is used to generate a three-dimensional portion in at least one of the plurality of images corresponding to the view.

例８：ユーザ入力は凝視方向の変更を含み、３次元データは、ユーザの視線上の複数の画像のうちの少なくとも１つにおいて３次元部分を生成するために使用される、例６または例７に記載の方法。 Example 8: User input includes a change in gaze direction, and the 3D data is used to generate a 3D portion in at least one of the plurality of images on the user's line of sight, Example 6 or Example 7 The method described in 1.

例９：コンピュータにより実現されるシステムであって、少なくとも１つのプロセッサと、命令を格納するメモリとを含み、命令は、少なくとも１つのプロセッサによって実行されると、システムに複数の動作を行なわせ、複数の動作は、２次元データを３次元データに変換するための、複数のカメラを用いて取込まれた複数の画像内の領域を判断することと、領域における画素の一部について深度値を計算することと、領域における画素の一部についての画像データを含む球状画像を生成することと、画像データを使用して、画像処理システムによって生成されたコンピュータグラフィックスオブジェクトの３次元空間において３次元表面を構成することと、画像データを使用して、コンピュータグラフィックスオブジェクトの表面へのテクスチャマッピングを生成することとを含み、テクスチャマッピングは、画像データをコンピュータグラフィックスオブジェクトの表面にマッピングすることを含み、複数の動作はさらに、頭部装着型ディスプレイデバイスにおける表示のために、球状画像およびテクスチャマッピングを送信することを含む、システム。 Example 9: A computer-implemented system comprising at least one processor and a memory for storing instructions that, when executed by at least one processor, causes the system to perform a plurality of operations, The plurality of operations include determining a region in a plurality of images captured using a plurality of cameras for converting two-dimensional data into three-dimensional data, and determining a depth value for a part of the pixels in the region. Calculating, generating a spherical image that includes image data for a portion of the pixels in the region, and using the image data in a three-dimensional space in a three-dimensional space of computer graphics objects generated by the image processing system Use the image data to construct the surface and use the image data to surface the computer graphics object. Generating texture mapping, texture mapping includes mapping image data to a surface of a computer graphics object, and the plurality of operations further includes a spherical image for display on a head mounted display device. And sending the texture mapping.

例１０：追加の球状画像およびテクスチャマッピングを生成することと、画像データの一部と球状画像とを組合せることによって、左目ビューを生成することと、追加の画像データを生成し、追加の画像データと追加の球状画像とを組合せることによって、右目ビューを生成することと、頭部装着型ディスプレイデバイスにおいて左目ビューおよび右目ビューを表示することとをさらに含み、画像データは、領域における画素の一部のうちの少なくともいくつかについての深度値データおよびＲＧＢデータを含む、例９に記載のシステム。 Example 10: Generating additional spherical images and texture mapping, generating a left-eye view by combining a portion of the image data with a spherical image, generating additional image data, and generating additional images The method further includes generating a right eye view by combining the data and the additional spherical image, and displaying the left eye view and the right eye view on the head mounted display device, wherein the image data includes the pixels in the region. The system of example 9, including depth value data and RGB data for at least some of the portions.

例１１：複数の画像はビデオコンテンツを含み、画像データは、画素の一部に関連付けられたＲＧＢデータおよび深度値データを含み、システムはさらに、画像データを使用して、領域の２次元バージョンを領域の３次元バージョンに変換することと、頭部装着型ディスプレイデバイスにおける表示のために、領域の３次元バージョンを提供することとを含む、例９または例１０に記載のシステム。 Example 11: Multiple images include video content, image data includes RGB data and depth value data associated with a portion of a pixel, and the system further uses the image data to generate a two-dimensional version of the region The system of example 9 or example 10, comprising converting to a three-dimensional version of the region and providing a three-dimensional version of the region for display on a head mounted display device.

例１２：複数の画像は、球状形状のカメラリグ上に搭載された複数のカメラを用いて取込まれる、例９〜１１のうちの１つに記載のシステム。 Example 12: The system of one of Examples 9-11, wherein the plurality of images are captured using a plurality of cameras mounted on a spherically shaped camera rig.

例１３：２次元データを３次元データに変換するための領域を判断することは、頭部装着型ディスプレイで検出されたユーザ入力に少なくとも部分的に基づいて自動的に行なわれる、例９〜１２のうちの１つに記載のシステム。 Example 13: Determining a region for converting 2D data to 3D data is automatically performed based at least in part on user input detected on the head mounted display, Examples 9-12 The system according to one of the above.

例１４：ユーザ入力は凝視方向の変更を含み、３次元データは、ユーザの視線上の複数の画像のうちの少なくとも１つにおいて３次元部分を生成するために使用される、例１３に記載のシステム。 Example 14: The user input includes a change in gaze direction, and the 3D data is used to generate a 3D portion in at least one of the plurality of images on the user's line of sight. system.

例１５：コンピュータにより実現される方法であって、複数のカメラを用いて複数の画像を取得するステップと、複数の画像について少なくとも２つの更新画像を生成するステップとを含み、少なくとも２つの更新画像は、予め規定された中心線からの左側オフセットでコンテンツを取込み、予め規定された中心線からの右側オフセットでコンテンツを取込むように構成された少なくとも１つの仮想カメラについての視点を補間することによって生成され、方法はさらに、頭部装着型ディスプレイの左アイピースに提供するための第１の球状画像を生成するために、少なくとも２つの更新画像における第１の画像を第１の球面にマッピングするステップと、頭部装着型ディスプレイの右アイピースに提供するための第２の球状画像を生成するために、少なくとも２つの更新画像における第２の画像を第２の球面にマッピングするステップと、頭部装着型ディスプレイの左アイピースに第１の球状画像を表示し、頭部装着型ディスプレイの右アイピースに第２の球状画像を表示するステップとを含む、方法。 Example 15: A computer-implemented method comprising: obtaining a plurality of images using a plurality of cameras; and generating at least two update images for the plurality of images, wherein at least two update images By interpolating the viewpoint for at least one virtual camera configured to capture content with a left offset from a predefined centerline and capture content with a right offset from a predefined centerline The method further includes mapping the first image in the at least two updated images to the first sphere to generate a first spherical image for provision to the left eyepiece of the head mounted display. And generating a second spherical image for provision to the right eyepiece of the head mounted display For this purpose, mapping the second image in the at least two update images to the second spherical surface, displaying the first spherical image on the left eyepiece of the head-mounted display, and right eyepiece of the head-mounted display Displaying a second spherical image.

例１６：少なくとも１つの仮想カメラは、１つ以上の物理的カメラを使用して取込まれたコンテンツを使用し、コンテンツを視点から提供されるよう適合させるように構成される、例１５に記載の方法。 Example 16: The at least one virtual camera is configured to use content captured using one or more physical cameras and to adapt the content to be provided from a viewpoint. the method of.

例１７：第１の画像のマッピングは、第１の画像から第１の球面に画素座標を割当てることによって第１の画像にテクスチャを適用することを含み、第２の画像のマッピングは、第２の画像から第２の球面に画素座標を割当てることによって第２の画像にテクスチャを適用することを含む、例１５または例１６に記載の方法。 Example 17: The mapping of the first image includes applying a texture to the first image by assigning pixel coordinates from the first image to the first sphere, wherein the mapping of the second image is the second 17. The method of example 15 or example 16, comprising applying a texture to the second image by assigning pixel coordinates from the image of the second to the second sphere.

例１８：視点を補間することは、複数の画像における複数の画素をサンプリングすることと、オプティカルフローを使用して仮想コンテンツを生成することと、少なくとも２つの更新画像のうちの少なくとも１つの内部に仮想コンテンツを設置することとを含む、例１５〜１７のうちの１つに記載の方法。 Example 18: Interpolating viewpoints involves sampling multiple pixels in multiple images, generating virtual content using optical flow, and within at least one of at least two update images 18. The method of one of examples 15-17, comprising installing virtual content.

例１９：少なくとも２つの球状画像は、左側オフセットで取込まれたコンテンツに含まれる複数の画素のうちの少なくとも一部を有するＲＧＢ画像と、右側オフセットで取込まれたコンテンツに含まれる複数の画素のうちの少なくとも一部を有するＲＧＢ画像とを含む、例１８に記載の方法。 Example 19: The at least two spherical images include an RGB image having at least a part of a plurality of pixels included in the content captured at the left offset, and a plurality of pixels included in the content captured at the right offset. 19. The method of example 18, comprising an RGB image having at least a portion of

例２０：左側オフセットおよび右側オフセットは修正可能であり、また、頭部装着型ディスプレイにおける第１の画像および第２の画像の表示精度を適合させるために機能的である、例１５〜１９のうちの１つに記載の方法。 Example 20: Of Examples 15-19, the left offset and right offset can be modified and are functional to adapt the display accuracy of the first and second images on the head mounted display The method according to one of the above.

Claims

A computer-implemented method comprising:
Determining regions in a plurality of images captured using a plurality of cameras for converting two-dimensional data to three-dimensional data;
Calculating a depth value for a portion of the pixels in the region;
Generating a spherical image including image data for the portion of pixels in the region;
Using the image data to construct a three-dimensional surface in a three-dimensional space of a computer graphics object generated by an image processing system;
Generating texture mapping to a surface of the computer graphics object using the image data, the texture mapping comprising mapping the image data to the surface of the computer graphics object. The method further comprises:
Transmitting the spherical image and the texture mapping for display on a head-mounted display device.

The portion of pixels is represented on the surface of the computer graphics object with a radius equal to a corresponding depth value associated with one or more of the portions of pixels in the region. The method described in 1.

Generating an additional spherical image and texture mapping associated with the region;
Generating a left eye view by combining a portion of the image data and the spherical image;
Generating a right eye view by generating additional image data and combining the additional image data and the additional spherical image;
Displaying the left eye view and the right eye view on the head mounted display device,
The method of claim 1, wherein the image data includes depth value data and RGB data for at least some of the portions of pixels in the region.

The plurality of images include video content, the image data includes RGB data and depth value data associated with the portion of pixels, and the method further includes:
Using the image data to convert a two-dimensional version of the region into a three-dimensional version of the region;
Providing the three-dimensional version of the region for display on the head-mounted display device.

The method of claim 1, wherein the plurality of images are captured using a plurality of cameras mounted on a spherically shaped camera rig.

The method of claim 1, wherein the step of determining an area for converting two-dimensional data to three-dimensional data is performed automatically based at least in part on user input detected on a head mounted display. .

The method of claim 6, wherein the user input includes head rotation and the three-dimensional data is used to generate a three-dimensional portion in at least one of the plurality of images corresponding to a view. .

The user input includes a change in gaze direction, and the three-dimensional data is used to generate a three-dimensional portion in at least one of the plurality of images on a user's line of sight. the method of.

A system realized by a computer,
At least one processor;
A memory for storing instructions, wherein the instructions, when executed by the at least one processor, cause the system to perform a plurality of operations, the plurality of operations comprising:
Determining regions in a plurality of images captured using a plurality of cameras for converting two-dimensional data to three-dimensional data;
Calculating a depth value for a portion of the pixels in the region;
Generating a spherical image including image data for the portion of the pixels in the region;
Using the image data to construct a three-dimensional surface in a three-dimensional space of a computer graphics object generated by an image processing system;
Generating texture mapping to a surface of the computer graphics object using the image data, the texture mapping comprising mapping the image data to the surface of the computer graphics object. The plurality of operations further includes:
Transmitting the spherical image and the texture mapping for display on a head mounted display device.

Generating additional spherical images and texture mappings;
Generating a left-eye view by combining a portion of the image data and the spherical image;
Generating a right eye view by generating additional image data and combining the additional image data and the additional spherical image;
Further displaying the left eye view and right eye view on the head mounted display device,
The system of claim 9, wherein the image data includes depth value data and RGB data for at least some of the portions of pixels in the region.

The plurality of images include video content, the image data includes RGB data and depth value data associated with the portion of pixels, and the system further includes:
Using the image data to convert a two-dimensional version of the region to a three-dimensional version of the region;
Providing the three-dimensional version of the region for display on the head-mounted display device.

The system of claim 9, wherein the plurality of images are captured using a plurality of cameras mounted on a spherically shaped camera rig.

10. The system of claim 9, wherein determining an area for converting 2D data to 3D data is automatically performed based at least in part on user input detected on a head mounted display. .

14. The user input includes a change in gaze direction, and the three-dimensional data is used to generate a three-dimensional portion in at least one of the plurality of images on a user's line of sight. System.

A computer-implemented method comprising:
Acquiring a plurality of images using a plurality of cameras;
Generating at least two update images for the plurality of images, wherein the at least two update images capture content with a left-side offset from a predefined centerline and from the predefined centerline Generated by interpolating viewpoints for at least one virtual camera configured to capture content with a right offset, the method further comprising:
Mapping a first image in the at least two updated images to a first sphere to generate a first spherical image for provision to a left eyepiece of a head mounted display;
Mapping a second image in the at least two updated images to a second sphere to generate a second spherical image for provision to a right eyepiece of the head mounted display;
Displaying the first spherical image on the left eyepiece of the head-mounted display and displaying the second spherical image on the right eyepiece of the head-mounted display.

16. The at least one virtual camera is configured to use content captured using one or more physical cameras and to adapt the content to be provided from the viewpoint. The method described.

Mapping the first image comprises applying a texture to the first image by assigning pixel coordinates from the first image to the first sphere;
The method of claim 15, wherein mapping the second image includes applying a texture to the second image by assigning pixel coordinates from the second image to the second sphere.

Interpolating the viewpoint includes sampling a plurality of pixels in the plurality of images, generating virtual content using an optical flow, and inside the at least one of the at least two update images. 16. The method of claim 15, comprising installing virtual content.

At least two spherical images are included in the RGB image having at least a part of the plurality of pixels included in the content captured at the left offset and the content captured at the right offset. The method of claim 18, comprising: an RGB image having at least a portion of the plurality of pixels.

16. The left offset and the right offset are modifiable and are functional to adapt display accuracy of the first image and the second image on the head mounted display. The method described.