JP2022509731A

JP2022509731A - Cross reality system

Info

Publication number: JP2022509731A
Application number: JP2021518528A
Authority: JP
Inventors: アヌシュモハン，; ロバートブレイクテイラー，; ジェレミードウェインミランダ，; ラファエルドミンゴストーレス，; ダニエルオルシャンスキー，; アリシャーロクニー，; エランゲンデルマン，; ニッククラマー，; ケントッセル，; サミュエルエー．ミラー，; ジェハンギールタジク，; アシュウィンスワミナサン，; ロメシュアガルワル，; プラティークシンガル，; ジョエルデイビッドホルダー，; シュアンジャオ，; シッダルタチョーダリー，; ヘルダートシロウスズキ，; ヒラルホナルバロット，; クリスチャンイヴァンロバートムーア，
Original assignee: Magic Leap Inc
Current assignee: Magic Leap Inc
Priority date: 2018-10-05
Filing date: 2019-10-04
Publication date: 2022-01-24
Anticipated expiration: 2039-10-04
Also published as: WO2020072972A8; JP7526169B2; WO2020072972A1; JP2024103610A; EP3861533A1; EP3861533A4; CN113544748A

Abstract

About the physical world, where one or more user devices may have access to determine their location within the physical world, and applications may have access to define the location of virtual objects within the physical world. A cross-reality system that provides an immersive user experience by storing persistent spatial information. Persistent spatial information allows users to have a shared virtual and physical experience when interacting with a cross-reality system. In addition, persistent spatial information is used within the map of the physical world, where one or more devices access, locate, and use the cross-reality system within a previously stored map. Before doing so, it may be possible to reduce the need to map the physical space.

Description

（関連出願の相互参照）
本特許出願は、参照することによってその全体として本明細書に組み込まれる、２０１８年１０月５日に出願され、「ＣＯＯＲＤＩＮＡＴＥＦＲＡＭＥＰＲＯＣＥＳＳＩＮＧＡＵＧＭＥＮＴＥＤＲＥＡＬＩＴＹ」と題された、米国仮特許出願第６２／７４２，２３７号の優先権および利益を主張する。本特許出願はまた、参照することによってその全体として本明細書に組み込まれる、２０１９年３月１日に出願され、「ＭＥＲＧＩＮＧＡＰＬＵＲＡＬＩＴＹＯＦＩＮＤＩＶＩＤＵＡＬＬＹＭＡＰＰＥＤＥＮＶＩＲＯＮＭＥＮＴＳ」と題された、米国仮特許出願第６２／８１２，９３５号の優先権および利益を主張する。本特許出願はまた、参照することによってその全体として本明細書に組み込まれる、２０１９年３月８日に出願され、「ＶＩＥＷＩＮＧＤＥＶＩＣＥＯＲＶＩＥＷＩＮＧＤＥＶＩＣＥＳＨＡＶＩＮＧＯＮＥＯＲＭＯＲＥＣＯＯＲＤＩＮＡＴＥＦＲＡＭＥＴＲＡＮＳＦＯＲＭＥＲＳ」と題された、米国仮特許出願第６２／８１５，９５５号の優先権および利益を主張する。本特許出願はまた、参照することによってその全体として本明細書に組み込まれる、２０１９年６月２８日に出願され、「ＲＡＮＫＩＮＧＡＮＤＭＥＲＧＩＮＧＡＰＬＵＲＡＬＩＴＹＯＦＥＮＶＩＲＯＮＭＥＮＴＭＡＰＳ」と題された、米国仮特許出願第６２／８６８，７８６号の優先権および利益を主張する。本特許出願はまた、参照することによってその全体として本明細書に組み込まれる、２０１９年７月５日に出願され、「ＲＡＮＫＩＮＧＡＮＤＭＥＲＧＩＮＧＡＰＬＵＲＡＬＩＴＹＯＦＥＮＶＩＲＯＮＭＥＮＴＭＡＰＳ」と題された、米国仮特許出願第６２／８７０，９５４号の優先権および利益を主張する。本特許出願はまた、参照することによってその全体として本明細書に組み込まれる、２０１９年８月７日に出願され、「ＡＶＩＥＷＩＮＧＳＹＳＴＥＭ」と題された、米国仮特許出願第６２／８８４，１０９号の優先権および利益を主張する。 (Mutual reference of related applications)
This patent application is filed October 5, 2018, which is incorporated herein by reference in its entirety, and is entitled "COORDINATE FRAME PROCESSING AUGMENTED REALITY", US Provisional Patent Application No. 62/742. Claim the priority and interests of No. 237. This patent application is also incorporated herein by reference in its entirety, filed March 1, 2019, entitled "MERGING A PLURALITY OF INDIVIDUALLY MAPPED ENVIRONMENTS", US Provisional Patent Application No. 62. Claim the priority and interests of / 812,935. This patent application is also filed March 8, 2019, which is incorporated herein by reference in its entirety, entitled "VIEWING DEVICE OR VIEWING DEVICES HAVING ONE OR MORE COORDINATE FRAME TRANSFORMERS", USA. Claim the priority and interests of Provisional Patent Application No. 62 / 815,955. This patent application is also filed June 28, 2019, which is incorporated herein by reference in its entirety, and is entitled "RANKING AND MERGING A PLURALITY OF ENVIRONMENT MAPS". Claim the priority and interests of No. 62 / 868,786. This patent application is also incorporated herein by reference in its entirety, filed July 5, 2019, entitled "RANKING AND MERGING A PLURALITY OF ENVIRONMENT MAPS". Claim the priority and interests of No. 62 / 870,954. This patent application is also incorporated herein by reference in its entirety, filed August 7, 2019, entitled "A VIEWING SYSTEM", US Provisional Patent Application No. 62 / 884,109. Claim the priority and interests of the issue.

本願は、概して、クロスリアリティシステムに関する。 This application generally relates to a cross-reality system.

コンピュータは、ヒューマンユーザインターフェースを制御し、ユーザによって知覚されるにつれて、ＸＲ環境の一部または全部がコンピュータによって生成される、Ｘリアリティ（ＸＲまたはクロスリアリティ）環境を作成し得る。これらのＸＲ環境は、ＸＲ環境の一部または全部が、部分的に、環境を説明するデータを使用して、コンピュータによって生成され得る、仮想現実（ＶＲ）、拡張現実（ＡＲ）、および複合現実（ＭＲ）環境であり得る。本データは、例えば、ユーザが、物理的世界の一部として、感知または知覚し、仮想オブジェクトと相互作用し得るようにレンダリングされ得る、仮想オブジェクトを説明し得る。ユーザは、例えば、頭部搭載型ディスプレイデバイス等のユーザインターフェースデバイスを通してレンダリングおよび提示されているデータの結果として、これらの仮想オブジェクトを体験し得る。データは、ユーザに見えるように表示され得る、またはユーザに聞こえるように再生される、オーディオを制御し得る、または触知的（または触覚的）インターフェースを制御し、ユーザが、仮想オブジェクトを感じるにつれて、ユーザが感知または知覚する、タッチ感覚を体験することを可能にし得る。 The computer may create an X-reality (XR or cross-reality) environment in which some or all of the XR environment is generated by the computer as it controls the human user interface and is perceived by the user. These XR environments are virtual reality (VR), augmented reality (AR), and mixed reality, where some or all of the XR environment can be computer-generated using data that describes the environment in part. It can be a (MR) environment. The data may describe, for example, a virtual object that the user can sense or perceive and render to interact with the virtual object as part of the physical world. Users may experience these virtual objects as a result of data rendered and presented through a user interface device, such as a head-mounted display device. The data may be visible to the user, or played audibly to the user, may control the audio, or control the tactile (or tactile) interface, as the user perceives the virtual object. It may allow the user to experience a touch sensation, perceived or perceived.

ＸＲシステムは、科学的可視化、医療訓練、工学設計、およびプロトタイプ化、遠隔操作およびテレプレゼンス、および個人的娯楽の分野に及ぶ、多くの用途のために有用であり得る。ＡＲおよびＭＲは、ＶＲと対照的に、物理的世界の実オブジェクトと関連して、１つ以上のオブジェクトを含む。実オブジェクトと相互作用する、仮想オブジェクトの体験は、概して、ＸＲシステムを使用する際、ユーザの享受を向上させ、また、物理的世界が改変され得る様子についての現実的かつ容易に理解可能な情報を提示する、種々の用途のための可能性を広げる。 XR systems can be useful for many applications ranging from scientific visualization, medical training, engineering design, and prototyping, remote control and telepresence, and personal entertainment. AR and MR, in contrast to VR, include one or more objects in relation to real objects in the physical world. The virtual object experience, which interacts with real objects, generally enhances user enjoyment when using an XR system and provides realistic and easily understandable information about how the physical world can be altered. To present, expand the possibilities for various uses.

仮想コンテンツを現実的にレンダリングするために、ＸＲシステムは、システムのユーザの周囲の物理的世界の表現を構築し得る。本表現は、例えば、ＸＲシステムの一部を形成する、ウェアラブルデバイス上のセンサを用いて入手された処理画像によって構築されてもよい。そのようなシステムでは、ユーザは、システムがその環境の表現を構築するために十分な情報を入手するまで、ユーザがＸＲシステムを使用することを意図する部屋または他の物理的環境を見て回ることによって、初期化ルーチンを実施し得る。システムが動作し、ユーザが環境の周囲を、または他の環境へと動き回るにつれて、ウェアラブルデバイス上のセンサは、付加的情報を入手し、物理的世界の表現を拡張または更新し得る。 To realistically render virtual content, the XR system may construct a representation of the physical world around the user of the system. The representation may be constructed, for example, by processed images obtained using sensors on wearable devices that form part of the XR system. In such a system, the user looks around the room or other physical environment in which the user intends to use the XR system until the system has sufficient information to construct a representation of that environment. Thereby, the initialization routine can be carried out. As the system operates and users move around or to other environments, sensors on wearable devices can obtain additional information and extend or update the representation of the physical world.

本願の側面は、Ｘリアリティ（クロスリアリティまたはＸＲ）場面を提供するための方法および装置に関する。本明細書に説明される技法は、ともに、別個に、または任意の好適な組み合わせにおいて、使用されてもよい。 Aspects of the present application relate to methods and devices for providing X-reality (cross-reality or XR) scenes. Both techniques described herein may be used separately or in any suitable combination.

いくつかの実施形態は、３次元（３Ｄ）環境についての情報を捕捉するように構成される、１つ以上のセンサを含む、電子システムに関する。捕捉された情報は、複数の画像を含む。電子システムは、コンピュータ実行可能命令を実行し、複数の画像に基づいて、３Ｄ環境の少なくとも一部のマップを生成するように構成される、少なくとも１つのプロセッサを含む。コンピュータ実行可能命令はさらに、複数の画像内の複数の特徴を識別するステップと、少なくとも部分的に、選択されたキーフレームの複数の特徴に基づいて、複数のキーフレームを複数の画像の中から選択するステップと、少なくとも部分的に、選択されたキーフレームの識別された特徴に基づいて、１つ以上の座標フレームを生成するステップと、３Ｄ環境のマップと関連付けて、１つ以上の座標フレームを１つ以上の持続的座標フレームとして記憶するステップとのための命令を含む。 Some embodiments relate to electronic systems that include one or more sensors configured to capture information about a three-dimensional (3D) environment. The captured information includes multiple images. The electronic system includes at least one processor configured to execute computer executable instructions and generate a map of at least a portion of the 3D environment based on multiple images. Computer-executable instructions further include multiple keyframes from among multiple images, based on the steps of identifying multiple features within multiple images and, at least in part, multiple features of the selected keyframe. One or more coordinate frames associated with the steps to select and, at least in part, the steps to generate one or more coordinate frames based on the identified characteristics of the selected keyframe and the map of the 3D environment. Includes instructions for and with the step of storing as one or more persistent coordinate frames.

いくつかの実施形態では、１つ以上のセンサは、複数の画像の各画像が複数のピクセルを備えるように２次元アレイで配列される、複数のピクセル回路を備える。各特徴は、複数のピクセルに対応する。 In some embodiments, the one or more sensors comprises a plurality of pixel circuits in which each image of the plurality of images is arranged in a two-dimensional array such that it comprises a plurality of pixels. Each feature corresponds to multiple pixels.

いくつかの実施形態では、複数の画像内の複数の特徴を識別するステップは、識別された特徴として、持続的オブジェクトの部分を描写するピクセルの群との類似性の測定値に基づいて、ピクセルの群の所定の最大値未満の数を選択するステップを含む。 In some embodiments, the step of identifying a plurality of features in a plurality of images is based on a measurement of similarity to a group of pixels depicting a portion of a persistent object as the identified feature. Includes the step of selecting a number less than a given maximum value for the group of.

いくつかの実施形態では、１つ以上の座標フレームを記憶するステップは、１つ以上の座標フレーム毎に、そこから座標フレームが生成された、選択されたキーフレーム内の特徴の少なくともサブセットを表す、記述子を記憶するステップを含む。 In some embodiments, the step of storing one or more coordinate frames represents at least a subset of the features in the selected keyframe from which the coordinate frames were generated, for each one or more coordinate frames. , Includes a step to memorize the descriptor.

いくつかの実施形態では、１つ以上の座標フレームを記憶するステップは、１つ以上の座標フレーム毎に、そこから座標フレームが生成された、選択されたキーフレーム内の特徴の少なくともサブセットを記憶するステップを含む。 In some embodiments, the step of storing one or more coordinate frames stores at least a subset of the features in the selected keyframe from which the coordinate frames were generated, for each one or more coordinate frames. Includes steps to do.

いくつかの実施形態では、１つ以上の座標フレームを記憶するステップは、１つ以上の座標フレーム毎に、３Ｄ環境のマップの座標フレームと持続的座標フレームとの間の変換と、そこから座標フレームが生成された、選択されたキーフレームの３Ｄ環境内の場所を示す、地理的情報とを記憶するステップを含む。 In some embodiments, the step of storing one or more coordinate frames is, for each one or more coordinate frames, the transformation between the coordinate frames of the map in the 3D environment and the persistent coordinate frames, and the coordinates from there. Includes a step to store geographic information that indicates the location of the selected keyframe in the 3D environment from which the frame was generated.

いくつかの実施形態では、地理的情報は、その場所のＷｉＦｉフィンガプリントを備える。 In some embodiments, the geographic information comprises a WiFi finger print of the location.

いくつかの実施形態では、コンピュータ実行可能命令は、人工ニューラルネットワークを用いて個々の特徴のための特徴記述子を算出するための命令を備える。 In some embodiments, the computer executable instruction comprises an instruction for calculating a feature descriptor for an individual feature using an artificial neural network.

いくつかの実施形態では、第１の人工ニューラルネットワークは、第１の人工ニューラルネットワークである。コンピュータ実行可能命令は、少なくとも部分的に、キーフレーム内の識別された特徴のための算出された特徴記述子に基づいて、キーフレームを表すためのフレーム記述子を算出するように構成される、第２の人工ニューラルネットワークを実装するための命令を備える。 In some embodiments, the first artificial neural network is the first artificial neural network. Computer executable instructions are configured to calculate a frame descriptor to represent a key frame, at least in part, based on the calculated feature descriptor for the identified feature within the key frame. It includes instructions for implementing a second artificial neural network.

いくつかの実施形態では、コンピュータ実行可能命令はさらに、ポータブル電子システム上で実行されるアプリケーションに、１つ以上の持続的座標フレームの持続的座標フレームを特徴付ける、情報を提供するように構成される、アプリケーションプログラミングインターフェースと、第２の複数の画像に基づいて、３Ｄ環境のマップを精緻化するための命令と、少なくとも部分的に、第２の複数の画像に基づいて、持続的座標フレームのうちの１つ以上のものを調節するための命令と、アプリケーションプログラミングインターフェースを通して調節された持続的座標フレームの通知を提供するための命令とを備える。 In some embodiments, computer executable instructions are further configured to provide information running on a portable electronic system that characterizes a persistent coordinate frame of one or more persistent coordinate frames. Of the persistent coordinate frames, based on the application programming interface and the second plurality of images, instructions for refining the map of the 3D environment, and at least in part, the second plurality of images. It comprises instructions for adjusting one or more of the following and instructions for providing notification of a sustained coordinate frame adjusted through an application programming interface.

いくつかの実施形態では、１つ以上の持続的座標フレームを調節するステップは、３Ｄ環境のマップの原点に対する１つ以上の持続的座標フレームの平行移動および回転を調節するステップを含む。 In some embodiments, the step of adjusting one or more persistent coordinate frames comprises adjusting the translation and rotation of one or more persistent coordinate frames with respect to the origin of the map in the 3D environment.

いくつかの実施形態では、電子システムは、ウェアラブルデバイスを備え、１つ以上のセンサは、ウェアラブルデバイス上に搭載される。マップは、ウェアラブルデバイス上で算出された追跡マップである。マップの原点は、デバイスが電源投入された場所に基づいて決定される。 In some embodiments, the electronic system comprises a wearable device and one or more sensors are mounted on the wearable device. The map is a tracking map calculated on the wearable device. The origin of the map is determined based on where the device was powered on.

いくつかの実施形態では、電子システムは、ウェアラブルデバイスを備え、１つ以上のセンサは、ウェアラブルデバイス上に搭載される。コンピュータ実行可能命令はさらに、ポータブルデバイスの運動を追跡するステップと、閾値距離を超えるウェアラブルデバイスの運動を示す、追跡された運動に基づいて、１つ以上の座標フレームを生成するための命令および／または１つ以上の持続的座標フレームを記憶するための命令の実行のタイミングを制御するステップであって、閾値距離は、２～２０メートルである、ステップとのための命令を備える。 In some embodiments, the electronic system comprises a wearable device and one or more sensors are mounted on the wearable device. Computer-executable instructions also include instructions and / / instructions for generating one or more coordinate frames based on the tracked motion, indicating the steps to track the motion of the portable device and the motion of the wearable device above the threshold distance. Alternatively, it comprises a step for controlling the timing of execution of an instruction for storing one or more persistent coordinate frames, the threshold distance being 2 to 20 meters, and an instruction for the step.

いくつかの実施形態は、電子システムを動作させ、ポータブルデバイスを備える、３Ｄ環境内に仮想コンテンツをレンダリングする方法に関する。本方法は、１つ以上のプロセッサを用いて、ポータブルデバイス上に、ポータブルデバイス上の１つ以上のセンサの出力に基づいて、ポータブルデバイスにローカルの座標フレームを維持するステップと、記憶された座標フレームを３Ｄ環境についての記憶された空間情報から取得するステップと、ポータブルデバイスにローカルの座標フレームと取得された記憶された座標フレームとの間の変換を算出するステップと、仮想オブジェクトにローカルの座標フレームを有する、仮想オブジェクトの仕様と、選択された記憶された座標フレームに対する仮想オブジェクトの場所とを受信するステップと、少なくとも部分的に、算出された変換および仮想オブジェクトの受信された場所に基づいて、決定された場所において、仮想オブジェクトをポータブルデバイスのディスプレイ上にレンダリングするステップとを含む。 Some embodiments relate to how to operate an electronic system and render virtual content in a 3D environment comprising a portable device. The method uses one or more processors to maintain a coordinate frame local to the portable device on the portable device, based on the output of one or more sensors on the portable device, and the stored coordinates. The step of retrieving the frame from the stored spatial information about the 3D environment, the step of calculating the transformation between the coordinate frame local to the portable device and the acquired stored coordinate frame, and the coordinates local to the virtual object. Based on the step of receiving the specification of the virtual object with the frame and the location of the virtual object with respect to the selected stored coordinate frame, and at least in part, the calculated transformation and the received location of the virtual object. Includes the step of rendering a virtual object onto the display of a portable device at a determined location.

いくつかの実施形態では、記憶された座標フレームを取得するステップは、アプリケーションプログラミングインターフェース（ＡＰＩ）を通して、座標フレームを取得するステップを含む。 In some embodiments, the step of retrieving the stored coordinate frame comprises the step of retrieving the coordinate frame through an application programming interface (API).

いくつかの実施形態では、ポータブルデバイスは、１つ以上のプロセッサの第１のプロセッサを備える、第１のポータブルデバイスを備える。本システムはさらに、１つ以上のプロセッサの第２のプロセッサを備える、第２のポータブルデバイスを備える。第１および第２のデバイスのそれぞれ上のプロセッサは、同一の記憶された座標フレームを取得し、個別のデバイスにローカルの座標フレームと取得された同一記憶された座標フレームとの間の変換を算出し、仮想オブジェクトの仕様を受信し、仮想オブジェクトを個別のディスプレイ上にレンダリングする。 In some embodiments, the portable device comprises a first portable device comprising a first processor of one or more processors. The system further comprises a second portable device comprising a second processor of one or more processors. The processors on each of the first and second devices get the same stored coordinate frame and calculate the conversion between the coordinate frame local to the individual device and the acquired identical stored coordinate frame. It receives the specifications of the virtual object and renders the virtual object on a separate display.

いくつかの実施形態では、第１および第２のデバイスはそれぞれ、複数のカメラ画像を出力するように構成される、カメラと、複数のカメラ画像を複数のキーフレームに変換するように構成される、キーフレーム生成器と、複数のキーフレームを平均することによって、持続的姿勢を生成するように構成される、持続的姿勢計算機と、追跡マップを持続的姿勢に変換し、追跡マップの原点に対する持続的姿勢を決定するように構成される、追跡マップおよび持続的姿勢変換器と、持続的姿勢をＰＣＦに変換するように構成される、持続的姿勢および持続的座標フレーム（ＰＣＦ）変換器と、ＰＣＦを含む空間情報をサーバに伝送するように構成される、マップ発行器とを備える。 In some embodiments, the first and second devices are each configured to output a plurality of camera images, a camera, and a plurality of camera images to be converted into a plurality of keyframes. , A keyframe generator, and a persistent pose calculator configured to generate a sustained pose by averaging multiple keyframes, and a tracking map that transforms the tracking map into a persistent pose and relative to the origin of the tracking map. A tracking map and a persistent pose converter configured to determine a sustained pose, and a sustained pose and continuous coordinate frame (PCF) converter configured to convert a sustained pose to a PCF. , A map issuer configured to transmit spatial information including PCF to the server.

いくつかの実施形態では、本方法はさらに、アプリケーションを実行し、仮想オブジェクトの仕様および選択された記憶された座標フレームに対する仮想オブジェクトの場所を生成するステップを含む。 In some embodiments, the method further comprises running an application to generate a virtual object specification and a virtual object location relative to a selected stored coordinate frame.

いくつかの実施形態では、ポータブルデバイス上に、ポータブルデバイスにローカルの座標フレームを維持するステップは、第１および第２のポータブルデバイス毎に、ポータブルデバイスの１つ以上のセンサから、３Ｄ環境についての複数の画像を捕捉するステップと、少なくとも部分的に、複数の画像に基づいて、１つ以上の持続的姿勢を算出するステップと、少なくとも部分的に、算出された１つ以上の持続的姿勢に基づいて、３Ｄ環境についての空間情報を生成するステップとを含む。本方法はさらに、第１および第２のポータブルデバイス毎に、遠隔サーバに、生成された空間情報を伝送するステップを含み、記憶された座標フレームを取得するステップは、記憶された座標フレームを遠隔サーバから受信するステップを含む。 In some embodiments, the step of maintaining a coordinate frame local to the portable device on the portable device is for each of the first and second portable devices, from one or more sensors of the portable device, for a 3D environment. A step of capturing multiple images, a step of at least partially calculating one or more persistent postures based on multiple images, and at least a partial calculation of one or more persistent postures. Based on, it includes a step of generating spatial information about the 3D environment. The method further comprises a step of transmitting the generated spatial information to a remote server for each of the first and second portable devices, the step of acquiring the stored coordinate frame remotely. Includes steps to receive from the server.

いくつかの実施形態では、少なくとも部分的に、複数の画像に基づいて、１つ以上の持続的姿勢を算出するステップは、１つ以上の特徴を複数の画像のそれぞれから抽出するステップと、１つ以上の特徴毎に、記述子を生成するステップと、複数の画像毎に、少なくとも部分的に、記述子に基づいて、キーフレームを生成するステップと、少なくとも部分的に、１つ以上のキーフレームに基づいて、１つ以上の持続的姿勢を生成するステップとを含む。 In some embodiments, at least in part, the step of calculating one or more persistent postures based on a plurality of images is a step of extracting one or more features from each of the plurality of images and one. A step to generate a descriptor for each of one or more features, a step to generate a keyframe based on the descriptor, at least partially for each image, and at least partially one or more keys. Includes steps to generate one or more sustained postures based on the frame.

いくつかの実施形態では、１つ以上の持続的姿勢を生成するステップは、他の持続的姿勢の場所から所定の距離だけ進行するポータブルデバイスに基づいて、持続的姿勢を選択的に生成するステップを含む。 In some embodiments, the step of generating one or more sustained postures is a step of selectively generating a sustained posture based on a portable device that travels a predetermined distance from the location of the other persistent posture. including.

いくつかの実施形態では、第１および第２のデバイスはそれぞれ、記憶された座標フレームをサーバからダウンロードするように構成される、ダウンロードシステムを備える。 In some embodiments, the first and second devices each include a download system configured to download stored coordinate frames from a server.

いくつかの実施形態は、仮想コンテンツを複数のポータブルデバイスのそれぞれ上にレンダリングするために、３Ｄ環境についての持続的空間情報を維持するための電子システムに関する。電子システムは、ネットワーク化されたコンピューティングデバイスを含む。ネットワーク化されたコンピューティングデバイスは、少なくとも１つのプロセッサと、プロセッサに接続される、少なくとも１つの記憶デバイスと、少なくとも１つのプロセッサを用いて実行可能であって、複数のポータブルデバイスのポータブルデバイスから、複数のマップを受信し、少なくとも１つの記憶デバイス上にマップ情報を記憶する、マップ記憶ルーチンであって、複数の受信されたマップはそれぞれ、少なくとも１つの座標フレームを備える、マップ記憶ルーチンと、少なくとも１つのプロセッサを用いて実行可能であって、複数のポータブルデバイスのポータブルデバイスから、場所情報を受信し、記憶されたマップの中から、１つ以上のマップを選択し、複数のポータブルデバイスのポータブルデバイスに、選択された１つ以上のマップからの情報を伝送し、伝送される情報は、選択された１つ以上のマップのマップの座標フレームを備える、ためのマップ伝送機とを含む。 Some embodiments relate to electronic systems for maintaining persistent spatial information about a 3D environment in order to render virtual content on each of a plurality of portable devices. Electronic systems include networked computing devices. Networked computing devices can be executed using at least one processor, at least one storage device connected to the processor, and at least one processor, from portable devices of multiple portable devices. A map storage routine that receives a plurality of maps and stores map information on at least one storage device, wherein each of the received maps has at least one coordinate frame, and at least a map storage routine. It can be executed using one processor, receives location information from portable devices of multiple portable devices, selects one or more maps from the stored maps, and is portable of multiple portable devices. Information from one or more selected maps is transmitted to the device, and the transmitted information includes a map transmitter for comprising a coordinate frame of the map of one or more selected maps.

いくつかの実施形態では、座標フレームは、コンピュータデータ構造を備える。コンピュータデータ構造は、３Ｄ環境内のオブジェクトの複数の特徴を特徴付ける、情報を備える、座標フレームを備える。 In some embodiments, the coordinate frame comprises a computer data structure. Computer data structures include coordinate frames with information that characterize multiple features of objects in a 3D environment.

いくつかの実施形態では、複数の特徴を特徴付ける、情報は、３Ｄ環境の領域を特徴付ける、記述子を備える。 In some embodiments, the information features a descriptor that characterizes a region of the 3D environment.

いくつかの実施形態では、少なくとも１つの座標フレームの各座標フレームは、３Ｄ環境を表すセンサデータ内で検出された特徴によって特徴付けられる、持続的点を備える。 In some embodiments, each coordinate frame of at least one coordinate frame comprises a persistent point characterized by features detected in sensor data representing a 3D environment.

いくつかの実施形態では、少なくとも１つの座標フレームの各座標フレームは、持続的姿勢を備える。 In some embodiments, each coordinate frame of at least one coordinate frame comprises a sustained posture.

いくつかの実施形態では、少なくとも１つの座標フレームの各座標フレームは、持続的座標フレームを備える。 In some embodiments, each coordinate frame of at least one coordinate frame comprises a persistent coordinate frame.

前述の説明は、例証として提供され、限定することを意図するものではない。 The above description is provided as an illustration and is not intended to be limiting.

添付の図面は、縮尺通りに描かれることを意図していない。図面では、種々の図に図示される、各同じまたはほぼ同じコンポーネントは、同様の数字で表される。明確性の目的のために、全てのコンポーネントが、全ての図面において標識されているわけではない。 The attached drawings are not intended to be drawn to scale. In the drawings, each same or nearly identical component illustrated in the various figures is represented by similar numbers. For clarity purposes, not all components are labeled in all drawings.

図１は、いくつかの実施形態による、簡略化された拡張現実（ＡＲ）場面の実施例を図示する、スケッチである。FIG. 1 is a sketch illustrating an example of a simplified augmented reality (AR) scene with some embodiments.

図２は、いくつかの実施形態による、ＸＲシステムの例示的ユースケースを示す、例示的簡略化されたＡＲ場面のスケッチである。FIG. 2 is a sketch of an exemplary simplified AR scene showing an exemplary use case of an XR system, according to some embodiments.

図３は、いくつかの実施形態による、物理的世界と相互作用するＡＲコンテンツの体験をユーザに提供するように構成される、ＡＲシステム内の単一ユーザのためのデータフローを図示する、概略図である。FIG. 3 illustrates a data flow for a single user in an AR system configured to provide the user with an experience of AR content that interacts with the physical world, according to some embodiments. It is a figure.

図４は、いくつかの実施形態による、単一ユーザのための仮想コンテンツを表示する、例示的ＡＲディスプレイシステムを図示する、概略図である。FIG. 4 is a schematic diagram illustrating an exemplary AR display system that displays virtual content for a single user, according to some embodiments.

図５Ａは、いくつかの実施形態による、ユーザが物理的世界環境を通して移動するにつれてＡＲコンテンツをレンダリングする、ＡＲディスプレイシステムを装着しているユーザを図示する、概略図である。FIG. 5A is a schematic diagram illustrating a user wearing an AR display system that renders AR content as the user moves through a physical world environment, according to some embodiments.

図５Ｂは、いくつかの実施形態による、視認光学系アセンブリおよび付帯コンポーネントを図示する、概略図である。FIG. 5B is a schematic diagram illustrating a visual optics assembly and ancillary components according to some embodiments.

図６Ａは、いくつかの実施形態による、世界再構築システムを使用するＡＲシステムを図示する、概略図である。FIG. 6A is a schematic diagram illustrating an AR system using a world reconstruction system according to some embodiments.

図６Ｂは、いくつかの実施形態による、パス可能世界のモデルを維持する、ＡＲシステムのコンポーネントを図示する、概略図である。FIG. 6B is a schematic diagram illustrating the components of an AR system that maintain a passable world model, according to some embodiments.

図７は、物理的世界を通した経路をトラバースするデバイスによって形成される、追跡マップの概略図である。FIG. 7 is a schematic of a tracking map formed by a device traversing a path through the physical world.

図８は、いくつかの実施形態による、仮想コンテンツを知覚する、クロスリアリティ（ＸＲ）システムのユーザを図示する、概略図である。FIG. 8 is a schematic diagram illustrating a user of a cross-reality (XR) system that perceives virtual content, according to some embodiments.

図９は、いくつかの実施形態による、座標系間で変換する、図８のＸＲシステムの第１のＸＲデバイスのコンポーネントのブロック図である。FIG. 9 is a block diagram of the components of the first XR device of the XR system of FIG. 8 that transforms between coordinate systems, according to some embodiments.

図１０は、いくつかの実施形態による、ローカルＸＲコンテンツを正しくレンダリングするための目的地座標フレームへの原点座標フレームの例示的変換を図示する、概略図である。FIG. 10 is a schematic diagram illustrating an exemplary conversion of an origin coordinate frame to a destination coordinate frame for correctly rendering local XR content according to some embodiments.

図１１は、いくつかの実施形態による、瞳孔ベースの座標フレームを図示する、上部平面図である。FIG. 11 is an upper plan view illustrating a pupil-based coordinate frame according to some embodiments.

図１２は、いくつかの実施形態による、全ての瞳孔位置を含む、カメラ座標フレームを図示する、上部平面図である。FIG. 12 is an upper plan view illustrating a camera coordinate frame, including all pupil positions, according to some embodiments.

図１３は、いくつかの実施形態による、図９のディスプレイシステムの概略図である。FIG. 13 is a schematic diagram of the display system of FIG. 9 according to some embodiments.

図１４は、いくつかの実施形態による、持続的座標フレーム（ＰＣＦ）の作成およびＸＲコンテンツのＰＣＦへの結付を図示する、ブロック図である。FIG. 14 is a block diagram illustrating the creation of a persistent coordinate frame (PCF) and the binding of XR content to a PCF according to some embodiments.

図１５は、いくつかの実施形態による、ＰＣＦを確立および使用する方法を図示する、フローチャートである。FIG. 15 is a flow chart illustrating how to establish and use a PCF according to some embodiments.

図１６は、いくつかの実施形態による、第２のＸＲデバイスを含む、図８のＸＲシステムのブロック図である。FIG. 16 is a block diagram of the XR system of FIG. 8 comprising a second XR device, according to some embodiments.

図１７は、いくつかの実施形態による、部屋と、部屋内の種々のエリアのために確立される、キーフレームとを図示する、概略図である。FIG. 17 is a schematic diagram illustrating a room and key frames established for various areas within the room, according to some embodiments.

図１８は、いくつかの実施形態による、キーフレームに基づく、持続的姿勢の確立を図示する、概略図である。FIG. 18 is a schematic diagram illustrating the establishment of a sustained posture based on key frames by some embodiments.

図１９は、いくつかの実施形態による、持続的姿勢に基づく、持続座標フレーム（ＰＣＦ）の確立を図示する、概略図である。FIG. 19 is a schematic diagram illustrating the establishment of a sustained coordinate frame (PCF) based on a sustained posture according to some embodiments.

図２０Ａ－２０Ｃは、いくつかの実施形態による、ＰＣＦを作成する実施例を図示する、概略図である。20A-20C are schematic views illustrating examples of creating a PCF according to some embodiments.

図２１は、いくつかの実施形態による、個々の画像および／またはマップのためのグローバル記述子を生成するためのシステムを図示する、ブロック図である。FIG. 21 is a block diagram illustrating a system for generating global descriptors for individual images and / or maps, according to some embodiments.

図２２は、いくつかの実施形態による、画像記述子を算出する方法を図示する、フローチャートである。FIG. 22 is a flowchart illustrating a method of calculating an image descriptor according to some embodiments.

図２３は、いくつかの実施形態による、画像記述子を使用した位置特定の方法を図示する、フローチャートである。FIG. 23 is a flowchart illustrating a method of locating using an image descriptor according to some embodiments.

図２４は、いくつかの実施形態による、ニューラルネットワークを訓練する方法を図示する、フローチャートである。FIG. 24 is a flow chart illustrating a method of training a neural network according to some embodiments.

図２５は、いくつかの実施形態による、ニューラルネットワークを訓練する方法を図示する、ブロック図である。FIG. 25 is a block diagram illustrating a method of training a neural network according to some embodiments.

図２６は、いくつかの実施形態による、複数の環境マップをランク付けおよびマージするように構成される、ＡＲシステムを図示する、概略図である。FIG. 26 is a schematic diagram illustrating an AR system configured to rank and merge multiple environmental maps, according to some embodiments.

図２７は、いくつかの実施形態による、遠隔記憶媒体上に記憶された複数の規準マップを図示する、簡略化されたブロック図である。FIG. 27 is a simplified block diagram illustrating a plurality of reference maps stored on a remote storage medium, according to some embodiments.

図２８は、いくつかの実施形態による、規準マップを選択し、例えば、１つ以上の規準マップ内で新しい追跡マップを位置特定する、および／またはＰＣＦを規準マップから取得する方法を図示する、概略図である。FIG. 28 illustrates how, according to some embodiments, a canonical map is selected, eg, a new tracking map is located within one or more canonical maps, and / or a PCF is obtained from the canonical map. It is a schematic diagram.

図２９は、いくつかの実施形態による、複数のランク付けされた環境マップを選択する方法を図示する、フローチャートである。FIG. 29 is a flowchart illustrating how to select a plurality of ranked environment maps according to some embodiments.

図３０は、いくつかの実施形態による、図２６のＡＲシステムの例示的マップランク付け部分を図示する、概略図である。FIG. 30 is a schematic diagram illustrating an exemplary map ranking portion of the AR system of FIG. 26, according to some embodiments.

図３１Ａは、いくつかの実施形態による、データベース内の追跡マップ（ＴＭ）および環境マップのエリア属性の実施例を図示する、概略図である。FIG. 31A is a schematic diagram illustrating examples of area attributes of a tracking map (TM) and an environment map in a database according to some embodiments.

図３１Ｂは、いくつかの実施形態による、図２９の地理的場所フィルタリングのための追跡マップ（ＴＭ）の地理的場所を決定する実施例を図示する、概略図である。FIG. 31B is a schematic diagram illustrating an example of determining the geographic location of the tracking map (TM) for geographic location filtering of FIG. 29, according to some embodiments.

図３２は、いくつかの実施形態による、図２９の地理的場所フィルタリングの実施例を図示する、概略図である。FIG. 32 is a schematic diagram illustrating an embodiment of the geographic location filtering of FIG. 29 according to some embodiments.

図３３は、いくつかの実施形態による、図２９のＷｉ－ＦｉＢＳＳＩＤフィルタリングの実施例を図示する、概略図である。FIG. 33 is a schematic diagram illustrating an example of Wi-Fi BSSID filtering of FIG. 29 according to some embodiments.

図３４は、いくつかの実施形態による、図２９の位置特定の実施例を図示する、概略図である。FIG. 34 is a schematic diagram illustrating an embodiment of the positioning of FIG. 29 according to some embodiments.

図３５および３６は、いくつかの実施形態による、複数の環境マップをランク付けおよびマージするように構成される、ＸＲシステムのブロック図である。35 and 36 are block diagrams of an XR system configured to rank and merge multiple environmental maps, according to some embodiments. 図３５および３６は、いくつかの実施形態による、複数の環境マップをランク付けおよびマージするように構成される、ＸＲシステムのブロック図である。35 and 36 are block diagrams of an XR system configured to rank and merge multiple environmental maps, according to some embodiments.

図３７は、いくつかの実施形態による、規準形態において、物理的世界の環境マップを作成する方法を図示する、ブロック図である。FIG. 37 is a block diagram illustrating a method of creating an environmental map of the physical world in a canonical form, according to some embodiments.

図３８Ａおよび３８Ｂは、いくつかの実施形態による、図７の追跡マップを新しい追跡マップで更新することによって規準形態において作成された環境マップを図示する、概略図である。38A and 38B are schematics illustrating an environmental map created in a canonical form by updating the tracking map of FIG. 7 with a new tracking map, according to some embodiments. 図３８Ａおよび３８Ｂは、いくつかの実施形態による、図７の追跡マップを新しい追跡マップで更新することによって規準形態において作成された環境マップを図示する、概略図である。38A and 38B are schematics illustrating an environmental map created in a canonical form by updating the tracking map of FIG. 7 with a new tracking map, according to some embodiments.

図３９Ａ－３９Ｆは、いくつかの実施形態による、マップをマージする実施例を図示する、概略図である。39A-39F are schematic diagrams illustrating examples of merging maps according to some embodiments. 図３９Ａ－３９Ｆは、いくつかの実施形態による、マップをマージする実施例を図示する、概略図である。39A-39F are schematic diagrams illustrating examples of merging maps according to some embodiments. 図３９Ａ－３９Ｆは、いくつかの実施形態による、マップをマージする実施例を図示する、概略図である。39A-39F are schematic diagrams illustrating examples of merging maps according to some embodiments. 図３９Ａ－３９Ｆは、いくつかの実施形態による、マップをマージする実施例を図示する、概略図である。39A-39F are schematic diagrams illustrating examples of merging maps according to some embodiments. 図３９Ａ－３９Ｆは、いくつかの実施形態による、マップをマージする実施例を図示する、概略図である。39A-39F are schematic diagrams illustrating examples of merging maps according to some embodiments. 図３９Ａ－３９Ｆは、いくつかの実施形態による、マップをマージする実施例を図示する、概略図である。39A-39F are schematic diagrams illustrating examples of merging maps according to some embodiments.

図４０は、いくつかの実施形態による、３次元の第１のローカル追跡マップ（マップ１）の２次元表現であって、これは、図９の第１のＸＲデバイスによって生成され得る。FIG. 40 is a two-dimensional representation of a three-dimensional first local tracking map (Map 1) according to some embodiments, which can be generated by the first XR device of FIG.

図４１は、いくつかの実施形態による、マップ１を第１のＸＲデバイスから図９のサーバにアップロードするステップを図示する、ブロック図である。FIG. 41 is a block diagram illustrating a step of uploading a map 1 from a first XR device to the server of FIG. 9 according to some embodiments.

図４２は、いくつかの実施形態による、図１６のＸＲシステムを図示する、概略図であって、第２のユーザが、第１のユーザが第１のセッションを終了した後、ＸＲシステムの第２のＸＲデバイスを使用して、第２のセッションを開始したことを示す。FIG. 42 is a schematic diagram illustrating the XR system of FIG. 16 according to some embodiments, wherein the second user is the second user of the XR system after the first user has completed the first session. Indicates that the second session has been started using the XR device of 2.

図４３Ａは、いくつかの実施形態による、図４２の第２のＸＲデバイスのための新しいセッションを図示する、ブロック図である。FIG. 43A is a block diagram illustrating a new session for the second XR device of FIG. 42, according to some embodiments.

図４３Ｂは、いくつかの実施形態による、図４２の第２のＸＲデバイスのための追跡マップの作成を図示する、ブロック図である。FIG. 43B is a block diagram illustrating the creation of a tracking map for the second XR device of FIG. 42, according to some embodiments.

図４３Ｃは、いくつかの実施形態による、規準マップをサーバから図４２の第２のＸＲデバイスにダウンロードするステップを図示する、ブロック図である。FIG. 43C is a block diagram illustrating a step of downloading a canonical map from a server to the second XR device of FIG. 42, according to some embodiments.

図４４は、いくつかの実施形態による、規準マップに対して、図４２の第２のＸＲデバイスによって生成され得る、第２の追跡マップ（マップ２）を位置特定するように試みる、位置特定を図示する、概略図である。FIG. 44 attempts to locate a second tracking map (Map 2), which may be generated by the second XR device of FIG. 42, for a reference map, according to some embodiments. It is a schematic diagram to be illustrated.

図４５は、いくつかの実施形態による、規準マップに対して、さらに展開され得、マップ２のＰＣＦと関連付けられるＸＲコンテンツを伴う、図４４の第２の追跡マップ（マップ２）を位置特定するように試みる、位置特定を図示する、概略図である。FIG. 45 locates a second tracking map (Map 2) of FIG. 44 with XR content that can be further expanded and associated with the PCF of Map 2 with respect to the reference map according to some embodiments. It is a schematic diagram which illustrates the position identification.

図４６Ａ－４６Ｂは、いくつかの実施形態による、規準マップに対する図４５のマップ２の位置特定成功を図示する、概略図である。46A-46B are schematic views illustrating the success of mapping Map 2 in FIG. 45 relative to the reference map, according to some embodiments. 図４６Ａ－４６Ｂは、いくつかの実施形態による、規準マップに対する図４５のマップ２の位置特定成功を図示する、概略図である。46A-46B are schematic views illustrating the success of mapping Map 2 in FIG. 45 relative to the reference map, according to some embodiments.

図４７は、いくつかの実施形態による、図４６Ａの規準マップからの１つ以上のＰＣＦを図４５のマップ２の中に含めることによって生成された、規準マップを図示する、概略図である。FIG. 47 is a schematic diagram illustrating a canonical map generated by including one or more PCFs from the canonical map of FIG. 46A within map 2 of FIG. 45, according to some embodiments.

図４８は、いくつかの実施形態による、第２のＸＲデバイス上のマップ２のさらなる拡張を伴う、図４７の規準マップを図示する、概略図である。FIG. 48 is a schematic diagram illustrating a canonical map of FIG. 47 with further extensions of map 2 on a second XR device, according to some embodiments.

図４９は、いくつかの実施形態による、マップ２を第２のＸＲデバイスからサーバにアップロードするステップを図示する、ブロック図である。FIG. 49 is a block diagram illustrating a step of uploading a map 2 from a second XR device to a server, according to some embodiments.

図５０は、いくつかの実施形態による、マップ２と規準マップをマージするステップを図示する、ブロック図である。FIG. 50 is a block diagram illustrating a step of merging map 2 and a canonical map according to some embodiments.

図５１は、いくつかの実施形態による、サーバから第１および第２のＸＲデバイスへの新しい規準マップの伝送を図示する、ブロック図である。FIG. 51 is a block diagram illustrating the transmission of new canonical maps from the server to the first and second XR devices, according to some embodiments.

図５２は、いくつかの実施形態による、マップ２の２次元表現と、マップ２に参照される、第２のＸＲデバイスの頭部座標フレームとを図示する、ブロック図である。FIG. 52 is a block diagram illustrating a two-dimensional representation of map 2 and a head coordinate frame of a second XR device referenced in map 2 according to some embodiments.

図５３は、いくつかの実施形態による、２次元において、６自由度で生じ得る、頭部座標フレームの調節を図示する、ブロック図である。FIG. 53 is a block diagram illustrating the adjustment of the head coordinate frame, which can occur in 6 degrees of freedom in 2 dimensions, according to some embodiments.

図５４は、いくつかの実施形態による、音がマップ２のＰＣＦに対して位置特定される、第２のＸＲデバイス上の規準マップを図示する、ブロック図である。FIG. 54 is a block diagram illustrating a canonical map on a second XR device in which sound is located relative to the PCF of map 2, according to some embodiments.

図５５および５６は、いくつかの実施形態による、第１のユーザが第１のセッションを終了し、第１のユーザがＸＲシステムを使用して第２のセッションを開始したときの、ＸＲシステムの使用を図示する、斜視図およびブロック図である。55 and 56 show, according to some embodiments, of the XR system when the first user ends the first session and the first user starts the second session using the XR system. It is a perspective view and a block diagram illustrating the use. 図５５および５６は、いくつかの実施形態による、第１のユーザが第１のセッションを終了し、第１のユーザがＸＲシステムを使用して第２のセッションを開始したときの、ＸＲシステムの使用を図示する、斜視図およびブロック図である。55 and 56 show, according to some embodiments, of the XR system when the first user ends the first session and the first user starts the second session using the XR system. It is a perspective view and a block diagram illustrating the use.

図５７および５８は、いくつかの実施形態による、３人のユーザが同一セッション内でＸＲシステムを同時に使用するときの、ＸＲシステムの使用を図示する、斜視図およびブロック図である。57 and 58 are perspective views and block diagrams illustrating the use of the XR system when three users simultaneously use the XR system in the same session, according to some embodiments. 図５７および５８は、いくつかの実施形態による、３人のユーザが同一セッション内でＸＲシステムを同時に使用するときの、ＸＲシステムの使用を図示する、斜視図およびブロック図である。57 and 58 are perspective views and block diagrams illustrating the use of the XR system when three users simultaneously use the XR system in the same session, according to some embodiments.

図５９は、いくつかの実施形態による、頭部姿勢を復元およびリセットする方法を図示する、フローチャートである。FIG. 59 is a flow chart illustrating a method of restoring and resetting the head posture according to some embodiments.

図６０は、いくつかの実施形態による、本発明のシステム内でアプリケーションを見出し得る、コンピュータの形態における機械のブロック図である。FIG. 60 is a block diagram of a machine in the form of a computer from which applications can be found within the system of the invention, according to some embodiments.

本明細書に説明されるものは、Ｘリアリティ（ＸＲまたはクロスリアリティ）場面を提供するための方法および装置である。現実的ＸＲ体験を複数のユーザに提供するために、ＸＲシステムは、実オブジェクトに関連して仮想オブジェクトの場所を正しく相関させるために、ユーザの物理的周囲を把握しなければならない。ＸＲシステムは、場面の環境マップを構築し得、これは、ＸＲシステムのユーザによって装着されるＸＲデバイスの一部であるセンサを用いて収集された、画像および／または深度情報から作成され得る。 Described herein are methods and devices for providing an X-reality (XR or cross-reality) scene. In order to provide a realistic XR experience to multiple users, the XR system must know the user's physical surroundings in order to properly correlate the location of the virtual object in relation to the real object. The XR system may construct an environmental map of the scene, which may be created from images and / or depth information collected using sensors that are part of the XR device worn by the user of the XR system.

本発明者らは、ある時点での走査の間に収集された１つ以上の画像からの情報を統合することによって、各ＸＲデバイスがその物理的環境のローカルマップを展開する、ＸＲシステムを有することが有益であり得ることを実現し、その真価を認めた。いくつかの実施形態では、そのマップの座標系は、走査が開始されたときのデバイスの配向に結び付けられる。その配向は、異なる瞬間が、異なるユーザ、それぞれ環境を走査するセンサを伴うその独自のウェアラブルデバイス、または異なる時間において同一デバイスを使用する同一ユーザと関連付けられるかどうかにかかわらず、ユーザがＸＲシステムと相互作用するにつれて、瞬間毎に変化し得る。本発明者らは、各ユーザデバイスが、異なるユーザインスタンス（例えば、時間単位のスナップショット）またはシステムのセッション（例えば、オンとオフとの間の時間）に関して異なる、配向に対して収集される、空間情報のみに依拠する、ＸＲシステムの限界を克服する、持続的空間情報に基づいてＸＲシステムを動作させるための技法を実現し、その真価を認めた。本技法は、例えば、持続的空間情報が、ＸＲシステムの複数のユーザのいずれかによって、作成され、記憶され、読み出されることを可能にすることによって、単一または複数のユーザのためのより算出上効率的および没入型の体験のためのＸＲ場面を提供し得る。 We have an XR system in which each XR device develops a local map of its physical environment by integrating information from one or more images collected during a point-in-time scan. Realized that it could be beneficial and acknowledged its true value. In some embodiments, the coordinate system of the map is tied to the orientation of the device when the scan is initiated. The orientation allows the user to interact with the XR system regardless of whether different moments are associated with different users, their own wearable device with sensors that each scan the environment, or the same user who uses the same device at different times. It can change from moment to moment as it interacts. We collect for different orientations, where each user device is different for different user instances (eg, time-based snapshots) or system sessions (eg, time between on and off). We have realized a technique for operating the XR system based on sustainable spatial information, which relies only on spatial information, overcomes the limitations of the XR system, and acknowledged its true value. The technique is calculated more for a single or multiple users, for example by allowing persistent spatial information to be created, stored, and read by any of the multiple users of the XR system. It can provide XR scenes for an efficient and immersive experience.

持続的空間情報は、ＸＲ体験を向上させる、１つ以上の機能を有効にし得る、持続的マップによって表されてもよい。持続的マップは、遠隔記憶媒体（例えば、クラウド）内に記憶されてもよい。例えば、ユーザによって装着されるウェアラブルデバイスは、オンにされた後、クラウド記憶装置等の持続記憶装置から、以前に作成および記憶された、適切な記憶されたマップを読み出してもよい。その以前に記憶されたマップは、以前のセッションの間に、ユーザのウェアラブルデバイス上のセンサを用いて収集された環境についてのデータに基づき得る。記憶されたマップを読み出すことは、ウェアラブルデバイス上のセンサを用いた物理的世界の走査を伴わずに、ウェアラブルデバイスの使用を有効にし得る。代替として、または加えて、システム／デバイスは、物理的世界の新しい領域に進入することに応じて、同様に、適切な記憶されたマップを読み出してもよい。 Persistent spatial information may be represented by a persistent map that can enable one or more functions that enhance the XR experience. The persistent map may be stored in a remote storage medium (eg, cloud). For example, a wearable device worn by a user may, after being turned on, read an appropriate stored map previously created and stored from a persistent storage device such as a cloud storage device. Previously stored maps may be based on data about the environment collected using sensors on the user's wearable device during the previous session. Retrieving the stored map may enable the use of the wearable device without scanning the physical world with the sensors on the wearable device. Alternatively, or in addition, the system / device may similarly read out the appropriate stored map as it enters a new realm of the physical world.

記憶されたマップは、各ＸＲデバイスがそのローカル基準フレームに関連し得る、規準形態において表されてもよい。マルチデバイスＸＲシステムでは、１つのデバイスによってアクセスされる記憶されたマップは、別のデバイスによって作成および記憶されていてもよく、および／または記憶されたマップによって表される物理的世界の少なくとも一部内に以前に存在した、複数のウェアラブルデバイス上のセンサによって収集された物理的世界についてのデータを集約することによって、構築されていてもよい。 The stored map may be represented in a canonical form in which each XR device may be associated with its local reference frame. In a multi-device XR system, the stored map accessed by one device may be created and stored by another device and / or within at least part of the physical world represented by the stored map. It may be constructed by aggregating data about the physical world collected by sensors on multiple wearable devices that previously existed in.

さらに、物理的世界についてのデータを複数のデバイス間で共有することは、仮想コンテンツの共有ユーザ体験を有効にし得る。同一の記憶されたマップへのアクセスを有する、２つのＸＲデバイスは両方とも、例えば、記憶されたマップに対して位置特定されてもよい。いったん位置特定されると、ユーザデバイスは、基準によって規定された場所を有する、仮想コンテンツを、その場所をユーザデバイスによって維持されるフレームまたは基準に平行移動させることによって、記憶されたマップにレンダリングしてもよい。ユーザデバイスは、本ローカル基準フレームを使用して、ユーザデバイスのディスプレイを制御し、仮想コンテンツを規定された場所内にレンダリングしてもよい。 In addition, sharing data about the physical world across multiple devices can enable a shared user experience for virtual content. Both two XR devices with access to the same stored map may be located, for example, relative to the stored map. Once located, the user device renders the virtual content, which has a location defined by the criteria, into a stored map by translating that location to a frame or reference maintained by the user device. You may. The user device may use this local reference frame to control the display of the user device and render the virtual content within a defined location.

これらおよび他の機能をサポートするために、ＸＲシステムは、ユーザデバイス上のセンサを用いて収集された物理的世界についてのデータに基づいて、１つ以上の記憶されたマップを含む、持続的空間情報を展開、維持、および使用する、コンポーネントを含んでもよい。これらのコンポーネントは、ＸＲシステムを横断して分散されてもよく、一部は、例えば、ユーザデバイスの頭部搭載型部分上で動作する。他のコンポーネントは、ローカルまたはパーソナルエリアネットワークを経由して頭部搭載型部分に結合される、ユーザと関連付けられるコンピュータ上で動作してもよい。さらにその他は、広域ネットワークを経由してアクセス可能な１つ以上のサーバ等、遠隔場所で動作してもよい。 To support these and other features, the XR system is a persistent space containing one or more stored maps based on data about the physical world collected using sensors on user devices. It may include components that deploy, maintain, and use information. These components may be distributed across the XR system, some of which operate, for example, on a head-mounted portion of the user device. Other components may run on the computer associated with the user, which is coupled to the head-mounted portion via a local or personal area network. Others may operate at remote locations, such as one or more servers accessible via a wide area network.

これらのコンポーネントは、例えば、１つ以上のユーザデバイスによって収集された物理的世界についての情報から、持続的マップとして、または持続的マップ内に記憶されるために十分な品質である情報を識別し得る、コンポーネントを含んでもよい。下記により詳細に説明される、そのようなコンポーネントの実施例は、マップマージコンポーネントである。そのようなコンポーネントは、例えば、ユーザデバイスからの入力を受信し、持続的マップを更新するために使用される入力の部分の好適性を決定してもよい。マップマージコンポーネントは、例えば、ユーザデバイスによって作成されたローカルマップを部分に分裂させ、部分のうちの１つ以上のものと持続的マップのマージ可能性を決定し、認定されたマージ可能性基準を満たす部分を持続的マップにマージしてもよい。マップマージコンポーネントはまた、例えば、持続的マップとマージされない部分が別個の持続的マップとなるようにプロモートしてもよい。 These components identify, for example, information about the physical world collected by one or more user devices that is of sufficient quality to be stored as a persistent map or within a persistent map. Obtain, may include components. An example of such a component, described in more detail below, is the map merge component. Such components may, for example, receive input from the user device and determine the aptitude of the portion of the input used to update the persistent map. The map merge component, for example, splits a local map created by a user device into parts, determines the mergeability of a persistent map with one or more of the parts, and sets a certified mergeability criterion. You may merge the fills into a persistent map. The map merge component may also be promoted, for example, so that the persistent map and the unmerged portion are separate persistent maps.

別の実施例として、これらのコンポーネントは、ユーザデバイスによって読み出され、使用され得る、適切な持続的マップを決定することを補助し得る、コンポーネントを含んでもよい。下記により詳細に説明される、そのようなコンポーネントの実施例は、マップランク付けコンポーネントである。そのようなコンポーネントは、例えば、入力をユーザデバイスから受信し、その中でデバイスが動作中である、物理的世界の領域を表す可能性が高い、１つ以上の持続的マップを識別してもよい。マップランク付けコンポーネントは、例えば、仮想コンテンツをレンダリングする、環境についてのデータを集める、または他のアクションを実施する際、そのローカルデバイスによって使用されるべき持続的マップを選択することを補助し得る。マップランク付けコンポーネントは、代替として、または加えて、物理的世界についての付加的情報が１つ以上のユーザデバイスによって収集されるにつれて、更新されるべき持続的マップを識別することを補助し得る。 As another embodiment, these components may include components that can be read and used by the user device to assist in determining the appropriate persistent map. An example of such a component, described in more detail below, is a map ranking component. Such components may, for example, receive input from a user device and identify one or more persistent maps that are likely to represent areas of the physical world in which the device is operating. good. The map ranking component may help select a persistent map to be used by its local device, for example, when rendering virtual content, collecting data about the environment, or performing other actions. The map ranking component may, as an alternative, or in addition, help identify a persistent map to be updated as additional information about the physical world is collected by one or more user devices.

さらに他のコンポーネントは、１つの基準フレームに関連して捕捉または説明される情報を別の基準フレームに変換する、変換を決定してもよい。例えば、センサは、そのセンサから読み取られたデータが、装着者の頭部姿勢に対する物理的世界内のオブジェクトの場所を示すように、頭部搭載型ディスプレイに取り付けられてもよい。１つ以上の変換が、その場所情報を持続環境マップと関連付けられる座標フレームに関連させるように適用されてもよい。同様に、持続環境マップの座標フレーム内に表されるとき、仮想オブジェクトがレンダリングされるべき場所を示す、データは、ユーザの頭部上のディスプレイの基準フレーム内にあるように１つ以上の変換を受けてもよい。下記により詳細に説明されるように、複数のそのような変換が存在してもよい。これらの変換は、それらが分散型システム内で効率的に更新および／または適用され得るように、ＸＲシステムのコンポーネントを横断してパーティション化されてもよい。 Yet another component may decide to transform the information captured or described in relation to one reference frame into another reference frame. For example, the sensor may be attached to a head-mounted display so that the data read from the sensor indicates the location of the object in the physical world with respect to the wearer's head posture. One or more transformations may be applied to relate the location information to the coordinate frame associated with the sustained environment map. Similarly, when represented within a coordinate frame of a persistent environment map, the data indicates where the virtual object should be rendered, and the data is one or more transformations to be within the reference frame of the display on the user's head. You may receive. There may be multiple such transformations, as described in more detail below. These transformations may be partitioned across the components of the XR system so that they can be updated and / or applied efficiently within the distributed system.

いくつかの実施形態では、持続的マップは、複数のユーザデバイスによって収集された情報から構築されてもよい。ＸＲデバイスは、種々の場所および時間においてＸＲデバイスのそれぞれのセンサによって収集された情報を用いて、ローカル空間情報を捕捉し、別個の追跡マップを構築してもよい。各追跡マップは、それぞれ、複数の特徴を含み得る、実オブジェクトの特徴と関連付けられ得る、点を含んでもよい。潜在的に、入力を供給し、持続的マップを作成および維持するステップに加え、追跡マップは、場面内のユーザの運動を追跡するために使用され、ＸＲシステムが、追跡マップに基づいて、個別のユーザの頭部姿勢を推定することを可能にしてもよい。 In some embodiments, the persistent map may be constructed from information collected by multiple user devices. The XR device may use the information collected by the respective sensors of the XR device at various locations and times to capture local spatial information and build a separate tracking map. Each tracking map may contain points that may contain multiple features and may be associated with features of a real object. Potentially, in addition to the steps of supplying input and creating and maintaining a persistent map, tracking maps are used to track the user's movements in the scene, and the XR system is individualized based on the tracking map. It may be possible to estimate the head posture of the user.

マップの作成と頭部姿勢の推定との間の本共依存性は、有意な課題を構成する。実質的処理が、同時に、マップを作成し、頭部姿勢を推定するために要求され得る。待ち時間は、ＸＲ体験をユーザにとってあまり現実的ではないものにするため、処理は、オブジェクトが場面内を移動する（例えば、カップをテーブル上で移動させる）につれて、かつユーザが場面内を移動するにつれて、迅速に遂行されなければならない。他方では、ＸＲデバイスの重量は、ユーザが快適に装着するために軽量であるべきであるため、ＸＲデバイスは、限定された算出リソースを提供し得る。算出リソースの欠如は、望ましくないことに、重量もまた追加するであろう、センサの追加のため、より多くのセンサを用いて補償されることができない。さらに、より多くのセンサまたはより多くの算出リソースのいずれかは、熱につながり、これは、ＸＲデバイスの変形を生じさせ得る。 This co-dependence between mapping and estimating head posture constitutes a significant task. Substantial processing may be required at the same time to create a map and estimate head posture. Waiting times make the XR experience less realistic for the user, so the process moves the object in the scene (eg, moving the cup on the table) and the user in the scene. As it goes, it must be carried out quickly. On the other hand, the weight of the XR device should be light for the user to wear comfortably, so the XR device may provide limited computational resources. The lack of computational resources cannot be compensated for with more sensors due to the addition of sensors, which would undesirably also add weight. In addition, either more sensors or more computational resources lead to heat, which can cause deformation of the XR device.

本発明者らは、１ｋＨｚの周波数における頭部姿勢の推定、例えば、３０Ｈｚで動作する４つのビデオグラフィックアレイ（ＶＧＡ）カメラとともに構成され得る、ＸＲデバイスと関連する算出リソースの低使用量、１ｋＨｚで動作する１つの慣性測定ユニット（ＩＭＵ）、単一の高度ＲＩＳＣ機械（ＡＲＭ）コアの算出パワー、１ＧＢ未満のメモリ、および１００Ｍｂｐ未満のネットワーク帯域幅等、より没入型のユーザ体験のために、ＸＲシステムを動作させ、ＸＲ場面を提供するための技法を実現し、その真価を認めた。これらの技法は、マップを生成および維持し、頭部姿勢を推定するために要求される処理を低減させ、かつ低算出オーバーヘッドを伴ってデータを提供および消費するステップに関する。 We have estimated head posture at a frequency of 1 kHz, eg, at 1 kHz, a low usage of computational resources associated with XR devices, which may be configured with four videographic array (VGA) cameras operating at 30 Hz. XR for a more immersive user experience, such as one inertial measurement unit (IMU) operating, calculated power of a single advanced RISC machine (ARM) core, less than 1GB of memory, and less than 100Mbp of network bandwidth. He realized the technique for operating the system and providing the XR scene, and recognized its true value. These techniques relate to the steps of generating and maintaining a map, reducing the processing required to estimate head posture, and providing and consuming data with low computational overhead.

これらの技法は、ＸＲシステムが、（１）環境の連続画像間の区別可能な点のパッチベースの追跡（例えば、フレーム／フレーム追跡）と、（２）現在の画像の着目点と対応する着目点の既知の実世界場所の記述子ベースのマップのマッチング（例えば、マップ／フレーム追跡）との両方を活用し得るように、ハイブリッド追跡を含んでもよい。フレーム／フレーム追跡では、ＸＲシステムは、実世界環境の捕捉された画像間の角等の特定の着目点（例えば、顕著な点）を追跡してもよい。例えば、ディスプレイシステムは、前の画像内に含まれていた（例えば、その中に位置していた）、現在の画像内の視覚的着目点の場所を識別してもよい。本識別は、例えば、測光誤差最小限化プロセスを使用して遂行されてもよい。マップ／フレーム追跡では、ＸＲシステムは、着目点の実世界場所を示す、マップ情報にアクセスし、現在の画像内に含まれる着目点をマップ情報に示される着目点にマッチングさせてもよい。着目点に関する情報は、記述子としてマップデータベース内に記憶されてもよい。ＸＲシステムは、マッチングされた視覚的特徴に基づいて、その姿勢を計算してもよい。米国特許第出願第１６／２２１，０６５号は、ハイブリッド追跡を説明し、参照することによってその全体として本明細書に組み込まれる。 These techniques allow the XR system to (1) patch-based tracking of distinguishable points between continuous images in an environment (eg, frame / frame tracking) and (2) focus on the current image. Hybrid tracking may be included to take advantage of both descriptor-based map matching of known real-world locations of points (eg, map / frame tracking). In frame / frame tracking, the XR system may track specific points of interest (eg, prominent points) such as angles between captured images in the real world environment. For example, the display system may identify the location of a visual point of interest in the current image that was included (eg, located in) in the previous image. The identification may be performed, for example, using a photometric error minimization process. In map / frame tracking, the XR system may access map information that indicates the real-world location of the point of interest and match the point of interest contained in the current image with the point of interest shown in the map information. Information about the point of interest may be stored in the map database as a descriptor. The XR system may calculate its orientation based on the matched visual features. U.S. Patent Application No. 16 / 221,065 is incorporated herein by reference in its entirety by describing and referencing hybrid tracking.

これらの技法は、マッピングされた点の集合およびキーフレームを用いて疎マップを構築する、および／またはマップをブロックに分割し、ブロックによる更新を有効にすること等によって、マップを構築するときに処理される、データの量を低減させるステップを含み得る。マッピングされた点は、環境内の着目点と関連付けられてもよい。キーフレームは、カメラ捕捉データから選択された情報を含んでもよい。米国特許第出願第１６／５２０，５８２号は、位置特定マップを決定および／または評価するステップを説明し、参照することによってその全体として本明細書に組み込まれる。 These techniques are used when building a map using a set of mapped points and keyframes, and / or by dividing the map into blocks and enabling block updates, etc. It may include steps to reduce the amount of data processed. The mapped points may be associated with points of interest in the environment. The key frame may contain information selected from the camera captured data. U.S. Patent Application No. 16 / 520,582 is incorporated herein by reference in its entirety by describing and referring to the steps of determining and / or evaluating a location map.

いくつかの実施形態では、持続的空間情報は、ユーザ間およびアプリケーションを含む分散型コンポーネント間で容易に共有され得る、方法で表されてもよい。物理的世界についての情報は、例えば、持続的座標フレーム（ＰＣＦ）として表されてもよい。ＰＣＦは、物理的世界内で認識される特徴を表す、１つ以上の点に基づいて定義されてもよい。特徴は、それらがＸＲシステムのユーザセッション毎に同一である可能性が高いように選択されてもよい。ＰＣＦは、それらが効率的に処理および転送され得るように、疎らに存在し、物理的世界についての利用可能な情報の全て未満を提供してもよい。持続的空間情報を処理するための技法は、１つ以上のセッションを横断して、実空間内の１つ以上の座標系に基づいて動的マップを作成するステップと、例えば、アプリケーションプログラミングインターフェース（ＡＰＩ）を介して、ＸＲアプリケーションにエクスポーズされ得る、持続座標フレーム（ＰＣＦ）を疎マップにわたって生成するステップとを含んでもよい。これらの能力は、１つ以上のＸＲデバイスによって作成された複数のマップをランク付けおよびマージするための技法によってサポートされてもよい。持続的空間情報はまた、算出上効率的方法において、１つ以上のＸＲデバイスのそれぞれ上に頭部姿勢を迅速に復元およびリセットすることを可能にし得る。 In some embodiments, persistent spatial information may be represented in a way that can be easily shared between users and among distributed components, including applications. Information about the physical world may be represented, for example, as a persistent coordinate frame (PCF). The PCF may be defined on the basis of one or more points representing features recognized within the physical world. The features may be selected so that they are likely to be the same for each user session of the XR system. PCFs may be sparsely present and provide less than all of the available information about the physical world so that they can be processed and transferred efficiently. Techniques for processing persistent spatial information include the steps of creating a dynamic map based on one or more coordinate systems in real space across one or more sessions, and, for example, an application programming interface (application programming interface). It may include a step of generating a sustained coordinate frame (PCF) over a sparse map that can be exposed to an XR application via API). These capabilities may be supported by techniques for ranking and merging multiple maps created by one or more XR devices. Persistent spatial information may also allow for rapid restoration and reset of head posture on each of one or more XR devices in a computationally efficient manner.

さらに、本技法は、空間情報の効率的比較を有効にし得る。いくつかの実施形態では、画像フレームは、数値記述子によって表されてもよい。その記述子は、画像内で識別された特徴のセットを記述子にマッピングする、変換を介して算出されてもよい。その変換は、訓練されたニューラルネットワーク内で実施されてもよい。いくつかの実施形態では、ニューラルネットワークへの入力として供給される、特徴のセットは、例えば、持続的である可能性が高い特徴を優先的に選択する、技法を使用して画像から抽出された、特徴のフィルタリングされたセットであってもよい。 In addition, the technique can enable efficient comparison of spatial information. In some embodiments, the image frame may be represented by a numeric descriptor. The descriptor may be calculated via a transformation that maps the set of features identified in the image to the descriptor. The transformation may be performed within a trained neural network. In some embodiments, the set of features, supplied as input to the neural network, was extracted from the image using a technique that preferentially selects features that are likely to be persistent, for example. , May be a filtered set of features.

記述子としての画像フレームの表現は、例えば、新しい画像情報と記憶された画像情報の効率的マッチングを有効にする。ＸＲシステムは、持続的マップと併せて、１つ以上のフレームの記述子を持続的マップの下層に記憶してもよい。ユーザデバイスによって入手されたローカル画像フレームも同様に、そのような記述子に変換されてもよい。ローカル画像フレームのものに類似する記述子を伴う記憶されたマップを選択することによって、ユーザデバイスと同一物理的空間を表す可能性が高い１つ以上の持続的マップが、比較的に少量の処理を伴って選択され得る。いくつかの実施形態では、記述子は、ローカルマップおよび持続的マップ内のキーフレームに関して算出され、マップを比較するとき、処理をさらに低減させてもよい。そのような効率的比較は、例えば、ローカルデバイスを用いて入手された画像情報に基づいて、ローカルデバイス内にロードする、または更新するための持続的マップを見出すためための持続的マップを見出すことを簡略化するために使用されてもよい。 Representation of an image frame as a descriptor enables, for example, efficient matching of new image information with stored image information. The XR system may store the descriptors of one or more frames in the lower layer of the persistent map together with the persistent map. Local image frames obtained by the user device may be converted to such descriptors as well. By selecting a stored map with a descriptor similar to that of a local image frame, one or more persistent maps that are likely to represent the same physical space as the user device will be processed in a relatively small amount. Can be selected with. In some embodiments, descriptors are calculated for keyframes in local and persistent maps, which may further reduce processing when comparing maps. Such an efficient comparison is, for example, finding a persistent map to find a persistent map to load or update within the local device, based on image information obtained using the local device. May be used to simplify.

本明細書に説明される技法は、拡張または複合現実場面を提供する、限定された算出リソースを伴う、ウェアラブルまたはポータブルデバイスを含む、多くのタイプのデバイスとともに、かつ多くのタイプの場面のために、ともにまたは別個に使用されてもよい。いくつかの実施形態では、技法は、ＸＲシステムの一部を形成する、１つ以上のサービスによって実装されてもよい。 The techniques described herein are with and for many types of devices, including wearable or portable devices, with limited computational resources that provide extended or mixed reality situations. , Together or separately. In some embodiments, the technique may be implemented by one or more services that form part of the XR system.

ＡＲシステム概要 AR system overview

図１および２は、物理的世界の一部と併せて表示される、仮想コンテンツを伴う場面を図示する。例証目的のために、ＡＲシステムが、ＸＲシステムの実施例として使用される。図３－６Ｂは、本明細書に説明される技法に従って動作し得る、１つ以上のプロセッサと、メモリと、センサと、ユーザインターフェースとを含む、例示的ＡＲシステムを図示する。 FIGS. 1 and 2 illustrate scenes with virtual content that are displayed in conjunction with a portion of the physical world. For purposes of illustration, an AR system is used as an embodiment of an XR system. FIG. 3-6B illustrates an exemplary AR system that includes one or more processors, a memory, a sensor, and a user interface that can operate according to the techniques described herein.

図１を参照すると、屋外ＡＲ場面３５４が、描写されており、ＡＲ技術のユーザには、人々、木々、背景における建物、およびコンクリートプラットフォーム３５８を特徴とする、物理的世界公園状設定３５６が見える。これらのアイテムに加え、ＡＲ技術のユーザはまた、物理的世界コンクリートプラットフォーム３５８上に立っているロボット像３５７と、それによってマルハナバチの擬人化のように見える、飛んでいる漫画のようなアバタキャラクタ３５２とが「見える」と知覚するが、これらの要素（例えば、アバタキャラクタ３５２およびロボット像３５７）は、物理的世界内には存在しない。ヒト視知覚および神経系の極端な複雑性に起因して、他の仮想または物理的世界画像要素の中で仮想画像要素の快適で、自然な感覚で、かつ豊かな提示を促進する、ＡＲ技術を生産することは、困難である。 Referring to FIG. 1, an outdoor AR scene 354 is depicted, and users of AR technology see a physical world park-like setting 356 featuring people, trees, buildings in the background, and a concrete platform 358. .. In addition to these items, users of AR technology also have a robot image 357 standing on the physical world concrete platform 358 and a flying cartoon-like avatar character 352, thereby looking like anthropomorphic bumblebees. Although they perceive "visible", these elements (eg, avatar character 352 and robot image 357) do not exist in the physical world. AR technology that promotes the comfortable, natural and rich presentation of virtual image elements among other virtual or physical world image elements due to the extreme complexity of human visual perception and nervous system. Is difficult to produce.

そのようなＡＲ場面は、ユーがＡＲコンテンツを物理的世界内に設置することを可能にし、ＡＲコンテンツが設置された物理的世界のマップ内の場所を決定し、設置されたＡＲコンテンツが、例えば、異なるＡＲ体験セッションの間、物理的世界内に表示するために再ロードされ得るように、ＡＲ場面を保存し、複数のユーザがＡＲ体験を共有することを可能にする、追跡情報に基づいて物理的世界のマップを構築するシステムを用いて、達成され得る。本システムは、ユーザの周囲の物理的世界表面のデジタル表現を構築および更新し得る。本表現は、仮想オブジェクトを設置するため、物理学ベースの相互作用において、および仮想キャラクタ経路計画およびナビゲーションのため、またはその中で物理的世界についての情報が使用される、他の動作のために、完全または部分的に、ユーザと仮想コンテンツのレンダリングされた場所との間で物理的オブジェクトによってオクルードされるように現れるように、仮想コンテンツをレンダリングするために使用されてもよい。 Such an AR scene allows you to place the AR content in the physical world, determines the location in the map of the physical world where the AR content is installed, and the installed AR content is, for example. Based on tracking information, it saves AR scenes and allows multiple users to share their AR experience, so that they can be reloaded for viewing within the physical world during different AR experience sessions. It can be achieved using a system that builds a map of the physical world. The system may build and update digital representations of the physical world surface around the user. This representation is used to place virtual objects, in physics-based interactions, and for virtual character path planning and navigation, or for other actions in which information about the physical world is used. , Completely or partially, may be used to render the virtual content so that it appears to be included by the physical object between the user and the rendered location of the virtual content.

図２は、いくつかの実施形態による、屋内ＡＲ場面４００の別の実施例を描写し、ＸＲシステムの例示的ユースケースを示す。例示的場面４００は、壁と、壁の片側上の本棚と、部屋の角における床置きランプと、床と、ソファと、床上のコーヒーテーブルとを有する、居間である。これらの物理的アイテムに加え、ＡＲ技術のユーザはまた、ソファの背後の壁上の画像、ドアを通して飛んで来た鳥、本棚から外を覗いているシカ、およびコーヒーテーブル上に設置された風車の形態における装飾品等の仮想オブジェクトを知覚する。 FIG. 2 illustrates another embodiment of the indoor AR scene 400 according to some embodiments and shows an exemplary use case of an XR system. An exemplary scene 400 is a living room with a wall, a bookshelf on one side of the wall, a floor lamp at the corner of the room, a floor, a sofa, and a coffee table on the floor. In addition to these physical items, users of AR technology also have images on the wall behind the sofa, birds flying through the door, deer looking out from the bookshelf, and a windmill installed on a coffee table. Perceive virtual objects such as ornaments in the form of.

壁上の画像に関して、ＡＲ技術は、壁の表面だけではなく、また、仮想オブジェクトを正しくレンダリングするために画像をオクルードする、ランプ形状等の部屋内のオブジェクトおよび表面についての情報も要求する。飛んで来た鳥に関して、ＡＲ技術は、オブジェクトおよび表面を回避する、または鳥が衝突する場合、跳ね返るように、現実的物理学を用いて鳥をレンダリングするために、部屋の周囲の全てのオブジェクトおよび表面についての情報を要求する。シカに関して、ＡＲ技術は、シカを設置すべき場所を算出するために、床またはコーヒーテーブル等の表面についての情報を要求する。風車に関して、システムは、テーブルと別個のオブジェクトであることを識別し得、それが移動可能であることを決定し得る一方、棚の角または壁の角は、定常であると決定され得る。そのような特異性は、種々の動作のそれぞれにおいて使用または更新される場面の部分に関する決定において使用されてもよい。 For images on walls, AR technology requires information not only about the surface of the wall, but also about objects and surfaces in the room, such as lamp shapes, that occlude the image in order to render the virtual object correctly. For flying birds, AR technology uses realistic physics to render the bird, avoiding objects and surfaces, or bouncing when the bird collides, with all objects around the room. And request information about the surface. With respect to deer, AR technology requires information about surfaces such as floors or coffee tables to calculate where deer should be placed. With respect to the wind turbine, the system can identify that it is an object separate from the table and can determine that it is mobile, while the corners of the shelves or the corners of the walls can be determined to be stationary. Such specificity may be used in determining the part of the scene to be used or updated in each of the various actions.

仮想オブジェクトは、前のＡＲ体験セッション内に設置されてもよい。新しいＡＲ体験セッションが、居間で開始すると、ＡＲ技術は、仮想オブジェクトが、以前に設置された場所に正確に表示され、異なる視点から現実的に可視であることを要求する。例えば、風車は、書籍を伴わない異なる場所においても、テーブルの上方に漂流しているのではなく、書籍上に立っているように表示されるべきである。そのような漂流は、新しいＡＲ体験セッションのユーザの場所が居間内で正確に位置特定されない場合に起こり得る。別の実施例として、ユーザが、風車が設置されたときの視点と異なる視点から風車を視認している場合、ＡＲ技術は、表示されている風車の対応する側を要求する。 The virtual object may be installed in the previous AR experience session. When a new AR experience session begins in the living room, AR technology requires that virtual objects be accurately displayed at previously installed locations and realistically visible from different perspectives. For example, the windmill should appear to be standing on the book rather than drifting above the table, even in different places without the book. Such drifting can occur if the user's location for a new AR experience session is not accurately located in the living room. As another embodiment, if the user is viewing the wind turbine from a different perspective than when the wind turbine was installed, AR technology requires the corresponding side of the displayed wind turbine.

場面は、視覚、聴覚、および／または触覚等、１つ以上のユーザ感知を刺激し得る、ユーザインターフェースを含む、複数のコンポーネントを含む、システムを介して、ユーザに提示されてもよい。加えて、システムは、場面の物理的部分内のユーザの位置および／または運動を含む、場面の物理的部分のパラメータを測定し得る、１つ以上のセンサを含んでもよい。さらに、システムは、メモリ等の関連付けられるコンピュータハードウェアを伴う、１つ以上のコンピューティングデバイスを含んでもよい。これらのコンポーネントは、単一デバイスの中に統合されてもよい、または複数の相互接続されるデバイスを横断して分散されてもよい。いくつかの実施形態では、これらのコンポーネントのいくつかまたは全ては、ウェアラブルデバイスの中に統合されてもよい。 The scene may be presented to the user via a system that includes a plurality of components, including a user interface, which may stimulate one or more user perceptions such as visual, auditory, and / or tactile sensations. In addition, the system may include one or more sensors capable of measuring parameters of the physical part of the scene, including the position and / or movement of the user within the physical part of the scene. Further, the system may include one or more computing devices with associated computer hardware such as memory. These components may be integrated within a single device or distributed across multiple interconnected devices. In some embodiments, some or all of these components may be integrated into the wearable device.

図３は、いくつかの実施形態による、物理的世界５０６と相互作用するＡＲコンテンツの体験を提供するように構成される、ＡＲシステム５０２を描写する。ＡＲシステム５０２は、ディスプレイ５０８を含んでもよい。図示される実施形態では、ディスプレイ５０８は、ユーザが、一対のゴーグルまたは眼鏡のように、ディスプレイをその眼にわたって装着し得るように、ヘッドセットの一部としてユーザによって装着されてもよい。ディスプレイの少なくとも一部は、ユーザがシースルー現実５１０を観察し得るように、透明であってもよい。シースルー現実５１０は、ＡＲシステム５０２の現在の視点内の物理的世界５０６の部分に対応し得、これは、ユーザが、ＡＲシステムのディスプレイおよびセンサの両方を組み込む、ヘッドセットを装着し、物理的世界についての情報を入手している場合のユーザの視点に対応し得る。 FIG. 3 depicts an AR system 502 configured to provide an experience of AR content that interacts with the physical world 506, according to some embodiments. The AR system 502 may include a display 508. In the illustrated embodiment, the display 508 may be worn by the user as part of a headset so that the user can wear the display across his eyes, such as a pair of goggles or eyeglasses. At least a portion of the display may be transparent so that the user can observe the see-through reality 510. The see-through reality 510 may correspond to a portion of the physical world 506 within the current viewpoint of the AR system 502, which allows the user to wear a headset and physically incorporate both the display and the sensor of the AR system. It can correspond to the user's point of view when getting information about the world.

ＡＲコンテンツはまた、シースルー現実５１０上にオーバーレイされる、ディスプレイ５０８上に提示されてもよい。ＡＲコンテンツとシースルー現実５１０との間の正確な相互作用をディスプレイ５０８上で提供するために、ＡＲシステム５０２は、物理的世界５０６についての情報を捕捉するように構成される、センサ５２２を含んでもよい。 AR content may also be presented on display 508, which is overlaid on see-through reality 510. To provide the exact interaction between AR content and see-through reality 510 on display 508, AR system 502 also includes sensor 522, which is configured to capture information about the physical world 506. good.

センサ５２２は、深度マップ５１２を出力する、１つ以上の深度センサを含んでもよい。各深度マップ５１２は、それぞれ、深度センサに対する特定の方向における物理的世界５０６内の表面までの距離を表し得る、複数のピクセルを有してもよい。未加工深度データが、深度センサから生じ、深度マップを作成し得る。そのような深度マップは、深度センサが新しい画像を形成し得るほど高速に更新され得、これは、数百または数千回／秒であり得る。しかしながら、そのデータは、雑音が多くかつ不完全であって、図示される深度マップ上に黒色ピクセルとして示される、穴を有し得る。 The sensor 522 may include one or more depth sensors that output the depth map 512. Each depth map 512 may have a plurality of pixels that may represent the distance to a surface within the physical world 506 in a particular direction with respect to the depth sensor. Raw depth data can be generated from the depth sensor to create a depth map. Such depth maps can be updated fast enough for the depth sensor to form new images, which can be hundreds or thousands of times / second. However, the data may have holes, which are noisy and incomplete and are shown as black pixels on the illustrated depth map.

システムは、画像センサ等の他のセンサを含んでもよい。画像センサは、他の方法において物理的世界を表すように処理され得る、単眼または立体視情報を入手してもよい。例えば、画像は、世界再構築コンポーネント５１６内で処理され、物理的世界内のオブジェクトの接続される部分を表す、メッシュを作成してもよい。例えば、色および表面テクスチャを含む、そのようなオブジェクトについてのメタデータも同様に、センサを用いて入手され、世界再構築物の一部として記憶されてもよい。 The system may include other sensors such as image sensors. The image sensor may obtain monocular or stereoscopic information that can be otherwise processed to represent the physical world. For example, the image may be processed within the World Reconstruction Component 516 to create a mesh that represents the connected parts of the object in the physical world. For example, metadata about such objects, including colors and surface textures, may also be obtained using sensors and stored as part of the world reconstruction.

システムはまた、物理的世界に対するユーザの頭部姿勢（または「姿勢」）についての情報を入手してもよい。いくつかの実施形態では、システムの頭部姿勢追跡コンポーネントは、頭部姿勢をリアルタイムで算出するために使用されてもよい。頭部姿勢追跡コンポーネントは、例えば、３つの垂直軸における平行移動（例えば、前／後、上／下、左／右）および３つの垂直軸を中心とした回転（例えば、ピッチ、ヨー、およびロール）を含む、６自由度を伴って、座標フレーム内のユーザの頭部姿勢を表し得る。いくつかの実施形態では、センサ５２２は、頭部姿勢５１４を算出および／または決定するために使用され得る、慣性測定ユニットを含んでもよい。深度マップのための頭部姿勢５１４は、例えば、６自由度を伴う、深度マップを捕捉するセンサの現在の視点を示し得るが、頭部姿勢５１４は、画像情報を物理的世界の特定の部分に関連させる、またはユーザの頭部上に装着されるディスプレイの位置を物理的世界に関連させる等の他の目的のために使用されてもよい。 The system may also obtain information about the user's head posture (or "posture") with respect to the physical world. In some embodiments, the head posture tracking component of the system may be used to calculate head posture in real time. Head posture tracking components include, for example, translation in three vertical axes (eg, front / back, up / down, left / right) and rotation around the three vertical axes (eg, pitch, yaw, and roll). ), With 6 degrees of freedom, may represent the user's head orientation within the coordinate frame. In some embodiments, the sensor 522 may include an inertial measurement unit that can be used to calculate and / or determine the head posture 514. The head posture 514 for a depth map may indicate, for example, the current viewpoint of a sensor that captures the depth map, with 6 degrees of freedom, while the head posture 514 makes the image information a specific part of the physical world. It may be used for other purposes such as relating to or relating the position of the display mounted on the user's head to the physical world.

いくつかの実施形態では、頭部姿勢情報は、画像内のオブジェクトの分析から等、ＩＭＵ以外の方法で導出されてもよい。例えば、頭部姿勢追跡コンポーネントは、カメラによって捕捉された視覚的情報およびＩＭＵによって捕捉された慣性情報に基づいて、物理的オブジェクトに対するＡＲデバイスの相対的位置および配向を算出してもよい。頭部姿勢追跡コンポーネントは、次いで、例えば、物理的オブジェクトに対するＡＲデバイスの算出された相対的位置および配向と物理的オブジェクトの特徴を比較することによって、ＡＲデバイスの頭部姿勢を算出してもよい。いくつかの実施形態では、その比較は、経時的に捕捉された画像内のこれらの特徴の位置の変化がユーザの頭部姿勢の変化と関連付けられ得るように、経時的に安定する、センサ５２２のうちの１つ以上のものを用いて捕捉された画像内の特徴を識別することによって、行われてもよい。 In some embodiments, the head posture information may be derived by methods other than IMU, such as from analysis of objects in the image. For example, the head posture tracking component may calculate the relative position and orientation of the AR device with respect to the physical object based on the visual information captured by the camera and the inertial information captured by the IMU. The head posture tracking component may then calculate the head posture of the AR device, for example, by comparing the calculated relative position and orientation of the AR device to the physical object with the characteristics of the physical object. .. In some embodiments, the comparison is stable over time so that changes in the position of these features within the captured image over time can be associated with changes in the user's head posture, sensor 522. It may be done by identifying features in the captured image using one or more of them.

いくつかの実施形態では、ＡＲデバイスは、ユーザがＡＲデバイスとともに物理的世界全体を通して移動するにつれて捕捉された一連の画像フレーム内の連続画像内で認識される、特徴点から、マップを構築してもよい。各画像フレームは、ユーザが移動するにつれて、異なる姿勢から得られ得るが、システムは、連続画像フレームの特徴と以前に捕捉された画像フレームをマッチングさせることによって、各連続画像フレームの特徴の配向を調節し、初期画像フレームの配向をマッチングさせてもよい。同一特徴を表す点が、以前に収集された画像フレームからの対応する特徴点にマッチングするであろうように、連続画像フレームの平行移動は、各連続画像フレームを整合させ、以前に処理された画像フレームの配向をマッチングさせるために使用されることができる。結果として生じるマップ内のフレームは、第１の画像フレームがマップに追加されたときに確立される共通配向を有し得る。本マップは、共通基準フレーム内の特徴点のセットとともに、現在の画像フレームからの特徴をマップにマッチングさせることによって、物理的世界内のユーザの姿勢を決定するために使用されてもよい。いくつかの実施形態では、本マップは、追跡マップと呼ばれ得る。 In some embodiments, the AR device builds a map from feature points that are recognized within a series of images captured as the user travels with the AR device throughout the physical world. May be good. Each image frame can be obtained from different poses as the user moves, but the system aligns the features of each continuous image frame by matching the features of the continuous image frame with the previously captured image frame. It may be adjusted to match the orientation of the initial image frame. Translation of continuous image frames aligns each continuous image frame and was previously processed so that points representing identical features would match the corresponding feature points from previously collected image frames. It can be used to match the orientation of the image frame. The resulting frames in the map may have the common orientation established when the first image frame is added to the map. The map, along with a set of feature points within a common reference frame, may be used to determine the posture of the user within the physical world by matching features from the current image frame to the map. In some embodiments, the map may be referred to as a tracking map.

環境内のユーザの姿勢の追跡を有効にすることに加え、本マップは、世界再構築コンポーネント５１６等のシステムの他のコンポーネントがユーザに対する物理的オブジェクトの場所を決定することを可能にし得る。世界再構築コンポーネント５１６は、深度マップ５１２および頭部姿勢５１４およびセンサからの任意の他のデータを受信し、そのデータを再構築物５１８の中に統合してもよい。再構築物５１８は、センサデータより完全かつより雑音が少なくあり得る。世界再構築コンポーネント５１６は、経時的な複数の視点からのセンサデータの空間および時間的平均を使用して、再構築物５１８を更新してもよい。 In addition to enabling tracking of the user's attitude within the environment, this map may allow other components of the system, such as the World Reconstruction Component 516, to determine the location of physical objects for the user. The world reconstruction component 516 may receive depth map 512 and head posture 514 and any other data from the sensor and integrate that data into the reconstruction 518. Reconstruction 518 may be more complete and less noisy than the sensor data. The world reconstruction component 516 may update the reconstruction 518 using the spatial and temporal averages of the sensor data from multiple perspectives over time.

再構築物５１８は、例えば、ボクセル、メッシュ、平面等を含む、１つ以上のデータフォーマットにおいて、物理的世界の表現を含んでもよい。異なるフォーマットは、物理的世界の同一部分の代替表現を表し得る、または物理的世界の異なる部分を表し得る。図示される実施例では、再構築物５１８の左側には、物理的世界の一部が、グローバル表面として提示され、再構築物５１８の右側には、物理的世界の一部が、メッシュとして提示される。 Reconstruction 518 may include representations of the physical world in one or more data formats, including, for example, voxels, meshes, planes, and the like. Different formats can represent alternative representations of the same part of the physical world, or can represent different parts of the physical world. In the illustrated embodiment, on the left side of the reconstructed structure 518, a part of the physical world is presented as a global surface, and on the right side of the reconstructed structure 518, a part of the physical world is presented as a mesh. ..

いくつかの実施形態では、頭部姿勢コンポーネント５１４によって維持されるマップは、物理的世界の維持され得る、他のマップに対して疎隔されてもよい。場所および可能性として表面の他の特性についての情報を提供するのではなく、疎マップは、角または縁等の着目点および／または構造の場所を示してもよい。いくつかの実施形態では、マップは、センサ５２２によって捕捉されるような画像フレームを含んでもよい。これらのフレームは、着目点および／または構造を表し得る、特徴に低減され得る。各フレームと併せて、そこからフレームが入手されたユーザの姿勢についての情報もまた、マップの一部として記憶されてもよい。いくつかの実施形態では、センサによって入手された全ての画像が、記憶される場合とそうではない場合がある。いくつかの実施形態では、システムは、それらがセンサによって収集されるにつれて、画像を処理し、さらなる算出のために、画像フレームのサブセットを選択してもよい。選択は、情報の追加を限定するが、マップが有用な情報を含有することを確実にする、１つ以上の基準に基づいてもよい。本システムは、例えば、マップにすでに追加されている以前の画像フレームとの重複に基づいて、または定常オブジェクトを表す可能性が高いと決定された十分な数の特徴を含有する、画像フレームに基づいて、新しい画像フレームをマップに追加してもよい。いくつかの実施形態では、選択された画像フレームまたは選択された画像フレームからの特徴の群は、マップのためのキーフレームとしての役割を果たし得、これは、空間情報を提供するために使用される。 In some embodiments, the map maintained by the head posture component 514 may be isolated from other maps that may be maintained in the physical world. Rather than providing information about locations and possibly other properties of the surface, sparse maps may indicate points of interest and / or structural locations such as corners or edges. In some embodiments, the map may include image frames such as those captured by sensor 522. These frames can be reduced to features that can represent points of interest and / or structure. Along with each frame, information about the posture of the user from which the frame was obtained may also be stored as part of the map. In some embodiments, all images obtained by the sensor may or may not be stored. In some embodiments, the system may process the images as they are collected by the sensor and select a subset of image frames for further calculation. The choice may be based on one or more criteria that limit the addition of information, but ensure that the map contains useful information. The system is based on image frames, for example, based on overlapping with previous image frames already added to the map, or containing a sufficient number of features determined to represent stationary objects. You may add a new image frame to the map. In some embodiments, the selected image frame or the set of features from the selected image frame can serve as a keyframe for the map, which is used to provide spatial information. To.

ＡＲシステム５０２は、物理的世界の複数の視点からのセンサデータを経時的に統合してもよい。センサの姿勢（例えば、位置および配向）が、センサを含むデバイスが移動されるにつれて追跡されてもよい。センサのフレーム姿勢およびそれが他の姿勢とどのように関連するかが、把握されるにつれて、物理的世界のこれらの複数の視点はそれぞれ、物理的世界の単一の組み合わせられた再構築物の中にともに融合されてもよく、これは、マップのための抽象層としての役割を果たし、空間情報を提供し得る。再構築物は、空間および時間的平均（すなわち、経時的な複数の視点からのデータの平均）または任意の他の好適な方法を使用することによって、オリジナルセンサデータより完全かつ雑音が少なくなり得る。 The AR system 502 may integrate sensor data from multiple viewpoints in the physical world over time. The posture of the sensor (eg, position and orientation) may be tracked as the device containing the sensor is moved. As we understand the frame orientation of the sensor and how it relates to other attitudes, each of these multiple perspectives of the physical world is within a single combined reconstruction of the physical world. May be fused together, which can serve as an abstraction layer for the map and provide spatial information. The reconstruction can be more complete and less noisy than the original sensor data by using spatial and temporal averaging (ie, averaging of data from multiple perspectives over time) or any other suitable method.

図３に図示される実施形態では、マップは、その中に単一ウェアラブルデバイスのユーザが存在する、物理的世界の一部を表す。そのシナリオでは、マップ内のフレームと関連付けられる頭部姿勢は、セッションの開始時における単一デバイスに関する初期配向に対する配向を示す、ローカル頭部姿勢として表されてもよい。例えば、頭部姿勢は、デバイスが、オンにされた、または別様に、環境を走査し、その環境の表現を構築するように動作されたときの、初期頭部姿勢に対して追跡されてもよい。 In the embodiment illustrated in FIG. 3, the map represents a part of the physical world in which a user of a single wearable device resides. In that scenario, the head posture associated with the frame in the map may be represented as a local head posture that indicates the orientation with respect to the initial orientation for a single device at the start of the session. For example, the head posture is tracked against the initial head posture when the device is turned on or otherwise operated to scan the environment and construct a representation of that environment. May be good.

物理的世界のその部分を特徴付けるコンテンツと組み合わせて、マップは、メタデータを含んでもよい。メタデータ、例えば、マップを形成するために使用されるセンサ情報の捕捉時間を示してもよい。メタデータは、代替として、または加えて、マップを形成するために使用される情報の捕捉時間におけるセンサの場所を示してもよい。場所は、直接、ＧＰＳチップからの情報等を用いて、または間接的に、センサデータが収集されていた間の１つ以上の無線アクセスポイントから受信された信号の強度を示す、Ｗｉ－Ｆｉシグネチャ等を用いて、および／またはセンサデータが収集された間にそれに対してユーザデバイスが接続した無線アクセスポイントのＢＳＳＩＤを用いて、表されてもよい。 In combination with the content that characterizes that part of the physical world, the map may contain metadata. It may indicate the capture time of the metadata, eg, the sensor information used to form the map. The metadata may, as an alternative, or in addition, indicate the location of the sensor in the capture time of the information used to form the map. The location indicates the strength of the signal received from one or more wireless access points while the sensor data was being collected, directly using information such as from a GPS chip, or indirectly, the Wi-Fi signature. Etc. and / or using the BSSID of the wireless access point to which the user device is connected to the sensor data while it was collected.

再構築物５１８は、オクルージョン処理または物理学ベースの処理のための物理的世界の表面表現の生産等、ＡＲ機能のために使用されてもよい。本表面表現は、ユーザが移動する、または物理的世界内のオブジェクトが変化するにつれて、変化してもよい。再構築物５１８の側面は、例えば、他のコンポーネントによって使用され得る、世界座標内の変化するグローバル表面表現を生産する、コンポーネント５２０によって使用されてもよい。 Reconstruction 518 may be used for AR functions such as the production of surface representations of the physical world for occlusion processing or physics-based processing. This surface representation may change as the user moves or objects in the physical world change. Aspects of the reconstruction 518 may be used, for example, by component 520, which produces a variable global surface representation in world coordinates that can be used by other components.

ＡＲコンテンツは、本情報に基づいて、ＡＲアプリケーション５０４等によって生成されてもよい。ＡＲアプリケーション５０４は、例えば、視覚的オクルージョン、物理学ベースの相互作用、および環境推測等の物理的世界についての情報に基づいて、１つ以上の機能を実施する、ゲームプログラムであってもよい。これは、世界再構築コンポーネント５１６によって生産された再構築物５１８から異なるフォーマットにおけるデータにクエリすることによって、これらの機能を実施してもよい。いくつかの実施形態では、コンポーネント５２０は、物理的世界の着目領域内の表現が変化すると、更新を出力するように構成されてもよい。その着目領域は、例えば、ユーザの視野内の一部等、システムのユーザの近傍内の物理的世界の一部に近似するように設定される、またはユーザの視野内に入るように投影（予測／決定）されてもよい。 The AR content may be generated by the AR application 504 or the like based on this information. The AR application 504 may be a game program that performs one or more functions based on information about the physical world, such as visual occlusion, physics-based interactions, and environmental inference. It may perform these functions by querying data in different formats from the rebuild 518 produced by the world rebuild component 516. In some embodiments, component 520 may be configured to output updates as the representation within the region of interest in the physical world changes. The region of interest is set to approximate (predict) a portion of the physical world within the user's vicinity of the system, such as a portion of the user's field of view, or projected into the user's field of view. / Decided).

ＡＲアプリケーション５０４は、本情報を使用して、ＡＲコンテンツを生成および更新してもよい。ＡＲコンテンツの仮想部分は、シースルー現実５１０と組み合わせて、ディスプレイ５０８上に提示され、現実的ユーザ体験を作成してもよい。 The AR application 504 may use this information to generate and update AR content. The virtual portion of the AR content may be presented on the display 508 in combination with the see-through reality 510 to create a realistic user experience.

いくつかの実施形態では、ＡＲ体験は、遠隔処理および／または遠隔データ記憶装置を含み得る、システムの一部であり得る、ウェアラブルディスプレイデバイス、および／または、いくつかの実施形態では、他のユーザによって装着される他のウェアラブルディスプレイデバイスであり得る、ＸＲデバイスを通して、ユーザに提供されてもよい。図４は、例証の便宜上、単一ウェアラブルデバイスを含む、システム５８０（以降、「システム５８０」と称される）の実施例を図示する。システム５８０は、頭部搭載型ディスプレイデバイス５６２（以降、「ディスプレイデバイス５６２」と称される）と、ディスプレイデバイス５６２の機能をサポートする、種々の機械および電子モジュールおよびシステムとを含む。ディスプレイデバイス５６２は、フレーム５６４に結合されてもよく、これは、ディスプレイシステムのユーザまたは視認者５６０（以降、「ユーザ５６０」と称される）によって装着可能であって、ディスプレイデバイス５６２をユーザ５６０の眼の正面に位置付けるように構成される。種々の実施形態によると、ディスプレイデバイス５６２は、シーケンシャルディスプレイであってもよい。ディスプレイデバイス５６２は、単眼または両眼であってもよい。いくつかの実施形態では、ディスプレイデバイス５６２は、図３におけるディスプレイ５０８の実施例であってもよい。 In some embodiments, the AR experience may include remote processing and / or remote data storage, a wearable display device that may be part of the system, and / or, in some embodiments, other users. It may be provided to the user through an XR device, which may be another wearable display device worn by. FIG. 4 illustrates an example of a system 580 (hereinafter referred to as "system 580"), including a single wearable device, for convenience of illustration. The system 580 includes a head-mounted display device 562 (hereinafter referred to as "display device 562") and various mechanical and electronic modules and systems that support the functions of the display device 562. The display device 562 may be coupled to the frame 564, which is wearable by a user of the display system or a viewer 560 (hereinafter referred to as "user 560"), and the display device 562 is attached to the user 560. It is configured to be positioned in front of the eye. According to various embodiments, the display device 562 may be a sequential display. The display device 562 may be monocular or binocular. In some embodiments, the display device 562 may be an embodiment of the display 508 in FIG.

いくつかの実施形態では、スピーカ５６６が、フレーム５６４に結合され、ユーザ５６０の外耳道に近接して位置付けられる。いくつかの実施形態では、示されない、別のスピーカが、ユーザ５６０の別の外耳道に隣接して位置付けられ、ステレオ／調節可能音制御を提供する。ディスプレイデバイス５６２は、有線導線または無線コネクティビティ５６８等によって、ローカルデータ処理モジュール５７０に動作可能に結合され、これは、フレーム５６４に固定して取り付けられる、ユーザ５６０によって装着されるヘルメットまたは帽子に固定して取り付けられる、ヘッドホンに内蔵される、または別様にユーザ５６０に除去可能に取り付けられる（例えば、リュック式構成において、ベルト結合式構成において）等、種々の構成において搭載されてもよい。 In some embodiments, the speaker 566 is coupled to the frame 564 and positioned in close proximity to the user 560's ear canal. In some embodiments, another speaker, not shown, is positioned adjacent to another ear canal of the user 560 to provide stereo / adjustable sound control. The display device 562 is operably coupled to the local data processing module 570 by a wired lead or wireless connectivity 568 or the like, which is secured to a helmet or hat worn by the user 560, which is secured and attached to the frame 564. It may be mounted in a variety of configurations, such as being mounted, built into headphones, or otherwise removably mounted by the user 560 (eg, in a backpack configuration, in a belt coupling configuration).

ローカルデータ処理モジュール５７０は、プロセッサおよび不揮発性メモリ（例えば、フラッシュメモリ）等のデジタルメモリを含んでもよく、その両方とも、データの処理、キャッシュ、および記憶を補助するために利用され得る。データは、ａ）画像捕捉デバイス（カメラ等）、マイクロホン、慣性測定ユニット、加速度計、コンパス、ＧＰＳユニット、無線デバイス、および／またはジャイロスコープ等の（例えば、フレーム５６４に動作可能に結合される、または別様にユーザ５６０に取り付けられ得る）センサから捕捉されるデータ、および／またはｂ）可能性として、処理または読出後にディスプレイデバイス５６２への通過のために、遠隔処理モジュール５７２および／または遠隔データリポジトリ５７４を使用して入手および／または処理されるデータを含む。 The local data processing module 570 may include a processor and digital memory such as non-volatile memory (eg, flash memory), both of which can be utilized to assist in data processing, caching, and storage. The data is operably coupled to a) an image capture device (such as a camera), a microphone, an inertial measurement unit, an accelerometer, a compass, a GPS unit, a wireless device, and / or a gyroscope (eg, frame 564). Data captured from a sensor (or otherwise attached to user 560), and / or b) potentially for passage to display device 562 after processing or reading, remote processing module 572 and / or remote data. Contains data obtained and / or processed using repository 574.

いくつかの実施形態では、ウェアラブルデバイスは、遠隔コンポーネントと通信してもよい。ローカルデータ処理モジュール５７０は、それぞれ、有線または無線通信リンク等を介して、通信リンク５７６、５７８によって、遠隔処理モジュール５７２および遠隔データリポジトリ５７４に、これらの遠隔モジュール５７２、５７４が、相互に動作可能に結合され、ローカルデータ処理モジュール５７０へのリソースとして利用可能であるように、動作可能に結合されてもよい。いくつかの実施形態では、上記に説明される頭部姿勢追跡コンポーテントは、少なくとも部分的に、ローカルデータ処理モジュール５７０内に実装されてもよい。いくつかの実施形態では、図３における世界再構築コンポーネント５１６は、少なくとも部分的に、ローカルデータ処理モジュール５７０内に実装されてもよい。例えば、ローカルデータ処理モジュール５７０は、少なくとも部分的に、データの少なくとも一部に基づいて、コンピュータ実行可能命令を実行し、マップおよび／または物理的世界表現を生成するように構成されてもよい。 In some embodiments, the wearable device may communicate with a remote component. The local data processing module 570 can operate the remote processing modules 572 and the remote data repository 574 with each other by the communication link 576 and 578, respectively, via a wired or wireless communication link or the like. It may be operably combined so that it is combined with and available as a resource to the local data processing module 570. In some embodiments, the head posture tracking components described above may be implemented, at least in part, within the local data processing module 570. In some embodiments, the world reconstruction component 516 in FIG. 3 may be implemented, at least in part, within the local data processing module 570. For example, the local data processing module 570 may be configured to execute computer executable instructions and generate maps and / or physical world representations, at least in part, based on at least a portion of the data.

いくつかの実施形態では、処理は、ローカルおよび遠隔プロセッサを横断して分散されてもよい。例えば、ローカル処理が、そのユーザのデバイス上のセンサを用いて収集されたセンサデータに基づいて、マップ（例えば、追跡マップ）をユーザデバイス上に構築するために使用されてもよい。そのようなマップは、そのユーザのデバイス上のアプリケーションによって使用されてもよい。加えて、以前に作成されたマップ（例えば、規準マップ）は、遠隔データリポジトリ５７４内に記憶されてもよい。好適な記憶されたまたは持続的マップが、利用可能である場合、デバイス上にローカルで作成された追跡マップの代わりに、またはそれに加え、使用されてもよい。いくつかの実施形態では、追跡マップは、対応が、ユーザがシステムをオンにした時間におけるウェアラブルデバイスの位置に対して配向され得る、追跡マップと、１つ以上の持続的特徴に対して配向され得る、規準マップとの間に確立されるように、記憶されたマップに対して位置特定されてもよい。いくつかの実施形態では、持続的マップは、ユーザデバイス上にロードされ、ユーザデバイスが、走査の間に入手されたセンサデータからのユーザの完全な環境の追跡マップを構築するための場所の走査と関連付けられる遅延を伴わずに、仮想コンテンツをレンダリングすることを可能にし得る。いくつかの実施形態では、ユーザデバイスは、持続的マップをユーザデバイス上にダウンロードする必要なく、（例えば、クラウド上に記憶された）遠隔持続的マップにアクセスしてもよい。 In some embodiments, processing may be distributed across local and remote processors. For example, local processing may be used to build a map (eg, a tracking map) on a user device based on sensor data collected using the sensors on that user's device. Such maps may be used by applications on the user's device. In addition, previously created maps (eg, canonical maps) may be stored in the remote data repository 574. Suitable stored or persistent maps may be used in place of or in addition to locally created tracking maps on the device, if available. In some embodiments, the tracking map is oriented with respect to the tracking map and one or more persistent features, where the correspondence can be oriented with respect to the position of the wearable device at the time the user turns on the system. Obtained, may be located relative to the stored map so that it is established between the reference map and the obtained. In some embodiments, the persistent map is loaded onto the user device and the user device scans the location to build a complete environment tracking map of the user from the sensor data obtained during the scan. It may be possible to render virtual content without the delay associated with. In some embodiments, the user device may access a remote persistent map (eg, stored in the cloud) without having to download the persistent map onto the user device.

代替として、または加えて、追跡マップは、以前に記憶されたマップとマージされ、それらのマップを拡張させる、またはその品質を改良してもよい。好適な以前に作成された環境マップが利用可能であるか、および／または追跡マップと１つ以上の記憶された環境マップをマージするかどうかを決定するための処理は、ローカルデータ処理モジュール５７０または遠隔処理モジュール５７２内で行われてもよい。 Alternatively, or in addition, tracking maps may be merged with previously stored maps to extend or improve their quality. The process for deciding whether a suitable previously created environment map is available and / or whether to merge the tracking map with one or more stored environment maps is the local data processing module 570 or It may be done in the remote processing module 572.

いくつかの実施形態では、ローカルデータ処理モジュール５７０は、データおよび／または画像情報を分析および処理するように構成される、１つ以上のプロセッサ（例えば、グラフィック処理ユニット（ＧＰＵ））を含んでもよい。いくつかの実施形態では、ローカルデータ処理モジュール５７０は、単一プロセッサ（例えば、シングルコアまたはマルチコアＡＲＭプロセッサ）を含んでもよく、これは、ローカルデータ処理モジュール５７０の算出予算を限定するが、より小型のデバイスを有効にするであろう。いくつかの実施形態では、世界再構築コンポーネント５１６は、単一ＡＲＭコアの残りの算出予算が、例えば、メッシュの抽出等の他の使用のためにアクセスされ得るように、単一高度ＲＩＳＣ機械（ＡＲＭ）コアより少ない算出予算を使用して、物理的世界表現をリアルタイムで非所定の空間上に生成してもよい。 In some embodiments, the local data processing module 570 may include one or more processors (eg, a graphics processing unit (GPU)) configured to analyze and process data and / or image information. .. In some embodiments, the local data processing module 570 may include a single processor (eg, a single-core or multi-core ARM processor), which limits the computational budget of the local data processing module 570, but is smaller. Will enable the device. In some embodiments, the World Reconstruction Component 516 is a single altitude RISC machine (so that the remaining calculated budget of a single ARM core can be accessed for other uses, such as mesh extraction). The physical world representation may be generated in real time on a non-predetermined space using less computational budget than the ARM) core.

いくつかの実施形態では、遠隔データリポジトリ５７４は、デジタルデータ記憶設備を含んでもよく、これは、インターネットまたは「クラウド」リソース構成における他のネットワーキング構成を通して利用可能であってもよい。いくつかの実施形態では、全てのデータが、記憶され、全ての算出が、ローカルデータ処理モジュール５７０において実施され、遠隔モジュールからの完全に自律的な使用を可能にする。いくつかの実施形態では、全てのデータが、記憶され、全てまたは大部分の算出は、遠隔データリポジトリ５７４内で実施され、より小さいデバイスを可能にする。世界再構築物は、例えば、全体または部分的に、本リポジトリ５７４内に記憶されてもよい。 In some embodiments, the remote data repository 574 may include digital data storage equipment, which may be available through the Internet or other networking configurations in a "cloud" resource configuration. In some embodiments, all data is stored and all calculations are performed in the local data processing module 570, allowing for fully autonomous use from remote modules. In some embodiments, all data is stored and all or most of the calculations are performed within the remote data repository 574, allowing for smaller devices. The World Reconstruction may be stored, for example, in whole or in part within the Repository 574.

その中にデータが、遠隔で記憶され、ネットワークを経由してアクセス可能である、実施形態では、データは、拡張現実システムの複数のユーザによって共有されてもよい。例えば、ユーザデバイスは、その追跡マップをアップロードし、環境マップのデータベース内に拡張されてもよい。いくつかの実施形態では、追跡マップのアップロードは、ウェアラブルデバイスとのユーザセッションの終了時に生じる。いくつかの実施形態では、追跡マップのアップロードは、持続的に、半持続的に、断続的に、事前に定義された時間において、前のアップロードから事前に定義された周期後、またはあるイベントによってトリガされると、生じ得る。任意のユーザデバイスによってアップロードされた追跡マップは、そのユーザデバイスまたは任意の他のユーザデバイスからのデータに基づくかどうかにかかわらず、以前に記憶されたマップを拡張または改良するために使用されてもよい。同様に、ユーザデバイスにダウンロードされた持続的マップは、そのユーザデバイスまたは任意の他のユーザデバイスからのデータに基づいてもよい。このように、高品質環境マップが、ＡＲシステムを用いたその体験を改良するために、ユーザに容易に利用可能であり得る。 In embodiments, the data is stored remotely and accessible over a network, the data may be shared by multiple users of the augmented reality system. For example, the user device may upload its tracking map and extend it into the database of environment maps. In some embodiments, the tracking map upload occurs at the end of the user session with the wearable device. In some embodiments, the tracking map upload is persistent, semi-persistent, intermittent, at a predefined time, after a predefined cycle from the previous upload, or by an event. Can occur when triggered. Tracking maps uploaded by any user device may be used to extend or improve previously stored maps, whether based on data from that user device or any other user device. good. Similarly, the persistent map downloaded to a user device may be based on data from that user device or any other user device. Thus, high quality environmental maps may be readily available to users to improve their experience with AR systems.

いくつかの実施形態では、ローカルデータ処理モジュール５７０は、バッテリ５８２に動作可能に結合される。いくつかの実施形態では、バッテリ５８２は、市販のバッテリ等のリムーバブル電源である。他の実施形態では、バッテリ５８２は、リチウムイオンバッテリである。いくつかの実施形態では、バッテリ５８２は、ユーザ５６０が、電源に繋がれ、リチウムイオンバッテリを充電する必要なく、またはシステム５８０をシャットオフし、バッテリを交換する必要なく、より長い時間周期にわたってシステム５８０を動作させ得るように、システム５８０の非動作時間の間、ユーザ５６０によって充電可能な内部リチウムイオンバッテリと、リムーバブルバッテリとの両方を含む。 In some embodiments, the local data processing module 570 is operably coupled to the battery 582. In some embodiments, the battery 582 is a removable power source such as a commercially available battery. In another embodiment, the battery 582 is a lithium ion battery. In some embodiments, the battery 582 is a system over a longer time cycle in which the user 560 is connected to a power source and does not need to charge the lithium ion battery or shut off the system 580 and replace the battery. Both an internal lithium-ion battery rechargeable by the user 560 and a removable battery are included so that the 580 can be operated during the non-operating time of the system 580.

図５Ａは、ユーザ５３０が物理的世界環境５３２（以降、「環境５３２」と称される）を通して移動するにつれてＡＲコンテンツをレンダリングする、ＡＲディスプレイシステムを装着している、ユーザ５３０を図示する。ユーザの移動経路に沿ってＡＲシステムによって捕捉された情報は、１つ以上の追跡マップの中に処理されてもよい。ユーザ５３０は、ＡＲディスプレイシステムを位置５３４に位置付け、ＡＲディスプレイシステムは、位置５３４に対するパス可能世界（例えば、物理的世界内の実オブジェクトの変化に伴って記憶および更新され得る、物理的世界内の実オブジェクトのデジタル表現）の周囲情報を記録する。その情報は、画像、特徴、指向性オーディオ入力、または他の所望のデータと組み合わせて、姿勢として記憶されてもよい。位置５３４は、例えば、追跡マップの一部として、データ入力５３６に対して集約され、少なくともパス可能世界モジュール５３８によって処理され、これは、例えば、図４の遠隔処理モジュール５７２上の処理によって実装されてもよい。いくつかの実施形態では、パス可能世界モジュール５３８は、処理された情報が、レンダリング仮想コンテンツ内で使用される物理的オブジェクトについての他の情報と組み合わせて、物理的世界内のオブジェクトの場所を示し得るように、頭部姿勢コンポーネント５１４と、世界再構築コンポーネント５１６とを含んでもよい。 FIG. 5A illustrates a user 530 wearing an AR display system that renders AR content as the user 530 travels through the physical world environment 532 (hereinafter referred to as "environment 532"). Information captured by the AR system along the user's travel path may be processed into one or more tracking maps. User 530 positions the AR display system at position 534, and the AR display system is in the physical world that can be stored and updated as the real object changes in the passable world for position 534 (eg, in the physical world). Record surrounding information (digital representation of a real object). The information may be stored as poses in combination with images, features, directional audio inputs, or other desired data. Location 534 is aggregated for data entry 536 and processed by at least passable worlds module 538, for example as part of a tracking map, which is implemented, for example, by processing on remote processing module 572 in FIG. May be. In some embodiments, the passable world module 538 indicates the location of an object in the physical world where the processed information is combined with other information about the physical object used in the rendered virtual content. As obtained, the head orientation component 514 and the world reconstruction component 516 may be included.

パス可能世界モジュール５３８は、データ入力５３６から決定されるように、少なくとも部分的に、ＡＲコンテンツ５４０が物理的世界内に設置され得る場所および方法を決定する。ＡＲコンテンツは、ユーザインターフェースを介して、物理的世界の表現およびＡＲコンテンツの両方を提示することによって、物理的世界内に「設置」され、ＡＲコンテンツは、物理的世界内のオブジェクトと相互作用しているかのようにレンダリングされ、物理的世界内のオブジェクトは、ＡＲコンテンツが、適切なとき、それらのオブジェクトのユーザのビューを不明瞭にしているかのように提示される。いくつかの実施形態では、ＡＲコンテンツは、固定要素５４２（例えば、テーブル）の一部を再構築物（例えば、再構築物５１８）から適切に選択し、ＡＲコンテンツ５４０の形状および位置を決定することによって、設置されてもよい。実施例として、固定要素は、テーブルであってもよく、仮想コンテンツは、そのテーブル上に現れるように位置付けられてもよい。いくつかの実施形態では、ＡＲコンテンツは、現在の視野または推定される将来的視野であり得る、視野５４４内の構造の中に設置されてもよい。いくつかの実施形態では、ＡＲコンテンツは、物理的世界のモデル５４６（例えば、メッシュ）に対して持続されてもよい。 The passable world module 538, at least in part, determines where and how the AR content 540 can be placed within the physical world, as determined from the data entry 536. AR content is "placed" within the physical world by presenting both a representation of the physical world and AR content through the user interface, and the AR content interacts with objects within the physical world. Objects in the physical world are presented as if the AR content, when appropriate, obscures the user's view of those objects. In some embodiments, the AR content is by appropriately selecting a portion of the fixed element 542 (eg, the table) from the rebuild (eg, the rebuild 518) and determining the shape and position of the AR content 540. , May be installed. As an embodiment, the fixed element may be a table and the virtual content may be positioned to appear on the table. In some embodiments, the AR content may be placed within a structure within the field of view 544, which may be the current field of view or the estimated future field of view. In some embodiments, the AR content may be sustained against model 546 (eg, mesh) of the physical world.

描写されるように、固定要素５４２は、ユーザ５３０にそれが見える度に、システムが固定要素５４２にマッピングする必要なく、ユーザ５３０が固定要素５４２上にコンテンツを知覚し得るように、パス可能世界モジュール５３８内に記憶され得る、物理的世界内の任意の固定要素のためのプロキシ（例えば、デジタルコピー）としての役割を果たす。固定要素５４２は、したがって、前のモデル化セッションからの、または別個のユーザから決定されるものであるものの、複数のユーザによる将来的参照のためにパス可能世界モジュール５３８によって記憶される、メッシュモデルであってもよい。したがって、パス可能世界モジュール５３８は、環境５３２を以前にマッピングされた環境から認識し、ユーザ５３０のデバイスが環境５３２の全部または一部を最初にマッピングすることなく、ＡＲコンテンツを表示し、算出プロセスおよびサイクルを節約し、任意のレンダリングされたＡＲコンテンツの待ち時間を回避し得る。 As depicted, the fixed element 542 is a passable world so that the user 530 can perceive content on the fixed element 542 without the need for the system to map to the fixed element 542 each time it is visible to the user 530. It acts as a proxy (eg, a digital copy) for any fixed element in the physical world that can be stored within module 538. The fixed element 542 is therefore a mesh model stored by passable worlds module 538 for future references by multiple users, although determined from a previous modeling session or from a separate user. May be. Therefore, the passable world module 538 recognizes the environment 532 from the previously mapped environment and displays and calculates the AR content without the user 530's device initially mapping all or part of the environment 532. And can save cycles and avoid waiting times for any rendered AR content.

物理的世界のメッシュモデル５４６は、ＡＲディスプレイシステムによって作成されてもよく、ＡＲコンテンツ５４０と相互作用し、表示するための適切な表面およびメトリックは、完全または部分的に、モデルを再作成する必要なく、ユーザ５３０または他のユーザによる将来的読出のために、パス可能世界モジュール５３８によって記憶されることができる。いくつかの実施形態では、データ入力５３６は、パス可能世界モジュール５３８に、１つ以上の固定要素のうちのどの固定要素５４２が利用可能であるかどうか、固定要素５４２上に最後に設置されたＡＲコンテンツ５４０、およびその同一コンテンツを表示すべきかどうか（そのようなＡＲコンテンツは、ユーザが特定のパス可能世界モデルを視認しているかどうかにかかわらず、「持続」コンテンツである）を示すための、地理的場所、ユーザ識別、および現在のアクティビティ等の入力である。 The physical world mesh model 546 may be created by the AR display system, and the appropriate surfaces and metrics to interact with and display the AR content 540 require full or partial remodeling. Instead, it can be stored by passable worlds module 538 for future readings by user 530 or other users. In some embodiments, the data entry 536 was finally installed on the passable worlds module 538, which fixed element 542 of one or more fixed elements is available, on the fixed element 542. To indicate whether AR Content 540 and its identical content should be displayed (such AR Content is "persistent" content regardless of whether the user is viewing a particular passable world model). , Geographic location, user identification, and current activity input.

オブジェクトが固定されていると見なされる（例えば、台所のテーブル）、実施形態においてさえ、パス可能世界モジュール５３８は、物理的世界の変化の可能性を考慮するために、物理的世界のモデル内のそれらのオブジェクトを随時更新してもよい。固定されたオブジェクトのモデルは、非常に低頻度で更新されてもよい。物理的世界内の他のオブジェクトは、移動している、または別様に固定されていると見なされないものであり得る（例えば、台所の椅子）。ＡＲ場面を現実的感覚でレンダリングするために、ＡＲシステムは、これらの非固定オブジェクトの位置を、固定オブジェクトを更新するために使用されるものよりはるかに高い頻度で更新してもよい。物理的世界内のオブジェクトの全ての正確な追跡を有効にするために、ＡＲシステムは、１つ以上の画像センサを含む、複数のセンサから情報を引き出してもよい。 Even in embodiments where the object is considered fixed (eg, a kitchen table), the passable world module 538 takes into account the potential for changes in the physical world within the model of the physical world. You may update those objects at any time. Models of fixed objects may be updated very infrequently. Other objects in the physical world can be those that are not considered to be moving or otherwise fixed (eg, kitchen chairs). In order to render the AR scene realistically, the AR system may update the positions of these non-fixed objects much more frequently than those used to update the fixed objects. To enable accurate tracking of all objects in the physical world, the AR system may extract information from multiple sensors, including one or more image sensors.

図５Ｂは、視認光学系アセンブリ５４８および付帯コンポーネントの概略例証である。いくつかの実施形態では、２つの眼追跡カメラ５５０が、ユーザの眼５４９に向かって指向され、眼形状、眼瞼オクルージョン、瞳孔方向、およびユーザの眼５４９上の閃光等、ユーザの眼５４９のメトリックを検出する。 FIG. 5B is a schematic illustration of the visual optics assembly 548 and ancillary components. In some embodiments, two eye tracking cameras 550 are directed towards the user's eye 549 and are metrics of the user's eye 549, such as eye shape, eyelid occlusion, pupil direction, and flash on the user's eye 549. Is detected.

いくつかの実施形態では、センサのうちの１つは、飛行時間センサ等の深度センサ５５１であって、信号を世界に放出し、近隣のオブジェクトからのそれらの信号の反射を検出し、所与のオブジェクトまでの距離を決定してもよい。深度センサは、例えば、オブジェクトが、それらのオブジェクトの運動またはユーザの姿勢の変化のいずれかの結果として、ユーザの視野に進入したかどうかを迅速に決定し得る。しかしながら、ユーザの視野内のオブジェクトの位置についての情報は、代替として、または加えて、他のセンサを用いて収集されてもよい。深度情報は、例えば、立体視的画像センサまたはプレノプティックセンサから取得されてもよい。 In some embodiments, one of the sensors is a depth sensor 551, such as a flight time sensor, that emits signals to the world, detects reflections of those signals from nearby objects, and is given. You may determine the distance to the object in. The depth sensor can quickly determine, for example, whether an object has entered the user's field of view as a result of either the movement of those objects or a change in the user's posture. However, information about the position of the object in the user's field of view may be collected as an alternative or in addition using other sensors. Depth information may be obtained from, for example, a stereoscopic image sensor or a prenoptic sensor.

いくつかの実施形態では、世界カメラ５５２は、周辺より広いビューを記録し、マッピングし、および／または別様に、環境５３２のモデルを作成し、ＡＲコンテンツに影響を及ぼし得る、入力を検出する。いくつかの実施形態では、世界カメラ５５２および／またはカメラ５５３は、グレースケールおよび／またはカラー画像センサであってもよく、これは、グレースケールおよび／またはカラー画像フレームを固定される時間インターバルにおいて出力してもよい。カメラ５５３はさらに、ユーザの視野内の物理的世界画像を具体的時間において捕捉してもよい。フレームベースの画像センサのピクセルは、その値が不変である場合でも、反復的にサンプリングされてもよい。世界カメラ５５２、カメラ５５３、および深度センサ５５１はそれぞれ、５５４、５５５、および５５６の個別の視野を有し、図３４Ａに描写される物理的世界環境５３２等の物理的世界場面からのデータを収集および記録する。 In some embodiments, the world camera 552 records, maps, and / or otherwise models the environment 532 and detects inputs that can affect AR content. .. In some embodiments, the world camera 552 and / or camera 553 may be a grayscale and / or color image sensor, which outputs the grayscale and / or color image frame at fixed time intervals. You may. Camera 553 may also capture a physical world image in the user's field of view at a specific time. Pixels in a frame-based image sensor may be iteratively sampled, even if their values are invariant. The world camera 552, camera 553, and depth sensor 551 have separate fields of view of 554, 555, and 556, respectively, and collect data from physical world scenes such as the physical world environment 532 depicted in FIG. 34A. And record.

慣性測定ユニット５５７は、視認光学系アセンブリ５４８の移動および配向を決定してもよい。いくつかの実施形態では、各コンポーネントは、少なくとも１つの他のコンポーネントに動作可能に結合される。例えば、深度センサ５５１は、ユーザの眼５４９が見ている実際の距離に対する測定された遠近調節の確認として、眼追跡カメラ５５０に動作可能に結合される。 The inertial measurement unit 557 may determine the movement and orientation of the visual optics assembly 548. In some embodiments, each component is operably coupled to at least one other component. For example, the depth sensor 551 is operably coupled to the eye tracking camera 550 as a confirmation of the measured accommodation for the actual distance the user's eye 549 is looking at.

視認光学系アセンブリ５４８は、図３４Ｂに図示されるコンポーネントのうちのいくつかを含んでもよく、図示されるコンポーネントの代わりに、またはそれに加え、コンポーネントを含んでもよいことを理解されたい。いくつかの実施形態では、例えば、視認光学系アセンブリ５４８は、４つの代わりに、２つの世界カメラ５５２を含んでもよい。代替として、または加えて、カメラ５５２および５５３は、その完全視野の可視光画像を捕捉する必要はない。視認光学系アセンブリ５４８は、他のタイプのコンポーネントを含んでもよい。いくつかの実施形態では、視認光学系アセンブリ５４８は、１つ以上の動的視覚センサ（ＤＶＳ）を含んでもよく、そのピクセルは、光強度の相対的変化が閾値を超えることに非同期して応答してもよい。 It should be appreciated that the visual optics assembly 548 may include some of the components illustrated in FIG. 34B and may include components in place of or in addition to the components illustrated. In some embodiments, for example, the visual optics assembly 548 may include two world cameras 552 instead of four. Alternatively, or in addition, the cameras 552 and 553 do not need to capture a visible light image of their full field of view. The visual optics assembly 548 may include other types of components. In some embodiments, the visual optics assembly 548 may include one or more dynamic visual sensors (DVS), the pixels of which respond asynchronously to a relative change in light intensity exceeding a threshold. You may.

いくつかの実施形態では、視認光学系アセンブリ５４８は、飛行時間情報に基づく深度センサ５５１を含まなくてもよい。いくつかの実施形態では、例えば、視認光学系アセンブリ５４８は、１つ以上のプレノプティックカメラを含んでもよく、そのピクセルは、入射光の光強度および角度を捕捉してもよく、そこから深度情報が、決定されることができる。例えば、プレノプティックカメラは、透過性回折マスク（ＴＤＭ）でオーバーレイされた画像センサを含んでもよい。代替として、または加えて、プレノプティックカメラは、角度感知ピクセルおよび／または位相検出自動焦点ピクセル（ＰＤＡＦ）および／またはマイクロレンズアレイ（ＭＬＡ）を含有する、画像センサを含んでもよい。そのようなセンサは、深度センサ５５１の代わりに、またはそれに加え、深度情報源としての役割を果たし得る。 In some embodiments, the visual optics assembly 548 may not include a depth sensor 551 based on flight time information. In some embodiments, for example, the visual optics assembly 548 may include one or more plenoptic cameras, from which pixels may capture the light intensity and angle of incident light. Depth information can be determined. For example, the prenoptic camera may include an image sensor overlaid with a transmissive diffraction mask (TDM). Alternatively, or in addition, the prenoptic camera may include an image sensor, including an angle-sensitive pixel and / or a phase-sensitive autofocus pixel (PDAF) and / or a microlens array (MLA). Such a sensor may serve as a depth information source in place of or in addition to the depth sensor 551.

また、図５Ｂにおけるコンポーネントの構成は、実施例として提供されることを理解されたい。視認光学系アセンブリ５４８は、任意の好適な構成を伴うコンポーネントを含んでもよく、これは、ユーザに、特定のセットのコンポーネントのために実践的な最大視野を提供するように設定されてもよい。例えば、視認光学系アセンブリ５４８が、１つの世界カメラ５５２を有する場合、世界カメラは、側面の代わりに、視認光学系アセンブリの中心領域内に設置されてもよい。 Also, it should be understood that the configuration of the components in FIG. 5B is provided as an example. The visual optics assembly 548 may include components with any suitable configuration, which may be configured to provide the user with a practical maximum field of view for a particular set of components. For example, if the visual optics assembly 548 has one world camera 552, the world cameras may be installed within the central region of the visual optics assembly instead of the sides.

視認光学系アセンブリ５４８内のセンサからの情報は、システム内のプロセッサのうちの１つ以上のものに結合されてもよい。プロセッサは、ユーザに仮想コンテンツが物理的世界内のオブジェクトと相互作用するように知覚させるようにレンダリングされ得る、データを生成してもよい。そのレンダリングは、物理的および仮想オブジェクトの両方を描写する、画像データを生成するステップを含め、任意の好適な方法において実装されてもよい。他の実施形態では、物理的および仮想コンテンツは、ユーザが物理的世界を透かし見る、ディスプレイデバイスの不透明度を変調させることによって、１つの場面に描写されてもよい。不透明度は、仮想オブジェクトの外観を作成し、ユーザに仮想オブジェクトによってオクルードされる物理的世界内のオブジェクトが見えないように遮断するように、制御されてもよい。いくつかの実施形態では、画像データは、仮想コンテンツがユーザインターフェースを通して視認されるとき、物理的世界と現実的に相互作用するように、ユーザによって知覚されるように修正され得る（例えば、コンテンツをクリッピングし、オクルージョンを考慮する）、仮想コンテンツのみを含んでもよい。 Information from the sensors in the visual optics assembly 548 may be coupled to one or more of the processors in the system. The processor may generate data that can be rendered to make the user perceive the virtual content to interact with objects in the physical world. The rendering may be implemented in any suitable way, including the steps of generating image data that depict both physical and virtual objects. In other embodiments, the physical and virtual content may be portrayed in one scene by modulating the opacity of the display device, where the user sees through the physical world. Opacity may be controlled to create the appearance of the virtual object and block the user from seeing the objects in the physical world that are included by the virtual object. In some embodiments, the image data may be modified to be perceived by the user so that when the virtual content is viewed through the user interface, it interacts realistically with the physical world (eg, the content). Clipping and consideration of occlusion), may include only virtual content.

コンテンツが特定の場所におけるオブジェクトの印象を作成するために表示され得る、視認光学系アセンブリ５４８上の場所は、視認光学系アセンブリの物理学に依存し得る。加えて、物理的世界に対するユーザの頭部の姿勢およびユーザの眼が見ている方向は、コンテンツが現れるであろう視認光学系アセンブリ上の特定の場所に表示される、物理的世界コンテンツ内の場所に影響を及ぼすであろう。上記に説明されるようなセンサは、センサ入力を受信するプロセッサが、オブジェクトが、視認光学系アセンブリ５４８上にレンダリングされ、ユーザのために所望の外観を作成すべき場所を算出し得るように、本情報を収集し、および／またはそこから本情報が計算され得る、情報を供給してもよい。 The location on the visual optics assembly 548, where the content may be displayed to create an impression of the object at a particular location, may depend on the physics of the visual optics assembly. In addition, the posture of the user's head with respect to the physical world and the direction the user's eyes are looking at are within the physical world content, which is displayed at a specific location on the visual optical system assembly where the content will appear. Will affect the location. A sensor as described above is such that the processor receiving the sensor input can calculate where the object should be rendered on the visual optics assembly 548 to create the desired appearance for the user. This information may be collected and / or supplied with information from which this information can be calculated.

コンテンツがユーザに提示される方法にかかわらず、物理的世界のモデルが、仮想オブジェクトの形状、位置、運動、および可視性を含む、物理的オブジェクトによって影響され得る、仮想オブジェクトの特徴が、正しく算出され得るように、使用され得る。いくつかの実施形態では、モデルは、物理的世界の再構築物、例えば、再構築物５１８を含んでもよい。 Regardless of how the content is presented to the user, the model of the physical world correctly calculates the characteristics of the virtual object that can be influenced by the physical object, including the shape, position, motion, and visibility of the virtual object. Can be used as it can be. In some embodiments, the model may include a reconstruction of the physical world, eg, reconstruction 518.

そのモデルは、ユーザのウェアラブルデバイス上のセンサから収集されたデータから作成されてもよい。但し、いくつかの実施形態では、モデルは、複数のユーザによって収集されたデータから作成されてもよく、これは、全てのユーザから遠隔のコンピューティングデバイス内に集約されてもよい（かつ「クラウド内」にあってもよい）。 The model may be created from data collected from sensors on the user's wearable device. However, in some embodiments, the model may be created from data collected by multiple users, which may be aggregated within a computing device remote from all users (and "cloud". It may be "inside").

モデルは、少なくとも部分的に、例えば、図６Ａにさらに詳細に描写される図３の世界再構築コンポーネント５１６等の世界再構築システムによって作成されてもよい。世界再構築コンポーネント５１６は、物理的世界の一部のための表現を生成、更新、および記憶し得る、知覚モジュール６６０を含んでもよい。いくつかの実施形態では、知覚モジュール６６０は、センサの再構築範囲内の物理的世界の一部を複数のボクセルとして表し得る。各ボクセルは、物理的世界内の所定の体積の３Ｄ立方体に対応し、表面情報を含み、ボクセルによって表される体積内に表面が存在するかどうかを示し得る。ボクセルは、その対応する体積が、物理的オブジェクトの表面を含むと決定されている、空であると決定されている、またはセンサを用いてまだ測定されていない、したがって、その値が未知であるかどうかを示す、値を割り当てられてもよい。空または未知であると決定されたボクセルを示す値は、明示的に記憶される必要はなく、ボクセルの値は、空または未知であると決定されたボクセルに関する情報を記憶しないことを含め、任意の好適な方法において、コンピュータメモリ内に記憶されてもよいことを理解されたい。 The model may be created, at least in part, by a world reconstruction system such as, for example, the world reconstruction component 516 of FIG. 3, which is depicted in more detail in FIG. 6A. World reconstruction component 516 may include a perception module 660 that can generate, update, and memorize representations for parts of the physical world. In some embodiments, the perception module 660 may represent a portion of the physical world within the reconstruction range of the sensor as multiple voxels. Each voxel corresponds to a given volume of 3D cube in the physical world, contains surface information, and may indicate whether the surface is within the volume represented by the voxel. A voxel whose corresponding volume has been determined to contain the surface of a physical object, has been determined to be empty, or has not yet been measured using a sensor, and therefore its value is unknown. A value may be assigned to indicate whether or not. Values indicating voxels determined to be empty or unknown need not be explicitly stored, and voxel values are optional, including not storing information about voxels determined to be empty or unknown. It should be understood that in the preferred method of the above, it may be stored in computer memory.

持続される世界表現のための情報を生成することに加え、知覚モジュール６６０は、ＡＲシステムのユーザの周囲の領域の変化のインジケーションを識別し、出力してもよい。そのような変化のインジケーションは、持続される世界の一部として記憶される立体データへの更新をトリガする、またはＡＲコンテンツを生成し、ＡＲコンテンツを更新する、コンポーネント６０４をトリガする等、他の機能をトリガしてもよい。 In addition to generating information for sustained world representation, the perception module 660 may identify and output changes in the area around the user of the AR system. Indications of such changes trigger updates to 3D data stored as part of a sustained world, or generate AR content, update AR content, trigger component 604, etc. You may trigger the function of.

いくつかの実施形態では、知覚モジュール６６０は、符号付き距離関数（ＳＤＦ）モデルに基づいて、変化を識別してもよい。知覚モジュール６６０は、例えば、深度マップ６６０ａおよび頭部姿勢６６０ｂ等のセンサデータを受信し、次いで、センサデータをＳＤＦモデル６６０ｃに融合させるように構成されてもよい。深度マップ６６０ａは、直接、ＳＤＦ情報を提供してもよく、画像は、ＳＤＦ情報に到着するように処理されてもよい。ＳＤＦ情報は、その情報を捕捉するために使用されるセンサからの距離を表す。それらのセンサは、ウェアラブルユニットの一部であり得るため、ＳＤＦ情報は、ウェアラブルユニットの視点、したがって、ユーザの視点から物理的世界を表し得る。頭部姿勢６６０ｂは、ＳＤＦ情報が物理的世界内のボクセルに関連されることを可能にし得る。 In some embodiments, the perception module 660 may identify changes based on a signed distance function (SDF) model. The perception module 660 may be configured to receive sensor data such as, for example, a depth map 660a and a head posture 660b, and then fuse the sensor data to the SDF model 660c. The depth map 660a may provide the SDF information directly, or the image may be processed to arrive at the SDF information. SDF information represents the distance from the sensor used to capture that information. Since those sensors can be part of the wearable unit, the SDF information can represent the physical world from the wearable unit's point of view, and thus from the user's point of view. Head posture 660b may allow SDF information to be associated with voxels within the physical world.

いくつかの実施形態では、知覚モジュール６６０は、知覚範囲内にある、物理的世界の一部のための表現を生成、更新、および記憶してもよい。知覚範囲は、少なくとも部分的に、センサの再構築範囲に基づいて決定されてもよく、これは、少なくとも部分的に、センサの観察範囲の限界に基づいて決定されてもよい。具体的実施例として、アクティブＩＲパルスを使用して動作する、アクティブ深度センサは、ある距離の範囲にわたって確実に動作し、数センチメートルまたは数十センチメートル～数メートルであり得る、センサの観察範囲を作成し得る。 In some embodiments, the perception module 660 may generate, update, and store representations for parts of the physical world within the perceptual range. The perceptual range may be determined, at least in part, based on the reconstruction range of the sensor, which may be determined, at least in part, based on the limits of the observation range of the sensor. As a specific embodiment, an active depth sensor that operates using an active IR pulse operates reliably over a range of distances and can range from a few centimeters or a few tens of centimeters to a few meters. Can be created.

世界再構築コンポーネント５１６は、知覚モジュール６６０と相互作用し得る、付加的モジュールを含んでもよい。いくつかの実施形態では、持続される世界モジュール６６２は、知覚モジュール６６０によって入手されたデータに基づいて、物理的世界のための表現を受信してもよい。持続される世界モジュール６６２はまた、物理的世界の種々のフォーマットの表現を含んでもよい。例えば、ボクセル等の立体メタデータ６６２ｂが、メッシュ６６２ｃおよび平面６６２ｄとともに記憶されてもよい。いくつかの実施形態では、深度マップ等の他の情報も、保存され得る。 The world reconstruction component 516 may include additional modules that can interact with the perception module 660. In some embodiments, the sustained world module 662 may receive representations for the physical world based on the data obtained by the perception module 660. The Sustained World Module 662 may also contain representations of various formats of the physical world. For example, 3D metadata 662b such as voxels may be stored with the mesh 662c and the plane 662d. In some embodiments, other information such as depth maps may also be stored.

いくつかの実施形態では、図６Ａに図示されるもの等の物理的世界の表現は、上記に説明されるように、特徴点に基づく追跡マップ等の疎マップと比較して、物理的世界についての比較的に稠密情報を提供し得る。 In some embodiments, representations of the physical world, such as those illustrated in FIG. 6A, are about the physical world as compared to sparse maps, such as tracking maps based on feature points, as described above. Can provide relatively dense information.

いくつかの実施形態では、知覚モジュール６６０は、例えば、メッシュ６６０ｄ、平面、および意味論６６０ｅを含む、種々のフォーマットにおける、物理的世界のための表現を生成する、モジュールを含んでもよい。物理的世界のための表現は、ローカルおよび遠隔記憶媒体を横断して記憶されてもよい。物理的世界のための表現は、例えば、記憶媒体の場所に応じて、異なる座標フレーム内に説明されてもよい。例えば、デバイス内に記憶された物理的世界のための表現は、デバイスにローカルの座標フレーム内に説明されてもよい。物理的世界のための表現は、クラウド内に記憶された対応物を有してもよい。クラウド内の対応物は、ＸＲシステム内の全てのデバイスによって共有される座標フレーム内に説明されてもよい。 In some embodiments, the perception module 660 may include a module that produces representations for the physical world in various formats, including, for example, mesh 660d, planes, and semantics 660e. Representations for the physical world may be stored across local and remote storage media. Representations for the physical world may be described, for example, in different coordinate frames, depending on the location of the storage medium. For example, the representation for the physical world stored within the device may be described in a coordinate frame local to the device. Representations for the physical world may have counterparts stored in the cloud. Corresponds in the cloud may be described in coordinate frames shared by all devices in the XR system.

いくつかの実施形態では、これらのモジュールは、表現が生成された時点の１つ以上のセンサの知覚範囲内のデータおよび以前の時間に捕捉されたデータおよび持続される世界モジュール６６２内の情報に基づいて、表現を生成してもよい。いくつかの実施形態では、これらのコンポーネントは、深度センサを用いて捕捉された深度情報に作用してもよい。しかしながら、ＡＲシステムは、視覚センサを含んでもよく、単眼または両眼視覚情報を分析することによって、そのような表現を生成してもよい。 In some embodiments, these modules are used for data within the perceptual range of one or more sensors at the time the representation was generated, as well as data captured at a previous time and information within the Sustained World Module 662. Based on this, the expression may be generated. In some embodiments, these components may act on depth information captured using a depth sensor. However, AR systems may include visual sensors and may generate such representations by analyzing monocular or binocular visual information.

いくつかの実施形態では、これらのモジュールは、物理的世界の領域に作用してもよい。それらのモジュールは、物理的世界のサブ領域を、知覚モジュール６６０がそのサブ領域内の物理的世界の変化を検出すると、更新するようにトリガされてもよい。そのような変化は、例えば、ＳＤＦモデル６６０ｃ内の新しい表面を検出することによって、またはサブ領域を表す十分な数のボクセルの値の変化等の他の基準によって、検出されてもよい。 In some embodiments, these modules may act on the realm of the physical world. Those modules may be triggered to update a sub-region of the physical world when the perception module 660 detects a change in the physical world within that sub-region. Such changes may be detected, for example, by detecting new surfaces within the SDF model 660c, or by other criteria such as changes in the values of a sufficient number of voxels representing subregions.

世界再構築コンポーネント５１６は、物理的世界の表現を知覚モジュール６６０から受信し得る、コンポーネント６６４を含んでもよい。物理的世界についての情報は、例えば、アプリケーションからの使用要求に従って、これらのコンポーネントによってプル配信されてもよい。いくつかの実施形態では、情報は、事前に識別された領域の変化または知覚範囲内の物理的世界表現の変化のインジケーション等を介して、使用コンポーネントにプッシュ配信されてもよい。コンポーネント６６４は、例えば、視覚的オクルージョン、物理学ベースの相互作用、および環境推測のための処理を実施する、ゲームプログラムおよび他のコンポーネントを含んでもよい。 World reconstruction component 516 may include component 664, which may receive representations of the physical world from the perception module 660. Information about the physical world may be pull-delivered by these components, for example, according to usage requirements from the application. In some embodiments, the information may be push-delivered to the components used, such as by indicating changes in pre-identified areas or changes in physical world representation within the perceptual range. Component 664 may include, for example, a game program and other components that perform processing for visual occlusion, physics-based interactions, and environmental inference.

コンポーネント６６４からのクエリに応答して、知覚モジュール６６０は、物理的世界のための表現を１つ以上のフォーマットにおいて送信してもよい。例えば、コンポーネント６６４が、使用が視覚的オクルージョンまたは物理学ベースの相互作用のためのものであることを示すとき、知覚モジュール６６０は、表面の表現を送信してもよい。コンポーネント６６４が、使用が環境推測のためのものであることを示すとき、知覚モジュール６６０は、物理的世界のメッシュ、平面、および意味論を送信してもよい。 In response to a query from component 664, the perception module 660 may send representations for the physical world in one or more formats. For example, the Perception Module 660 may transmit a surface representation when component 664 indicates that its use is for a visual occlusion or physics-based interaction. Perception module 660 may transmit meshes, planes, and semantics of the physical world when component 664 indicates that its use is for environmental inference.

いくつかの実施形態では、知覚モジュール６６０は、フォーマット情報をコンポーネント６６４に提供する、コンポーネントを含んでもよい。そのようなコンポーネントの実施例は、レイキャスティングコンポーネント６６０ｆであってもよい。使用コンポーネント（例えば、コンポーネント６６４）は、例えば、特定の視点からの物理的世界についての情報をクエリしてもよい。レイキャスティングコンポーネント６６０ｆは、その視点からの視野内の物理的世界データの１つ以上の表現から選択してもよい。 In some embodiments, the perception module 660 may include a component that provides format information to the component 664. An embodiment of such a component may be the raycasting component 660f. The used component (eg, component 664) may, for example, query information about the physical world from a particular point of view. The raycasting component 660f may be selected from one or more representations of the physical world data in the field of view from that point of view.

前述の説明から理解されるはずであるように、知覚モジュール６６０またはＡＲシステムの別のコンポーネントは、データを処理し、物理的世界の一部の３Ｄ表現を作成してもよい。処理されるべきデータは、少なくとも部分的に、カメラ錐台および／または深度画像に基づいて、３Ｄ再構築体積の一部を間引く、平面データを抽出および持続させる、近傍系の一貫性を維持しながらローカル更新を可能にするブロックにおいて、３Ｄ再構築データを捕捉し、持続させ、更新する、オクルージョンデータが１つ以上の深度データソースの組み合わせから導出される、オクルージョンデータをそのような場面を生成するアプリケーションに提供する、および／または多段階メッシュ簡略化を実施することによって、低減され得る。再構築物は、例えば、ライブ深度データ等の未加工データ、ボクセル等の融合された立体データ、およびメッシュ等の算出されたデータを含む、異なるレベルの精巧さのデータを含有してもよい。 As should be understood from the above description, the perception module 660 or another component of the AR system may process the data and create a 3D representation of part of the physical world. The data to be processed, at least in part, is based on the camera cone and / or depth image, thinning out part of the 3D reconstructed volume, extracting and sustaining planar data, maintaining the consistency of the neighborhood system. While capturing, sustaining, and updating 3D reconstructed data in blocks that allow local updates, occlusion data is derived from a combination of one or more depth data sources, generating such scenes with occlusion data. Can be reduced by providing to the application and / or performing multi-step mesh simplification. The reconstruction may contain different levels of sophistication data, including, for example, raw data such as live depth data, fused 3D data such as voxels, and calculated data such as mesh.

いくつかの実施形態では、パス可能世界モデルのコンポーネントは、分散されてもよく、いくつかの部分は、ＸＲデバイス上でローカルで実行され、いくつかの部分は、サーバに接続されるネットワーク上または別様にクラウド内等の遠隔で実行される。ローカルＸＲデバイスとクラウドとの間の情報の処理および記憶の配分は、ＸＲシステムの機能性およびユーザ体験に影響を及ぼし得る。例えば、処理をクラウドに配分することによって、ローカルデバイス上の処理を低減させることは、より長いバッテリ寿命を有効にし、ローカルデバイス上に生成される熱を低減させ得る。しかし、はるかに多い処理をクラウドに配分することは、容認不可能なユーザ体験を引き起こす、望ましくない待ち時間を作成し得る。 In some embodiments, the components of the passable world model may be distributed, some running locally on the XR device and some on the network connected to the server or. Separately, it is executed remotely, such as in the cloud. The processing of information and the distribution of storage between the local XR device and the cloud can affect the functionality and user experience of the XR system. For example, reducing the processing on the local device by allocating the processing to the cloud can enable longer battery life and reduce the heat generated on the local device. However, allocating much more processing to the cloud can create unwanted latency that causes an unacceptable user experience.

図６Ｂは、いくつかの実施形態による、空間コンピューティングのために構成される、分散型コンポーネントアーキテクチャ６００を描写する。分散型コンポーネントアーキテクチャ６００は、パス可能世界コンポーネント６０２（例えば、図５ＡにおけるＰＷ５３８）と、ＬｕｍｉｎＯＳ６０４と、ＡＰＩ６０６と、ＳＤＫ６０８と、アプリケーション６１０とを含んでもよい。ＬｕｍｉｎＯＳ６０４は、ＸＲデバイスと互換性があるカスタムドライバを伴う、Ｌｉｎｕｘ（登録商標）ベースのカーネルを含んでもよい。ＡＰＩの６０６は、ＸＲアプリケーション（例えば、アプリケーション６１０）にＸＲデバイスの空間コンピューティング特徴へのアクセスを与える、アプリケーションプログラミングインターフェースを含んでもよい。ＳＤＫ６０８は、ＸＲアプリケーションの作成を可能にする、ソフトウェア開発キットを含んでもよい。 FIG. 6B illustrates a distributed component architecture 600 configured for spatial computing, according to some embodiments. The distributed component architecture 600 may include passable world components 602 (eg, PW538 in FIG. 5A), Lumin OS 604, API 606, SDK 608, and application 610. Lumin OS 604 may include a Linux®-based kernel with custom drivers compatible with XR devices. The API 606 may include an application programming interface that gives the XR application (eg, application 610) access to the spatial computing features of the XR device. The SDK 608 may include a software development kit that allows the creation of XR applications.

アーキテクチャ６００内の１つ以上のコンポーネントは、パス可能世界のモデルを作成および維持してもよい。本実施例では、センサデータは、ローカルデバイス上で収集される。そのセンサデータの処理は、部分的に、ＸＲデバイス上でローカルで、部分的に、クラウド内で実施されてもよい。ＰＷ５３８は、少なくとも部分的に、複数のユーザによって装着されるＡＲデバイスによって捕捉されたデータに基づいて作成される、環境マップを含んでもよい。ＡＲ体験のセッションの間、個々のＡＲデバイス（図４に関連して上記に説明されるウェアラブルデバイス等）は、マップの１つのタイプである、追跡マップを作成してもよい。 One or more components within Architecture 600 may create and maintain models of passable worlds. In this embodiment, the sensor data is collected on the local device. The processing of the sensor data may be performed partially locally on the XR device and partially in the cloud. The PW538 may include, at least in part, an environment map created based on data captured by AR devices worn by multiple users. During an AR experience session, individual AR devices (such as the wearable devices described above in connection with FIG. 4) may create a tracking map, which is a type of map.

いくつかの実施形態では、デバイスは、疎マップおよび稠密マップの両方を構築する、コンポーネントを含んでもよい。追跡マップは、疎マップとしての役割を果たしてもよく、環境を走査するＡＲデバイスの頭部姿勢ならび各頭部姿勢においてその環境内で検出されたオブジェクトについての情報を含んでもよい。それらの頭部姿勢は、デバイス毎にローカルで維持されてもよい。例えば、各デバイス上の頭部姿勢は、デバイスがそのセッションのためにオンにされたときの初期頭部姿勢に対するものであり得る。結果として、各追跡マップは、それを作成するデバイスにローカルであり得る。稠密マップは、表面情報を含んでもよく、これは、メッシュまたは深度情報によって表されてもよい。代替として、または加えて、稠密マップは、平面および／または他のオブジェクトの場所および／または特性等の表面または深度情報から導出されるより高いレベルの情報を含んでもよい。 In some embodiments, the device may include components that build both sparse and dense maps. The tracking map may serve as a sparse map and may include head postures of AR devices scanning the environment as well as information about objects detected in the environment in each head posture. Their head postures may be maintained locally on a device-by-device basis. For example, the head posture on each device can be relative to the initial head posture when the device is turned on for that session. As a result, each tracking map can be local to the device that creates it. The dense map may contain surface information, which may be represented by mesh or depth information. Alternatively, or in addition, the dense map may contain higher levels of information derived from surface or depth information such as the location and / or properties of planes and / or other objects.

稠密マップの作成は、いくつかの実施形態では、疎マップの作成から独立してもよい。稠密マップおよび疎マップの作成は、例えば、ＡＲシステム内の別個の処理パイプライン内で実施されてもよい。処理を分離することは、例えば、異なるタイプのマップの生成または処理が異なるレートで実施されることを可能にし得る。疎マップは、例えば、稠密マップより高速のレートでリフレッシュされてもよい。しかしながら、いくつかの実施形態では、稠密および疎マップの処理は、異なるパイプライン内で実施される場合でも、関連し得る。疎マップ内で露見される物理的世界の変化は、例えば、稠密マップの更新をトリガし得る、またはその逆である。さらに、独立して作成される場合でも、マップは、ともに使用され得る。例えば、疎マップから導出される座標系は、稠密マップ内のオブジェクトの位置および／または配向を定義するために使用されてもよい。 The creation of dense maps may be independent of the creation of sparse maps in some embodiments. The creation of dense and sparse maps may be performed, for example, within a separate processing pipeline within the AR system. Separating the processing may allow, for example, the generation or processing of different types of maps to be performed at different rates. Sparse maps may be refreshed, for example, at a faster rate than dense maps. However, in some embodiments, the processing of dense and sparse maps may be relevant even if they are performed within different pipelines. Changes in the physical world exposed within a sparse map can, for example, trigger an update of the dense map, or vice versa. Moreover, maps can be used together, even if they are created independently. For example, a coordinate system derived from a sparse map may be used to define the position and / or orientation of objects in a dense map.

疎マップおよび／または稠密マップは、同一デバイスによる再使用のために、および／または他のデバイスと共有するために、存続してもよい。そのような存続は、情報をクラウド内に記憶することによって達成されてもよい。ＡＲデバイスは、追跡マップをクラウドに送信し、例えば、クラウド内に以前に記憶された存続されるマップから選択された環境マップとマージしてもよい。いくつかの実施形態では、選択された存続されるマップは、マージするために、クラウドからＡＲデバイスに送信されてもよい。いくつかの実施形態では、存続されるマップは、１つ以上の持続的座標フレームに対して配向されてもよい。そのようなマップは、それらが複数のデバイスのいずれかによって使用され得るため、規準マップとしての役割を果たし得る。いくつかの実施形態では、パス可能世界のモデルは、１つ以上の規準マップを含む、または作成されてもよい。デバイスは、デバイスにローカルの座標フレームに基づいて、いくつかの動作を実施するものの、デバイスにローカルのその座標フレームと規準マップとの間の変換を決定することによって、規準マップを使用してもよい。 Sparse and / or dense maps may survive for reuse by the same device and / or for sharing with other devices. Such survival may be achieved by storing information in the cloud. The AR device may send the tracking map to the cloud and, for example, merge it with an environment map selected from the surviving maps previously stored in the cloud. In some embodiments, the selected surviving maps may be sent from the cloud to the AR device for merging. In some embodiments, the surviving map may be oriented with respect to one or more persistent coordinate frames. Such maps can serve as a reference map because they can be used by any of multiple devices. In some embodiments, the passable world model may include or be created with one or more canonical maps. Although the device performs some actions based on the coordinate frame local to the device, it also uses the canonical map by determining the conversion between that coordinate frame local to the device and the canonical map. good.

規準マップは、追跡マップ（ＴＭ）（例えば、図３１ＡにおけるＴＭ１１０２）として生じてもよく、これは、規準マップにプロモートされ得る。規準マップは、規準マップにアクセスするデバイスが、いったんそのローカル座標系と規準マップの座標系との間の変換を決定すると、規準マップ内の情報を使用して、デバイスの周囲の物理的世界内の規準マップ内に表されるオブジェクトの場所を決定し得るように、存続されてもよい。いくつかの実施形態では、ＴＭは、ＸＲデバイスによって作成された頭部姿勢疎マップであってもよい。いくつかの実施形態では、規準マップは、異なる時間においてＸＲデバイスによって、または他のＸＲデバイスによって、捕捉された付加的ＴＭとマージするために、ＸＲデバイスが１つ以上のＴＭをクラウドサーバに送信すると、作成されてもよい。 The canonical map may arise as a tracking map (TM) (eg, TM1102 in FIG. 31A), which can be promoted to the canonical map. A canon map uses the information in the canon map to use the information in the canon map within the physical world around the device once the device accessing the canon map determines the transformation between its local coordinate system and the coordinate system of the canon map. It may survive so that it can determine the location of the objects represented in the canonical map of. In some embodiments, the TM may be a head posture sparse map created by an XR device. In some embodiments, the canonical map is such that the XR device sends one or more TMs to the cloud server to merge with the additional TMs captured by the XR device or by another XR device at different times. Then, it may be created.

規準マップまたは他のマップは、個別のマップを作成するために処理されたデータによって表される物理的世界の一部についての情報を提供してもよい。図７は、いくつかの実施形態による、例示的追跡マップ７００を描写する。追跡マップ７００は、点７０２によって表される対応する物理的世界内の物理的オブジェクトの平面図７０６を提供してもよい。いくつかの実施形態では、マップ点７０２は、複数の特徴を含み得る、物理的オブジェクトの特徴を表し得る。例えば、テーブルの各角は、マップ上の点によって表される、特徴であり得る。特徴は、拡張現実システム内のウェアラブルデバイスのセンサを用いて入手され得るような処理画像から導出されてもよい。特徴は、例えば、センサによって出力された画像フレームを処理し、画像内の大勾配または他の好適な基準に基づいて、特徴を識別することによって導出されてもよい。さらなる処理は、各フレーム内の特徴の数を限定してもよい。例えば、処理は、持続的オブジェクトを表す可能性が高い、特徴を選択してもよい。１つ以上のヒューリスティックが、本選択のために適用されてもよい。 Criteria maps or other maps may provide information about the part of the physical world represented by the data processed to create the individual maps. FIG. 7 depicts an exemplary tracking map 700 according to some embodiments. The tracking map 700 may provide a plan view 706 of a physical object in the corresponding physical world represented by the point 702. In some embodiments, the map point 702 may represent a feature of a physical object that may contain multiple features. For example, each corner of the table can be a feature represented by a point on the map. Features may be derived from processed images as available using the sensors of the wearable device in the augmented reality system. Features may be derived, for example, by processing the image frame output by the sensor and identifying the features based on a large gradient in the image or other suitable criteria. Further processing may limit the number of features within each frame. For example, the process may select features that are likely to represent persistent objects. One or more heuristics may be applied for this selection.

追跡マップ７００は、デバイスによって収集された点７０２に関するデータを含んでもよい。追跡マップ内に含まれるデータ点を伴う、画像フレーム毎に、姿勢が、記憶されてもよい。姿勢は、各画像フレーム内の特徴点が空間的に相関され得るように、そこから画像フレームが捕捉された配向を表し得る。姿勢は、ウェアラブルデバイス上のＩＭＵセンサ等のセンサから導出され得るような位置付け情報によって決定されてもよい。代替として、または加えて、姿勢は、画像フレームと、物理的世界の重複部分を描写する、他の画像フレームをマッチングさせることから決定されてもよい。２つのフレーム内の特徴点のサブセットをマッチングさせることによって遂行され得る、そのような位置相関を見出すことによって、２つのフレーム間の相対的姿勢が、算出され得る。相対的姿勢は、マップが、追跡マップの構築が開始されたときのデバイスの初期姿勢に基づいて確立されるデバイスにローカルの座標系に対するものであり得るため、追跡マップのために適正であり得る。 The tracking map 700 may include data about points 702 collected by the device. Postures may be stored for each image frame with data points contained within the tracking map. The orientation can represent the orientation in which the image frame is captured so that the feature points within each image frame can be spatially correlated. The posture may be determined by positioning information that can be derived from a sensor such as an IMU sensor on the wearable device. Alternatively, or in addition, the orientation may be determined by matching the image frame with another image frame that depicts the overlap of the physical world. The relative attitude between the two frames can be calculated by finding such a positional correlation, which can be accomplished by matching a subset of feature points within the two frames. Relative orientation can be appropriate for tracking maps, as the map can be relative to the device-local coordinate system established based on the device's initial orientation when the tracking map construction begins. ..

センサを用いて収集された情報の多くが冗長である可能性が高いため、デバイスによって収集された特徴点および画像フレームの全てが、追跡マップの一部として留保され得るわけではない。むしろ、あるフレームのみが、マップに追加されてもよい。それらのフレームは、すでにマップ内にある画像フレームとの重複度、それらが含有する新しい特徴の数、またはフレーム内の特徴に関する品質メトリック等の１つ以上の基準に基づいて選択されてもよい。追跡マップに追加されない画像フレームは、破棄されてもよい、または特徴の場所を改訂するために使用されてもよい。さらなる代替として、特徴のセットとして表される、画像フレームの全部または大部分が、留保されてもよいが、それらのフレームのサブセットは、キーフレームとして指定されてもよく、これは、さらなる処理のために使用される。 Not all feature points and image frames collected by the device can be reserved as part of the tracking map, as much of the information collected using the sensor is likely to be redundant. Rather, only certain frames may be added to the map. The frames may be selected based on one or more criteria such as the degree of overlap with image frames already in the map, the number of new features they contain, or the quality metric for features in the frame. Image frames that are not added to the tracking map may be discarded or used to revise the location of the feature. As a further alternative, all or most of the image frames, represented as a set of features, may be reserved, but a subset of those frames may be designated as keyframes, which may be further processed. Used for.

キーフレームは、処理され、キーリグ７０４を生産してもよい。キーフレームは、処理され、特徴点の３次元セットを生産し、キーリグ７０４として保存されてもよい。そのような処理は、例えば、２つのカメラから同時に導出される画像フレームを比較し、特徴点の３Ｄ位置を立体視的に決定するステップを伴ってもよい。姿勢等のメタデータが、これらのキーフレームおよび／またはキーリグと関連付けられてもよい。 Keyframes may be processed to produce the keyrig 704. Keyframes may be processed to produce a three-dimensional set of feature points and stored as a key rig 704. Such processing may involve, for example, a step of comparing image frames simultaneously derived from two cameras and stereoscopically determining the 3D position of the feature point. Metadata such as posture may be associated with these keyframes and / or keyrigs.

環境マップは、例えば、ＡＲデバイスのローカル記憶装置および遠隔記憶装置を含む、例えば、環境マップの記憶場所に応じて、複数のフォーマットのいずれかを有してもよい。例えば、遠隔記憶装置内のマップは、メモリが限定されると、ウェアラブルデバイス上のローカル記憶装置内のマップより高い分解能を有してもよい。より高い分解能マップを遠隔記憶装置からローカル記憶装置に送信するために、マップは、マップ内に記憶される物理的世界のエリアあたりの姿勢の数および／または姿勢毎に記憶される特徴点の数を低減させること等によって、ダウンサンプリングまたは別様に適切なフォーマットに変換されてもよい。いくつかの実施形態では、遠隔記憶装置からの高分解能マップのスライスまたは一部が、ローカル記憶装置に送信されてもよく、スライスまたは一部は、ダウンサンプリングされない。 The environment map may have one of a plurality of formats, for example, depending on the storage location of the environment map, including the local storage device and the remote storage device of the AR device. For example, a map in a remote storage device may have higher resolution than a map in a local storage device on a wearable device if memory is limited. In order to send a higher resolution map from remote storage to local storage, the map is stored in the map with the number of poses per area of the physical world and / or the number of feature points stored per pose. May be downsampled or otherwise converted to a suitable format, such as by reducing. In some embodiments, slices or parts of the high resolution map from the remote storage may be sent to the local storage and the slices or parts are not downsampled.

環境マップのデータベースは、新しい追跡マップが作成されるにつれて、更新されてもよい。データベース内の潜在的に非常に多数の環境マップのうちのどれが更新されるべきかを決定するために、更新するステップは、新しい追跡マップに関連するデータベース内に記憶される１つ以上の環境マップを効率的に選択するステップを含んでもよい。選択された１つ以上の環境マップは、関連性によってランク付けされてもよく、最高ランク付けマップのうちの１つ以上のものが、より高いランク付けされた選択された環境マップと新しい追跡マップをマージし、１つ以上の更新された環境マップを作成するために処理するために選択されてもよい。新しい追跡マップが、それにわたって更新するための既存の環境マップが存在しない、物理的世界の部分を表すとき、その追跡マップは、新しい環境マップとしてデータベース内に記憶されてもよい。 The environment map database may be updated as new tracking maps are created. To determine which of the potentially very large number of environment maps in the database should be updated, the step to update is one or more environments stored in the database associated with the new tracking map. It may include steps to efficiently select the map. One or more selected environment maps may be ranked by relevance, with one or more of the highest ranked maps being the higher ranked selected environment map and the new tracking map. May be selected to merge and process to create one or more updated environment maps. When a new tracking map represents a part of the physical world where there is no existing environment map to update over it, the tracking map may be stored in the database as a new environment map.

ビュー独立ディスプレイ View independent display

本明細書に説明されるものは、仮想コンテンツを視認する眼の場所から独立して、ＸＲシステムを使用して仮想コンテンツを提供するための方法および装置である。従来、仮想コンテンツは、表示システムの任意の運動に応じて、再レンダリングされる。例えば、ディスプレイシステムを装着しているユーザが、ディスプレイ上の３次元（３Ｄ）オブジェクトの仮想表現を視認し、３Ｄオブジェクトが現れるエリアの周囲を歩き回る場合、３Ｄオブジェクトは、ユーザが実空間を占有する、オブジェクトの周囲を歩き回っているという知覚を有するように、視点毎に再レンダリングされるべきである。しかしながら、再レンダリングは、システムの有意な算出リソースを消費し、待ち時間に起因して、アーチファクトを生じさせる。 Described herein are methods and devices for providing virtual content using an XR system, independent of the location of the eye viewing the virtual content. Traditionally, virtual content is re-rendered in response to any movement of the display system. For example, if a user wearing a display system visually recognizes a virtual representation of a three-dimensional (3D) object on the display and roams around the area where the 3D object appears, the 3D object occupies the real space of the user. , Should be re-rendered per viewpoint so that it has the perception of roaming around the object. However, re-rendering consumes significant computational resources in the system and causes artifacts due to latency.

本発明者らは、頭部姿勢（例えば、ＸＲシステムを装着しているユーザの場所および配向）が、ユーザの頭部における眼回転から独立して、仮想コンテンツをレンダリングするために使用され得ることを認識し、その真価を認めた。いくつかの実施形態では、場面の動的マップが、ユーザの頭部における眼回転から独立して、および／または、例えば、高速算出集約的動作の間に生成された熱によって生じるセンサ変形から独立して、動的マップと相互作用する仮想コンテンツが、ロバストにレンダリングされ得るように、１つ以上のセッションを横断して、実空間内の複数の座標フレームに基づいて生成されてもよい。いくつかの実施形態では、複数の座標フレームの構成は、第１のユーザによって装着される第１のＸＲデバイスおよび第２のユーザによって装着される第２のＸＲデバイスが、場面内の共通場所を認識することを可能にし得る。いくつかの実施形態では、複数の座標フレームの構成は、ＸＲデバイスを装着しているユーザが、仮想コンテンツを場面の同一場所内で視認することを可能にし得る。 We can use the head posture (eg, the location and orientation of the user wearing the XR system) to render virtual content independently of eye rotation in the user's head. Recognized and acknowledged its true value. In some embodiments, the dynamic map of the scene is independent of eye rotation in the user's head and / or, for example, sensor deformation caused by heat generated during fast computational intensive movements. Virtual content that interacts with the dynamic map may then be generated based on multiple coordinate frames in real space across one or more sessions so that it can be rendered robustly. In some embodiments, the configuration of the plurality of coordinate frames is such that the first XR device worn by the first user and the second XR device worn by the second user have a common location in the scene. It may be possible to recognize. In some embodiments, the configuration of the plurality of coordinate frames may allow the user wearing the XR device to view the virtual content in the same place in the scene.

いくつかの実施形態では、追跡マップが、世界座標フレーム内に構築されてもよく、これは、世界原点を有してもよい。世界原点は、ＸＲデバイスが電源投入されるときの、ＸＲデバイスの第１の姿勢であってもよい。世界原点は、ＸＲアプリケーションの開発者が余剰作業を伴わずに重力整合を得ることができるように、重力に整合されてもよい。異なる追跡マップが、追跡マップが、異なるセッションにおける同一ＸＲデバイスおよび／または異なるユーザによって装着される異なるＸＲデバイスによって捕捉され得るため、異なる世界座標フレーム内に構築されてもよい。いくつかの実施形態では、ＸＲデバイスのセッションは、デバイスを電源オンにしてから開始し、電源オフにするまで続いてもよい。いくつかの実施形態では、ＸＲデバイスは、頭部座標フレームを有してもよく、これは、頭部原点を有してもよい。頭部原点は、画像が撮影されるときの、ＸＲデバイスの現在の姿勢であってもよい。世界座標フレームおよび頭部座標フレームの頭部姿勢の間の差異は、追跡ルートを推定するために使用されてもよい。 In some embodiments, the tracking map may be constructed within a world coordinate frame, which may have a world origin. The world origin may be the first attitude of the XR device when the power is turned on. The world origin may be aligned with gravity so that the developer of the XR application can obtain gravity alignment without extra work. Different tracking maps may be constructed within different world coordinate frames because the tracking maps can be captured by the same XR device and / or different XR devices worn by different users in different sessions. In some embodiments, the session of the XR device may start after the device is powered on and continue until it is powered off. In some embodiments, the XR device may have a head coordinate frame, which may have a head origin. The head origin may be the current posture of the XR device when the image is taken. Differences between the head poses of the world coordinate frame and the head coordinate frame may be used to estimate the tracking route.

いくつかの実施形態では、ＸＲデバイスは、カメラ座標フレームを有してもよく、これは、カメラ原点を有してもよい。カメラ原点は、ＸＲデバイスの１つ以上のセンサの現在の姿勢であってもよい。本発明者らは、カメラ座標フレームの構成が、ユーザの頭部における眼回転から独立して、仮想コンテンツのロバストな表示を有効にすることを認識し、その真価を認めた。本構成はまた、例えば、動作の間に生成された熱に起因するセンサ変形から独立して、仮想コンテンツのロバストな表示を有効にする。 In some embodiments, the XR device may have a camera coordinate frame, which may have a camera origin. The camera origin may be the current orientation of one or more sensors in the XR device. The present inventors recognized that the configuration of the camera coordinate frame enables the robust display of virtual content independently of the eye rotation in the user's head, and acknowledged its true value. The configuration also enables a robust display of virtual content, eg, independent of sensor deformation due to heat generated during operation.

いくつかの実施形態では、ＸＲデバイスは、ユーザが、その頭部に固着させ得、ユーザの各眼の正面に１つずつ、２つの導波管を含み得る、頭部搭載可能フレームを伴う、頭部ユニットを有してもよい。導波管は、実世界オブジェクトからの周囲光が、導波管を通して透過し得、ユーザに実世界オブジェクトが見え得るように、透明であり得る。各導波管は、プロジェクタから投影された光をユーザの個別の眼に透過させ得る。投影された光は、画像を眼の網膜上に形成し得る。眼の網膜は、したがって、周囲光および投影された光を受光する。ユーザには、同時に、実世界オブジェクトと、投影された光によって作成される、１つ以上の仮想オブジェクトとが見え得る。いくつかの実施形態では、ＸＲデバイスは、ユーザの周囲の実世界オブジェクトを検出する、センサを有してもよい。これらのセンサは、例えば、実世界オブジェクトの場所を識別するために処理され得る、画像を捕捉する、カメラであってもよい。 In some embodiments, the XR device comprises a head mountable frame that allows the user to adhere to its head and include two waveguides, one in front of each of the user's eyes. It may have a head unit. The waveguide can be transparent so that ambient light from the real-world object can pass through the waveguide and the user can see the real-world object. Each waveguide may allow the light projected from the projector to pass through the individual eye of the user. The projected light can form an image on the retina of the eye. The retina of the eye therefore receives ambient and projected light. At the same time, the user may see a real-world object and one or more virtual objects created by the projected light. In some embodiments, the XR device may have sensors that detect real-world objects around the user. These sensors may be, for example, cameras that capture images that can be processed to identify the location of real-world objects.

いくつかの実施形態では、ＸＲシステムは、仮想コンテンツを世界座標フレーム内に結び付けることとは対照的に、座標フレームを仮想コンテンツに割り当ててもよい。そのような構成は、ユーザのためにレンダリングされた場所にかかわらず、仮想コンテンツが説明されることを可能にするが、例えば、図１４－２０Ｃに関連して説明される持続的座標フレーム（ＰＣＦ）等のより持続的フレーム位置に結び付けられ、規定された場所にレンダリングされ得る。オブジェクトの場所が変化すると、ＸＲデバイスは、環境マップの変化を検出し、ユーザによって装着される頭部ユニットの移動を実世界オブジェクトに対して決定してもよい。 In some embodiments, the XR system may assign coordinate frames to the virtual content as opposed to binding the virtual content within the world coordinate frame. Such a configuration allows virtual content to be described regardless of where it was rendered for the user, eg, a persistent coordinate frame (PCF) described in connection with FIGS. 14-20C. ) Etc. can be tied to a more persistent frame position and rendered in a defined location. When the location of the object changes, the XR device may detect the change in the environment map and determine the movement of the head unit worn by the user with respect to the real world object.

図８は、いくつかの実施形態による、ＸＲシステム１０によって物理的環境内にレンダリングされるような仮想コンテンツを体験するユーザを図示する。ＸＲシステムは、第１のユーザ１４．１によって装着される、第１のＸＲデバイス１２．１と、ネットワーク１８と、サーバ２０とを含んでもよい。ユーザ１４．１は、テーブル１６の形態における実オブジェクトを伴う、物理的環境内に存在する。 FIG. 8 illustrates a user experiencing virtual content as rendered in a physical environment by the XR system 10, according to some embodiments. The XR system may include a first XR device 12.1, a network 18, and a server 20, which are worn by the first user 14.1. User 14.1 exists in a physical environment with real objects in the form of table 16.

図示される実施例では、第１のＸＲデバイス１２．１は、頭部ユニット２２と、ベルトパック２４と、ケーブル接続２６とを含む。第１のユーザ１４．１は、頭部ユニット２２をその頭部に、頭部ユニット２２から遠隔のベルトパック２４をその腰部上に固着させる。ケーブル接続２６は、頭部ユニット２２をベルトパック２４に接続する。頭部ユニット２２は、第１のユーザ１４．１にテーブル１６等の実オブジェクトが見えることを可能にしたまま、仮想オブジェクトまたは複数のオブジェクトを第１のユーザ１４．１に表示するために使用される、技術を含む。ベルトパック２４は、主に、第１のＸＲデバイス１２．１の処理および通信能力を含む。いくつかの実施形態では、処理および通信能力は、ベルトパック２４が、除去され得る、またはリュック等の別のデバイス内に位置し得るように、全体的または部分的に、頭部ユニット２２内に常駐してもよい。 In the illustrated embodiment, the first XR device 12.1 includes a head unit 22, a belt pack 24, and a cable connection 26. The first user 14.1 fixes the head unit 22 to the head and the belt pack 24 remote from the head unit 22 on the waist. The cable connection 26 connects the head unit 22 to the belt pack 24. The head unit 22 is used to display a virtual object or a plurality of objects to the first user 14.1 while allowing the first user 14.1 to see a real object such as a table 16. Including technology. The belt pack 24 mainly includes the processing and communication capability of the first XR device 12.1. In some embodiments, the processing and communication capabilities are wholly or partially within the head unit 22 so that the belt pack 24 can be removed or located within another device such as a backpack. It may be resident.

図示される実施例では、ベルトパック２４は、無線接続を介して、ネットワーク１８に接続される。サーバ２０は、ネットワーク１８に接続され、ローカルコンテンツを表すデータを保持する。ベルトパック２４は、ローカルコンテンツを表すデータをサーバ２０からネットワーク１８を介してダウンロードする。ベルトパック２４は、ケーブル接続２６を介して、データを頭部ユニット２２に提供する。頭部ユニット２２は、光源、例えば、レーザ光源または発光ダイオード（ＬＥＤ）光源と、光を誘導する、導波管とを有する、ディスプレイを含んでもよい。 In the illustrated embodiment, the belt pack 24 is connected to the network 18 via a wireless connection. The server 20 is connected to the network 18 and holds data representing local content. The belt pack 24 downloads data representing local content from the server 20 via the network 18. The belt pack 24 provides data to the head unit 22 via the cable connection 26. The head unit 22 may include a display having a light source, such as a laser light source or a light emitting diode (LED) light source, and a waveguide that guides the light.

いくつかの実施形態では、第１のユーザ１４．１は、頭部ユニット２２をその頭部に、ベルトパック２４をその腰部に搭載してもよい。ベルトパック２４は、仮想コンテンツを表す画像データをネットワーク１８を経由してサーバ２０からダウンロードしてもよい。第１のユーザ１４．１には、頭部ユニット２２のディスプレイを通して、テーブル１６が見え得る。頭部ユニット２２の一部を形成する、プロジェクタは、画像データをベルトパック２４から受信し、画像データに基づいて、光を生成してもよい。光は、頭部ユニット２２のディスプレイの一部を形成する導波管のうちの１つ以上のものを通して進行してもよい。光は、次いで、導波管から出射し、第１のユーザ１４．１の眼の網膜上に伝搬してもよい。プロジェクタは、第１のユーザ１４．１の眼の網膜上に複製される、パターンにおいて光を生成してもよい。第１のユーザ１４．１の眼の網膜上に当たる光は、第１のユーザ１４．１が画像を導波管の背後の事前に選択された深度において知覚するように、選択された被写界深度を有してもよい。加えて、第１のユーザ１４．１の両眼は、第１のユーザ１４．１の脳が３次元画像または複数の画像を頭部ユニット２２からの選択された距離において知覚するように、若干異なる画像を受信してもよい。図示される実施例では、第１のユーザ１４．１は、仮想コンテンツ２８をテーブル１６の上方において知覚する。仮想コンテンツ２８および第１のユーザ１４．１からのその場所および距離の割合は、仮想コンテンツ２８を表すデータと、仮想コンテンツ２８を第１のユーザ１４．１に表示するために使用される、種々の座標フレームとによって決定される。 In some embodiments, the first user 14.1 may mount the head unit 22 on its head and the belt pack 24 on its lumbar region. The belt pack 24 may download image data representing virtual contents from the server 20 via the network 18. The first user 14.1 may see the table 16 through the display of the head unit 22. The projector, which forms part of the head unit 22, may receive image data from the belt pack 24 and generate light based on the image data. Light may travel through one or more of the waveguides that form part of the display of the head unit 22. The light may then exit the waveguide and propagate onto the retina of the eye of the first user 14.1. The projector may generate light in a pattern that is replicated on the retina of the eye of the first user 14.1. The light that hits the retina of the eye of the first user 14.1 is selected so that the first user 14.1 perceives the image at a preselected depth behind the waveguide. It may have a depth. In addition, the eyes of the first user 14.1 are slightly such that the brain of the first user 14.1 perceives a three-dimensional image or multiple images at a selected distance from the head unit 22. You may receive different images. In the illustrated embodiment, the first user 14.1 perceives the virtual content 28 above the table 16. The percentage of the location and distance from the virtual content 28 and the first user 14.1 varies with the data representing the virtual content 28 and used to display the virtual content 28 to the first user 14.1. Determined by the coordinate frame of.

図示される実施例では、仮想コンテンツ２８は、図面の視点から不可視であって、第１のＸＲデバイス１２．１の使用を通して第１のユーザ１４．１に可視である。仮想コンテンツ２８は、最初に、視覚データ内のデータ構造およびベルトパック２４内のアルゴリズムとして常駐し得る。データ構造は、次いで、頭部ユニット２２のプロジェクタがデータ構造に基づいて光を生成するとき、光として露見し得る。仮想コンテンツ２８は、第１のユーザ１４．１の正面の３次元空間内に存在しないが、仮想コンテンツ２８は、依然として、頭部ユニット２２の装着者が知覚する内容の例証目的のために、３次元空間内の図１において表されることを理解されたい。３次元空間内のコンピュータデータの可視化は、１人以上のユーザによって知覚されるレンダリングを促進するデータ構造が、ベルトパック２４内のデータ構造の中で相互に関連する方法を図示するために、本説明において使用され得る。 In the illustrated embodiment, the virtual content 28 is invisible from the perspective of the drawing and visible to the first user 14.1 through the use of the first XR device 12.1. The virtual content 28 may initially reside as a data structure in the visual data and an algorithm in the belt pack 24. The data structure can then be exposed as light when the projector of the head unit 22 produces light based on the data structure. Although the virtual content 28 does not exist in the three-dimensional space in front of the first user 14.1, the virtual content 28 is still 3 for the purpose of exemplifying what the wearer of the head unit 22 perceives. Please understand that it is represented in FIG. 1 in dimensional space. Visualization of computer data in three-dimensional space is intended to illustrate how data structures that facilitate rendering perceived by one or more users are interrelated within the data structures within the belt pack 24. Can be used in the description.

図９は、いくつかの実施形態による、第１のＸＲデバイス１２．１のコンポーネントを図示する。第１のＸＲデバイス１２．１は、頭部ユニット２２と、例えば、レンダリングエンジン３０と、種々の座標系３２と、種々の原点および目的地座標フレーム３４と、種々の原点／目的地座標フレーム変換器３６とを含む、視覚データおよびアルゴリズムの一部を形成する、種々のコンポーネントとを含んでもよい。種々の座標系は、ＸＲデバイスの固有性質に基づいてもよい、または本明細書に説明されるような持続的姿勢または持続的座標系等の他の情報を参照することによって決定されてもよい。 FIG. 9 illustrates the components of the first XR device 12.1 according to some embodiments. The first XR device 12.1 includes a head unit 22, for example, a rendering engine 30, various coordinate systems 32, various origin and destination coordinate frames 34, and various origin / destination coordinate frame conversions. It may include various components that form part of the visual data and algorithm, including the vessel 36. The various coordinate systems may be based on the intrinsic properties of the XR device, or may be determined by reference to other information such as persistent attitudes or persistent coordinate systems as described herein. ..

頭部ユニット２２は、頭部搭載可能フレーム４０と、ディスプレイシステム４２と、実オブジェクト検出カメラ４４と、移動追跡カメラ４６と、慣性測定ユニット４８とを含んでもよい。 The head unit 22 may include a head mountable frame 40, a display system 42, a real object detection camera 44, a movement tracking camera 46, and an inertial measurement unit 48.

頭部搭載可能フレーム４０は、図８における第１のユーザ１４．１の頭部に固着可能である、形状を有してもよい。ディスプレイシステム４２、実オブジェクト検出カメラ４４、移動追跡カメラ４６、および慣性測定ユニット４８は、頭部搭載可能フレーム４０に搭載され、したがって、頭部搭載可能フレーム４０とともに移動し得る。 The head mountable frame 40 may have a shape that can be fixed to the head of the first user 14.1 in FIG. The display system 42, the real object detection camera 44, the movement tracking camera 46, and the inertial measurement unit 48 are mounted on the head mountable frame 40 and can therefore move with the head mountable frame 40.

座標系３２は、ローカルデータシステム５２と、世界フレームシステム５４と、頭部フレームシステム５６と、カメラフレームシステム５８とを含んでもよい。 The coordinate system 32 may include a local data system 52, a world frame system 54, a head frame system 56, and a camera frame system 58.

ローカルデータシステム５２は、データチャネル６２と、ローカルフレーム決定ルーチン６４と、ローカルフレーム記憶命令６６とを含んでもよい。データチャネル６２は、内部ソフトウェアルーチン、外部ケーブルまたは無線周波数受信機等のハードウェアコンポーネント、または開放されるポート等のハイブリッドコンポーネントであってもよい。データチャネル６２は、仮想コンテンツを表す画像データ６８を受信するように構成されてもよい。 The local data system 52 may include a data channel 62, a local frame determination routine 64, and a local frame storage instruction 66. The data channel 62 may be an internal software routine, a hardware component such as an external cable or radio frequency receiver, or a hybrid component such as an open port. The data channel 62 may be configured to receive image data 68 representing virtual content.

ローカルフレーム決定ルーチン６４は、データチャネル６２に接続されてもよい。ローカルフレーム決定ルーチン６４は、ローカル座標フレーム７０を決定するように構成されてもよい。いくつかの実施形態では、ローカルフレーム決定ルーチンは、実世界オブジェクトまたは実世界場所に基づいて、ローカル座標フレームを決定してもよい。いくつかの実施形態では、ローカル座標フレームは、ブラウザウィンドウの底部縁に対する上部縁、キャラクタの頭または足、仮想コンテンツを包囲する、プリズムまたは境界ボックスの外面上のノード、または仮想コンテンツの対面方向およびそれを用いて仮想コンテンツを設置すべき場所（例えば、設置ノードまたはアンカノード等のノード）を定義する、座標フレームを設置するための任意の他の好適な場所等に基づいてもよい。 The local frame determination routine 64 may be connected to the data channel 62. The local frame determination routine 64 may be configured to determine the local coordinate frame 70. In some embodiments, the local frame determination routine may determine a local coordinate frame based on a real world object or real world location. In some embodiments, the local coordinate frame is the top edge relative to the bottom edge of the browser window, the character's head or foot, the node on the outer surface of the prism or boundary box that surrounds the virtual content, or the face-to-face direction of the virtual content. It may be based on any other suitable location for installing the coordinate frame, which defines the location where the virtual content should be installed (eg, a node such as an installation node or anchor node).

ローカルフレーム記憶命令６６は、ローカルフレーム決定ルーチン６４に接続されてもよい。当業者は、ソフトウェアモジュールおよびルーチンが、サブルーチン、コール等を通して、相互に「接続」されることを理解するであろう。ローカルフレーム記憶命令６６は、ローカル座標フレーム７０をローカル座標フレーム７２として原点および目的地座標フレーム３４内に記憶してもよい。いくつかの実施形態では、原点および目的地座標フレーム３４は、仮想コンテンツがセッション間で持続するために、操作または変換され得る、１つ以上の座標フレームであってもよい。いくつかの実施形態では、セッションは、ＸＲデバイスのブートアップとシャットダウンとの間の時間周期であってもよい。２つのセッションは、単一ＸＲデバイスに関する２回の始動およびシャットダウン周期であってもよい、または２つの異なるＸＲデバイスに関する始動およびシャットダウンであってもよい。 The local frame storage instruction 66 may be connected to the local frame determination routine 64. One of skill in the art will appreciate that software modules and routines are "connected" to each other through subroutines, calls, etc. The local frame storage command 66 may store the local coordinate frame 70 as the local coordinate frame 72 in the origin and destination coordinate frame 34. In some embodiments, the origin and destination coordinate frame 34 may be one or more coordinate frames that can be manipulated or transformed in order for the virtual content to persist between sessions. In some embodiments, the session may be a time cycle between booting up and shutting down the XR device. The two sessions may be two start-up and shutdown cycles for a single XR device, or may be two start-up and shutdown cycles for two different XR devices.

いくつかの実施形態では、原点および目的地座標フレーム３４は、第１のユーザのＸＲデバイスおよび第２のユーザのＸＲデバイスが共通場所を認識するために要求される１つ以上の変換の際に関わる座標フレームであってもよい。いくつかの実施形態では、目的地座標フレームは、第１および第２のユーザが同一場所における仮想コンテンツを視認するために、標的座標フレームに適用される一連の算出および変換の出力であってもよい。 In some embodiments, the origin and destination coordinate frame 34 is during one or more transformations required for the first user's XR device and the second user's XR device to recognize a common location. It may be a coordinate frame involved. In some embodiments, the destination coordinate frame is even the output of a series of calculations and transformations applied to the target coordinate frame for the first and second users to view the virtual content at the same location. good.

レンダリングエンジン３０は、データチャネル６２に接続されてもよい。レンダリングエンジン３０は、レンダリングエンジン３０が、少なくとも部分的に、画像データ６８に基づいて、仮想コンテンツをレンダリングし得るように、画像データ６８をデータチャネル６２から受信してもよい。 The rendering engine 30 may be connected to the data channel 62. The rendering engine 30 may receive image data 68 from the data channel 62 so that the rendering engine 30 can render virtual content based on the image data 68, at least in part.

ディスプレイシステム４２は、レンダリングエンジン３０に接続されてもよい。ディスプレイシステム４２は、画像データ６８を可視光に変換する、コンポーネントを含んでもよい。可視光は、眼毎に１つの２つのパターンを形成してもよい。可視光は、図８における第１のユーザ１４．１の眼に入射し得、第１のユーザ１４．１の眼の網膜上で検出され得る。 The display system 42 may be connected to the rendering engine 30. The display system 42 may include a component that converts the image data 68 into visible light. Visible light may form one or two patterns per eye. Visible light can enter the eye of the first user 14.1 in FIG. 8 and can be detected on the retina of the eye of the first user 14.1.

実オブジェクト検出カメラ４４は、画像を頭部搭載可能フレーム４０の異なる側から捕捉し得る、１つ以上のカメラを含んでもよい。移動追跡カメラ４６は、画像を頭部搭載可能フレーム４０の側面上で捕捉する、１つ以上のカメラを含んでもよい。１つ以上のカメラの１つのセットが、実オブジェクト検出カメラ４４および移動追跡カメラ４６を表す、１つ以上のカメラの２つのセットの代わりに使用されてもよい。いくつかの実施形態では、カメラ４４、４６は、画像を捕捉してもよい。上記に説明されるように、これらのカメラは、追跡マップを構築するために使用される、データを収集してもよい。 The real object detection camera 44 may include one or more cameras capable of capturing images from different sides of the head mountable frame 40. The mobile tracking camera 46 may include one or more cameras that capture images on the sides of the head mountable frame 40. One set of one or more cameras may be used in place of the two sets of one or more cameras representing the real object detection camera 44 and the motion tracking camera 46. In some embodiments, cameras 44, 46 may capture images. As described above, these cameras may collect data used to build a tracking map.

慣性測定ユニット４８は、頭部ユニット２２の移動を検出するために使用される、いくつかのデバイスを含んでもよい。慣性測定ユニット４８は、重力センサと、１つ以上の加速度計と、１つ以上のジャイロスコープとを含んでもよい。慣性測定ユニット４８のセンサは、組み合わせて、少なくとも３つの直交方向および約少なくとも３つの直交軸において、頭部ユニット２２の移動を追跡する。 The inertial measurement unit 48 may include several devices used to detect the movement of the head unit 22. The inertial measurement unit 48 may include a gravity sensor, one or more accelerometers, and one or more gyroscopes. The sensors of the inertial measurement unit 48, in combination, track the movement of the head unit 22 in at least three orthogonal directions and about at least three orthogonal axes.

図示される実施例では、世界フレームシステム５４は、世界表面決定ルーチン７８と、世界フレーム決定ルーチン８０と、世界フレーム記憶命令８２とを含む。世界表面決定ルーチン７８は、実オブジェクト検出カメラ４４に接続される。世界表面決定ルーチン７８は、実オブジェクト検出カメラ４４によって捕捉された画像に基づいて、画像および／またはキーフレームを受信し、画像を処理し、画像内の表面を識別する。深度センサ（図示せず）は、表面までの距離を決定してもよい。表面は、したがって、そのサイズ、形状、および実オブジェクト検出カメラからの距離を含む、３次元におけるデータによって表される。 In the illustrated embodiment, the world frame system 54 includes a world surface determination routine 78, a world frame determination routine 80, and a world frame storage instruction 82. The world surface determination routine 78 is connected to the real object detection camera 44. The world surface determination routine 78 receives an image and / or a key frame, processes the image, and identifies a surface in the image based on the image captured by the real object detection camera 44. The depth sensor (not shown) may determine the distance to the surface. The surface is therefore represented by data in three dimensions, including its size, shape, and distance from the real object detection camera.

いくつかの実施形態では、世界座標フレーム８４は、頭部姿勢セッションの初期化時の原点に基づいてもよい。いくつかの実施形態では、世界座標フレームは、デバイスがブートアップされた場所に位置してもよい、またはブートセッションの間に頭部姿勢が喪失された場合、新しいある場所であり得る。いくつかの実施形態では、世界座標フレームは、頭部姿勢セッションの開始時の原点であってもよい。 In some embodiments, the world coordinate frame 84 may be based on the origin at the time of initialization of the head posture session. In some embodiments, the world coordinate frame may be located at the location where the device was booted up, or may be a new location if head posture is lost during the boot session. In some embodiments, the world coordinate frame may be the origin at the start of the head posture session.

図示される実施例では、世界フレーム決定ルーチン８０は、世界表面決定ルーチン７８に接続され、世界表面決定ルーチン７８によって決定されるような表面の場所に基づいて、世界座標フレーム８４を決定する。世界フレーム記憶命令８２は、世界フレーム決定ルーチン８０に接続され、世界座標フレーム８４を世界フレーム決定ルーチン８０から受信する。世界フレーム記憶命令８２は、世界座標フレーム８４を世界座標フレーム８６として原点および目的地座標フレーム３４内に記憶する。 In the illustrated embodiment, the world frame determination routine 80 is connected to the world surface determination routine 78 and determines the world coordinate frame 84 based on the location of the surface as determined by the world surface determination routine 78. The world frame storage instruction 82 is connected to the world frame determination routine 80, and receives the world coordinate frame 84 from the world frame determination routine 80. The world frame storage command 82 stores the world coordinate frame 84 as the world coordinate frame 86 in the origin and destination coordinate frame 34.

頭部フレームシステム５６は、頭部フレーム決定ルーチン９０と、頭部フレーム記憶命令９２とを含んでもよい。頭部フレーム決定ルーチン９０は、移動追跡カメラ４６および慣性測定ユニット４８に接続されてもよい。頭部フレーム決定ルーチン９０は、移動追跡カメラ４６および慣性測定ユニット４８からのデータを使用して、頭部座標フレーム９４を計算してもよい。例えば、慣性測定ユニット４８は、頭部ユニット２２に対する重力の方向を決定する、重力センサを有してもよい。移動追跡カメラ４６は、頭部フレーム決定ルーチン９０によって使用され、頭部座標フレーム９４を精緻化する、画像を継続的に捕捉してもよい。頭部ユニット２２は、図８における第１のユーザ１４．１がその頭部を移動させると移動する。移動追跡カメラ４６および慣性測定ユニット４８は、頭部フレーム決定ルーチン９０が頭部座標フレーム９４を更新し得るように、データを頭部フレーム決定ルーチン９０に持続的に提供してもよい。 The head frame system 56 may include a head frame determination routine 90 and a head frame storage command 92. The head frame determination routine 90 may be connected to the motion tracking camera 46 and the inertial measurement unit 48. The head frame determination routine 90 may use the data from the motion tracking camera 46 and the inertial measurement unit 48 to calculate the head coordinate frame 94. For example, the inertial measurement unit 48 may have a gravity sensor that determines the direction of gravity with respect to the head unit 22. The motion tracking camera 46 may be used by the head frame determination routine 90 to continuously capture images that refine the head coordinate frame 94. The head unit 22 moves when the first user 14.1 in FIG. 8 moves its head. The motion tracking camera 46 and the inertial measurement unit 48 may continuously provide data to the head frame determination routine 90 so that the head frame determination routine 90 can update the head coordinate frame 94.

頭部フレーム記憶命令９２は、頭部フレーム決定ルーチン９０に接続され、頭部座標フレーム９４を頭部フレーム決定ルーチン９０から受信してもよい。頭部フレーム記憶命令９２は、頭部座標フレーム９４を頭部座標フレーム９６として原点および目的地座標フレーム３４の中に記憶してもよい。頭部フレーム記憶命令９２は、頭部フレーム決定ルーチン９０が頭部座標フレーム９４を再計算すると、更新された頭部座標フレーム９４を頭部座標フレーム９６として繰り返し記憶してもよい。いくつかの実施形態では、頭部座標フレームは、ローカル座標フレーム７２に対するウェアラブルＸＲデバイス１２．１の場所であってもよい。 The head frame storage command 92 may be connected to the head frame determination routine 90 and may receive the head coordinate frame 94 from the head frame determination routine 90. The head frame storage command 92 may store the head coordinate frame 94 as the head coordinate frame 96 in the origin and destination coordinate frame 34. The head frame storage command 92 may repeatedly store the updated head coordinate frame 94 as the head coordinate frame 96 when the head frame determination routine 90 recalculates the head coordinate frame 94. In some embodiments, the head coordinate frame may be the location of the wearable XR device 12.1 relative to the local coordinate frame 72.

カメラフレームシステム５８は、カメラ固有性質９８を含んでもよい。カメラ固有性質９８は、その設計および製造の特徴である、頭部ユニット２２の寸法を含んでもよい。カメラ固有性質９８は、原点および目的地座標フレーム３４内に記憶される、カメラ座標フレーム１００を計算するために使用されてもよい。 The camera frame system 58 may include a camera-specific property 98. The camera specificity 98 may include the dimensions of the head unit 22, which is a feature of its design and manufacture. The camera unique property 98 may be used to calculate the camera coordinate frame 100 stored in the origin and destination coordinate frame 34.

いくつかの実施形態では、カメラ座標フレーム１００は、図８における第１のユーザ１４．１の左眼の全ての瞳孔位置を含んでもよい。左眼が、左から右または上下に移動すると、左眼の瞳孔位置は、カメラ座標フレーム１００内に位置する。加えて、右眼の瞳孔位置は、右眼のためのカメラ座標フレーム１００内に位置する。いくつかの実施形態では、カメラ座標フレーム１００は、画像が撮影されると、ローカル座標フレームに対するカメラの場所を含んでもよい。 In some embodiments, the camera coordinate frame 100 may include all pupil positions of the left eye of the first user 14.1 in FIG. When the left eye moves from left to right or up and down, the pupil position of the left eye is located within the camera coordinate frame 100. In addition, the pupil position of the right eye is located within the camera coordinate frame 100 for the right eye. In some embodiments, the camera coordinate frame 100 may include the location of the camera relative to the local coordinate frame when the image is taken.

原点／目的地座標フレーム変換器３６は、ローカル／世界座標変換器１０４と、世界／頭部座標変換器１０６と、頭部／カメラ座標変換器１０８とを含んでもよい。ローカル／世界座標変換器１０４は、ローカル座標フレーム７２を受信し、ローカル座標フレーム７２を世界座標フレーム８６に変換してもよい。世界座標フレーム８６へのローカル座標フレーム７２の変換は、世界座標フレーム８６内で世界座標フレーム１１０に変換されるローカル座標フレームとして表され得る。 The origin / destination coordinate frame converter 36 may include a local / world coordinate converter 104, a world / head coordinate converter 106, and a head / camera coordinate converter 108. The local / world coordinate converter 104 may receive the local coordinate frame 72 and convert the local coordinate frame 72 into the world coordinate frame 86. The conversion of the local coordinate frame 72 to the world coordinate frame 86 can be represented as a local coordinate frame converted to the world coordinate frame 110 within the world coordinate frame 86.

世界／頭部座標変換器１０６は、世界座標フレーム８６から頭部座標フレーム９６に変換してもよい。世界／頭部座標変換器１０６は、世界座標フレーム１１０に変換されるローカル座標フレームを頭部座標フレーム９６に変換してもよい。変換は、頭部座標フレーム９６内の頭部座標フレーム１１２に変換されるローカル座標フレームとして表され得る。 The world / head coordinate converter 106 may convert from the world coordinate frame 86 to the head coordinate frame 96. The world / head coordinate converter 106 may convert the local coordinate frame converted to the world coordinate frame 110 into the head coordinate frame 96. The transformation can be represented as a local coordinate frame that is transformed into the head coordinate frame 112 within the head coordinate frame 96.

頭部／カメラ座標変換器１０８は、頭部座標フレーム９６からカメラ座標フレーム１００に変換してもよい。頭部／カメラ座標変換器１０８は、頭部座標フレーム１１２に変換されるローカル座標フレームをカメラ座標フレーム１００内のカメラ座標フレーム１１４に変換されるローカル座標フレームに変換してもよい。カメラ座標フレーム１１４に変換されるローカル座標フレームは、レンダリングエンジン３０の中に取り込まれてもよい。レンダリングエンジン３０は、カメラ座標フレーム１１４に変換されるローカル座標フレームに基づいて、ローカルコンテンツ２８を表す画像データ６８をレンダリングしてもよい。 The head / camera coordinate converter 108 may convert from the head coordinate frame 96 to the camera coordinate frame 100. The head / camera coordinate converter 108 may convert the local coordinate frame converted into the head coordinate frame 112 into the local coordinate frame converted into the camera coordinate frame 114 in the camera coordinate frame 100. The local coordinate frame converted to the camera coordinate frame 114 may be incorporated into the rendering engine 30. The rendering engine 30 may render the image data 68 representing the local content 28 based on the local coordinate frame converted into the camera coordinate frame 114.

図１０は、種々の原点および目的地座標フレーム３４の空間表現である。ローカル座標フレーム７２、世界座標フレーム８６、頭部座標フレーム９６、およびカメラ座標フレーム１００が、図内に表される。いくつかの実施形態では、ＸＲコンテンツ２８と関連付けられるローカル座標フレームは、仮想コンテンツが実世界内に設置され、したがって、仮想コンテンツがユーザによって視認され得るとき、ローカルおよび／または世界座標フレームおよび／またはＰＣＦに対する位置および回転を有し得る（例えば、ノードおよび対面方向を提供し得る）。各カメラは、片眼の全ての瞳孔位置を包含する、その独自のカメラ座標フレーム１００を有してもよい。参照番号１０４Ａおよび１０６Ａは、それぞれ、図９におけるローカル／世界座標変換器１０４、世界／頭部座標変換器１０６、および頭部／カメラ座標変換器１０８によって行われる、変換を表す。 FIG. 10 is a spatial representation of various origin and destination coordinate frames 34. The local coordinate frame 72, the world coordinate frame 86, the head coordinate frame 96, and the camera coordinate frame 100 are represented in the figure. In some embodiments, the local coordinate frame associated with the XR content 28 is a local and / or world coordinate frame and / or when the virtual content is placed in the real world and thus the virtual content can be viewed by the user. It may have a position and rotation with respect to the PCF (eg, it may provide a node and face-to-face orientation). Each camera may have its own camera coordinate frame 100 that covers all pupil positions in one eye. Reference numbers 104A and 106A represent the transformations performed by the local / world coordinate converter 104, the world / head coordinate converter 106, and the head / camera coordinate converter 108, respectively, in FIG.

図１１は、いくつかの実施形態による、頭部座標フレームからカメラ座標フレームに変換するためのカメラレンダリングプロトコルを描写する。図示される実施例では、片眼に関する瞳孔は、位置ＡからＢに移動する。定常として現れることが意図される仮想オブジェクトは、瞳孔の位置に応じて、２つの位置ＡまたはＢのうちの１つにおける深度平面上に投影されるであろう（カメラが瞳孔ベースの座標フレームとして使用するように構成されると仮定して）。その結果、頭部座標フレームに変換される瞳孔座標フレームを使用することは、眼が位置Ａから位置Ｂに移動するにつれて、ジッタを定常仮想オブジェクト内に生じさせるであろう。本状況は、ビュー依存ディスプレイまたは投影と称される。 FIG. 11 illustrates a camera rendering protocol for converting head coordinate frames to camera coordinate frames, according to some embodiments. In the illustrated embodiment, the pupil for one eye moves from position A to position B. A virtual object intended to appear stationary will be projected onto a depth plane at one of two positions A or B, depending on the position of the pupil (as the camera is a pupil-based coordinate frame). Assuming it is configured to be used). As a result, using a pupil coordinate frame that is converted to a head coordinate frame will cause jitter in the stationary virtual object as the eye moves from position A to position B. This situation is referred to as a view-dependent display or projection.

図１２に描写されるように、カメラ座標フレーム（例えば、ＣＲ）が、位置付けられ、全ての瞳孔位置を包含するが、オブジェクト投影は、ここでは、瞳孔位置ＡおよびＢにかかわらず、一貫するであろう。頭部座標フレームは、ＣＲフレームに変換し、これは、ビュー独立ディスプレイまたは投影と称される。画像再投影が、仮想コンテンツに適用され、眼位置の変化を考慮してもよいが、しかしながら、レンダリングは、依然として、同一位置にあるため、ジッタは、最小限にされる。 As depicted in FIG. 12, the camera coordinate frame (eg, CR) is positioned and covers all pupil positions, but the object projection is here consistent regardless of pupil positions A and B. There will be. The head coordinate frame is converted to a CR frame, which is referred to as a view-independent display or projection. Image reprojection may be applied to the virtual content to take into account changes in eye position, however, because the rendering is still in the same position, jitter is minimized.

図１３は、ディスプレイシステム４２をさらに詳細に図示する。ディスプレイシステム４２は、レンダリングエンジン３０に接続され、視覚データおよびアルゴリズムの一部を形成する、立体視分析器１４４を含む。 FIG. 13 illustrates the display system 42 in more detail. The display system 42 includes a stereoscopic analyzer 144 that is connected to the rendering engine 30 and forms part of the visual data and algorithms.

ディスプレイシステム４２はさらに、左および右プロジェクタ１６６Ａおよび１６６Ｂと、左および右導波管１７０Ａおよび１７０Ｂとを含む。左および右プロジェクタ１６６Ａおよび１６６Ｂは、電力供給源に接続される。各プロジェクタ１６６Ａおよび１６６Ｂは、画像データが個別のプロジェクタ１６６Ａまたは１６６Ｂに提供されるための個別の入力を有する。個別のプロジェクタ１６６Ａまたは１６６Ｂは、給電されると、２次元パターンにおける光を生成し、光をそこから発出する。左および右導波管１７０Ａおよび１７０Ｂは、それぞれ、光を左および右プロジェクタ１６６Ａおよび１６６Ｂから受光するように位置付けられる。左および右導波管１７０Ａおよび１７０Ｂは、透明導波管である。 The display system 42 further includes left and right projectors 166A and 166B, and left and right waveguides 170A and 170B. The left and right projectors 166A and 166B are connected to a power source. Each projector 166A and 166B has a separate input for image data to be provided to a separate projector 166A or 166B. The individual projectors 166A or 166B, when fed, generate and emit light in a two-dimensional pattern. The left and right waveguides 170A and 170B are positioned to receive light from the left and right projectors 166A and 166B, respectively. The left and right waveguides 170A and 170B are transparent waveguides.

使用時、ユーザは、頭部搭載可能フレーム４０をその頭部に搭載する。頭部搭載可能フレーム４０のコンポーネントは、例えば、ユーザの頭部の背面の周囲に巻着する、ストラップ（図示せず）を含んでもよい。左および右導波管１７０Ａおよび１７０Ｂは、次いで、ユーザの左および右眼２２０Ａおよび２２０Ｂの正面に位置する。 At the time of use, the user mounts the head mountable frame 40 on the head. The components of the head mountable frame 40 may include, for example, a strap (not shown) that wraps around the back of the user's head. The left and right waveguides 170A and 170B are then located in front of the user's left and right eyes 220A and 220B.

レンダリングエンジン３０は、それが受信する画像データを立体視分析器１４４の中に取り込む。画像データは、図８におけるローカルコンテンツ２８の３次元画像データである。画像データは、複数の仮想平面上に投影される。立体視分析器１４４は、画像データを分析し、各深度平面上への投影のために、画像データに基づいて、左および右画像データセットを決定する。左および右画像データセットは、３次元において投影され、ユーザに深度の知覚を与える、２次元画像を表す、データセットである。 The rendering engine 30 captures the image data it receives into the stereoscopic analyzer 144. The image data is the three-dimensional image data of the local content 28 in FIG. The image data is projected on a plurality of virtual planes. The stereoscopic analyzer 144 analyzes the image data and determines the left and right image data sets based on the image data for projection onto each depth plane. Left and right image datasets are datasets that represent two-dimensional images that are projected in three dimensions and give the user a sense of depth.

立体視分析器１４４は、左および右画像データセットを左および右プロジェクタ１６６Ａおよび１６６Ｂに取り込む。左および右プロジェクタ１６６Ａおよび１６６Ｂは、次いで、左および右光パターンを作成する。ディスプレイシステム４２のコンポーネントは、平面図に示されるが、左および右パターンは、正面立面図に示されるとき、２次元パターンであることを理解されたい。各光パターンは、複数のピクセルを含む。例証目的のために、ピクセルのうちの２つからの光線２２４Ａおよび２２６Ａが、左プロジェクタ１６６Ａから出射し、左導波管１７０Ａに入射するように示される。光線２２４Ａおよび２２６Ａは、左導波管１７０Ａの側から反射する。光線２２４Ａおよび２２６Ａは、左導波管１７０Ａ内で左から右に内部反射を通して伝搬することが示されるが、光線２２４Ａおよび２２６Ａはまた、屈折性および反射性システムを使用して、紙面に向かう方向にも伝搬することを理解されたい。 The stereoscopic analyzer 144 captures the left and right image data sets into the left and right projectors 166A and 166B. The left and right projectors 166A and 166B then create left and right light patterns. It should be understood that the components of the display system 42 are shown in plan view, but the left and right patterns are two-dimensional patterns when shown in front elevation. Each light pattern contains multiple pixels. For illustration purposes, rays 224A and 226A from two of the pixels are shown to exit the left projector 166A and enter the left waveguide 170A. The rays 224A and 226A are reflected from the side of the left waveguide 170A. Rays 224A and 226A are shown to propagate through internal reflections from left to right within the left waveguide 170A, while rays 224A and 226A are also directed towards the paper using a refractive and reflective system. Please understand that it also propagates.

光線２２４Ａおよび２２６Ａは、左光導波管１７０Ａから瞳孔２２８Ａを通して出射し、次いで、左眼２２０Ａの瞳孔２３０Ａを通して左眼２２０Ａに入射する。光線２２４Ａおよび２２６Ａは、次いで、左眼２２０Ａの網膜２３２Ａ上に当たる。このように、左光パターンは、左眼２２０Ａの網膜２３２Ａ上に当たる。ユーザは、網膜２３２Ａ上に形成されるピクセルが、ユーザが左眼２２０Ａに対向する左導波管１７０Ａの側上のある距離にあるように知覚する、ピクセル２３４Ａおよび２３６Ａであるという知覚を与えられる。深度知覚は、光の焦点距離を操作することによって作成される。 The rays 224A and 226A exit from the left optical waveguide 170A through the pupil 228A and then enter the left eye 220A through the pupil 230A of the left eye 220A. The rays 224A and 226A then hit the retina 232A of the left eye 220A. Thus, the left light pattern hits the retina 232A of the left eye 220A. The user is given the perception that the pixels formed on the retina 232A are the pixels 234A and 236A, which the user perceives as being at a distance on the side of the left waveguide 170A facing the left eye 220A. .. Depth perception is created by manipulating the focal length of light.

同様に、立体視分析器１４４は、右画像データセットを右プロジェクタ１６６Ｂに取り込む。右プロジェクタ１６６Ｂは、右光パターンを伝送し、これは、光線２２４Ｂおよび２２６Ｂの形態におけるピクセルによって表される。光線２２４Ｂおよび２２６Ｂは、右導波管１７０Ｂ内で反射し、瞳孔２２８Ｂを通して出射する。光線２２４Ｂおよび２２６Ｂは、次いで、右眼２２０Ｂの瞳孔２３０Ｂを通して入射し、右眼２２０Ｂの網膜２３２Ｂ上に当たる。光線２２４Ｂおよび２２６Ｂのピクセルは、右導波管１７０Ｂの背後のピクセル１３４Ｂおよび２３６Ｂとして知覚される。 Similarly, the stereoscopic analyzer 144 captures the right image data set into the right projector 166B. The right projector 166B transmits a right light pattern, which is represented by pixels in the form of rays 224B and 226B. The rays 224B and 226B are reflected in the right waveguide 170B and exit through the pupil 228B. The rays 224B and 226B then enter through the pupil 230B of the right eye 220B and hit the retina 232B of the right eye 220B. The pixels of the rays 224B and 226B are perceived as pixels 134B and 236B behind the right waveguide 170B.

網膜２３２Ａおよび２３２Ｂ上に作成されたパターンは、左および右画像として個々に知覚される。左および右画像は、立体視分析器１４４の機能に起因して、相互に若干異なる。左および右画像は、ユーザの脳内で３次元レンダリングとして知覚される。 The patterns created on the retinas 232A and 232B are individually perceived as left and right images. The left and right images are slightly different from each other due to the function of the stereoscopic analyzer 144. The left and right images are perceived as 3D rendering in the user's brain.

述べられたように、左および右導波管１７０Ａおよび１７０Ｂは、透明である。眼２２０Ａおよび２２０Ｂに対向する左および右導波管１７０Ａおよび１７０Ｂ側上のテーブル１６等の実在のオブジェクトからの光は、左および右導波管１７０Ａおよび１７０Ｂを通して投影され、網膜２３２Ａおよび２３２Ｂ上に当たることができる。 As mentioned, the left and right waveguides 170A and 170B are transparent. Light from real objects such as the table 16 on the left and right waveguides 170A and 170B side facing the eyes 220A and 220B is projected through the left and right waveguides 170A and 170B and hits the retinas 232A and 232B. be able to.

持続的座標フレーム（ＰＣＦ） Persistent Coordinate Frame (PCF)

本明細書に説明されるものは、共有空間内のユーザインスタンスを横断して空間存続を提供するための方法および装置である。空間存続がなければ、セッション内にユーザによって物理的世界内に設置された仮想コンテンツは、異なるセッション内のユーザのビュー内に存在し得ない、または誤設置され得る。空間存続がなければ、１人のユーザによって物理的世界内に設置された仮想コンテンツは、第２のユーザが第１のユーザと同一物理的空間の体験を共有するように意図する場合でも、第２のユーザのビュー内に、存在し得ない、または場所がずれ得る。 Described herein are methods and devices for providing spatial persistence across user instances within a shared space. Without spatial persistence, virtual content placed in the physical world by the user in a session cannot exist or can be misplaced in the user's view in a different session. Without spatial persistence, virtual content placed within the physical world by one user is the first, even if the second user intends to share the same physical space experience with the first user. It may not exist or may be out of place within the view of two users.

本発明者らは、空間存続が持続的座標フレーム（ＰＣＦ）を通して提供され得ることを認識および理解している。ＰＣＦは、物理的世界内で認識される特徴（例えば、角、縁）を表す、１つ以上の点に基づいて定義されてもよい。特徴は、それらがユーザインスタンスからＸＲシステムの別のユーザインスタンスと同一である可能性が高いように選択されてもよい。 We recognize and understand that spatial persistence can be provided through persistent coordinate frames (PCFs). The PCF may be defined on the basis of one or more points representing features recognized within the physical world (eg, corners, edges). Features may be selected so that they are likely to be identical from one user instance to another user instance in the XR system.

さらに、算出された追跡経路（例えば、カメラ軌道）を実際の追跡経路から逸脱させ得る、追跡の間のドリフトは、仮想コンテンツの場所を、追跡マップのみに基づく、ローカルマップに対してレンダリングされるとき、場所からずれて現れさせ得る。空間のための追跡マップは、ＸＲデバイスが経時的に場面のさらなる情報を収集するにつれて、精緻化され、ドリフトを補正してもよい。しかしながら、仮想コンテンツが、マップ精緻化の前に、実オブジェクト上に設置され、追跡マップから導出されるデバイスの世界座標フレームに対して保存される場合、仮想コンテンツは、実オブジェクトがマップ精緻化の間に移動したかのように変位されて現れ得る。ＰＣＦは、ＰＣＦが、特徴に基づいて定義され、特徴がマップ精緻化の間に移動するにつれて、更新されるため、マップ精緻化に従って更新されてもよい。 In addition, the drift between tracks, which can deviate the calculated tracking path (eg, camera trajectory) from the actual tracking path, renders the location of the virtual content against a local map based solely on the tracking map. Sometimes it can appear out of place. The tracking map for space may be refined and drift corrected as the XR device collects more information about the scene over time. However, if the virtual content is placed on a real object and stored against the world coordinate frame of the device derived from the tracking map prior to map refinement, then the virtual content is that the real object is map refined. It can appear displaced as if it had moved in between. The PCF may be updated according to the map refinement as the PCF is defined based on the feature and is updated as the feature moves during the map refinement.

ＰＣＦは、マップ座標系に対する平行移動および回転を伴う、６自由度を備えてもよい。ＰＣＦは、ローカルおよび／または遠隔記憶媒体内に記憶されてもよい。ＰＣＦの平行移動および回転は、例えば、記憶場所に応じて、マップ座標系に対して算出されてもよい。例えば、デバイスによってローカルで使用されるＰＣＦは、デバイスの世界座標フレームに対する平行移動および回転を有してもよい。クラウド内のＰＣＦは、規準マップの規準座標フレームに対する平行移動および回転を有してもよい。 The PCF may have 6 degrees of freedom with translation and rotation with respect to the map coordinate system. The PCF may be stored locally and / or in remote storage media. The translation and rotation of the PCF may be calculated with respect to the map coordinate system, for example, depending on the storage location. For example, the PCF used locally by the device may have translation and rotation with respect to the device's world coordinate frame. The PCF in the cloud may have translations and rotations with respect to the canonical coordinate frame of the canonical map.

ＰＣＦは、それらが効率的に処理および転送され得るように、物理的世界の疎表現を提供し、物理的世界についての利用可能な情報の全て未満を提供してもよい。持続的空間情報を処理するための技法は、１つ以上のセッションを横断して、実空間内の１つ以上の座標系に基づいて、動的マップを作成し、例えば、アプリケーションプログラミングインターフェース（ＡＰＩ）を介して、ＸＲアプリケーションにエクスポーズされ得る、疎マップにわたる持続的座標フレーム（ＰＣＦ）を生成するステップを含んでもよい。 The PCF may provide a sparse representation of the physical world and provide less than all of the available information about the physical world so that they can be processed and transferred efficiently. Techniques for processing persistent spatial information create dynamic maps based on one or more coordinate systems in real space across one or more sessions, such as application programming interfaces (APIs). ) May include the step of generating a persistent coordinate frame (PCF) over a sparse map that can be exposed to the XR application.

図１４は、いくつかの実施形態による、持続的座標フレーム（ＰＣＦ）の作成およびＸＲコンテンツとＰＣＦの結付を図示する、ブロック図である。各ブロックは、コンピュータメモリ内に記憶されたデジタル情報を表し得る。アプリケーション１１８０の場合、データは、コンピュータ実行可能命令を表し得る。仮想コンテンツ１１７０の場合、デジタル情報は、例えば、アプリケーション１１８０によって規定されるような仮想オブジェクトを定義し得る。他のボックスの場合、デジタル情報は、物理的世界のいくつかの側面を特徴付けし得る。 FIG. 14 is a block diagram illustrating the creation of a persistent coordinate frame (PCF) and the binding of XR content to a PCF according to some embodiments. Each block may represent digital information stored in computer memory. For application 1180, the data may represent computer executable instructions. For virtual content 1170, the digital information may define, for example, a virtual object as defined by application 1180. For other boxes, digital information can characterize some aspects of the physical world.

図示される実施形態では、１つ以上のＰＣＦは、ウェアラブルデバイス上のセンサを用いて捕捉された画像から作成される。図１４の実施形態では、センサは、視覚的画像カメラである。これらのカメラは、追跡マップを形成するために使用される、同一カメラであってもよい。故に、図１４によって提案される処理のうちのいくつかは、追跡マップを更新するステップの一部として実施されてもよい。しかしながら、図１４は、存続を提供する情報が追跡マップに加えて生成されることを図示する。 In the illustrated embodiment, one or more PCFs are created from images captured using sensors on a wearable device. In the embodiment of FIG. 14, the sensor is a visual image camera. These cameras may be the same camera used to form a tracking map. Therefore, some of the processes proposed by FIG. 14 may be performed as part of the step of updating the tracking map. However, FIG. 14 illustrates that information that provides survival is generated in addition to the tracking map.

３ＤＰＣＦを導出するために、立体視画像分析を有効にする構成においてウェアラブルデバイスに搭載される、２つのカメラからの２つの画像１１１０は、ともに処理される。図１４は、画像１および画像２を図示し、それぞれ、カメラのうちの１つから導出される。各カメラからの単一画像が、便宜上、図示される。しかしながら、各カメラは、画像フレームのストリームを出力してもよく、図１４に図示される処理は、ストリーム内の複数の画像フレームのために実施されてもよい。 Two images 1110 from two cameras mounted on a wearable device in a configuration that enables stereoscopic image analysis to derive a 3D PCF are processed together. FIG. 14 illustrates image 1 and image 2, each derived from one of the cameras. A single image from each camera is illustrated for convenience. However, each camera may output a stream of image frames, and the process illustrated in FIG. 14 may be performed for a plurality of image frames in the stream.

故に、画像１および画像２はそれぞれ、画像フレームのシーケンス内の１つのフレームであってもよい。図１４に描写されるような処理は、そこから持続的空間情報を形成する、好適な画像を提供する特徴点を含有する画像フレームが処理されるまで、シーケンスにおいて連続画像フレーム上で繰り返されてもよい。代替として、または加えて、図１４の処理は、ユーザがもはや以前に識別されたＰＣＦに十分に近接しておらず、物理的世界に対する位置を決定するためにそのＰＣＦを確実に使用することができないほど、ユーザが移動するにつれて、繰り返され得る。例えば、ＸＲシステムは、ユーザのために、現在のＰＣＦを維持してもよい。その距離が、閾値を超えると、システムは、ユーザの現在の場所において入手された画像フレームを使用して図１４のプロセスに従って生成され得る、ユーザにより近い新しい現在のＰＣＦに切り替えてもよい。 Therefore, image 1 and image 2 may each be one frame in a sequence of image frames. The process as depicted in FIG. 14 is repeated on the continuous image frame in the sequence until an image frame containing feature points that form a suitable image from which persistent spatial information is formed is processed. May be good. As an alternative, or in addition, the process of FIG. 14 ensures that the user is no longer sufficiently close to the previously identified PCF and uses that PCF to determine its position with respect to the physical world. It can be repeated as the user moves so much that it cannot. For example, the XR system may maintain the current PCF for the user. When that distance exceeds a threshold, the system may switch to a new current PCF closer to the user, which can be generated according to the process of FIG. 14 using image frames obtained at the user's current location.

単一ＰＣＦを生成するときでも、画像フレームのストリームが、安定する可能性が高く、画像フレームに描写される物理的世界の領域の近傍におけるデバイスによって容易に識別され得る、物理的世界内のコンテンツを描写する画像フレームを識別するように処理されてもよい。図１４の実施形態では、本処理は、画像内の特徴１１２０の識別から開始する。特徴は、例えば、オブジェクトの角に対応し得る、例えば、閾値を上回る画像内の勾配の場所または他の特性を見出すことによって、識別されてもよい。図示される実施形態では、特徴は、点であるが、縁等の他の認識可能特徴も、代替として、または加えて、使用されてもよい。 Content within the physical world where the stream of image frames is likely to be stable and can be easily identified by devices in the vicinity of the area of the physical world depicted in the image frame, even when generating a single PCF. It may be processed to identify the image frame that depicts the image. In the embodiment of FIG. 14, the process begins with the identification of feature 1120 in the image. Features may be identified, for example, by finding locations or other properties of gradients in the image that may correspond to the corners of the object, eg, above a threshold. In the illustrated embodiment, the feature is a point, but other recognizable features such as edges may also be used as an alternative or in addition.

図示される実施形態では、固定数Ｎの特徴１１２０が、さらなる処理のために選択される。それらの特徴点は、勾配の大きさまたは他の特徴点との近接度等の１つ以上の基準に基づいて選択されてもよい。代替として、または加えて、特徴点は、特徴点が持続的であることを示唆する、特性に基づいて等、ヒューリスティックに選択されてもよい。例えば、ヒューリスティックは、窓またはドアまたは大型家具の角に対応する可能性が高い、特徴点の特性に基づいて定義されてもよい。そのようなヒューリスティックは、特徴点自体およびそれを囲繞するものを考慮してもよい。具体的実施例として、画像あたり特徴点の数は、２００等の１００～５００または１５０～２５０であってもよい。 In the illustrated embodiment, feature 1120 with a fixed number N is selected for further processing. The feature points may be selected based on one or more criteria such as the magnitude of the gradient or the proximity to other feature points. Alternatively, or in addition, the feature points may be heuristically selected, such as based on the characteristics, suggesting that the feature points are persistent. For example, heuristics may be defined based on the characteristics of feature points that are likely to correspond to the corners of windows or doors or large furniture. Such heuristics may consider the feature points themselves and those surrounding them. As a specific example, the number of feature points per image may be 100 to 500 such as 200 or 150 to 250.

選択された特徴点の数にかかわらず、記述子１１３０が、特徴点に関して算出されてもよい。本実施例では、記述子は、選択された特徴点毎に算出されるが、記述子は、特徴点の群に関して、または特徴点のサブセットに関して、または画像内の全ての特徴に関して算出されてもよい。記述子は、物理的世界内の同一オブジェクトを表す特徴点が類似記述子を割り当てられるように、特徴点を特徴付ける。記述子は、１つのマップが別のマップに対して位置特定されるときに生じ得るような２つのフレームの整合を促進し得る。２つの画像の特徴点間の距離を最小限にする、フレームの相対的配向を検索するのではなく、２つのフレームの初期整合は、類似記述子を伴う特徴点を識別することによって行われてもよい。画像フレームの整合は、類似記述子を伴う点を整合させるステップに基づいてもよく、これは、画像内の全ての特徴点の整合を算出する、より少ない処理を伴い得る。 Descriptor 1130 may be calculated for feature points regardless of the number of feature points selected. In this embodiment, the descriptor is calculated for each selected feature point, but the descriptor may be calculated for a group of feature points, for a subset of feature points, or for all features in the image. good. Descriptors characterize feature points so that feature points that represent the same object in the physical world can be assigned similar descriptors. Descriptors can facilitate the alignment of two frames that can occur when one map is located relative to another. Rather than searching for the relative orientation of the frames, which minimizes the distance between the feature points of the two images, the initial alignment of the two frames is done by identifying the feature points with similar descriptors. May be good. Image frame matching may be based on a step of matching points with similar descriptors, which may involve less processing to calculate the matching of all feature points in the image.

記述子は、記述子への特徴点のマッピング、またはいくつかの実施形態では、特徴点の周囲の画像のパッチのマッピングとして算出されてもよい。記述子は、数値量であってもよい。米国特許出願第１６／１９０，９４８号は、特徴点に関する記述子を算出するステップを説明しており、参照することによってその全体として本明細書に組み込まれる。 The descriptor may be calculated as a mapping of feature points to the descriptor, or, in some embodiments, a patch of images around the feature points. The descriptor may be a numerical quantity. U.S. Patent Application No. 16 / 190,948 describes the steps to calculate a descriptor for a feature point, which is incorporated herein by reference in its entirety.

図１４の実施例では、記述子１１３０は、各画像フレーム内の特徴点毎に算出される。記述子および／または特徴点および／または画像自体に基づいて、画像フレームは、キーフレーム１１４０として識別されてもよい。図示される実施形態では、キーフレームは、次いで、さらなる処理のために選択される、ある基準を満たす、画像フレームである。追跡マップを作成する際、例えば、有意義な情報をマップに追加する、画像フレームが、マップの中に統合される、キーフレームとして選択されてもよい。他方では、それにわたって画像フレームがすでにマップの中に統合されている、領域に実質的に重複する、画像フレームは、それらがキーフレームとならないように破棄されてもよい。代替として、または加えて、キーフレームは、画像フレーム内の特徴点の数および／またはタイプに基づいて選択されてもよい。図１４の実施形態では、追跡マップ内への含有のために選択されたキーフレーム１１５０はまた、ＰＣＦを決定するためのキーフレームとして処理されてもよいが、ＰＣＦの生成のためのキーフレームを選択するための異なるまたは付加的基準が、使用されてもよい。 In the embodiment of FIG. 14, the descriptor 1130 is calculated for each feature point in each image frame. Based on the descriptor and / or feature points and / or the image itself, the image frame may be identified as key frame 1140. In the illustrated embodiment, the keyframe is an image frame that meets certain criteria, which is then selected for further processing. When creating a tracking map, image frames may be selected as keyframes, for example to add meaningful information to the map, to be integrated into the map. On the other hand, image frames that are substantially overlapped in the area, over which the image frames are already integrated in the map, may be discarded so that they are not keyframes. Alternatively, or in addition, keyframes may be selected based on the number and / or type of feature points in the image frame. In the embodiment of FIG. 14, the keyframe 1150 selected for inclusion in the tracking map may also be treated as a keyframe for determining the PCF, but for the generation of the PCF. Different or additional criteria for selection may be used.

図１４は、キーフレームがさらなる処理のために使用されることを示すが、画像から入手された情報は、他の形態で処理されてもよい。例えば、キーリグ内等の特徴点が、代替として、または加えて、処理されてもよい。さらに、キーフレームは、単一画像フレームから導出されるように説明されるが、キーフレームと入手された画像フレームとの間の１対１の関係が存在することが必要であるわけではない。キーフレームは、例えば、複数の画像内に現れる特徴のみがキーフレーム内に留保されるように、画像フレームをともにスティッチングまたは集約すること等によって、複数の画像フレームから入手されてもよい。 FIG. 14 shows that keyframes are used for further processing, but the information obtained from the images may be processed in other forms. For example, feature points, such as in a key rig, may be processed as an alternative or in addition. Further, although keyframes are described as being derived from a single image frame, it is not necessary that there be a one-to-one relationship between the keyframe and the obtained image frame. The key frame may be obtained from the plurality of image frames, for example, by stitching or aggregating the image frames together so that only the features appearing in the plurality of images are retained in the key frame.

キーフレームは、画像情報および／または画像情報と関連付けられるメタデータを含んでもよい。いくつかの実施形態では、カメラ４４、４６（図９）によって捕捉された画像が、１つ以上のキーフレーム（例えば、キーフレーム１、２）の中に算出されてもよい。いくつかの実施形態では、キーフレームは、カメラ姿勢を含んでもよい。いくつかの実施形態では、キーフレームは、カメラ姿勢において捕捉された１つ以上のカメラ画像を含んでもよい。いくつかの実施形態では、ＸＲシステムは、カメラ姿勢において捕捉されたカメラ画像の一部を有用ではないと決定し、したがって、その部分をキーフレーム内に含まなくてもよい。したがって、キーフレームを使用して、新しい画像と場面のより早期の知識を整合させることは、ＸＲシステムの算出リソースの使用を低減させる。いくつかの実施形態では、キーフレームは、ある方向／角度を伴う場所における画像および／または画像データを含んでもよい。いくつかの実施形態では、キーフレームは、そこから１つ以上のマップ点が観察され得る、場所および方向を含んでもよい。いくつかの実施形態では、キーフレームは、あるＩＤを伴う、座標フレームを含んでもよい。米国特許第出願第１５／８７７，３５９号は、キーフレームを説明しており、参照することによってその全体として本明細書に組み込まれる。 Keyframes may contain image information and / or metadata associated with the image information. In some embodiments, the image captured by the cameras 44, 46 (FIG. 9) may be calculated within one or more keyframes (eg, keyframes 1, 2). In some embodiments, the keyframe may include a camera orientation. In some embodiments, the keyframe may include one or more camera images captured in the camera orientation. In some embodiments, the XR system determines that a portion of the camera image captured in the camera orientation is not useful and therefore may not include that portion within the key frame. Therefore, using keyframes to align new images with earlier knowledge of the scene reduces the use of computational resources in the XR system. In some embodiments, the keyframe may include an image and / or image data at a location with a certain direction / angle. In some embodiments, the keyframe may include a location and direction from which one or more map points can be observed. In some embodiments, the keyframe may include a coordinate frame with an ID. U.S. Patent Application No. 15 / 877,359 describes key frames and is incorporated herein by reference in their entirety.

キーフレーム１１４０のいくつかまたは全てが、キーフレームのための持続的姿勢１１５０の生成等のさらなる処理のために選択されてもよい。選択は、画像フレーム内の特徴点の全てまたはサブセットの特性に基づいてもよい。それらの特性は、記述子、特徴、および／または画像フレーム自体を処理することから決定されてもよい。具体的実施例として、選択は、持続的オブジェクトに関連する可能性が高いと識別された特徴点のクラスタに基づいてもよい。 Some or all of the keyframes 1140 may be selected for further processing, such as the generation of a sustained pose 1150 for the keyframes. The selection may be based on the characteristics of all or a subset of feature points within the image frame. Those properties may be determined by processing the descriptors, features, and / or the image frame itself. As a specific embodiment, the selection may be based on a cluster of feature points identified as likely to be associated with persistent objects.

各キーフレームは、キーフレームが入手された、カメラの姿勢と関連付けられる。持続的姿勢に処理するために選択されたキーフレームに関して、その姿勢情報は、入手時および／または入手の場所におけるＷｉＦｉフィンガプリントおよび／またはＧＰＳ座標等のキーフレームについての他のメタデータとともに保存されてもよい。 Each keyframe is associated with the camera orientation from which the keyframe was obtained. For keyframes selected for persistent attitude processing, that attitude information is stored with other metadata about keyframes such as WiFi finger prints and / or GPS coordinates at the time of acquisition and / or at the location of acquisition. You may.

持続的姿勢は、デバイスがそれ自体を物理的世界についての以前に入手された情報に対して配向するために使用し得る、情報源である。例えば、そこから持続的姿勢が作成された、キーフレームが、物理的世界のマップの中に組み込まれる場合、デバイスは、持続的姿勢と関連付けられる、キーフレーム内の十分な数の特徴点を使用して、それ自体をその持続的姿勢に対して配向し得る。デバイスは、その周囲の得られた現在の画像と持続的姿勢を整合させ得る。本整合は、現在の画像と、持続的姿勢を生じさせた画像１１１０、特徴１１２０、および／または記述子１１３０、またはその画像またはそれらの特徴または記述子の任意のサブセットのマッチングに基づいてもよい。いくつかの実施形態では、持続的姿勢にマッチされた現在の画像フレームは、デバイスの追跡マップの中に組み込まれている、別のキーフレームであってもよい。 Persistent posture is a source of information that the device can use to orient itself to previously obtained information about the physical world. For example, if a keyframe from which a persistent pose is created is embedded in a map of the physical world, the device will use a sufficient number of feature points in the keyframe associated with the persistent pose. It can then orient itself to its sustained posture. The device may match the sustained posture with the current image obtained around it. The matching may be based on matching the current image with the image 1110, feature 1120, and / or descriptor 1130 that gave rise to a persistent posture, or any subset of the image or its features or descriptors. .. In some embodiments, the current image frame matched to the persistent pose may be another key frame embedded within the tracking map of the device.

持続的姿勢についての情報は、同一または異なるデバイス上で実行され得る、複数のアプリケーション間の共有を促進する、フォーマットで記憶されてもよい。図１４の実施例では、持続的姿勢のいくつかまたは全てが、持続的座標フレーム（ＰＣＦ）１１６０として反映され得る。持続的姿勢のように、ＰＣＦも、マップと関連付けられてもよく、デバイスがそのＰＣＦに対するその配向を決定するために使用し得る、特徴のセットまたは他の情報を備えてもよい。ＰＣＦは、その位置をＰＣＦに相関させることによって、デバイスがマップ内に反映される物理的世界内の任意のオブジェクトに対するその位置を決定し得るように、そのマップの原点に対するその変換を定義する、変換を含んでもよい。 Information about persistent posture may be stored in a format that facilitates sharing between multiple applications that can be performed on the same or different devices. In the embodiment of FIG. 14, some or all of the persistent postures may be reflected as a persistent coordinate frame (PCF) 1160. Like a sustained posture, the PCF may also be associated with a map and may include a set of features or other information that the device can use to determine its orientation with respect to the PCF. The PCF defines its transformation to the origin of the map so that the device can determine its position with respect to any object in the physical world reflected in the map by correlating its position to the PCF. It may include a transformation.

ＰＣＦが、物理的オブジェクトに対する場所を決定するための機構を提供するため、アプリケーション１１８０等のアプリケーションが、仮想コンテンツ１１７０のためのアンカとしての役割を果たす、１つ以上のＰＣＦに対する仮想オブジェクトの位置を定義し得る。図１４は、例えば、アプリ１が、その仮想コンテンツ２とＰＣＦ１、２を関連付けていることを図示する。同様に、アプリ２は、その仮想コンテンツ３とＰＣＦ１、２を関連付けている。アプリ１はまた、その仮想コンテンツ１とＰＣＦ４、５を関連付けていることが示され、アプリ２は、その仮想コンテンツ４とＰＣＦ３を関連付けていることが示される。いくつかの実施形態では、ＰＣＦ１、２が画像１および画像２に基づく方法と同様に、ＰＣＦ３は、画像３（図示せず）に基づいてもよく、ＰＣＦ４、５は、画像４および画像５（図示せず）に基づいてもよい。本仮想コンテンツをレンダリングするとき、デバイスは、１つ以上の変換を適用し、デバイスのディスプレイに対する仮想コンテンツの場所および／または仮想コンテンツの所望の場所に対する物理的オブジェクトの場所等の情報を算出してもよい。ＰＣＦを基準として使用することは、そのような算出を簡略化し得る。 Because the PCF provides a mechanism for determining the location for a physical object, an application such as application 1180 serves as an anchor for virtual content 1170, and the location of the virtual object with respect to one or more PCFs. Can be defined. FIG. 14 illustrates, for example, that the application 1 associates its virtual content 2 with PCFs 1 and 2. Similarly, the application 2 associates the virtual content 3 with the PCFs 1 and 2. App 1 is also shown to associate its virtual content 1 with PCF4 and 5, and app 2 is shown to associate its virtual content 4 with PCF3. In some embodiments, PCF3 may be based on image 3 (not shown), as PCF1 and 2 may be based on image 1 and image 2, and PCF4 and 5 may be image 4 and image 5 (not shown). It may be based on (not shown). When rendering the Virtual Content, the device applies one or more transformations to calculate information such as the location of the virtual content with respect to the device's display and / or the location of the physical object with respect to the desired location of the virtual content. May be good. Using the PCF as a reference can simplify such calculations.

いくつかの実施形態では、持続的姿勢は、１つ以上の関連付けられるキーフレームを有する、座標場所および／または方向であってもよい。いくつかの実施形態では、持続的姿勢は、ユーザが、ある距離、例えば、３メートル進行した後、自動的に作成されてもよい。いくつかの実施形態では、持続的姿勢は、位置特定の間、基準点として作用してもよい。いくつかの実施形態では、持続的姿勢は、パス可能世界（例えば、パス可能世界モジュール５３８）内に記憶されてもよい。 In some embodiments, the persistent posture may be a coordinate location and / or orientation with one or more associated keyframes. In some embodiments, the sustained posture may be created automatically after the user has traveled a distance, eg, 3 meters. In some embodiments, the sustained posture may act as a reference point during positioning. In some embodiments, the sustained posture may be stored within a passable world (eg, passable world module 538).

いくつかの実施形態では、新しいＰＣＦは、隣接するＰＣＦ間で許容される、事前に定義された距離に基づいて決定されてもよい。いくつかの実施形態では、１つ以上の持続的姿勢は、ユーザが、所定の距離、例えば、５メートル進行すると、ＰＣＦの中に算出されてもよい。いくつかの実施形態では、ＰＣＦは、例えば、パス可能世界内において、１つ以上の世界座標フレームおよび／または規準座標フレームと関連付けられてもよい。いくつかの実施形態では、ＰＣＦは、例えば、セキュリティ設定に応じて、ローカルおよび／または遠隔データベース内に記憶されてもよい。 In some embodiments, the new PCF may be determined based on a predefined distance allowed between adjacent PCFs. In some embodiments, one or more sustained postures may be calculated within the PCF as the user travels a predetermined distance, eg, 5 meters. In some embodiments, the PCF may be associated with one or more world coordinate frames and / or reference coordinate frames, for example, within a passable world. In some embodiments, the PCF may be stored in a local and / or remote database, for example, depending on security settings.

図１５は、いくつかの実施形態による、持続座標フレームを確立および使用する方法４７００を図示する。方法４７００は、ＸＲデバイスの１つ以上のセンサを使用して、場面に関する画像（例えば、図１４における画像１および画像２）を捕捉するステップ（行為４７０２）から開始してもよい。複数のカメラが、使用されてもよく、１つのカメラが、例えば、ストリームにおいて、複数の画像を生成してもよい。 FIG. 15 illustrates a method 4700 for establishing and using sustained coordinate frames, according to some embodiments. Method 4700 may begin with the step (act 4702) of capturing an image of the scene (eg, image 1 and image 2 in FIG. 14) using one or more sensors in the XR device. Multiple cameras may be used and one camera may produce multiple images, eg, in a stream.

方法４７００は、着目点（例えば、図７におけるマップ点７０２、図１４における特徴１１２０）を捕捉された画像から抽出するステップ（４７０４）と、抽出された着目点に関する記述子（例えば、図１４における記述子１１３０）を生成するステップ（行為４７０６）と、記述子に基づいて、キーフレーム（例えば、キーフレーム１１４０）を生成するステップ（行為４７０８）とを含んでもよい。いくつかの実施形態では、本方法は、キーフレーム内の着目点を比較し、所定の量の着目点を共有する、対のキーフレームを形成してもよい。本方法は、個々の対のキーフレームを使用して、物理的世界の一部を再構築してもよい。物理的世界のマッピングされた部分は、３Ｄ特徴（例えば、図７におけるキーリグ７０４）として保存されてもよい。いくつかの実施形態では、対のキーフレームの選択された部分は、３Ｄ特徴を構築するために使用されてもよい。いくつかの実施形態では、マッピングの結果は、選択的に保存されてもよい。３Ｄ特徴を構築するために使用されない、キーフレームは、例えば、キーフレームの姿勢間に共分散行列を伴う、キーフレーム間の距離を表す、姿勢を通して、３Ｄ特徴と関連付けられてもよい。いくつかの実施形態では、対のキーフレームは、構築される３Ｄ特徴のそれぞれ間の距離が、必要とされる算出の量および結果として生じるモデルの正確度のレベルを平衡させるように決定され得る、所定の距離内にあるように、３Ｄ特徴を構築するように選択されてもよい。そのようなアプローチは、ＸＲシステムを用いた効率的および正確な算出のために好適なデータの量を伴う、物理的世界のモデルを提供することを可能にする。いくつかの実施形態では、２つの画像の共分散行列は、２つの画像の姿勢（例えば、６自由度）間の共分散を含んでもよい。 Method 4700 includes a step (4704) of extracting points of interest (eg, map points 702 in FIG. 7 and features 1120 in FIG. 14) from captured images and a descriptor for the extracted points of interest (eg, in FIG. 14). It may include a step of generating the descriptor 1130) (act 4706) and a step of generating a keyframe (eg, keyframe 1140) based on the descriptor (act 4708). In some embodiments, the method may form a pair of keyframes that compare points of interest within a keyframe and share a predetermined amount of points of interest. The method may use individual pairs of keyframes to reconstruct parts of the physical world. The mapped portion of the physical world may be stored as a 3D feature (eg, key rig 704 in FIG. 7). In some embodiments, selected parts of a pair of keyframes may be used to build 3D features. In some embodiments, the mapping results may be selectively stored. Keyframes that are not used to build 3D features may be associated with 3D features through poses, such as representing the distance between keyframes, with a covariance matrix between the poses of the keyframes. In some embodiments, the pair of keyframes may be determined so that the distance between each of the 3D features constructed balances the amount of calculation required and the level of accuracy of the resulting model. , May be selected to build 3D features so that they are within a predetermined distance. Such an approach makes it possible to provide a model of the physical world with a suitable amount of data for efficient and accurate calculations using the XR system. In some embodiments, the covariance matrix of the two images may include the covariance between the poses of the two images (eg, 6 degrees of freedom).

方法４７００は、キーフレームに基づいて持続的姿勢を生成するステップ（行為４７１０）を含んでもよい。いくつかの実施形態では、本方法は、対のキーフレームから再構成された３Ｄ特徴に基づいて、持続的姿勢を生成するステップを含んでもよい。いくつかの実施形態では、持続的姿勢は、３Ｄ特徴に結び付けられてもよい。いくつかの実施形態では、持続的姿勢は、３Ｄ特徴を構築するために使用される、キーフレームの姿勢を含んでもよい。いくつかの実施形態では、持続的姿勢は、３Ｄ特徴を構築するために使用される、キーフレームの平均姿勢を含んでもよい。いくつかの実施形態では、持続的姿勢は、近傍の持続的姿勢の間の距離が、所定の値、例えば、１つのメートル～５メートルの範囲内、その間の任意の値、または任意の他の好適な値内であるように生成されてもよい。いくつかの実施形態では、近傍の持続的姿勢間の距離は、近傍の持続的姿勢の共分散行列によって表されてもよい。 Method 4700 may include a step (act 4710) of generating a sustained posture based on a key frame. In some embodiments, the method may include the step of generating a sustained posture based on 3D features reconstructed from a pair of keyframes. In some embodiments, sustained posture may be associated with 3D features. In some embodiments, the sustained posture may include a keyframe posture used to build a 3D feature. In some embodiments, the sustained posture may include the average posture of the keyframes used to build the 3D feature. In some embodiments, the sustained posture is such that the distance between nearby persistent postures is within a predetermined value, eg, one meter to 5 meters, any value in between, or any other. It may be generated to be within a suitable value. In some embodiments, the distance between neighboring persistent poses may be represented by a covariance matrix of neighboring persistent poses.

方法４７００は、持続的姿勢に基づいてＰＣＦを生成するステップ（行為４７１２）を含んでもよい。いくつかの実施形態では、ＰＣＦは、３Ｄ特徴に結び付けられてもよい。いくつかの実施形態では、ＰＣＦは、１つ以上の持続的姿勢と関連付けられてもよい。いくつかの実施形態では、ＰＣＦは、関連付けられる持続的姿勢のうちの１つの姿勢を含んでもよい。いくつかの実施形態では、ＰＣＦは、関連付けられる持続的姿勢の姿勢の平均姿勢を含んでもよい。いくつかの実施形態では、ＰＣＦは、近傍のＰＣＦ間の距離が、所定の値、例えば、３メートル～１０メートルの範囲、その間の任意の値、または任意の他の好適な値内にあるように生成されてもよい。いくつかの実施形態では、近傍のＰＣＦ間の距離は、近傍のＰＣＦの共分散行列によって表されてもよい。いくつかの実施形態では、ＰＣＦは、ＸＲアプリケーションが、モデル自体にアクセスせずに、ＰＣＦを通して、物理的世界のモデルにアクセスし得るように、例えば、アプリケーションプログラミングインターフェース（ＡＰＩ）を介して、ＸＲアプリケーションにエクスポーズされてもよい。 Method 4700 may include the step of generating a PCF based on a sustained posture (act 4712). In some embodiments, the PCF may be associated with 3D features. In some embodiments, the PCF may be associated with one or more sustained postures. In some embodiments, the PCF may include one of the associated persistent postures. In some embodiments, the PCF may include the average posture of the associated persistent posture. In some embodiments, the PCFs such that the distance between neighboring PCFs is within a predetermined value, eg, a range of 3 to 10 meters, any value in between, or any other suitable value. May be generated in. In some embodiments, the distance between neighboring PCFs may be represented by a covariance matrix of neighboring PCFs. In some embodiments, the PCF allows the XR application to access the model of the physical world through the PCF without accessing the model itself, eg, via an application programming interface (API). It may be exposed to the application.

方法４７００は、ＸＲデバイスによって表示されるための仮想オブジェクトの画像データとＰＣＦのうちの少なくとも１つを関連付けるステップ（行為４７１４）を含んでもよい。いくつかの実施形態では、本方法は、関連付けられるＰＣＦに対する仮想オブジェクトの平行移動および配向を算出するステップを含んでもよい。仮想オブジェクトと仮想オブジェクトを設置するデバイスによって生成されたＰＣＦを関連付けることは、必要とされないことを理解されたい。例えば、デバイスは、クラウド内の規準マップ内の保存されたＰＣＦを読み出し、仮想オブジェクトと読み出されたＰＣＦを関連付けてもよい。仮想オブジェクトは、ＰＣＦが経時的に調節されるにつれて、関連付けられるＰＣＦとともに移動し得ることを理解されたい。 Method 4700 may include associating at least one of the PCF with the image data of the virtual object to be displayed by the XR device (act 4714). In some embodiments, the method may include calculating the translation and orientation of the virtual object with respect to the associated PCF. It should be understood that associating a virtual object with a PCF generated by the device on which the virtual object is installed is not required. For example, the device may read the stored PCF in the canonical map in the cloud and associate the virtual object with the read PCF. It should be understood that virtual objects can move with the associated PCF as the PCF is adjusted over time.

図１６は、いくつかの実施形態による、第１のＸＲデバイス１２．１と、第２のＸＲデバイス１２．２の視覚データおよびアルゴリズムと、サーバ２０とを図示する。図１６に図示されるコンポーネントは、本明細書に説明されるような持続的姿勢、持続的座標フレーム、追跡マップ、または規準マップ等の空間情報を生成、更新、および／または使用するステップと関連付けられる、動作のいくつかまたは全てを実施するように動作し得る。図示されないが、第１のＸＲデバイス１２．１は、第２のＸＲデバイス１２．２と同一に構成されてもよい。サーバ２０は、マップ記憶ルーチン１１８と、規準マップ１２０と、マップ伝送機１２２と、マップマージアルゴリズム１２４とを有してもよい。 FIG. 16 illustrates the visual data and algorithms of the first XR device 12.1 and the second XR device 12.2, and the server 20 according to some embodiments. The components illustrated in FIG. 16 are associated with steps that generate, update, and / or use spatial information such as persistent attitudes, persistent coordinate frames, tracking maps, or reference maps as described herein. Can act to perform some or all of the actions. Although not shown, the first XR device 12.1 may be configured identically to the second XR device 12.2. The server 20 may have a map storage routine 118, a reference map 120, a map transmitter 122, and a map merge algorithm 124.

第１のＸＲデバイス１２．１と同一場面にあり得る、第２のＸＲデバイス１２．２は、持続的座標フレーム（ＰＣＦ）統合ユニット１３００と、仮想オブジェクトをレンダリングするために使用され得る画像データ６８を生成する、アプリケーション１３０２と、フレーム埋込生成器３０８（図２１参照）とを含んでもよい。いくつかの実施形態では、マップダウンロードシステム１２６、ＰＣＦ識別システム１２８、マップ２、位置特定モジュール１３０、規準マップ組込器１３２、規準マップ１３３、およびマップ発行器１３６は、パス可能世界ユニット１３０４の中に群化されてもよい。ＰＣＦ統合ユニット１３００は、パス可能世界ユニット１３０４および第２のＸＲデバイス１２．２の他のコンポーネントに接続され、ＰＣＦの読出、生成、使用、アップロード、およびダウンロードを可能にしてもよい。 The second XR device 12.2, which may be in the same scene as the first XR device 12.1, has a persistent coordinate frame (PCF) integration unit 1300 and image data 68 that can be used to render virtual objects. The application 1302 and the frame embedding generator 308 (see FIG. 21) may be included. In some embodiments, the map download system 126, the PCF identification system 128, the map 2, the location module 130, the reference map incorporater 132, the reference map 133, and the map issuer 136 are in passable world units 1304. May be grouped into. The PCF integration unit 1300 may be connected to the passable worlds unit 1304 and other components of the second XR device 12.2 to allow the PCF to be read, generated, used, uploaded and downloaded.

ＰＣＦを備える、マップは、変化する世界内のより多くの存続を有効にし得る。いくつかの実施形態では、例えば、画像のためのマッチング特徴を含む、追跡マップを位置特定するステップは、ＰＣＦによって構成されるマップから、持続的コンテンツを表す特徴を選択するステップを含んでもよく、これは、高速マッチングおよび／または位置特定を有効にする。例えば、人々が場面の内外に移動し、ドア等のオブジェクトが場面に対して移動する、世界は、より少ない記憶空間および伝送レートを要求し、場面をマッピングするために個々のＰＣＦおよび相互に対するその関係（例えば、ＰＣＦの統合されたコンステレーション）の使用を有効にする。 Maps with PCFs can enable more survival in a changing world. In some embodiments, the step of locating a tracking map, including, for example, matching features for an image, may include selecting features that represent persistent content from a map composed of PCFs. This enables fast matching and / or positioning. For example, people move in and out of a scene, objects such as doors move relative to the scene, the world demands less storage space and transmission rate, and its to individual PCFs and each other to map the scene. Enables the use of relationships (eg, integrated constellations of PCF).

いくつかの実施形態では、ＰＣＦ統合ユニット１３００は、第２のＸＲデバイス１２．２の記憶ユニット上のデータ記憶内に以前に記憶されたＰＣＦ１３０６と、ＰＣＦ追跡器１３０８と、持続的姿勢入手器１３１０と、ＰＣＦ確認器１３１２と、ＰＣＦ生成システム１３１４と、座標フレーム計算機１３１６と、持続的姿勢計算機１３１８と、追跡マップおよび持続的姿勢変換器１３２０と、持続的姿勢およびＰＣＦ変換器１３２２と、ＰＣＦおよび画像データ変換器１３２４とを含む、３つの変換器とを含んでもよい。 In some embodiments, the PCF integrated unit 1300 is a PCF 1306 previously stored in data storage on a storage unit of a second XR device 12.2, a PCF tracker 1308, and a continuous posture acquirer 1310. , PCF confirmer 1312, PCF generation system 1314, coordinate frame calculator 1316, continuous posture calculator 1318, tracking map and continuous posture converter 1320, continuous posture and PCF converter 1322, PCF and It may include three converters, including an image data converter 1324.

いくつかの実施形態では、ＰＣＦ追跡器１３０８は、アプリケーション１３０２によって選択可能である、オンプロンプトおよびオフプロンプトを有してもよい。アプリケーション１３０２は、第２のＸＲデバイス１２．２のプロセッサによって実行可能であって、例えば、仮想コンテンツを表示してもよい。アプリケーション１３０２は、オンプロンプトを介して、ＰＣＦ追跡器１３０８をオンに切り替える、コールを有してもよい。ＰＣＦ追跡器１３０８は、ＰＣＦ追跡器１３０８がオンに切り替えられると、ＰＣＦを生成してもよい。アプリケーション１３０２は、オフプロンプトを介して、ＰＣＦ追跡器１３０８をオフに切り替え得る、後続コールを有してもよい。ＰＣＦ追跡器１３０８は、ＰＣＦ追跡器１３０８がオフに切り替えられると、ＰＣＦ生成を終了する。 In some embodiments, the PCF tracker 1308 may have on-prompt and off-prompt selectable by application 1302. Application 1302 may be run by the processor of the second XR device 12.2 and may display virtual content, for example. Application 1302 may have a call to switch on the PCF tracker 1308 via an on-prompt. The PCF tracker 1308 may generate a PCF when the PCF tracker 1308 is switched on. Application 1302 may have subsequent calls that may switch the PCF tracker 1308 off via an off-prompt. The PCF tracker 1308 ends PCF generation when the PCF tracker 1308 is switched off.

いくつかの実施形態では、サーバ２０は、規準マップ１２０と関連付けられて以前に保存されている、複数の持続的姿勢１３３２と、複数のＰＣＦ１３３０とを含んでもよい。マップ伝送機１２２は、規準マップ１２０を持続的姿勢１３３２および／またはＰＣＦ１３３０とともに、第２のＸＲデバイス１２．２に伝送してもよい。持続的姿勢１３３２およびＰＣＦ１３３０は、規準マップ１３３と関連付けて第２のＸＲデバイス１２．２上に記憶されてもよい。マップ２が、規準マップ１３３に対して位置特定すると、持続的姿勢１３３２およびＰＣＦ１３３０は、マップ２と関連付けて記憶されてもよい。 In some embodiments, the server 20 may include a plurality of persistent postures 1332 and a plurality of PCF 1330s previously stored associated with the criterion map 120. The map transmitter 122 may transmit the reference map 120 to the second XR device 12.2 together with the continuous attitude 1332 and / or the PCF 1330. Persistent posture 1332 and PCF 1330 may be stored on a second XR device 12.2 in association with the reference map 133. When the map 2 locates with respect to the reference map 133, the continuous posture 1332 and the PCF 1330 may be stored in association with the map 2.

いくつかの実施形態では、持続的姿勢入手器１３１０は、マップ２のための持続的姿勢を入手してもよい。ＰＣＦ確認器１３１２は、持続的姿勢入手器１３１０に接続されてもよい。ＰＣＦ確認器１３１２は、持続的姿勢入手器１３１０によって読み出された持続的姿勢に基づいて、ＰＣＦをＰＣＦ１３０６から読み出してもよい。ＰＣＦ確認器１３１２によって読み出されたＰＣＦは、ＰＣＦに基づいて、画像ディスプレイのために使用される、ＰＣＦの初期群を形成してもよい。 In some embodiments, the continuous posture obtainr 1310 may obtain a continuous posture for map 2. The PCF confirmer 1312 may be connected to the continuous posture obtainr 1310. The PCF confirmer 1312 may read the PCF from the PCF 1306 based on the continuous posture read by the continuous posture acquirer 1310. The PCF read by the PCF confirmer 1312 may form an initial group of PCFs used for an image display based on the PCF.

いくつかの実施形態では、アプリケーション１３０２は、付加的ＰＣＦが生成されることを要求し得る。例えば、ユーザが、以前にマッピングされていないエリアに移動する場合、アプリケーション１３０２は、ＰＣＦ追跡器１３０８をオンに切り替えてもよい。ＰＣＦ生成システム１３１４は、ＰＣＦ追跡器１３０８に接続され、マップ２が拡張し始めるにつれて、マップ２に基づいて、ＰＣＦの生成を開始してもよい。ＰＣＦ生成システム１３１４によって生成されたＰＣＦは、ＰＣＦベースの画像表示のために使用され得る、ＰＣＦの第２のグループを形成してもよい。 In some embodiments, application 1302 may require that an additional PCF be generated. For example, if the user navigates to an area that was not previously mapped, application 1302 may switch the PCF tracker 1308 on. The PCF generation system 1314 may be connected to the PCF tracker 1308 and start PCF generation based on the map 2 as the map 2 begins to expand. The PCFs generated by the PCF generation system 1314 may form a second group of PCFs that can be used for PCF-based image display.

座標フレーム計算機１３１６は、ＰＣＦ確認器１３１２に接続されてもよい。ＰＣＦ確認器１３１２がＰＣＦを読み出した後、座標フレーム計算機１３１６は、頭部座標フレーム９６を呼び出し、第２のＸＲデバイス１２．２の頭部姿勢を決定してもよい。座標フレーム計算機１３１６はまた、持続的姿勢計算機１３１８を呼び出してもよい。持続的姿勢計算機１３１８は、フレーム埋込生成器３０８に直接または間接的に接続されてもよい。いくつかの実施形態では、画像／フレームは、前のキーフレームからの閾値距離、例えば、３メートルが、進行された後、キーフレームと指定されてもよい。持続的姿勢計算機１３１８は、複数の、例えば、３つのキーフレームに基づいて、持続的姿勢を生成してもよい。いくつかの実施形態では、持続的姿勢は、本質的に、複数のキーフレームの座標フレームの平均であってもよい。 The coordinate frame calculator 1316 may be connected to the PCF confirmer 1312. After the PCF confirmer 1312 reads the PCF, the coordinate frame computer 1316 may call the head coordinate frame 96 to determine the head posture of the second XR device 12.2. The coordinate frame calculator 1316 may also call the sustained attitude calculator 1318. The continuous attitude calculator 1318 may be directly or indirectly connected to the frame embedding generator 308. In some embodiments, the image / frame may be designated as a keyframe after a threshold distance from the previous keyframe, eg, 3 meters, has been advanced. The continuous posture calculator 1318 may generate a continuous posture based on a plurality of key frames, for example, three key frames. In some embodiments, the persistent posture may be essentially the average of the coordinate frames of multiple keyframes.

追跡マップおよび持続的姿勢変換器１３２０は、マップ２および持続的姿勢計算機１３１８に接続されてもよい。追跡マップおよび持続的姿勢変換器１３２０は、マップ２を持続的姿勢に変換し、マップ２に対する原点における持続的姿勢を決定してもよい。 The tracking map and the continuous attitude converter 1320 may be connected to the map 2 and the continuous attitude calculator 1318. The tracking map and the continuous attitude converter 1320 may convert the map 2 into a continuous attitude and determine the persistent attitude at the origin with respect to the map 2.

持続的姿勢およびＰＣＦ変換器１３２２は、追跡マップおよび持続的姿勢変換器１３２０およびさらにＰＣＦ確認器１３１２およびＰＣＦ生成システム１３１４に接続されてもよい。持続的姿勢およびＰＣＦ変換器１３２２は、持続的姿勢（それに対して追跡マップが変換された）をＰＣＦ確認器１３１２およびＰＣＦ生成システム１３１４からのＰＣＦに変換し、持続的姿勢に対するＰＣＦを決定してもよい。 The persistent attitude and PCF converter 1322 may be connected to the tracking map and continuous attitude converter 1320 and further to the PCF confirmer 1312 and the PCF generation system 1314. Persistent Posture and PCF Converter 1322 converts the Sustained Posture (with which the tracking map converted) to PCF from the PCF confirmer 1312 and the PCF generation system 1314 to determine the PCF for the Persistent Posture. May be good.

ＰＣＦおよび画像データ変換器１３２４は、持続的姿勢およびＰＣＦ変換器１３２２およびデータチャネル６２に接続されてもよい。ＰＣＦおよび画像データ変換器１３２４は、ＰＣＦを画像データ６８に変換する。レンダリングエンジン３０は、ＰＣＦおよび画像データ変換器１３２４に接続され、ＰＣＦに対する画像データ６８をユーザに表示してもよい。 The PCF and image data converter 1324 may be connected to a continuous posture and PCF converter 1322 and a data channel 62. The PCF and the image data converter 1324 convert the PCF into image data 68. The rendering engine 30 may be connected to the PCF and the image data converter 1324 to display the image data 68 for the PCF to the user.

ＰＣＦ統合ユニット１３００は、ＰＣＦ１３０６内でＰＣＦ生成システム１３１４を用いて生成される、付加的ＰＣＦを記憶してもよい。ＰＣＦ１３０６は、持続的姿勢に対して記憶されてもよい。マップ発行器１３６は、マップ発行器１３６が、マップ２をサーバ２０に伝送し、マップ発行器１３６がまた、マップ２と関連付けられるＰＣＦおよび持続的姿勢をサーバ２０に伝送するとき、ＰＣＦ１３０６およびＰＣＦ１３０６と関連付けられる持続的姿勢を読み出し得る。サーバ２０のマップ記憶ルーチン１１８が、マップ２を記憶するとき、マップ記憶ルーチン１１８はまた、第２の視認デバイス１２．２によって生成された持続的姿勢およびＰＣＦを記憶してもよい。マップマージアルゴリズム１２４は、それぞれ、規準マップ１２０と関連付けられ、持続的姿勢１３３２およびＰＣＦ１３３０内に記憶される、マップ２の持続的姿勢およびＰＣＦとともに、規準マップ１２０を作成してもよい。 The PCF integration unit 1300 may store additional PCFs generated within the PCF 1306 using the PCF generation system 1314. PCF1306 may be stored for sustained posture. The map issuer 136 and the PCF1306 and PCF1306 when the map issuer 136 transmits the map 2 to the server 20 and the map issuer 136 also transmits the PCF and persistent attitude associated with the map 2 to the server 20. The associated persistent posture can be read. When the map storage routine 118 of the server 20 stores the map 2, the map storage routine 118 may also store the persistent posture and PCF generated by the second visual device 12.2. The map merge algorithm 124 may create the canonical map 120, along with the persistent pose of map 2 and the PCF, which are associated with the canonical map 120, respectively, and are stored in the persistent posture 1332 and the PCF 1330.

第１のＸＲデバイス１２．１は、第２のＸＲデバイス１２．２のＰＣＦ統合ユニット１３００に類似する、ＰＣＦ統合ユニットを含んでもよい。マップ伝送機１２２が、規準マップ１２０を第１のＸＲデバイス１２．１に伝送するとき、マップ伝送機１２２は、規準マップ１２０と関連付けられ、第２のＸＲデバイス１２．２から生じる、持続的姿勢１３３２およびＰＣＦ１３３０を伝送してもよい。第１のＸＲデバイス１２．１は、ＰＣＦおよび持続的姿勢を第１のＸＲデバイス１２．１の記憶デバイス上のデータ記憶装置内に記憶してもよい。第１のＸＲデバイス１２．１は、次いで、ＰＣＦに対する画像表示のために、第２のＸＲデバイス１２．２から生じる持続的姿勢およびＰＣＦを利用してもよい。加えて、または代替として、第１のＸＲデバイス１２．１は、上記に説明されるように、第２のＸＲデバイス１２．２に類似する様式において、ＰＣＦおよび持続的姿勢を読み出し、生成し、利用し、アップロードし、ダウンロードしてもよい。 The first XR device 12.1 may include a PCF integration unit similar to the PCF integration unit 1300 of the second XR device 12.2. When the map transmitter 122 transmits the reference map 120 to the first XR device 12.1, the map transmitter 122 is associated with the reference map 120 and the sustained attitude resulting from the second XR device 12.2. 1332 and PCF 1330 may be transmitted. The first XR device 12.1 may store the PCF and sustained posture in a data storage device on the storage device of the first XR device 12.1. The first XR device 12.1 may then utilize the sustained posture and PCF resulting from the second XR device 12.2 for displaying images to the PCF. In addition, or as an alternative, the first XR device 12.1 reads and generates the PCF and sustained posture in a manner similar to the second XR device 12.2, as described above. You may use it, upload it, and download it.

図示される実施例では、第１のＸＲデバイス１２．１は、ローカル追跡マップ（以降、「マップ１」と称される）を生成し、マップ記憶ルーチン１１８は、マップ１を第１のＸＲデバイス１２．１から受信する。マップ記憶ルーチン１１８は、次いで、マップ１をサーバ２０の記憶デバイス上に規準マップ１２０として記憶する。 In the illustrated embodiment, the first XR device 12.1 will generate a local tracking map (hereinafter referred to as "map 1"), and the map storage routine 118 will map map 1 to the first XR device. Receive from 12.1. The map storage routine 118 then stores the map 1 on the storage device of the server 20 as a reference map 120.

第２のＸＲデバイス１２．２は、マップダウンロードシステム１２６と、アンカ識別システム１２８と、位置特定モジュール１３０と、規準マップ組込器１３２と、ローカルコンテンツ位置付けシステム１３４と、マップ発行器１３６とを含む。 The second XR device 12.2 includes a map download system 126, an anchor identification system 128, a location identification module 130, a reference map embedding device 132, a local content positioning system 134, and a map issuer 136. ..

使用時、マップ伝送機１２２は、規準マップ１２０を第２のＸＲデバイス１２．２に送信し、マップダウンロードシステム１２６は、規準マップ１２０を規準マップ１３３としてサーバ２０からダウンロードし、記憶する。 At the time of use, the map transmitter 122 transmits the reference map 120 to the second XR device 12.2, and the map download system 126 downloads and stores the reference map 120 as the reference map 133 from the server 20.

アンカ識別システム１２８は、世界表面決定ルーチン７８に接続される。アンカ識別システム１２８は、世界表面決定ルーチン７８によって検出されたオブジェクトに基づいて、アンカを識別する。アンカ識別システム１２８は、アンカを使用して、第２のマップ（マップ２）を生成する。サイクル１３８によって示されるように、アンカ識別システム１２８は、アンカを識別し続け、マップ２を更新し続ける。アンカの場所は、世界表面決定ルーチン７８によって提供されるデータに基づいて、３次元データとして記録される。世界表面決定ルーチン７８は、画像を実オブジェクト検出カメラ４４から、深度データを深度センサ１３５から受信し、表面の場所および深度センサ１３５からのその相対的距離を決定する。 The anchor identification system 128 is connected to the world surface determination routine 78. The anchor identification system 128 identifies anchors based on the objects detected by the world surface determination routine 78. The anchor identification system 128 uses the anchor to generate a second map (map 2). As indicated by cycle 138, the anchor identification system 128 continues to identify anchors and keeps updating map 2. The location of the anchor is recorded as 3D data based on the data provided by the world surface determination routine 78. The world surface determination routine 78 receives an image from the real object detection camera 44 and depth data from the depth sensor 135 to determine the location of the surface and its relative distance from the depth sensor 135.

位置特定モジュール１３０は、規準マップ１３３およびマップ２に接続される。位置特定モジュール１３０は、マップ２を規準マップ１３３に対して位置特定するように繰り返し試みる。規準マップ組込器１３２は、規準マップ１３３およびマップ２に接続される。位置特定モジュール１３０が、マップ２を規準マップ１３３に対して位置特定すると、規準マップ組込器１３２は、規準マップ１３３をマップ２のアンカの中に組み込む。マップ２は、次いで、規準マップ内に含まれる欠測データで更新される。 The positioning module 130 is connected to the reference map 133 and the map 2. The positioning module 130 repeatedly attempts to locate the map 2 with respect to the reference map 133. The reference map embedding device 132 is connected to the reference map 133 and the map 2. When the position identification module 130 locates the map 2 with respect to the reference map 133, the reference map incorporater 132 incorporates the reference map 133 into the anchor of the map 2. Map 2 is then updated with missing data contained within the reference map.

ローカルコンテンツ位置付けシステム１３４は、マップ２に接続される。ローカルコンテンツ位置付けシステム１３４は、例えば、ユーザが世界座標フレーム内の特定の場所においてローカルコンテンツを位置特定し得る、システムであってもよい。ローカルコンテンツ自体が、次いで、マップ２の１つのアンカに結び付けられる。ローカル／世界座標変換器１０４は、ローカルコンテンツ位置付けシステム１３４の設定に基づいて、ローカル座標フレームを世界座標フレームに変換する。レンダリングエンジン３０、ディスプレイシステム４２、およびデータチャネル６２の機能は、図２を参照して説明されている。 The local content positioning system 134 is connected to the map 2. The local content positioning system 134 may be, for example, a system in which the user can locate the local content at a specific location within the world coordinate frame. The local content itself is then tied to one anchor in Map 2. The local / world coordinate converter 104 converts the local coordinate frame into a world coordinate frame based on the settings of the local content positioning system 134. The functions of the rendering engine 30, the display system 42, and the data channel 62 are described with reference to FIG.

マップ発行器１３６は、マップ２をサーバ２０にアップロードする。サーバ２０のマップ記憶ルーチン１１８は、次いで、マップ２をサーバ２０の記憶媒体内に記憶する。 The map issuer 136 uploads the map 2 to the server 20. The map storage routine 118 of the server 20 then stores the map 2 in the storage medium of the server 20.

マップマージアルゴリズム１２４は、マップ２と規準マップ１２０をマージする。物理的世界の同一または隣接する領域に関連する、２つを上回るマップ、例えば、３つまたは４つのマップが、記憶されているとき、マップマージアルゴリズム１２４は、全てのマップを規準マップ１２０にマージし、新しい規準マップ１２０をレンダリングする。マップ伝送機１２２は、次いで、新しい規準マップ１２０を、新しい規準マップ１２０によって表されるエリア内のあらゆるデバイス１２．１および１２．２に伝送する。デバイス１２．１および１２．２が、その個別のマップを規準マップ１２０に対して位置特定すると、規準マップ１２０は、プロモートされたマップとなる。 The map merge algorithm 124 merges the map 2 and the reference map 120. When more than two maps, eg, three or four maps, related to the same or adjacent areas of the physical world are stored, the map merge algorithm 124 merges all the maps into the reference map 120. And render the new reference map 120. The map transmitter 122 then transmits the new reference map 120 to all devices 12.1 and 12.2 in the area represented by the new reference map 120. When devices 12.1 and 12.2 locate their individual maps relative to the canonical map 120, the canonical map 120 becomes a promoted map.

図１７は、いくつかの実施形態による、場面のマップのためのキーフレームを生成する実施例を図示する。図示される実施例では、第１のキーフレームＫＦ１は、部屋の左壁上のドアのために生成される。第２のキーフレームＫＦ２は、部屋の床、左壁、および右壁が出会う角内のエリアのために生成される。第３のキーフレームＫＦ３は、部屋の右壁上の窓のエリアのために生成される。第４のキーフレームＫＦ４は、壁の床上のラグの縁端におけるエリアのために生成される。第５のキーフレームＫＦ５は、ユーザに最も近いラグのエリアのために生成される。 FIG. 17 illustrates an example of generating keyframes for a map of a scene, according to some embodiments. In the illustrated embodiment, the first keyframe KF1 is generated for the door on the left wall of the room. The second keyframe KF2 is generated for the area within the corner where the floor, left wall, and right wall of the room meet. A third keyframe, KF3, is generated for the area of the window on the right wall of the room. The fourth keyframe KF4 is generated for the area at the edge of the rug on the floor of the wall. The fifth keyframe KF5 is generated for the area of lag closest to the user.

図１８は、いくつかの実施形態による、図１７のマップのための持続的姿勢を生成する実施例を図示する。いくつかの実施形態では、新しい持続的姿勢が、デバイスが進行された閾値距離を測定すると、および／またはアプリケーションが新しい持続的姿勢（ＰＰ）を要求すると、作成される。いくつかの実施形態では、閾値距離は、３メートル、５メートル、２０メートル、または任意の他の好適な距離であってもよい。より小さい閾値距離（例えば、１ｍ）を選択することは、より多数のＰＰが、より大きい閾値距離と比較して、作成および管理され得るため、算出負荷の増加をもたらし得る。より大きい閾値距離（例えば、４０ｍ）を選択することは、より少数のＰＰが作成され、より少ないＰＣＦが作成される結果をもたらし、ＰＣＦに結び付けられた仮想コンテンツが、ＰＣＦから比較的に大距離（例えば、３０ｍ）離れ、ＰＣＦから仮想コンテンツまでの距離の増加に伴って誤差が増加し得ることを意味するため、仮想コンテンツ設置誤差の増加をもたらし得る。 FIG. 18 illustrates an example of generating a sustained posture for the map of FIG. 17 with some embodiments. In some embodiments, a new sustained posture is created when the device measures the advanced threshold distance and / or when the application requests a new sustained posture (PP). In some embodiments, the threshold distance may be 3 meters, 5 meters, 20 meters, or any other suitable distance. Choosing a smaller threshold distance (eg, 1 m) can result in an increase in computational load as more PPs can be created and managed compared to a larger threshold distance. Choosing a larger threshold distance (eg, 40 m) results in the creation of fewer PPs and less PCF, and the virtual content associated with the PCF is relatively large distance from the PCF. This can result in an increase in virtual content installation error, as it means that the error can increase with increasing distance (eg, 30 m) from the PCF to the virtual content.

いくつかの実施形態では、ＰＰは、新しいセッションの開始時に作成されてもよい。本初期ＰＰは、ゼロと見なされ得、閾値距離に等しい半径を有する、円形の中心として可視化され得る。デバイスが、円形の周に到達し、いくつかの実施形態では、アプリケーションが、新しいＰＰを要求すると、新しいＰＰは、デバイスの現在の場所（閾値距離）に設置されてもよい。いくつかの実施形態では、新しいＰＰは、デバイスが、既存のＰＰをデバイスの新しい位置からの閾値距離内に見出すことが可能である場合、閾値距離に作成されないであろう。いくつかの実施形態では、新しいＰＰ（図１４におけるＰＰ１１５０）が、作成されると、デバイスは、最も近いキーフレームのうちの１つ以上のものをＰＰに結び付ける。いくつかの実施形態では、キーフレームに対するＰＰの場所は、ＰＰが作成された時点でのデバイスの場所に基づいてもよい。いくつかの実施形態では、ＰＰは、アプリケーションがＰＰを要求しない限り、デバイスが閾値距離を進行しても作成されないであろう。 In some embodiments, the PP may be created at the start of a new session. This initial PP can be considered zero and can be visualized as the center of a circle with a radius equal to the threshold distance. When the device reaches a circular circumference and, in some embodiments, the application requests a new PP, the new PP may be installed at the device's current location (threshold distance). In some embodiments, the new PP will not be created at the threshold distance if the device is able to find the existing PP within the threshold distance from the new position of the device. In some embodiments, when a new PP (PP1150 in FIG. 14) is created, the device binds one or more of the closest keyframes to the PP. In some embodiments, the location of the PP relative to the keyframe may be based on the location of the device at the time the PP was created. In some embodiments, the PP will not be created as the device travels the threshold distance unless the application requires the PP.

いくつかの実施形態では、アプリケーションは、アプリケーションがユーザに表示するための仮想コンテンツを有するとき、ＰＣＦをデバイスから要求してもよい。アプリケーションからのＰＣＦ要求は、ＰＰ要求をトリガしてもよく、新しいＰＰは、デバイスが閾値距離を進行した後に作成されるであろう。図１８は、例えば、キーフレームと持続的姿勢との間の相対的姿勢を算出することによっって、最も近いキーフレーム（例えば、ＫＦ１、ＫＦ２、およびＫＦ３）を結び付けさせ得る、第１の持続的姿勢ＰＰ１を図示する。図１８はまた、最も近いキーフレーム（例えば、ＫＦ４およびＫＦ５）を結び付けさせ得る、第２の持続的姿勢ＰＰ２を図示する。 In some embodiments, the application may request a PCF from the device when the application has virtual content to display to the user. The PCF request from the application may trigger a PP request and a new PP will be created after the device has traveled the threshold distance. FIG. 18 allows the closest keyframes (eg, KF1, KF2, and KF3) to be linked by, for example, calculating the relative attitude between the keyframe and the sustained attitude. The continuous posture PP1 is illustrated. FIG. 18 also illustrates a second sustained posture PP2 that can bind the closest keyframes (eg, KF4 and KF5).

図１９は、いくつかの実施形態による、図１７のマップのためのＰＣＦを生成する実施例を図示する。図示される実施例では、ＰＣＦ１は、ＰＰ１と、ＰＰ２とを含んでもよい。上記に説明されるように、ＰＣＦは、ＰＣＦに対する画像データを表示するために使用されてもよい。いくつかの実施形態では、各ＰＣＦは、別の座標フレーム（例えば、世界座標フレーム）内の座標と、例えば、ＰＣＦを一意に識別する、ＰＣＦ記述子とを有してもよい。いくつかの実施形態では、ＰＣＦ記述子は、ＰＣＦと関連付けられるフレーム内の特徴の特徴記述子に基づいて算出されてもよい。いくつかの実施形態では、ＰＣＦの種々のコンステレーションは、より少ないデータおよびより少ないデータの伝送を要求する、持続的様式において、組み合わせられ、実世界を表してもよい。 FIG. 19 illustrates an example of generating a PCF for the map of FIG. 17 according to some embodiments. In the illustrated embodiment, PCF1 may include PP1 and PP2. As described above, the PCF may be used to display image data for the PCF. In some embodiments, each PCF may have coordinates within another coordinate frame (eg, a world coordinate frame) and, for example, a PCF descriptor that uniquely identifies the PCF. In some embodiments, the PCF descriptor may be calculated based on the feature descriptor of the features in the frame associated with the PCF. In some embodiments, the various constellations of the PCF may be combined and represent the real world in a sustainable manner that requires the transmission of less data and less data.

図２０Ａ－２０Ｃは、持続的座標フレームを確立および使用する実施例を図示する、概略図である。図２０Ａは、規準マップに対して位置特定されていない、個別のローカル追跡マップ４８０４Ａ、４８０４Ｂを伴う、２人のユーザ４８０２Ａ、４８０２Ｂを示す。個々のユーザに関する原点４８０６Ａ、４８０６Ｂは、その個別のエリア内の座標系（例えば、世界座標系）によって描写される。各追跡マップのこれらの原点は、原点が、追跡が開始されたときのその個別のデバイスの配向に依存するため、各ユーザにローカルであり得る。 20A-20C are schematics illustrating examples of establishing and using persistent coordinate frames. FIG. 20A shows two users 4802A, 4802B with separate local tracking maps 4804A, 4804B that are not located with respect to the canonical map. The origins 4806A and 4806B for an individual user are depicted by a coordinate system within that individual area (eg, a world coordinate system). These origins of each tracking map can be local to each user because the origin depends on the orientation of that individual device when the tracking is started.

ユーザデバイスのセンサが、環境を走査するにつれて、デバイスは、図１４に関連して上記に説明されるように、それらの画像が、そこから持続的姿勢が作成され得る、キーフレームとして分類され得るように、持続的オブジェクトを表す特徴を含有し得る、画像を捕捉してもよい。本実施例では、追跡マップ４８０２Ａは、持続的姿勢（ＰＰ）４８０８Ａを含み、追跡マップ４８０２Ｂは、ＰＰ４８０８Ｂを含む。 As the sensor of the user device scans the environment, the device can be classified as a key frame from which those images can be created, as described above in connection with FIG. As such, images may be captured that may contain features that represent persistent objects. In this embodiment, the tracking map 4802A comprises a sustained posture (PP) 4808A and the tracking map 4802B comprises a PP4808B.

また、図１４に関連して上記に説明されるように、ＰＰのうちのいくつかは、ユーザにそれをレンダリングするために仮想コンテンツの配向を決定するために使用される、ＰＣＦとして分類されてもよい。図２０Ｂは、個別のユーザ４８０２Ａ、４８０２Ｂによって装着されるＸＲデバイスが、ＰＰ４８０８Ａ、４８０８Ｂに基づいて、ローカルＰＣＦ４８１０Ａ、４８１０Ｂを作成し得ることを示す。図２０Ｃは、持続的コンテンツ４８１２Ａ、４８１２Ｂ（例えば、仮想コンテンツ）が個別のＸＲデバイスによってＰＣＦ４８１０Ａ、４８１０Ｂに結び付けられ得ることを示す。 Also, as described above in connection with FIG. 14, some of the PPs are classified as PCFs, which are used to determine the orientation of the virtual content to render it to the user. May be good. FIG. 20B shows that an XR device worn by individual users 4802A, 4802B can create local PCF4810A, 4810B based on PP4808A, 4808B. FIG. 20C shows that persistent content 4812A, 4812B (eg, virtual content) can be associated with PCF4810A, 4810B by individual XR devices.

本実施例では、仮想コンテンツは、仮想コンテンツが表示されるべき方法にかかわらず、仮想コンテンツを生成するアプリケーションによって使用され得る、仮想コンテンツ座標フレームを有し得る。仮想コンテンツは、例えば、仮想コンテンツ座標フレームに対して特定の場所および角度における、メッシュの三角形等の表面として規定されてもよい。その仮想コンテンツをユーザにレンダリングするために、それらの表面の場所が、仮想コンテンツを知覚することになるユーザに対して決定されてもよい。 In this embodiment, the virtual content may have a virtual content coordinate frame that can be used by an application that produces the virtual content, regardless of how the virtual content should be displayed. The virtual content may be defined, for example, as a surface such as a mesh triangle at a particular location and angle with respect to the virtual content coordinate frame. In order to render the virtual content to the user, the location of their surface may be determined for the user who will perceive the virtual content.

仮想コンテンツをＰＣＦに結び付けることは、ユーザに対する仮想コンテンツの場所を決定する際に関わる算出を簡略化し得る。ユーザに対する仮想コンテンツの場所は、一連の変換を適用することによって決定されてもよい。それらの変換のうちのいくつかは、変化し得、頻繁に更新されてもよい。それらの変換のその他は、安定し得、あまり頻繁にまたは全く更新されなくてもよい。それにもかかわらず、変換は、仮想コンテンツの場所が、ユーザに対して頻繁に更新され、現実的外観をレンダリングされた仮想コンテンツに提供し得るように、比較的に低算出負担を伴って適用され得る。 Linking the virtual content to the PCF can simplify the calculations involved in determining the location of the virtual content for the user. The location of the virtual content for the user may be determined by applying a series of transformations. Some of those transformations can change and may be updated frequently. Others of those conversions can be stable and need not be updated very often or at all. Nevertheless, the transformation is applied with a relatively low computational burden so that the location of the virtual content is updated frequently to the user and can provide a realistic look to the rendered virtual content. obtain.

図２０Ａ－２０Ｃの実施例では、ユーザ１のデバイスは、変換ｒｉｇ１＿Ｔ＿ｗ１によってマップの原点を定義する、座標系に関連し得る、座標系を有する。ユーザ２のデバイスは、類似変換ｒｉｇ２＿Ｔ＿ｗ２を有する。これらの変換は、６度の変換として表され、デバイス座標系とマップ座標系を整合させるための平行移動および回転を規定し得る。いくつかの実施形態では、変換は、２つの別個の変換として表され得、一方は、平行移動を規定し、他方は、回転を規定する。故に、変換は、算出を簡略化する、または別様に利点を提供する形態で表され得ることを理解されたい。 In the embodiment of FIGS. 20A-20C, the device of user 1 has a coordinate system that may be associated with a coordinate system that defines the origin of the map by the transformation rig1_T_w1. The device of user 2 has a similar conversion rig2_T_w2. These transformations are represented as 6 degree transformations and may specify translations and rotations to align the device coordinate system with the map coordinate system. In some embodiments, the transformation can be represented as two separate transformations, one defining translation and the other defining rotation. Therefore, it should be understood that the transformation can be expressed in a form that simplifies the calculation or otherwise provides an advantage.

追跡マップの原点と個別のユーザデバイスによって識別されるＰＣＦとの間の変換は、ｐｃｆ１＿Ｔ＿ｗ１およびｐｃｆ２＿Ｔ＿ｗ２として表される。本実施例では、ＰＣＦおよびＰＰは、同一変換がまたＰＰを特徴付けるように、同じである。 The transformation between the origin of the tracking map and the PCF identified by the individual user device is represented as pcf1_T_w1 and pcf2_T_w2. In this example, PCF and PP are the same, as the same transformation also characterizes PP.

ＰＣＦに対するユーザデバイスの場所は、したがって、ｒｉｇ１＿Ｔ＿ｐｃｆ１＝（ｒｉｇ１＿Ｔ＿ｗ１）^＊（ｐｃｆ１＿Ｔ＿ｗ１）等のこれらの変換の連続適用によって算出されることができる。 The location of the user device for the PCF can therefore be calculated by continuous application of these transformations such as rig1_T_pcf1 = (rig1_T_w1) ^* (pcf1_T_w1).

図２０Ｃに示されるように、仮想コンテンツは、ｏｂｊ１＿Ｔ＿ｐｃｆ１の変換を用いて、ＰＣＦに対して位置特定される。本変換は、ＰＣＦに対する物理的オブジェクトを説明する世界再構築システムからの情報を受信し得る、仮想コンテンツを生成するアプリケーションによって設定されてもよい。仮想コンテンツをユーザにレンダリングするために、ユーザのデバイスの座標系への変換が、算出され、これは、変換ｏｂｊ１＿ｔ＿ｗ１＝（ｏｂｊ１＿Ｔ＿ｐｃｆ１）^＊（ｐｃｆ１＿Ｔ＿ｗ１）を通して、仮想コンテンツ座標フレームを追跡マップの原点に関連させることによって算出され得る。その変換は、次いで、さらなる変換ｒｉｇ１＿Ｔ＿ｗ１を通して、ユーザのデバイスに関連され得る。 As shown in FIG. 20C, the virtual content is located relative to the PCF using the transformation of obj1_T_pcf1. The transformation may be set by an application that produces virtual content that may receive information from a world reconstruction system that describes a physical object for the PCF. To render the virtual content to the user, a transformation of the user's device into the coordinate system is calculated, which relates the virtual content coordinate frame to the origin of the tracking map through the transformation obj1_t_w1 = (obj1_T_pcf1) ^* (pcf1_T_w1). It can be calculated by letting it. The conversion may then be associated with the user's device through a further conversion rig1_T_w1.

仮想コンテンツの場所は、仮想コンテンツを生成するアプリケーションからの出力に基づいて変化し得る。それが変化すると、ソース座標系から目的地座標系へのエンドツーエンド変換が、再算出され得る。加えて、ユーザの場所および／または頭部姿勢も、ユーザが移動するにつれて変化し得る。結果として、変換ｒｉｇ１＿Ｔ＿ｗ１が、変化し得るのと同様に、ユーザの場所または頭部姿勢に依存する、任意のエンドツーエンド変換も変化するであろう。 The location of the virtual content can change based on the output from the application that produces the virtual content. As it changes, the end-to-end transformation from the source coordinate system to the destination coordinate system can be recalculated. In addition, the user's location and / or head posture may change as the user moves. As a result, any end-to-end conversion that depends on the user's location or head posture will change as well as the conversion rig1_T_w1 can change.

変換ｒｉｇ１＿Ｔ＿ｗ１は、ユーザの位置を物理的世界内の定常オブジェクトに対して追跡することに基づいて、ユーザの運動に伴って更新されてもよい。そのような追跡は、上記に説明されるように、画像のシーケンスを処理するヘッドホン追跡コンポーネント、またはシステムの他のコンポーネントによって実施されてもよい。そのような更新は、ＰＰ等の定常基準フレームに対するユーザの姿勢を決定することによって行われてもよい。 The transformation rig1_T_w1 may be updated with the user's movements based on tracking the user's position with respect to stationary objects in the physical world. Such tracking may be performed by a headphone tracking component that processes the sequence of images, or other components of the system, as described above. Such updates may be made by determining the user's attitude towards a stationary reference frame such as PP.

いくつかの実施形態では、ユーザデバイスの場所および配向は、最も近くの持続的姿勢に対して決定されてもよい、または、本実施例では、ＰＰとしてのＰＣＦが、ＰＣＦとして使用される。そのような決定は、デバイス上のセンサを用いて捕捉された現在の画像内において、ＰＰを特徴付ける、特徴点を識別することによって、行われてもよい。立体視画像分析等の画像処理技法を使用して、それらの特徴点に対するデバイスの場所が、決定されてもよい。本データから、システムは、関係ｒｉｇ１＿Ｔ＿ｐｃｆ１＝（ｒｉｇ１＿Ｔ＿ｗ１）^＊（ｐｃｆ１＿Ｔ＿ｗ１）に基づいて、ユーザの運動と関連付けられる変換の変化を計算し得る。 In some embodiments, the location and orientation of the user device may be determined for the nearest sustained posture, or in this embodiment the PCF as a PP is used as the PCF. Such a determination may be made by identifying feature points that characterize the PP within the current image captured using the sensor on the device. Image processing techniques such as stereoscopic image analysis may be used to determine the location of the device for those feature points. From this data, the system can calculate the transformation changes associated with the user's movements based on the relationship rig1_T_pcf1 = (rig1_T_w1) ^* (pcf1_T_w1).

システムは、算出上効率的である順序において、変換を決定および適用してもよい。例えば、ｒｉｇ１＿Ｔ＿ｐｃｆ１をもたらす測定からｒｉｇ１＿Ｔ＿ｗ１を算出する必要性は、ユーザ姿勢を追跡することと、持続的姿勢上に構築されるＰＰまたはＰＣＦに対する仮想コンテンツの場所を定義することとの両方によって、回避され得る。このように、仮想コンテンツのソース座標系からユーザのデバイスの目的地座標系への変換は、表現（ｒｉｇ１＿Ｔ＿ｐｃｆ１）^＊（ｏｂｊ１＿ｔ＿ｐｃｆ１）に従う測定された変換に基づいてもよく、第１の変換は、システムによって測定され、後者の変換は、レンダリングのための仮想コンテンツを規定するアプリケーションによって供給される。仮想コンテンツがマップの原点に対して位置付けられる、実施形態では、エンドツーエンド変換は、マップ座標とＰＣＦ座標との間のさらなる変換に基づいて、仮想オブジェクト座標系をＰＣＦ座標系に関連させ得る。仮想コンテンツが、それに対してユーザ位置が追跡されているものと異なるＰＰまたはＰＣＦに対して位置付けられる、実施形態では、２つの間の変換が、適用されてもよい。そのような変換は、固定されてもよく、例えば、両方が現れる、マップから決定されてもよい。 The system may determine and apply the transformations in a computationally efficient order. For example, the need to calculate rig1_T_w1 from the measurements resulting in rig1_T_pcf1 is avoided by both tracking the user posture and defining the location of virtual content for the PP or PCF built on the persistent posture. obtain. Thus, the transformation of the virtual content from the source coordinate system to the destination coordinate system of the user's device may be based on the measured transformation according to the representation (rig1_T_pcf1) ^* (obj1_t_pcf1), the first transformation being the system. Measured by, the latter transformation is provided by the application that defines the virtual content for rendering. In an embodiment where the virtual content is positioned relative to the origin of the map, the end-to-end transformation may associate the virtual object coordinate system with the PCF coordinate system based on further transformations between the map coordinates and the PCF coordinates. In embodiments where the virtual content is positioned for a PP or PCF that is different from the one for which the user location is tracked, a transformation between the two may be applied. Such transformations may be fixed or, for example, determined from a map where both appear.

変換ベースのアプローチは、例えば、センサデータを処理し、追跡マップを構築する、コンポーネントを伴う、デバイス内に実装されてもよい。そのプロセスの一部として、それらのコンポーネントは、持続的姿勢として使用され得る、特徴点を識別してもよく、これは、ひいては、ＰＣＦに変えられてもよい。それらのコンポーネントは、図１７－１９に関連して上記に説明されるように、マップのために生成される持続的姿勢の数を限定し、持続的姿勢間の好適な間隔を提供する一方、ユーザが、物理的環境内の場所にかかわらず、持続的姿勢場所に十分に近接し、ユーザの姿勢を正確に算出することを可能にし得る。ユーザに最も近い持続的姿勢が、ユーザ移動、追跡マップに対する精緻化、または他の原因の結果として、更新されるにつれて、ＰＰ（または使用されている場合、ＰＣＦ）の場所に依存する、ユーザに対する仮想コンテンツの場所を算出するために使用される、変換のいずれかが、少なくとも、ユーザがその持続的姿勢から離れるように移動するまで、使用のために、更新および記憶されてもよい。なお、変換を算出および記憶することによって、仮想コンテンツの場所が更新される度の算出負担は、比較的に短待ち時間を伴って実施され得るほど比較的に低くなり得る。 The transformation-based approach may be implemented within the device, with components, for example, processing sensor data and building tracking maps. As part of that process, those components may identify feature points that can be used as a sustained posture, which in turn may be converted to a PCF. While those components limit the number of persistent poses generated for the map and provide a suitable spacing between persistent poses, as described above in connection with FIGS. 17-19, while It may allow the user to be close enough to a sustained posture location, regardless of location in the physical environment, to accurately calculate the user's posture. To the user, the persistent attitude closest to the user depends on the location of the PP (or PCF, if used) as it is updated as a result of user movement, refinement to the tracking map, or other causes. Any of the transformations used to calculate the location of the virtual content may be updated and stored for use, at least until the user moves away from its persistent posture. It should be noted that by calculating and storing the conversion, the calculation burden each time the location of the virtual content is updated can be relatively low enough to be performed with a relatively short waiting time.

図２０Ａ－２０Ｃは、追跡マップに対する位置付けを図示し、各デバイスがその独自の追跡マップを有する。しかしながら、変換は、任意のマップ座標系に対して生成されてもよい。ＸＲシステムのユーザセッションを横断したコンテンツの存続は、持続的マップを使用することによって達成され得る。ユーザの共有体験はまた、それに対して複数のユーザデバイスが配向され得る、マップを使用することによって促進され得る。 20A-20C illustrate the position with respect to the tracking map, and each device has its own tracking map. However, the transformation may be generated for any map coordinate system. The survival of content across user sessions in an XR system can be achieved by using persistent maps. The user's shared experience can also be facilitated by using a map, to which multiple user devices can be oriented.

下記により詳細に説明される、いくつかの実施形態では、仮想コンテンツの場所は、複数のデバイスのいずれかがマップを使用し得るようにフォーマットされる、規準マップにおける座標に関連して規定されてもよい。各デバイスは、追跡マップを維持し得、追跡マップに対するユーザの姿勢の変化を決定してもよい。本実施例では、追跡マップと規準マップとの間の変換は、「位置特定」のプロセスを通して決定されてもよく、これは、追跡マップ内の構造（１つ以上の持続的姿勢等）と規準マップの１つ以上の構造（１つ以上のＰＣＦ等）をマッチングさせることによって実施されてもよい。 In some embodiments, described in more detail below, the location of the virtual content is defined in relation to the coordinates in the canonical map, which is formatted so that one of a plurality of devices can use the map. May be good. Each device may maintain a tracking map and may determine changes in the user's attitude towards the tracking map. In this embodiment, the transformation between the tracking map and the criteria map may be determined through the process of "locating", which is the structure within the tracking map (such as one or more persistent postures) and the criteria. It may be implemented by matching one or more structures of the map (such as one or more PCFs).

下記にさらに説明されるものは、このように規準マップを作成および使用するための技法である。 Further described below are techniques for creating and using canonical maps in this way.

深層キーフレーム Deep keyframe

本明細書に説明されるような技法は、画像フレームの比較に依拠する。例えば、追跡マップに対するデバイスの位置を確立するために、新しい画像が、ユーザによって装着されるセンサを用いて捕捉されてもよく、ＸＲシステムは、追跡マップを作成するために使用された画像のセット内において、少なくとも所定の量の着目点を新しい画像と共有する、画像を検索してもよい。画像フレームの比較を伴う、別のシナリオの実施例として、追跡マップが、最初に、規準マップ内のＰＣＦと関連付けられる画像フレームに類似する、追跡マップ内の持続的姿勢と関連付けられる画像フレームを見出すことによって、規準マップに対して位置特定され得る。代替として、２つの規準マップ間の変換が、最初に、２つのマップ内の類似画像フレームを見出すことによって、算出されてもよい。 Techniques as described herein rely on image frame comparisons. For example, new images may be captured using a sensor worn by the user to establish the location of the device with respect to the tracking map, and the XR system may be a set of images used to create the tracking map. Within, an image may be searched that shares at least a predetermined amount of points of interest with the new image. As an example of another scenario involving comparison of image frames, the tracking map first finds an image frame associated with a persistent posture in the tracking map that is similar to the image frame associated with the PCF in the reference map. Thereby, it can be located with respect to the reference map. Alternatively, the transformation between the two canonical maps may be calculated by first finding similar image frames in the two maps.

深層キーフレームは、類似画像フレームを識別するために要求される処理の量を低減させるための方法を提供する。例えば、いくつかの実施形態では、比較は、新しい２Ｄ画像内の画像特徴（例えば、「２Ｄ特徴」）とマップ内の３Ｄ特徴との間で行われ得る。そのような比較は、３Ｄ画像を２Ｄ平面の中に投影すること等によって、任意の好適な方法において行われ得る。単語の袋（ＢｏＷ）等の従来の方法は、マップ内の全ての２Ｄ特徴を含む、データベース内の新しい画像の２Ｄ特徴を検索し、これは、特に、マップが大エリアを表すとき、有意な算出リソースを要求し得る。従来の方法は、次いで、２Ｄ特徴のうちの少なくとも１つを新しい画像と共有する、画像を位置特定し、これは、マップ内の有意義な３Ｄ特徴を位置特定するために有用ではない、画像を含み得る。従来の方法は、次いで、新しい画像内の２Ｄ特徴に対して有意義ではない、３Ｄ特徴を位置特定する。 Deep keyframes provide a method for reducing the amount of processing required to identify similar image frames. For example, in some embodiments, the comparison may be made between an image feature in a new 2D image (eg, "2D feature") and a 3D feature in the map. Such comparisons can be made in any suitable method, such as by projecting a 3D image into a 2D plane. Traditional methods, such as Word Bags (BoW), search for 2D features in a new image in the database, including all 2D features in the map, which is significant, especially when the map represents a large area. Calculation resources may be requested. Traditional methods then locate the image, sharing at least one of the 2D features with the new image, which is not useful for locating meaningful 3D features in the map. Can include. Conventional methods then locate 3D features that are not meaningful to the 2D features in the new image.

本発明者らは、より少ないメモリリソース（例えば、ＢｏＷによって使用されるメモリリソースの４分の１）、より高い効率（例えば、キーフレーム毎に２．５ｍｓの処理時間、５００枚のキーフレームに対する比較のために１００μｓ）、およびより高い正確度（例えば、１，０２４次元モデルのためにＢｏＷより２０％良好な読出再生、２５６次元モデルのためにＢｏＷより５％良好な読出再生）を使用した、マップ内の画像を読み出すための技法を認識および理解している。 We have less memory resources (eg, a quarter of the memory resources used by BoW), higher efficiency (eg, 2.5 ms processing time per keyframe, for 500 keyframes). 100 μs) and higher accuracy (eg, 20% better read / replay than BoW for 1,024 dimensional models) and 5% better read / play than BoW for 256 dimensional models) were used. Recognizes and understands techniques for reading images in maps.

算出を低減させるために、画像フレームと他の画像フレームを比較するために使用され得る、記述子が、画像フレームのために算出されてもよい。記述子は、画像フレームおよび特徴点の代わりに、またはそれに加え、記憶されてもよい。持続的姿勢および／またはＰＣＦが画像フレームから生成され得る、マップでは、そこから各持続的姿勢またはＰＣＦが生成された画像フレームまたは複数のフレームの記述子が、持続的姿勢および／またはＰＣＦの一部として記憶されてもよい。 To reduce the calculation, a descriptor that can be used to compare an image frame to another image frame may be calculated for the image frame. Descriptors may be stored in place of, or in addition to, image frames and feature points. Persistent postures and / or PCFs can be generated from image frames, from which the descriptor of the image frame or multiple frames from which each persistent posture or PCF is generated is one of the persistent postures and / or PCFs. It may be stored as a part.

いくつかの実施形態では、記述子は、画像フレーム内の特徴点の関数として算出されてもよい。いくつかの実施形態では、ニューラルネットワークが、画像を表すための一意のフレーム記述子を算出するように構成される。画像は、ユーザによって装着されるデバイスの視野内の３Ｄ環境の十分な詳細が画像内で捕捉されるように、１メガバイトより高い分解能を有し得る。フレーム記述子は、数字の列、例えば、１２８バイト～５１２バイトの範囲内またはその間の任意の数等、はるかに小さくあり得る。 In some embodiments, the descriptor may be calculated as a function of feature points within the image frame. In some embodiments, the neural network is configured to calculate a unique frame descriptor to represent the image. The image may have a resolution higher than 1 megabyte so that sufficient details of the 3D environment within the field of view of the device worn by the user are captured in the image. The frame descriptor can be much smaller, such as a sequence of numbers, for example any number in or between 128 bytes to 512 bytes.

いくつかの実施形態では、ニューラルネットワークは、算出されたフレーム記述子が画像間の類似性を示すように訓練される。マップ内の画像は、マップを生成するために使用される画像を備えるデータベース内において、新しい画像のためのフレーム記述子に対して所定の距離内のフレーム記述子を有し得る、最も近くの画像を識別することによって位置特定され得る。いくつかの実施形態では、画像間の距離は、２つの画像のフレーム記述子間の差異によって表されてもよい。 In some embodiments, the neural network is trained so that the calculated frame descriptor shows the similarity between the images. The image in the map is the closest image in the database that contains the image used to generate the map, which may have a frame descriptor within a given distance to the frame descriptor for the new image. Can be located by identifying. In some embodiments, the distance between the images may be represented by the difference between the frame descriptors of the two images.

図２１は、いくつかの実施形態による、個々の画像のための記述子を生成するためのシステムを図示する、ブロック図である。図示される実施例では、フレーム埋込生成器３０８が、示される。フレーム埋込生成器３０８は、いくつかの実施形態では、サーバ２０と併用されてもよいが、代替として、または加えて、全体または部分的に、ＸＲデバイス１２．１および１２．２、または他の画像との比較のための画像を処理する任意の他のデバイスのうちの１つ内で実行されてもよい。 FIG. 21 is a block diagram illustrating a system for generating descriptors for individual images, according to some embodiments. In the illustrated embodiment, the frame embedding generator 308 is shown. The frame embedding generator 308 may be used in conjunction with the server 20 in some embodiments, but as an alternative or in addition, in whole or in part, the XR devices 12.1 and 12.2, or others. It may be performed within one of any other device that processes the image for comparison with the image of.

いくつかの実施形態では、フレーム埋込生成器は、低減されたサイズにもかかわらず、それでもなお、画像内のコンテンツを示す、初期サイズ（例えば、７６，８００バイト）から最終サイズ（例えば、２５６バイト）に低減された画像のデータ表現を生成するように構成されてもよい。いくつかの実施形態では、フレーム埋込生成器は、他の方法において使用されるキーフレームまたはフレームであり得る、画像のためのデータ表現を生成するために使用されてもよい。いくつかの実施形態では、フレーム埋込生成器３０８は、特定の場所および配向における画像を一意の数字の列（例えば、２５６バイト）に変換するように構成されてもよい。図示される実施例では、ＸＲデバイスによって撮影される画像３２０は、画像３２０内の着目点３２２を検出するために特徴抽出器３２４によって処理されてもよい。着目点は、特徴１１２０（図１４）に関して上記に説明されるように、または本明細書に別様に説明されるように、識別された特徴点から導出される場合とそうではない場合がある。いくつかの実施形態では、着目点は、深層疎特徴方法を使用して生成され得る、記述子１１３０（図１４）に関して上記に説明されるように、記述子によって表されてもよい。いくつかの実施形態では、各着目点３２２は、数字の列（例えば、３２バイト）によって表されてもよい。例えば、ｎ個の特徴（例えば、１００）が存在してもよく、各特徴は、３２バイトの列によって表される。 In some embodiments, the frame embedding generator, despite the reduced size, still indicates content in the image, from initial size (eg, 76,800 bytes) to final size (eg, 256). It may be configured to generate a data representation of the image reduced to bytes). In some embodiments, the frame embedding generator may be used to generate a data representation for an image, which may be a keyframe or frame used in other ways. In some embodiments, the frame embedding generator 308 may be configured to transform an image at a particular location and orientation into a unique sequence of numbers (eg, 256 bytes). In the illustrated embodiment, the image 320 captured by the XR device may be processed by the feature extractor 324 to detect the point of interest 322 in the image 320. Points of interest may or may not be derived from the identified feature points, as described above for feature 1120 (FIG. 14) or otherwise as described herein. .. In some embodiments, points of interest may be represented by descriptors, as described above with respect to descriptor 1130 (FIG. 14), which can be generated using the deep sparse feature method. In some embodiments, each point of interest 322 may be represented by a sequence of numbers (eg, 32 bytes). For example, there may be n features (eg, 100), each feature represented by a 32-byte sequence.

いくつかの実施形態では、フレーム埋込生成器３０８は、ニューラルネットワーク３２６を含んでもよい。ニューラルネットワーク３２６は、多層パーセプトロンユニット３１２と、最大（ｍａｘ）プールユニット３１４とを含んでもよい。いくつかの実施形態では、多層パーセプトロン（ＭＬＰ）ユニット３１２は、多層パーセプトロンを備えてもよく、これは、訓練されてもよい。いくつかの実施形態では、着目点３２２（例えば、着目点のための記述子）は、多層パーセプトロン３１２によって低減され得、記述子の加重された組み合わせ３１０として出力され得る。例えば、ＭＬＰは、ｎ個の特徴を、ｎ個の特徴未満である、ｍ個の特徴に低減させ得る。 In some embodiments, the frame embedding generator 308 may include a neural network 326. The neural network 326 may include a multi-layer perceptron unit 312 and a maximum (max) pool unit 314. In some embodiments, the multi-layer perceptron (MLP) unit 312 may include a multi-layer perceptron, which may be trained. In some embodiments, the point of interest 322 (eg, the descriptor for the point of interest) can be reduced by the Multilayer Perceptron 312 and output as a weighted combination of descriptors 310. For example, the MLP can reduce n features to m features, which is less than n features.

いくつかの実施形態では、ＭＬＰユニット３１２は、行列乗算を実施するように構成されてもよい。多層パーセプトロンユニット３１２は、画像３２０の複数の着目点３２２を受信し、各着目点を個別の数字の列（例えば、２５６）に変換する。例えば、１００個の特徴が存在してもよく、各特徴は、２５６の数字の列によって表され得る。行列は、本実施例では、１００の水平行および２５６の垂直列を有するように作成されてもよい。各行は、一部は、より小さく、一部は、より大きい、大きさが変動する、一連の２５６の数を有してもよい。いくつかの実施形態では、ＭＬＰの出力は、ｎ×２５６の行列であってもよく、ｎは、画像から抽出された特徴の数を表す。いくつかの実施形態では、ＭＬＰの出力は、ｍ×２５６行列であってもよく、ｍは、ｎから低減された着目点の数である。 In some embodiments, the MLP unit 312 may be configured to perform matrix multiplication. The multi-layer perceptron unit 312 receives a plurality of points of interest 322 of the image 320 and converts each point of interest into an individual sequence of numbers (eg, 256). For example, there may be 100 features, each feature may be represented by a sequence of 256 numbers. The matrix may be made to have 100 horizontal parallels and 256 vertical columns in this embodiment. Each row may have a series of 256 numbers, some smaller, some larger, varying in size. In some embodiments, the output of the MLP may be an n × 256 matrix, where n represents the number of features extracted from the image. In some embodiments, the output of the MLP may be an m × 256 matrix, where m is the number of points of interest reduced from n.

いくつかの実施形態では、ＭＬＰ３１２は、その間にＭＬＰのためのモデルパラメータが決定される、訓練相と、使用相とを有してもよい。いくつかの実施形態では、ＭＬＰは、図２５に図示されるように、訓練されてもよい。入力訓練データは、３つのセット内にデータを備えてもよく、３つのセットは、１）クエリ画像と、２）正のサンプルと、３）負のサンプルとを備える。クエリ画像は、基準画像と見なされ得る。 In some embodiments, the MLP 312 may have a training phase and a working phase in which model parameters for the MLP are determined. In some embodiments, the MLP may be trained as illustrated in FIG. The input training data may include the data in three sets, the three sets comprising 1) a query image, 2) a positive sample and 3) a negative sample. The query image can be considered a reference image.

いくつかの実施形態では、正のサンプルは、クエリ画像に類似する、画像を備えてもよい。例えば、いくつかの実施形態では、類似するとは、同一オブジェクトをクエリおよび正のサンプル画像の両方内に有することであるが、異なる角度から視認され得る。いくつかの実施形態では、類似するとは、同一オブジェクトをクエリおよび正のサンプル画像の両方内に有することであるが、他の画像に対して（例えば、左、右、上、下に）偏移されるオブジェクトを有し得る。 In some embodiments, the positive sample may comprise an image similar to the query image. For example, in some embodiments, similarity means having the same object in both the query and the positive sample image, but can be viewed from different angles. In some embodiments, similarity is having the same object in both the query and the positive sample image, but with respect to other images (eg, left, right, up, down). Can have an object to be.

いくつかの実施形態では、負のサンプルは、クエリ画像と類似しない、画像を備えてもよい。例えば、いくつかの実施形態では、類似しない画像は、クエリ画像内で顕著な任意のオブジェクトを含有し得ない、またはクエリ画像内で顕著なオブジェクトの小部分のみ（例えば、＜１０％、１％）を含有し得る。類似画像は、対照的に、例えば、クエリ画像内でオブジェクトの大部分（例えば、＞５０％、または＞７５％）を有し得る。 In some embodiments, the negative sample may comprise an image that is not similar to the query image. For example, in some embodiments, dissimilar images may not contain any prominent objects in the query image, or only a small portion of the prominent objects in the query image (eg, <10%, 1%). ) Can be contained. Similar images, in contrast, may have, for example, the majority of objects (eg,> 50%, or> 75%) in the query image.

いくつかの実施形態では、着目点は、入力訓練データ内の画像から抽出されてもよく、特徴記述子に変換されてもよい。これらの記述子は、図２５に示されるように、訓練画像のためと、図２１のフレーム埋込生成器３０８の動作時に抽出された特徴のためとの両方のために算出されてもよい。いくつかの実施形態では、深層疎特徴（ＤＳＦ）プロセスが、米国特許出願第１６／１９０，９４８号に説明されるように、記述子（例えば、ＤＳＦ記述子）を生成するために使用されてもよい。いくつかの実施形態では、ＤＳＦ記述子は、ｎ×３２寸法である。記述子は、次いで、モデル／ＭＬＰを通して通過され、２５６バイト出力を作成してもよい。いくつかの実施形態では、モデル／ＭＬＰは、いったんモデルパラメータが訓練を通して設定されると、結果として生じる訓練されたＭＬＰがＭＬＰ３１２として使用され得るように、ＭＬＰ３１２と同一構造を有してもよい。 In some embodiments, the points of interest may be extracted from the images in the input training data or converted into feature descriptors. These descriptors may be calculated both for the training image and for the features extracted during operation of the frame embedding generator 308 of FIG. 21 as shown in FIG. In some embodiments, a Deep Sparseness Feature (DSF) process is used to generate a descriptor (eg, a DSF descriptor) as described in US Patent Application No. 16 / 190,948. May be good. In some embodiments, the DSF descriptor is n × 32 dimensions. The descriptor may then be passed through the model / MLP to produce a 256-byte output. In some embodiments, the model / MLP may have the same structure as the MLP 312 so that once the model parameters are set through training, the resulting trained MLP can be used as the MLP 312.

いくつかの実施形態では、特徴記述子（例えば、ＭＬＰモデルから出力された２５６バイト）は、次いで、トリプレットマージン損失モジュール（ＭＬＰニューラルネットワークの訓練相の間にのみ使用され、使用相の間には使用され得ない）に送信されてもよい。いくつかの実施形態では、トリプレットマージン損失モジュールは、クエリ画像から出力された２５６バイトと正のサンプルから出力された２５６バイトとの間の差異を低減させ、クエリ画像から出力された２５６バイトと負のサンプルから出力された２５６バイトとの間の差異を増加させるように、モデルのためのパラメータを選択するように構成されてもよい。いくつかの実施形態では、訓練相は、複数のトリプレット入力画像を学習プロセスの中にフィードし、モデルパラメータを決定するステップを含んでもよい。本訓練プロセスは、例えば、正の画像に関する差異が最小限にされ、負の画像に関する差異が最大限にされるまで、または他の好適な終了基準に到達するまで、継続し得る。 In some embodiments, the feature descriptor (eg, 256 bytes output from the MLP model) is then used only during the triplet margin loss module (training phase of the MLP neural network) and between the used phases. May not be used). In some embodiments, the triplet margin loss module reduces the difference between 256 bytes output from the query image and 256 bytes output from the positive sample, with 256 bytes output from the query image and negative. The parameters for the model may be configured to increase the difference from the 256 bytes output from the sample. In some embodiments, the training phase may include feeding multiple triplet input images into the learning process to determine model parameters. The training process may continue, for example, until the differences for positive images are minimized and the differences for negative images are maximized, or until other suitable termination criteria are reached.

図２１に戻って参照すると、フレーム埋込生成器３０８は、ここでは、最大（ｍａｘ）プールユニット３１４として図示される、プーリング層を含んでもよい。最大プールユニット３１４は、各列を分析し、個別の列内の最大数を決定してもよい。最大プールユニット３１４は、ＭＬＰ３１２の出力行列の数の各列の最大値を、例えば、２５６の数のグローバル特徴列３１６に組み合わせてもよい。ＸＲシステム内で処理された画像は、望ましくは、潜在的に数百万ピクセルを伴う、高分解能フレームを有し得ることを理解されたい。グローバル特徴列３１６は、比較的に少ないメモリを占有し、画像（例えば、１メガバイトより高い分解能を伴う）と比較して、容易に検索可能である、比較的に小数である。したがって、カメラからの各オリジナルフレームを分析せずに、画像を検索することが可能であって、また、完全フレームの代わりに、２５６バイトを記憶するようにより安価である。 With reference back to FIG. 21, the frame embedding generator 308 may include a pooling layer, which is illustrated here as the max pool unit 314. The maximum pool unit 314 may analyze each column to determine the maximum number within an individual column. The maximum pool unit 314 may combine the maximum value of each column of the number of output matrices of the MLP 312 into, for example, the global feature column 316 of the number of 256. It should be appreciated that images processed within an XR system can preferably have high resolution frames, potentially with millions of pixels. The global feature sequence 316 is a relatively small number that occupies a relatively small amount of memory and is easily searchable compared to an image (eg, with a resolution higher than 1 megabyte). Therefore, it is possible to retrieve the image without analyzing each original frame from the camera, and it is cheaper to store 256 bytes instead of a complete frame.

図２２は、いくつかの実施形態による、画像記述子を算出する方法２２００を図示する、フローチャートである。方法２２００は、ユーザによって装着されるＸＲデバイスによって捕捉された複数の画像を受信するステップ（行為２２０２）から開始してもよい。いくつかの実施形態では、方法２２００は、１つ以上のキーフレームを複数の画像から決定するステップ（行為２２０４）を含んでもよい。いくつかの実施形態では、行為２２０４は、スキップされてもよい、および／または代わりに、ステップ２２１０後に生じてもよい。 FIG. 22 is a flow chart illustrating a method 2200 for calculating an image descriptor according to some embodiments. Method 2200 may start from the step of receiving a plurality of images captured by an XR device worn by the user (act 2202). In some embodiments, the method 2200 may include determining one or more keyframes from a plurality of images (act 2204). In some embodiments, the act 2204 may be skipped and / or instead may occur after step 2210.

方法２２００は、人工ニューラルネットワークを用いて、複数の画像内の１つ以上の着目点を識別するステップ（行為２２０６）と、人工ニューラルネットワークを用いて、個々の着目点のための特徴記述子を算出するステップ（行為２２０８）とを含んでもよい。本方法は、画像毎に、少なくとも部分的に、人工ニューラルネットワークを用いて、画像内の識別された着目点のための算出された特徴記述子に基づいて、画像を表すためのフレーム記述子を算出するステップ（行為２２１０）を含んでもよい。 Method 2200 uses an artificial neural network to identify one or more points of interest in a plurality of images (act 2206) and uses an artificial neural network to provide a feature descriptor for each point of interest. It may include a step to calculate (act 2208). The method uses an artificial neural network, at least in part, for each image to provide a frame descriptor to represent the image based on the calculated feature descriptor for the identified points of interest in the image. It may include a step to calculate (act 2210).

図２３は、いくつかの実施形態による、画像記述子を使用した位置特定の方法２３００を図示する、フローチャートである。本実施例では、ＸＲデバイスの現在の場所を描写する、新しい画像フレームが、マップ内の点（上記に説明されるように、持続的姿勢またはＰＣＦ等）に関連して記憶された画像フレームと比較されてもよい。方法２３００は、ユーザによって装着されるＸＲデバイスによって捕捉された新しい画像を受信するステップ（行為２３０２）から開始してもよい。方法２３００は、１つ以上のマップを生成するために使用されるキーフレームを備える、データベース内の１つ以上の最も近くのキーフレームを識別するステップ（行為２３０４）を含んでもよい。いくつかの実施形態では、最も近くのキーフレームは、大まかな空間情報および／または以前に決定された空間情報に基づいて識別されてもよい。例えば、大まかな空間情報は、ＸＲデバイスがマップの５０ｍ×５０ｍエリアによって表される地理的領域内に存在することを示し得る。画像マッチングは、そのエリア内の点のみに関して実施されてもよい。別の実施例として、追跡に基づいて、ＸＲシステムは、ＸＲデバイスが、マップ内の第１の持続的姿勢に以前に近接しており、マップ内の第２の持続的姿勢の方向に移動していたことを把握してもよい。その第２の持続的姿勢は、最も近くの持続的姿勢と見なされ得、それとともに記憶されたキーフレームは、最も近くのキーフレームと見なされ得る。代替として、または加えて、ＧＰＳデータまたはＷｉＦｉフィンガプリント等の他のメタデータも、最も近くのキーフレームまたは最も近くのキーフレームのセットを選択するために使用されてもよい。 FIG. 23 is a flow chart illustrating a location-locating method 2300 using an image descriptor, according to some embodiments. In this embodiment, a new image frame depicting the current location of the XR device is with the image frame stored in relation to a point in the map (such as a persistent posture or PCF, as described above). May be compared. Method 2300 may start from the step of receiving a new image captured by the XR device worn by the user (act 2302). Method 2300 may include identifying one or more closest keyframes in a database (act 2304), comprising keyframes used to generate one or more maps. In some embodiments, the nearest keyframe may be identified based on rough spatial information and / or previously determined spatial information. For example, rough spatial information may indicate that the XR device is within the geographic area represented by the 50m x 50m area of the map. Image matching may be performed only on points within the area. As another embodiment, based on tracking, the XR system moves the XR device in the direction of the second sustained attitude in the map, where the XR device was previously in close proximity to the first sustained attitude in the map. You may know what was happening. The second persistent posture can be considered the closest persistent posture, and the keyframes stored with it can be considered the closest keyframe. Alternatively, or in addition, GPS data or other metadata such as WiFi finger prints may also be used to select the nearest keyframe or set of nearest keyframes.

最も近くのキーフレームが選択される方法にかかわらず、フレーム記述子が、新しい画像が近隣の持続的姿勢と関連付けられるものとして選択されたフレームのいずれかにマッチングするかどうかを決定するために使用されてもよい。決定は、新しい画像のフレーム記述子と任意の他の好適な方法において選択されたデータベース内の最も近いキーフレームまたはキーフレームのサブセットのフレーム記述子を比較し、新しい画像のフレーム記述子の所定の距離内のフレーム記述子を伴うキーフレームを選択することによって行われてもよい。いくつかの実施形態では、２つのフレーム記述子間の距離は、２つのフレーム記述子を表し得る、２つの数字の列間の差異を取得することによって算出されてもよい。列が複数の量の列として処理される、実施形態では、差異は、ベクトル差異として算出されてもよい。 Regardless of how the nearest keyframe is selected, the frame descriptor is used to determine if the new image matches any of the selected frames to be associated with the persistent pose of the neighborhood. May be done. The decision compares the frame descriptor of the new image with the frame descriptor of the nearest keyframe or subset of keyframes in the database selected in any other suitable way, and determines the given frame descriptor of the new image. This may be done by selecting a keyframe with a frame descriptor within the distance. In some embodiments, the distance between the two frame descriptors may be calculated by obtaining the difference between the columns of two numbers that may represent the two frame descriptors. In embodiments where the columns are treated as columns of multiple quantities, the differences may be calculated as vector differences.

いったんマッチング画像フレームが、識別されると、その画像フレームに対するＸＲデバイスの配向が、決定され得る。方法２３００は、識別された最も近くのキーフレームに対応する、マップ内の３Ｄ特徴に対して特徴マッチングを実施するステップ（行為２３０６）と、特徴マッチング結果に基づいて、ユーザによって装着されるデバイスの姿勢を算出するステップ（行為２３０８）とを含んでもよい。このように、２つの画像内の特徴点の算出上集約的であるマッチングは、新しい画像のための可能性が高いマッチングであるとすでに決定されているわずか１つの画像に関して実施されてもよい。 Once the matching image frame is identified, the orientation of the XR device with respect to that image frame can be determined. Method 2300 is a step of performing feature matching against a 3D feature in a map corresponding to the nearest identified key frame (act 2306) and a device worn by the user based on the feature matching result. It may include a step of calculating the posture (act 2308). Thus, the computationally intensive matching of feature points in the two images may be performed on only one image that has already been determined to be a likely match for the new image.

図２４は、いくつかの実施形態による、ニューラルネットワークを訓練する方法２４００を図示する、フローチャートである。方法２４００は、複数の画像セットを備える、データセットを生成するステップ（行為２４０２）から開始してもよい。複数の画像セットはそれぞれ、クエリ画像と、正のサンプル画像と、負のサンプル画像とを含んでもよい。いくつかの実施形態では、複数の画像セットは、例えば、ニューラルネットワークに形状等の基本情報を教示するように構成される、合成記録対を含んでもよい。いくつかの実施形態では、複数の画像セットは、物理的世界から記録され得る、実記録対を含んでもよい。 FIG. 24 is a flow chart illustrating a method 2400 for training a neural network, according to some embodiments. Method 2400 may start from the step of generating a dataset (act 2402), which comprises a plurality of image sets. Each of the plurality of image sets may include a query image, a positive sample image, and a negative sample image. In some embodiments, the plurality of image sets may include, for example, a synthetic record pair configured to teach a neural network basic information such as shape. In some embodiments, the plurality of image sets may include actual recording pairs that can be recorded from the physical world.

いくつかの実施形態では、インライアは、２つの画像間の基本行列を適合させることによって算出されてもよい。いくつかの実施形態では、疎重複が、画像の両方内で見られる着目点の和集合上の交点（ＩｏＵ）として算出されてもよい。いくつかの実施形態では、正のサンプルは、クエリ画像内で同一である、インライアとしての役割を果たす、少なくとも２０個の着目点を含んでもよい。負のサンプルは、１０未満のインライア点を含んでもよい。負のサンプルは、クエリ画像の解析点と重複する、半分未満の疎点を有してもよい。 In some embodiments, the inliar may be calculated by fitting the elementary matrix between the two images. In some embodiments, sparse duplication may be calculated as an intersection (IoU) on the union of points of interest found in both images. In some embodiments, the positive sample may include at least 20 points of interest that are identical in the query image and serve as inliers. Negative samples may contain less than 10 inlier points. Negative samples may have less than half the sparse points that overlap the analysis points of the query image.

方法２４００は、画像セット毎に、クエリ画像と正のサンプル画像および負のサンプル画像を比較することによって、損失を算出するステップ（行為２４０４）を含んでもよい。方法２４００は、クエリ画像のために人工ニューラルネットワークによって生成されたフレーム記述子と正のサンプル画像のためのフレーム記述子との間の距離が、クエリ画像のためのフレーム記述子と負のサンプル画像のためのフレーム記述子との間の距離未満であるように、算出された損失に基づいて、人工ニューラルネットワークを修正するステップ（行為２４０６）を含んでもよい。 Method 2400 may include the step of calculating the loss (act 2404) by comparing the query image with the positive and negative sample images for each image set. In method 2400, the distance between the frame descriptor generated by the artificial neural network for the query image and the frame descriptor for the positive sample image is the frame descriptor for the query image and the negative sample image. It may include a step (act 2406) of modifying the artificial neural network based on the calculated loss so that it is less than the distance to the frame descriptor for.

個々の画像のためのグローバル記述子を生成するように構成される、方法および装置が、上記に説明されるが、方法および装置は、個々のマップのための記述子を生成するように構成されてもよいことを理解されたい。例えば、マップは、複数のキーフレームを含んでもよく、それぞれ、上記に説明されるように、フレーム記述子を有してもよい。最大プールユニットは、マップのキーフレームのフレーム記述子を分析し、フレーム記述子をマップのための一意のマップ記述子の中に組み合わせてもよい。 Methods and devices configured to generate global descriptors for individual images are described above, but methods and devices are configured to generate descriptors for individual maps. Please understand that it is okay. For example, the map may contain multiple keyframes, each with a frame descriptor as described above. The largest pool unit may analyze the frame descriptors of the map's keyframes and combine the frame descriptors into a unique map descriptor for the map.

さらに、他のアーキテクチャも、上記に説明されるように、処理のために使用されてもよいことを理解されたい。例えば、別個のニューラルネットワークは、ＤＳＦ記述子およびフレーム記述子を生成するために説明される。そのようなアプローチは、算出上効率的である。しかしながら、いくつかの実施形態では、フレーム記述子は、最初にＤＳＦ記述子を生成せずに、選択された特徴点から生成されてもよい。
マップのランク付けおよびマージ In addition, it should be understood that other architectures may also be used for processing, as described above. For example, a separate neural network is described to generate DSF and frame descriptors. Such an approach is computationally efficient. However, in some embodiments, the frame descriptor may be generated from selected feature points without first generating a DSF descriptor.
Map ranking and merging

本明細書に説明されるものは、複数の環境マップをＸリアリティ（ＸＲ）システム内でランク付けおよびマージするための方法および装置である。マップマージは、物理的世界の重複部分を表すマップが、より大きいエリアを表すように組み合わせられることを可能にし得る。マップをランク付けすることは、類似性に基づいてマップをマップのセットから選択するステップを伴う、マップマージを含む、本明細書に説明されるような技法を効率的に実施することを可能にし得る。いくつかの実施形態では、例えば、いくつかのＸＲデバイスのいずれかによってアクセスされ得る方法でフォーマットされる、規準マップのセットが、システムによって維持されてもよい。これらの規準マップは、それらのデバイスからの選択された追跡マップと他の追跡マップまたは以前に記憶された規準マップをマージすることによって形成されてもよい。規準マップは、例えば、１つ以上の規準マップを選択し、新しい追跡マップとマージする、および／またはセットから１つ以上の規準マップを選択し、デバイス内で使用する際に使用するために、ランク付けされてもよい。 Described herein are methods and devices for ranking and merging multiple environmental maps within an X Reality (XR) system. Map merging can allow maps that represent overlapping parts of the physical world to be combined to represent larger areas. Ranking maps makes it possible to efficiently implement techniques as described herein, including map merging, which involves selecting maps from a set of maps based on similarity. obtain. In some embodiments, the system may maintain, for example, a set of canonical maps that are formatted in a manner accessible by any of several XR devices. These canonical maps may be formed by merging selected tracking maps from those devices with other tracking maps or previously stored canonical maps. Criteria maps can be used, for example, to select one or more canon maps and merge them with a new tracking map, and / or to select one or more canon maps from a set and use them within the device. It may be ranked.

現実的ＸＲ体験をユーザに提供するために、ＸＲシステムは、実オブジェクトに関連して仮想オブジェクトの場所を正しく相関させるために、ユーザの物理的周囲を把握しなければならない。ユーザの物理的周囲についての情報は、ユーザの場所に関する環境マップから取得されてもよい。 In order to provide the user with a realistic XR experience, the XR system must know the user's physical surroundings in order to properly correlate the location of the virtual object in relation to the real object. Information about the user's physical surroundings may be obtained from an environmental map of the user's location.

本発明者らは、ＸＲシステムが、複数のユーザによって収集された実／物理的世界の環境マップの効率的共有を有効にすることによって、それらのユーザが同一または異なる時間に世界内に存在するかどうかにかかわらず、実および／または仮想コンテンツを備える、同一世界を共有する複数のユーザに、向上されたＸＲ体験を提供し得ることを認識し、その真価を認めた。しかしながら、そのようなシステムを提供する際、有意な課題が存在する。そのようなシステムは、複数のユーザによって生成された複数のマップを記憶し得、および／またはシステムは、異なる時間に生成された複数のマップを記憶し得る。例えば、上記に説明されるように、位置特定等の以前に生成されたマップを用いて実施され得る動作に関して、実質的処理が、同一世界（例えば、同一実世界場所）の関連環境マップをＸＲシステム内で収集された全ての環境マップから識別するために要求され得る。いくつかの実施形態では、デバイスが、例えば、位置特定のためにアクセスし得る、少数の環境マップのみが存在し得る。いくつかの実施形態では、デバイスがアクセスし得る、多数の環境マップが存在し得る。本発明者らは、環境マップの関連性を、例えば、図２８における全ての規準マップ１２０の母集団等のあらゆる可能性として考えられる環境マップから迅速かつ正確にランク付けする技法を認識し、その真価を認めた。高ランク付けマップは、次いで、仮想オブジェクトをユーザディスプレイ上にユーザの周囲の物理的世界と現実的に相互作用するようにレンダリングする、またはそのユーザによって収集されたマップデータと記憶されたマップをマージし、より大きいまたはより正確なマップを作成するため等、さらなる処理のために、選択されてもよい。 We have an XR system that enables efficient sharing of real / physical world environmental maps collected by multiple users so that those users are present in the world at the same or different times. Recognizing and recognizing its true value, it can provide an improved XR experience to multiple users who share the same world, with or without real and / or virtual content. However, there are significant challenges in providing such a system. Such a system may store multiple maps generated by multiple users, and / or the system may store multiple maps generated at different times. For example, as described above, with respect to actions that can be performed using previously generated maps, such as location identification, the substantive processing is XR the relevant environment map of the same world (eg, the same real world location). May be required to identify from all environment maps collected within the system. In some embodiments, there may be only a small number of environmental maps that the device can access, for example, for location. In some embodiments, there may be a large number of environmental maps that the device can access. We recognize a technique for quickly and accurately ranking environmental map relevance from an environmental map that is considered as any possibility, for example, the population of all criterion maps 120 in FIG. 28. I acknowledged the true value. The high-ranked map then renders the virtual object on the user display to interact realistically with the physical world around the user, or merges the map data collected by the user with the stored map. It may be selected for further processing, such as to create a larger or more accurate map.

いくつかの実施形態では、物理的世界内のある場所におけるユーザのためのタスクに関連する、記憶されたマップが、複数の基準に基づいて、記憶されたマップをフィルタリングすることによって識別されてもよい。それらの基準は、その場所におけるユーザのウェアラブルデバイスによって生成された追跡マップと、データベース内に記憶される候補環境マップの比較を示してもよい。比較は、マップを生成するデバイスによって検出されたＷｉ－Ｆｉフィンガプリント等のマップと関連付けられる、メタデータ、および／またはマップを形成する間、それに対してデバイスが接続されるＢＳＳＩＤのセットに基づいて実施されてもよい。比較はまた、マップの圧縮または解凍されたコンテンツに基づいて実施されてもよい。圧縮された表現に基づく比較は、例えば、マップコンテンツから算出されたベクトルの比較によって実施されてもよい。解凍されたマップに基づく比較は、例えば、追跡マップを記憶されたマップ内で位置特定する、またはその逆によって実施されてもよい。複数の比較が、考慮のために、候補マップの数を低減させるために必要とされる算出時間に基づく順序に実施されてもよく、より少ない算出を伴う比較は、より多くの算出を要求する他の比較より先の順序で実施される。 In some embodiments, even if the stored map associated with the task for the user at a location within the physical world is identified by filtering the stored map based on multiple criteria. good. Those criteria may show a comparison of the tracking map generated by the user's wearable device at that location with the candidate environment map stored in the database. The comparison is based on the metadata associated with the map, such as Wi-Fi finger prints, detected by the device that produces the map, and / or the set of BSSIDs to which the device is attached while forming the map. It may be carried out. The comparison may also be based on the compressed or decompressed content of the map. Comparisons based on compressed representations may be performed, for example, by comparing vectors calculated from map content. Comparisons based on the unzipped map may be performed, for example, by locating the tracking map within the stored map and vice versa. Multiple comparisons may be performed in order based on the calculation time required to reduce the number of candidate maps for consideration, and comparisons with fewer calculations require more calculations. It is performed in the order prior to the other comparisons.

図２６は、いくつかの実施形態による、１つ以上の環境マップをランク付けおよびマージするように構成される、ＡＲシステム８００を描写する。ＡＲシステムは、ＡＲデバイスのパス可能世界モデル８０２を含んでもよい。パス可能世界モデル８０２を取り込むための情報は、ＡＲデバイス上のセンサから生じてもよく、これは、プロセッサ８０４（例えば、図４におけるローカルデータ処理モジュール５７０）内に記憶される、センサデータをマップに変換するための処理のいくつかまたは全てを実施し得る、コンピュータ実行可能命令を含んでもよい。そのようなマップは、ＡＲデバイスが領域内で動作する際にセンサデータが収集されるにつれて構築され得るような追跡マップであってもよい。その追跡マップとともに、エリア属性が、追跡マップが表すエリアを示すように供給されてもよい。これらのエリア属性は、緯度および経度として提示される座標または場所を表すためにＡＲシステムによって使用されるＩＤ等の地理的場所識別子であってもよい。代替として、または加えて、エリア属性は、そのエリアに関して一意である高尤度を有する、測定された特性であってもよい。エリア属性は、例えば、エリア内で検出された無線ネットワークのパラメータから導出されてもよい。いくつかの実施形態では、エリア属性は、ＡＲシステムが近隣にある、および／または接続される、アクセスポイントの一意のアドレスと関連付けられてもよい。例えば、エリア属性は、５Ｇ基地局／ルータ、Ｗｉ－Ｆｉルータ、および同等物のＭＡＣアドレスまたは基本サービスセット識別子（ＢＳＳＩＤ）と関連付けられてもよい。 FIG. 26 depicts an AR system 800 configured to rank and merge one or more environmental maps, according to some embodiments. The AR system may include a passable world model 802 for AR devices. The information for capturing the passable world model 802 may come from a sensor on the AR device, which maps the sensor data stored in the processor 804 (eg, the local data processing module 570 in FIG. 4). It may include computer executable instructions that may perform some or all of the processing to convert to. Such a map may be a tracking map such that it can be constructed as sensor data is collected as the AR device operates in the area. Along with the tracking map, area attributes may be supplied to indicate the area represented by the tracking map. These area attributes may be geographic location identifiers such as IDs used by the AR system to represent coordinates or locations presented as latitude and longitude. Alternatively, or in addition, the area attribute may be a measured property with a high likelihood that is unique for that area. Area attributes may be derived, for example, from the parameters of the radio network detected within the area. In some embodiments, the area attribute may be associated with the unique address of the access point to which the AR system is located and / or connected. For example, area attributes may be associated with 5G base stations / routers, Wi-Fi routers, and equivalent MAC addresses or basic service set identifiers (BSSIDs).

図２６の実施例では、追跡マップは、環境の他のマップとマージされてもよい。マップランク付け部分８０６は、追跡マップをデバイスＰＷ８０２から受信し、マップデータベース８０８と通信し、マップデータベース８０８からの環境マップを選択およびランク付けする。より高いランク付けされる選択されたマップは、マップマージ部分８１０に送信される。 In the embodiment of FIG. 26, the tracking map may be merged with other maps of the environment. The map ranking portion 806 receives the tracking map from the device PW802, communicates with the map database 808, and selects and ranks the environment map from the map database 808. The higher ranked selected map is sent to the map merge portion 810.

マップマージ部分８１０は、マージ処理をマップランク付け部分８０６から送信されるマップ上で実施してもよい。マージ処理は、追跡マップとランク付けされたマップのいくつかまたは全てをマージし、新しいマージされたマップをパス可能世界モデル８１２に伝送するステップを伴ってもよい。マップマージ部分は、物理的世界の重複部分を描写するマップを識別することによって、マップをマージしてもよい。それらの重複部分は、両方のマップ内の情報が最終マップの中に集約され得るように整合されてもよい。規準マップは、他の規準マップおよび／または追跡マップとマージされてもよい。 The map merge portion 810 may perform the merge process on the map transmitted from the map ranking portion 806. The merge process may involve merging some or all of the tracking maps and ranked maps and transmitting the new merged map to the passable world model 812. The map merge portion may merge maps by identifying maps that depict overlapping parts of the physical world. Those overlaps may be aligned so that the information in both maps can be aggregated into the final map. The criteria map may be merged with other criteria maps and / or tracking maps.

集約は、１つのマップを別のマップからの情報で拡張するステップを伴ってもよい。代替として、または加えて、集約は、別のマップ内の情報に基づいて、１つのマップ内の物理的世界の表現を調節するステップを伴ってもよい。後者のマップは、例えば、マップが後者の情報に基づいて更新され得るように、特徴点を生じさせるオブジェクトが移動したことを表し得る。代替として、２つのマップは、異なる特徴点を伴う、同一領域を特徴付けし得、集約は、特徴点のセットを２つのマップから選択し、その領域をより良好に表すステップを伴ってもよい。マージするプロセスにおいて生じる、具体的処理にかかわらず、いくつかの実施形態では、マージされる全てのマップからのＰＣＦが、コンテンツをそれらに対して位置付けるアプリケーションがそのように行い続け得るように、留保されてもよい。いくつかの実施形態では、マップのマージは、冗長持続的姿勢をもたらし得、持続的姿勢のうちのいくつかは、削除されてもよい。ＰＣＦが、削除されるべき持続的姿勢と関連付けられるとき、マップをマージするステップは、ＰＣＦがマージ後にマップ内に残る持続的姿勢と関連付けられるように修正するステップを伴ってもよい。 Aggregation may involve the step of extending one map with information from another map. Alternatively, or in addition, the aggregation may involve adjusting the representation of the physical world in one map based on the information in another map. The latter map may represent, for example, that the object that gives rise to the feature point has moved so that the map can be updated based on the latter information. Alternatively, the two maps may characterize the same region with different feature points, and the aggregation may involve selecting a set of feature points from the two maps and better representing the region. .. Regardless of the specific processing that occurs in the process of merging, in some embodiments PCFs from all the maps to be merged are reserved so that the application that positions the content to them can continue to do so. May be done. In some embodiments, map merging can result in redundant persistent postures, some of which may be removed. When the PCF is associated with a persistent posture to be deleted, the step of merging the map may involve modifying the PCF so that it is associated with the persistent posture that remains in the map after the merge.

いくつかの実施形態では、マップが、拡張および／または更新されるにつれて、それらは、精緻化されてもよい。精緻化は、物理的世界内の同一オブジェクトを表す可能性が高い、特徴点間の内部不一致を低減させるための算出を伴ってもよい。不一致は、物理的世界内の同一オブジェクトを表す特徴点を供給する、キーフレームと関連付けられる姿勢内の不正確性から生じ得る。そのような不一致は、例えば、追跡マップに対する姿勢を算出するＸＲデバイスから生じ得、これは、ひいては、姿勢推定内の誤差が、蓄積し、「ドリフト」を姿勢正確度内に経時的に作成するように、姿勢を推定するステップに基づいて構築される。バンドル調整または他の動作を実施し、複数のキーフレームからの特徴点の不一致を低減させることによって、マップは、精緻化されてもよい。 In some embodiments, they may be refined as the maps are expanded and / or updated. The refinement may involve calculations to reduce internal discrepancies between feature points that are likely to represent the same object in the physical world. Mismatches can result from inaccuracies in the pose associated with keyframes that provide feature points that represent the same object in the physical world. Such discrepancies can result, for example, from an XR device that calculates the attitude to the tracking map, which in turn accumulates errors within the attitude estimation and creates a "drift" over time within the attitude accuracy. As such, it is constructed based on the steps of estimating the posture. The map may be refined by performing bundle adjustments or other actions to reduce feature point discrepancies from multiple keyframes.

精緻化に応じて、マップの原点に対する持続的点の場所は、変化し得る。故に、持続的姿勢またはＰＣＦ等のその持続的点と関連付けられる変換も、変化し得る。いくつかの実施形態では、ＸＲシステムは、マップ精緻化（マージ動作の一部として、または他の理由から実施されるかどうかにかかわらず）に関連して、変化した任意の持続的点と関連付けられる変換を再算出してもよい。これらの変換は、変換の任意の使用が持続的点の更新された場所に基づき得るように、変換を算出するコンポーネントから変換を使用するコンポーネントにプッシュ配信され得る。 Depending on the refinement, the location of the persistent points relative to the origin of the map can change. Therefore, the transformation associated with a persistent posture or its persistent point, such as PCF, can also change. In some embodiments, the XR system is associated with any persistence that has changed in relation to map refinement (whether performed as part of a merge operation or for other reasons). The conversion to be performed may be recalculated. These transformations can be pushed from the component that calculates the transformation to the component that uses the transformation so that any use of the transformation can be based on the updated location of the persistence point.

パス可能世界モデル８１２は、クラウドモデルであってもよく、これは、複数のＡＲデバイスによって共有されてもよい。パス可能世界モデル８１２は、環境マップをマップデータベース８０８内に記憶する、または別様にそれへのアクセスを有してもよい。いくつかの実施形態では、以前に算出された環境マップが、更新されると、そのマップの以前のバージョンは、古くなったマップをデータベースから除去するように、削除されてもよい。いくつかの実施形態では、以前に算出された環境マップが、更新されると、そのマップの以前のバージョンは、アーカイブされ、環境の以前のバージョンの読出／閲覧を有効にしてもよい。いくつかの実施形態では、許可が、ある読取／書込アクセスを有するＡＲシステムのみが、マップの以前のバージョンの削除／アーカイブをトリガし得るように設定されてもよい。 The passable world model 812 may be a cloud model, which may be shared by multiple AR devices. The passable world model 812 may store the environment map in the map database 808 or otherwise have access to it. In some embodiments, when a previously calculated environment map is updated, previous versions of that map may be removed to remove the stale map from the database. In some embodiments, when a previously calculated environment map is updated, previous versions of the map may be archived and read / view of previous versions of the environment may be enabled. In some embodiments, the permission may be set so that only AR systems with certain read / write access can trigger the deletion / archiving of previous versions of the map.

１つ以上のＡＲデバイス／システムによって供給される追跡マップから作成されたこれらの環境マップは、ＡＲシステム内のＡＲデバイスによってアクセスされてもよい。マップランク付け部分８０６はまた、環境マップをＡＲデバイスに供給する際に使用されてもよい。ＡＲデバイスは、その現在の場所に関する環境マップを要求するメッセージを送信してもよく、マップランク付け部分８０６は、要求側デバイスに関連する環境マップを選択およびランク付けするために使用されてもよい。 These environment maps created from tracking maps supplied by one or more AR devices / systems may be accessed by AR devices within the AR system. The map ranking portion 806 may also be used in supplying the environment map to the AR device. The AR device may send a message requesting an environment map for its current location, and the map ranking portion 806 may be used to select and rank the environment map associated with the requesting device. ..

いくつかの実施形態では、ＡＲシステム８００は、マージされたマップをクラウドＰＷ８１２から受信するように構成される、ダウンサンプリング部分８１４を含んでもよい。クラウドＰＷ８１２から受信され、マージされたマップは、クラウドのための記憶フォーマットであってもよく、これは、平方メートルあたり多数のＰＣＦまたはＰＣＦと関連付けられる複数の画像フレームまたは大セットの特徴点等の高分解能情報を含んでもよい。ダウンサンプリング部分８１４は、クラウドフォーマットマップをＡＲデバイス上への記憶のために好適なフォーマットにダウンサンプリングするように構成されてもよい。デバイスフォーマットマップは、より少ないＰＣＦまたはＰＣＦ毎に記憶されたより少ないデータ等のより少ないデータを有し、ＡＲデバイスの限定されたローカル演算能力および記憶空間に対応し得る。 In some embodiments, the AR system 800 may include a downsampling portion 814 configured to receive the merged map from the cloud PW812. The map received and merged from the cloud PW812 may be a storage format for the cloud, which may be a large number of PCFs per square meter or high features such as multiple image frames or large sets of feature points associated with PCFs. It may include resolution information. The downsampling portion 814 may be configured to downsample the cloud format map to a format suitable for storage on the AR device. The device format map has less data, such as less PCF or less data stored per PCF, and may accommodate the limited local computing power and storage space of the AR device.

図２７は、遠隔記憶媒体、例えば、クラウド内に記憶され得る、複数の規準マップ１２０を図示する、簡略化されたブロック図である。各規準マップ１２０は、惑星である地球上のいずれかの場所等、物理的空間内の規準マップの場所を示す、複数の規準マップ識別子を含んでもよい。これらの規準マップ識別子は、以下の識別子、すなわち、経度および緯度の範囲によって表される、エリア識別子と、フレーム記述子（例えば、図２１におけるグローバル特徴列３１６）と、Ｗｉ－Ｆｉフィンガプリントと、特徴記述子（例えば、図２１における特徴記述子３１０）と、マップに寄与した１つ以上のデバイスを示す、デバイス識別とのうちの１つ以上のものを含んでもよい。 FIG. 27 is a simplified block diagram illustrating a plurality of reference maps 120 that may be stored in a remote storage medium, eg, a cloud. Each canonical map 120 may include a plurality of canonical map identifiers indicating the location of the canonical map in physical space, such as any location on the planet Earth. These canonical map identifiers include the following identifiers, namely area identifiers represented by longitude and latitude ranges, frame descriptors (eg, global feature sequence 316 in FIG. 21), Wi-Fi finger prints, and It may include one or more of a feature descriptor (eg, feature descriptor 310 in FIG. 21) and a device identification indicating one or more devices that contributed to the map.

図示される実施例では、規準マップ１２０は、地球の表面上に存在し得るため、２次元パターンで地理的に配置される。規準マップ１２０は、重複経度および緯度を有する、任意の規準マップが、新しい規準マップの中にマージされ得るため、対応する経度および緯度によって、一意に識別可能であり得る。 In the illustrated embodiment, the reference map 120 may be on the surface of the earth and is therefore geographically arranged in a two-dimensional pattern. The criterion map 120 may be uniquely identifiable by the corresponding longitude and latitude, as any canon map with overlapping longitudes and latitudes can be merged into the new canon map.

図２８は、いくつかの実施形態による、新しい追跡マップを１つ以上の規準マップに対して位置特定するために使用され得る、規準マップを選択する方法を図示する、概略図である。本方法は、実施例として、パス可能世界（例えば、パス可能世界モジュール５３８）内のデータベース内に記憶され得る、規準マップ１２０の母集合にアクセスするステップ（行為１２０）から開始してもよい。規準マップの母集合は、全ての以前に訪問した場所からの規準マップを含んでもよい。ＸＲシステムは、全ての規準マップの母集合を小サブセットまたは単一マップのみにフィルタリングしてもよい。いくつかの実施形態では、帯域幅制限に起因して、全ての規準マップを視認デバイスに送信することは不可能であり得ることを理解されたい。デバイスに送信するために、追跡マップにマッチングするための可能性が高い候補として選択されたサブセットを選択することは、マップの遠隔データベースにアクセスすることと関連付けられる帯域幅および待ち時間を低減させ得る。 FIG. 28 is a schematic diagram illustrating, according to some embodiments, a method of selecting a canonical map that can be used to locate a new tracking map with respect to one or more canonical maps. As an embodiment, the method may start from the step (act 120) of accessing the population of reference maps 120, which may be stored in a database in passable worlds (eg, passable worlds module 538). The canonical map population may include canonical maps from all previously visited locations. The XR system may filter the population of all canonical maps to only a small subset or a single map. It should be understood that in some embodiments it may not be possible to send all the canonical maps to the visual device due to bandwidth limitations. Choosing the subset selected as a likely candidate to match the tracking map for sending to the device can reduce the bandwidth and latency associated with accessing the map's remote database. ..

本方法は、所定のサイズおよび形状を伴うエリアに基づいて、規準マップの母集合をフィルタリングするステップ（行為３００）を含んでもよい。図２７に図示される実施例では、各正方形は、エリアを表し得る。各正方形は、５０ｍ×５０ｍを網羅し得る。各正方形は、６つの近傍のエリアを有し得る。いくつかの実施形態では、行為３００は、少なくとも１つのマップがその経度および緯度に存在する限り、ＸＲデバイスから受信された位置識別子のその経度および緯度を含む、経度および緯度を網羅する、少なくとも１つのマッチング規準マップ１２０を選択してもよい。いくつかの実施形態では、行為３００は、マッチング規準マップに隣接する、経度および緯度を網羅する、少なくとも１つの近傍の規準マップを選択してもよい。いくつかの実施形態では、行為３００は、複数のマッチング規準マップと、複数の近傍の規準マップとを選択してもよい。行為３００は、例えば、規準マップの数を約１０分の１、例えば、数千から数百に低減させ、第１のフィルタリングされた選択を形成してもよい。代替として、または加えて、緯度および経度以外の基準が、近傍のマップを識別するために使用されてもよい。ＸＲデバイスは、例えば、同一セッションの一部として、セット内の規準マップを用いて以前に位置特定されていてもよい。クラウドサービスは、以前に位置特定されたマップを含む、ＸＲデバイスについての情報を留保してもよい。本実施例では、行為３００において選択されたマップは、それに対してＸＲデバイスが位置特定されたマップに隣接する、エリアを網羅するものを含んでもよい。 The method may include filtering the population of the canonical map (act 300) based on the area with a given size and shape. In the embodiment illustrated in FIG. 27, each square may represent an area. Each square may cover 50m x 50m. Each square may have six nearby areas. In some embodiments, the act 300 covers at least one longitude and latitude, including that longitude and latitude of the location identifier received from the XR device, as long as at least one map is present at that longitude and latitude. One matching criterion map 120 may be selected. In some embodiments, the act 300 may select at least one neighborhood canonical map that is adjacent to the matching canonical map and covers longitude and latitude. In some embodiments, the act 300 may select a plurality of matching criteria maps and a plurality of neighborhood criteria maps. Act 300 may, for example, reduce the number of canonical maps by about one tenth, eg, thousands to hundreds, to form a first filtered selection. Alternatively, or in addition, criteria other than latitude and longitude may be used to identify nearby maps. The XR device may have been previously located, for example, as part of the same session, using a canonical map within the set. The cloud service may reserve information about the XR device, including previously located maps. In this embodiment, the map selected in the act 300 may include one covering an area adjacent to the map in which the XR device is located.

本方法は、Ｗｉ－Ｆｉフィンガプリントに基づいて、規準マップの第１のフィルタリングされた選択をフィルタリングするステップ（行為３０２）を含んでもよい。行為３０２は、ＸＲデバイスからの位置識別子の一部として受信されたＷｉ－Ｆｉフィンガプリントに基づいて、緯度および経度を決定してもよい。行為３０２は、Ｗｉ－Ｆｉフィンガプリントからの緯度および経度と規準マップ１２０の緯度および経度を比較し、第２のフィルタリングされた選択を形成する、１つ以上の規準マップを決定してもよい。行為３０２は、規準マップの数を約１０分の１、例えば、第２の選択を形成する、数百から数十個の規準マップ（例えば、５０）に低減させてもよい。例えば、第１のフィルタリングされた選択は、１３０個の規準マップを含み得、第２のフィルタリングされた選択は、１３０個の規準マップのうちの５０個を含み得、１３０個の規準マップのうちの残りの８０個を含み得ない。 The method may include filtering the first filtered selection of the canonical map based on the Wi-Fi finger print (act 302). Act 302 may determine the latitude and longitude based on the Wi-Fi finger print received as part of the location identifier from the XR device. Act 302 may compare the latitude and longitude from the Wi-Fi finger print to the latitude and longitude of the criterion map 120 to determine one or more criterion maps that form a second filtered selection. Act 302 may reduce the number of canonical maps to about one tenth, eg, hundreds to dozens of canonical maps (eg, 50) forming a second choice. For example, the first filtered selection may include 130 canonical maps, the second filtered selection may include 50 out of 130 canonical maps, and out of 130 canonical maps. Cannot contain the remaining 80 of.

本方法は、キーフレームに基づいて、規準マップの第２のフィルタリングされた選択をフィルタリングするステップ（行為３０４）を含んでもよい。行為３０４は、ＸＲデバイスによって捕捉された画像を表すデータと規準マップ１２０を表すデータを比較してもよい。いくつかの実施形態では、画像および／またはマップを表すデータは、特徴記述子（例えば、図２５におけるＤＳＦ記述子）および／またはグローバル特徴列（例えば、図２１における３１６）を含んでもよい。行為３０４は、規準マップの第３のフィルタリングされた選択を提供してもよい。いくつかの実施形態では、行為３０４の出力は、例えば、第２のフィルタリングされた選択に続いて識別された５０個の規準マップのうちの５つのみであってもよい。マップ伝送機１２２が、次いで、第３のフィルタリングされた選択に基づいて、１つ以上の規準マップを視認デバイスに伝送する。行為３０４は、規準マップの数を約１０分の１、第３の選択を形成する、例えば、数十から単一桁数の規準マップ（例えば、５）に低減させ得る。いくつかの実施形態では、ＸＲデバイスは、第３のフィルタリングされた選択内の規準マップを受信し、受信された規準マップの中に位置特定するように試みてもよい。 The method may include filtering a second filtered selection of the canonical map based on a key frame (act 304). Action 304 may compare the data representing the image captured by the XR device with the data representing the reference map 120. In some embodiments, the data representing the image and / or map may include a feature descriptor (eg, DSF descriptor in FIG. 25) and / or a global feature sequence (eg, 316 in FIG. 21). Act 304 may provide a third filtered selection of canonical maps. In some embodiments, the output of the act 304 may be, for example, only 5 of the 50 criterion maps identified following the second filtered selection. The map transmitter 122 then transmits one or more canonical maps to the visual device based on a third filtered selection. Act 304 may reduce the number of canonical maps to about one tenth, forming a third choice, eg, from tens to single digit number of canonical maps (eg, 5). In some embodiments, the XR device may receive a canonical map within a third filtered selection and attempt to locate it within the received canonical map.

例えば、行為３０４は、規準マップ１２０のグローバル特徴列３１６および視認デバイスによって捕捉された画像（例えば、ユーザのためのローカル追跡マップの一部であり得る、画像）に基づくグローバル特徴列３１６に基づいて、規準マップ１２０をフィルタリングしてもよい。図２７における規準マップ１２０のそれぞれのものは、したがって、それと関連付けられる１つ以上のグローバル特徴列３１６を有する。いくつかの実施形態では、グローバル特徴列３１６は、ＸＲデバイスが画像または特徴詳細をクラウドに提出するときに入手されてもよく、クラウドは、画像または特徴詳細を処理し、規準マップ１２０のためのグローバル特徴列３１６を生成する。 For example, action 304 is based on the global feature sequence 316 of the reference map 120 and the global feature sequence 316 based on the image captured by the visual device (eg, an image that can be part of a local tracking map for the user). , Criteria map 120 may be filtered. Each of the criteria maps 120 in FIG. 27 therefore has one or more global feature columns 316 associated with it. In some embodiments, the global feature sequence 316 may be obtained when the XR device submits the image or feature details to the cloud, where the cloud processes the image or feature details and for the reference map 120. Generate the global feature sequence 316.

いくつかの実施形態では、クラウドは、視認デバイスによって捕捉されたライブ／新しい／現在の画像の特徴詳細を受信してもよく、クラウドは、ライブ画像のためのグローバル特徴列３１６を生成してもよい。クラウドは、次いで、ライブグローバル特徴列３１６に基づいて、規準マップ１２０をフィルタリングしてもよい。いくつかの実施形態では、グローバル特徴列は、ローカル視認デバイス上に生成されてもよい。いくつかの実施形態では、グローバル特徴列は、遠隔で、例えば、クラウド上で生成されてもよい。いくつかの実施形態では、クラウドは、フィルタリングされた規準マップを、フィルタリングされた規準マップと関連付けられるグローバル特徴列３１６とともに、ＸＲデバイスに伝送してもよい。いくつかの実施形態では、視認デバイスが、その追跡マップを規準マップに対して位置特定すると、ローカル追跡マップのグローバル特徴列３１６と規準マップのグローバル特徴列をマッチングさせることによって、そのように行ってもよい。 In some embodiments, the cloud may receive feature details of the live / new / current image captured by the visual device, and the cloud may generate a global feature sequence 316 for the live image. good. The cloud may then filter the criterion map 120 based on the live global feature column 316. In some embodiments, the global feature sequence may be generated on a local visual device. In some embodiments, the global feature sequence may be generated remotely, eg, on the cloud. In some embodiments, the cloud may transmit the filtered criteria map to the XR device along with the global feature column 316 associated with the filtered criteria map. In some embodiments, when the visual device locates its tracking map relative to the canonical map, it does so by matching the global feature column 316 of the local tracking map with the global feature column of the canonical map. May be good.

ＸＲデバイスの動作は、行為（３００、３０２、３０４）の全てを実施しなくてもよいことを理解されたい。例えば、規準マップの母集合が、比較的に小さい（例えば、５００個のマップ）場合、位置特定するように試みるＸＲデバイスは、Ｗｉ－Ｆｉフィンガプリント（例えば、行為３０２）およびキーフレーム（例えば、行為３０４）に基づいて、規準マップの母集合をフィルタリングするが、エリア（例えば、行為３００）に基づくフィルタリングを省略してもよい。さらに、マップは、その全体として比較される必要であるわけではない。いくつかの実施形態では、例えば、２つのマップの比較は、新しいマップおよびマップの母集合から選択されたマップの両方に現れる、持続的姿勢またはＰＣＦ等の共通の持続的点の識別をもたらし得る。その場合、記述子は、持続的点と関連付けられてもよく、それらの記述子は、比較されてもよい。 It should be understood that the operation of the XR device does not have to perform all of the actions (300, 302, 304). For example, if the population of canonical maps is relatively small (eg, 500 maps), the XR device attempting to locate will be a Wi-Fi finger print (eg, act 302) and a keyframe (eg, act 302). The population of the canonical map is filtered based on the act 304), but the filtering based on the area (eg, act 300) may be omitted. Moreover, the maps do not need to be compared as a whole. In some embodiments, comparison of two maps, for example, may result in the identification of common persistent points such as persistent posture or PCF that appear in both the new map and the map selected from the map population. .. In that case, the descriptors may be associated with persistence points and those descriptors may be compared.

図２９は、いくつかの実施形態による、１つ以上のランク付けされた環境マップを選択する方法９００を図示する、フローチャートである。図示される実施形態では、ランク付けするステップは、追跡マップを作成する、ユーザのＡＲデバイスのために実施される。故に、追跡マップは、環境マップをランク付けする際に使用するために利用可能である。追跡マップが利用不可能である、実施形態では、追跡マップに明示的に依拠しない、環境マップの選択およびランク付けするステップの部分のいくつかまたは全てが、使用されてもよい。 FIG. 29 is a flow chart illustrating a method 900 for selecting one or more ranked environment maps according to some embodiments. In the illustrated embodiment, the ranking step is performed for the user's AR device to create a tracking map. Therefore, tracking maps are available for use in ranking environmental maps. In embodiments, where the tracking map is not available, some or all of the steps of selecting and ranking the environment map that do not explicitly rely on the tracking map may be used.

方法９００は、行為９０２から開始し得、追跡マップが形成された場所の近傍の環境マップのデータベースからのマップのセット（規準マップとしてフォーマット化され得る）が、アクセスされ、次いで、ランク付けのためにフィルタリングされ得る。加えて、行為９０２では、その中でユーザのＡＲデバイスが動作しているエリアに関する、少なくとも１つのエリア属性が、決定される。ユーザのＡＲデバイスが追跡マップを構築している、シナリオでは、エリア属性は、それにわたって追跡マップが作成される、エリアに対応し得る。具体的実施例として、エリア属性は、ＡＲデバイスが追跡マップを算出していた間にアクセスポイントからコンピュータネットワークに受信された信号に基づいて、算出されてもよい。 Method 900 may start from act 902 and a set of maps (which can be formatted as a canonical map) from a database of environmental maps near the location where the tracking map was formed is accessed and then ranked. Can be filtered to. In addition, in act 902, at least one area attribute with respect to the area in which the user's AR device is operating is determined. In a scenario where the user's AR device is building a tracking map, the area attributes may correspond to the area over which the tracking map is created. As a specific embodiment, the area attributes may be calculated based on the signal received from the access point to the computer network while the AR device was calculating the tracking map.

図３０は、いくつかの実施形態による、ＡＲシステム８００の例示的マップランク付け部分８０６を描写する。マップランク付け部分８０６は、ＡＲデバイス上で実行される部分およびクラウド等の遠隔コンピューティングシステム上で実行される部分を含み得るため、クラウドコンピューティング環境内で実行されてもよい。マップランク付け部分８０６は、方法９００の少なくとも一部を実施するように構成されてもよい。 FIG. 30 illustrates an exemplary map ranking portion 806 of an AR system 800, according to some embodiments. The map ranking portion 806 may be performed within a cloud computing environment as it may include a portion that runs on an AR device and a portion that runs on a remote computing system such as the cloud. The map ranking portion 806 may be configured to implement at least a portion of Method 900.

図３１Ａは、いくつかの実施形態による、追跡マップ（ＴＭ）１１０２およびデータベース内の環境マップＣＭ１－ＣＭ４のエリア属性ＡＡ１－ＡＡ８の実施例を描写する。図示されるように、環境マップは、複数のエリア属性に関連付けられてもよい。エリア属性ＡＡ１－ＡＡ８は、追跡マップ１１０２を算出するＡＲデバイスによって検出された無線ネットワークのパラメータ、例えば、それに対してＡＲデバイスが接続される、ネットワークの基本サービスセット識別子（ＢＳＳＩＤ）、および／または、例えば、ネットワークタワー１１０４を通して無線ネットワークに受信されるアクセスポイントの信号の強度を含んでもよい。無線ネットワークのパラメータは、Ｗｉ－Ｆｉおよび５ＧＮＲを含む、プロトコルに準拠してもよい。図３２に図示される実施例では、エリア属性は、その中でユーザＡＲデバイスが、センサデータを収集し、追跡マップを形成した、エリアのフィンガプリントである。 FIG. 31A illustrates examples of the area attributes AA1-AA8 of the tracking map (TM) 1102 and the environment maps CM1-CM4 in the database, according to some embodiments. As shown, the environment map may be associated with multiple area attributes. Area attributes AA1-AA8 are network parameters detected by the AR device that calculates the tracking map 1102, eg, the network basic service set identifier (BSSID) to which the AR device is connected, and / or. For example, it may include the signal strength of the access point received by the wireless network through the network tower 1104. Wireless network parameters may be protocol compliant, including Wi-Fi and 5G NR. In the embodiment illustrated in FIG. 32, the area attribute is a finger print of the area in which the user AR device has collected sensor data and formed a tracking map.

図３１Ｂは、いくつかの実施形態による、追跡マップ１１０２の決定された地理的場所１１０６の実施例を描写する。図示される実施例では、決定された地理的場所１１０６は、重心点１１１０と、重心点の周囲を囲むエリア１１０８とを含む。本願の地理的場所の決定は、図示されるフォーマットに限定されないことを理解されたい。決定された地理的場所は、例えば、異なるエリア形状を含む、任意の好適なフォーマットを有してもよい。本実施例では、地理的場所は、エリア属性を地理的場所に関連させるデータベースを使用して、エリア属性から決定される。データベースは、市販されており、例えば、緯度および経度として表され、本動作のために使用され得る、Ｗｉ－Ｆｉフィンガプリントを場所に関連させる、データベースである。 FIG. 31B illustrates an embodiment of the determined geographic location 1106 of tracking map 1102, according to some embodiments. In the illustrated embodiment, the determined geographic location 1106 includes a center of gravity point 1110 and an area 1108 that surrounds the center of gravity point. It should be understood that the geographic location determination of the present application is not limited to the formats shown. The determined geographic location may have any suitable format, including, for example, different area shapes. In this embodiment, the geographic location is determined from the area attributes using a database that associates the area attributes with the geographic location. The database is a commercially available, eg, latitude and longitude, Wi-Fi finger print location-related database that can be used for this operation.

図２９の実施形態では、環境マップを含有する、マップデータベースはまた、マップによって網羅される緯度および経度を含む、それらのマップに関する場所データを含んでもよい。行為９０２における処理は、そのデータベースから、追跡マップのエリア属性に関して決定された同一緯度および経度を網羅する、環境マップのセットを選択するステップを伴ってもよい。 In the embodiment of FIG. 29, the map database comprising environment maps may also include location data for those maps, including latitudes and longitudes covered by the maps. The process in act 902 may involve selecting from its database a set of environmental maps that cover the same latitude and longitude determined for the area attributes of the tracking map.

行為９０４は、行為９０２においてアクセスされる環境マップのセットの第１のフィルタリングである。行為９０２では、環境マップは、追跡マップの地理的場所との近接度に基づいて、セット内に留保される。本フィルタリングステップは、追跡マップとセット内の環境マップと関連付けられる緯度および経度を比較することによって実施されてもよい。 Act 904 is the first filtering of the set of environment maps accessed in act 902. In Act 902, the environmental map is reserved in the set based on the proximity of the tracking map to the geographic location. This filtering step may be performed by comparing the latitude and longitude associated with the tracking map and the environment map in the set.

図３２は、いくつかの実施形態による、行為９０４の実施例を描写する。各エリア属性は、対応する地理的場所１２０２を有してもよい。環境マップのセットは、追跡マップの決定された地理的場所と重複する地理的場所を有する、少なくとも１つのエリア属性を伴う、環境マップを含んでもよい。図示される実施例では、識別された環境マップのセットは、それぞれ、追跡マップ１１０２の決定された地理的場所と重複する地理的場所を有する、少なくとも１つのエリア属性を有する、環境マップＣＭ１、ＣＭ２、およびＣＭ４を含む。エリア属性ＡＡ６と関連付けられる、環境マップＣＭ３は、追跡マップの決定された地理的場所外にあるため、セット内に含まれない。 FIG. 32 illustrates an embodiment of Act 904 according to some embodiments. Each area attribute may have a corresponding geographic location 1202. The set of environment maps may include an environment map with at least one area attribute that has a geographic location that overlaps the determined geographic location of the tracking map. In the illustrated embodiment, each set of identified environment maps has at least one area attribute, having a geographic location that overlaps the determined geographic location of tracking map 1102, environment maps CM1, CM2. , And CM4. The environment map CM3 associated with the area attribute AA6 is not included in the set because it is outside the determined geographic location of the tracking map.

他のフィルタリングステップもまた、最終的に処理される（マップマージのため、またはパス可能世界情報をユーザデバイスに提供するため等）セット内の環境マップの数を低減させる／ランク付けするために、環境マップのセット上で実施されてもよい。方法９００は、追跡マップおよび環境マップのセットの環境マップと関連付けられるネットワークアクセスポイントの１つ以上の識別子の類似性に基づいて、環境マップのセットをフィルタリングするステップ（行為９０６）を含んでもよい。マップの形成の間、センサデータを収集し、マップを生成する、デバイスは、Ｗｉ－Ｆｉまたは類似無線通信プロトコル等を通して、ネットワークアクセスポイントを通して、ネットワークに接続され得る。アクセスポイントは、ＢＳＳＩＤによって識別され得る。ユーザデバイスは、エリアを通して移動し、データを収集し、マップを形成するにつれて、複数の異なるアクセスポイントに接続し得る。同様に、複数のデバイスが、マップを形成するための情報を供給するとき、デバイスは、異なるアクセスポイントを通して接続されている場合があり、したがって、同様にその理由から、マップを形成する際に使用される複数のアクセスポイントが存在し得る。故に、マップと関連付けられる複数のアクセスポイントが存在し得、アクセスポイントのセットは、マップの場所のインジケーションであり得る。ＲＳＳＩ値として反映され得る、アクセスポイントからの信号の強度は、さらなる地理的情報を提供し得る。いくつかの実施形態では、ＢＳＳＩＤおよびＲＳＳＩ値のリストは、マップのためのエリア属性を形成し得る。 Other filtering steps are also ultimately processed (for map merging, or for providing passable worlds information to user devices, etc.) to reduce / rank the number of environment maps in the set. It may be carried out on a set of environment maps. Method 900 may include filtering the set of environment maps (Act 906) based on the similarity of one or more identifiers of network access points associated with the environment map of the set of tracking maps and environment maps. During map formation, devices that collect sensor data and generate maps can be connected to the network through network access points, such as through Wi-Fi or similar wireless communication protocols. The access point can be identified by BSSID. User devices may connect to multiple different access points as they travel through the area, collect data, and form maps. Similarly, when multiple devices provide information to form a map, the devices may be connected through different access points and, therefore, for the same reason, used in forming the map. There can be multiple access points. Therefore, there can be multiple access points associated with a map, and a set of access points can be an indication of the location of the map. The strength of the signal from the access point, which can be reflected as an RSSI value, may provide additional geographic information. In some embodiments, the list of BSSID and RSSI values may form an area attribute for the map.

いくつかの実施形態では、ネットワークアクセスポイントの１つ以上の識別子の類似性に基づいて、環境マップのセットをフィルタリングするステップは、環境マップのセット内に、ネットワークアクセスポイントの１つ以上の識別子に基づいて、追跡マップの少なくとも１つのエリア属性との最高Ｊａｃｃａｒｄ類似性を伴う環境マップを留保するステップを含んでもよい。図３３は、いくつかの実施形態による、行為９０６の実施例を描写する。図示される実施例では、エリア属性ＡＡ７と関連付けられるネットワーク識別子は、追跡マップ１１０２に関する識別子として決定され得る。行為９０６後の環境マップのセットは、ＡＡ７とのより高いＪａｃｃａｒｄ類似性内のエリア属性を有し得る、環境マップＣＭ２と、同様にエリア属性ＡＡ７を含む、環境マップＣＭ４とを含む。環境マップＣＭ１は、ＡＡ７との最低Ｊａｃｃａｒｄ類似性を有するため、セット内に含まれない。 In some embodiments, the step of filtering a set of environment maps based on the similarity of one or more identifiers of a network access point is within the set of environment maps to one or more identifiers of the network access point. Based on this, it may include the step of reserving an environment map with the highest Jaccard similarity to at least one area attribute of the tracking map. FIG. 33 illustrates an embodiment of Act 906 according to some embodiments. In the illustrated embodiment, the network identifier associated with the area attribute AA7 may be determined as the identifier for tracking map 1102. The set of environment maps after act 906 includes environment map CM2, which may have an area attribute within a higher Jaccard similarity with AA7, and environment map CM4, which also includes area attribute AA7. The environment map CM1 is not included in the set because it has the lowest Jaccard similarity with AA7.

行為９０２－９０６における処理は、マップと関連付けられるメタデータに基づいて、マップデータベース内に記憶されるマップのコンテンツに実際にアクセスすることなく、実施されてもよい。他の処理は、マップのコンテンツにアクセスするステップを伴ってもよい。行為９０８は、メタデータに基づいてフィルタリング後にサブセット内に残っている環境マップにアクセスするステップを示す。本行為は、後続動作がアクセスされるコンテンツを用いて実施され得る場合、プロセスにおけるより早い段階または後の段階のいずれかで実施されてもよいことを理解されたい。 The processing in Acts 902-906 may be performed based on the metadata associated with the map without actually accessing the contents of the map stored in the map database. Other processes may involve steps to access the contents of the map. Act 908 shows the steps to access the environment map that remains in the subset after filtering based on the metadata. It should be understood that this action may be performed either earlier or later in the process if subsequent actions can be performed with the content being accessed.

方法９００は、追跡マップおよび環境マップのセットの環境マップのコンテンツを表すメトリックの類似性に基づいて、環境マップのセットをフィルタリングするステップ（行為９１０）を含んでもよい。追跡マップおよび環境マップのコンテンツを表すメトリックは、マップのコンテンツから算出された値のベクトルを含んでもよい。例えば、マップを形成する際に使用される１つ以上のキーフレームのために算出される、上記に説明されるような深層キーフレーム記述子は、マップまたはマップの一部の比較のためのメトリックを提供し得る。メトリックは、行為９０８において読み出されたマップから算出されてもよい、または事前に算出され、それらのマップと関連付けられるメタデータとして記憶されてもよい。いくつかの実施形態では、追跡マップおよび環境マップのセットの環境マップのコンテンツを表すメトリックの類似性に基づいて、環境マップのセットをフィルタリングするステップは、環境マップのセット内に、追跡マップの特性のベクトルと環境マップのセット内の環境マップを表すベクトルとの間に最小ベクトル距離を伴う環境マップを留保するステップを含んでもよい。 Method 900 may include filtering the set of environment maps (Act 910) based on the similarity of the metrics representing the contents of the environment map of the tracking map and the set of environment maps. Metrics representing the content of tracking maps and environment maps may include a vector of values calculated from the content of the map. For example, a deep keyframe descriptor as described above, calculated for one or more keyframes used in forming a map, is a metric for comparing the map or parts of the map. Can be provided. Metrics may be calculated from the maps read in Act 908, or may be pre-calculated and stored as metadata associated with those maps. In some embodiments, the step of filtering a set of environment maps based on the similarity of the metrics that represent the contents of the environment map of the tracking map and the set of environment maps is a characteristic of the tracking map within the set of environment maps. It may include a step of reserving an environment map with a minimum vector distance between the vector of and the vector representing the environment map in the set of environment maps.

方法９００は、追跡マップの一部と環境マップのセットの環境マップの一部との間のマッチング度に基づいて、環境マップのセットをさらにフィルタリングするステップ（行為９１２）を含んでもよい。マッチング度は、位置特定プロセスの一部として決定されてもよい。非限定的実施例として、位置特定は、それらが物理的世界の同一部分を表し得るほど十分に類似する、追跡マップおよび環境マップ内の重要点を識別することによって実施されてもよい。いくつかの実施形態では、重要点は、特徴、特徴記述子、キーフレーム、キーリグ、持続的姿勢、および／またはＰＣＦであってもよい。追跡マップ内の重要点のセットは、次いで、環境マップ内の重要点のセットとの最良適合を生産するように整合され得る。対応する重要点間の平均二乗距離が、算出され得、追跡マップの特定の領域に関する閾値を下回る場合、追跡マップおよび環境マップが物理的世界の同一領域を表すことのインジケーションとして使用される。 Method 900 may include further filtering the set of environmental maps (Act 912) based on the degree of matching between a portion of the tracking map and a portion of the environmental map of the set of environmental maps. The degree of matching may be determined as part of the positioning process. As a non-limiting example, locating may be performed by identifying key points in tracking and environmental maps that are sufficiently similar that they can represent the same part of the physical world. In some embodiments, the key points may be features, feature descriptors, keyframes, keyrigs, persistent postures, and / or PCFs. The set of key points in the tracking map can then be aligned to produce the best fit with the set of key points in the environmental map. If the mean square distance between the corresponding key points can be calculated and falls below a threshold for a particular region of the tracking map, the tracking map and environment map are used as an indication to represent the same region of the physical world.

いくつかの実施形態では、追跡マップの一部と環境マップのセットの環境マップの一部との間のマッチング度に基づいて、環境マップのセットをフィルタリングするステップは、環境マップのセットの環境マップ内にもまた表される、追跡マップによって表される物理的世界の体積を算出するステップと、環境マップのセット内に、セットのフィルタリング除去された環境マップより大きい算出された体積を伴う環境マップを留保するステップとを含んでもよい。図３４は、いくつかの実施形態による、行為９１２の実施例を描写する。図示される実施例では、行為９１２後の環境マップのセットは、追跡マップ１１０２のエリアとマッチングされるエリア１４０２を有する、環境マップＣＭ４を含む。環境マップＣＭ１は、追跡マップ１１０２のエリアとマッチングされるエリアを有していないため、セット内に含まれない。 In some embodiments, the step of filtering the set of environment maps based on the degree of matching between part of the tracking map and part of the environment map of the set of environment maps is the environment map of the set of environment maps. An environment map with a calculated volume larger than the set's filtered out environment map within a set of environment maps, with steps to calculate the volume of the physical world represented by the tracking map, also represented within. May include a step to reserve. FIG. 34 illustrates an embodiment of Act 912 according to some embodiments. In the illustrated embodiment, the set of environment maps after act 912 includes environment map CM4 having an area 1402 matched to the area of tracking map 1102. The environment map CM1 is not included in the set because it does not have an area matched with the area of the tracking map 1102.

いくつかの実施形態では、環境マップのセットは、行為９０６、行為９１０、および行為９１２の順序でフィルタリングされてもよい。いくつかの実施形態では、環境マップのセットは、行為９０６、行為９１０、および行為９１２に基づいて、フィルタリングされてもよく、これは、最低から最高へとフィルタリングを実施するために要求される処理に基づく順序で実施され得る。方法９００は、環境マップおよびデータのセットをロードするステップ（行為９１４）を含んでもよい。 In some embodiments, the set of environment maps may be filtered in the order of actions 906, actions 910, and actions 912. In some embodiments, the set of environment maps may be filtered based on actions 906, actions 910, and actions 912, which is the processing required to perform filtering from lowest to highest. It can be carried out in the order based on. Method 900 may include loading an environment map and a set of data (Act 914).

図示される実施例では、ユーザデータベースは、ＡＲデバイスが使用されたエリアを示す、エリア識別を記憶する。エリア識別は、エリア属性であってもよく、これは、使用時にＡＲデバイスによって検出された無線ネットワークのパラメータを含み得る。マップデータベースは、ＡＲデバイスによって供給されるデータおよび関連付けられるメタデータから構築された複数の環境マップを記憶してもよい。関連付けられるメタデータは、そこから環境マップが構築されたデータを供給した、ＡＲデバイスのエリア識別から導出されるエリア識別を含んでもよい。ＡＲデバイスは、メッセージをＰＷモジュールに送信し、新しい追跡マップが作成される、または作成中であることを示してもよい。ＰＷモジュールは、ＡＲデバイスに関するエリア識別子を算出し、受信されたパラメータおよび／または算出されたエリア識別子に基づいて、ユーザデータベースを更新してもよい。ＰＷモジュールはまた、環境マップを要求するＡＲデバイスと関連付けられるエリア識別子を決定し、エリア識別子に基づいて、マップデータベースから環境マップのセットを識別し、環境マップのセットをフィルタリングし、フィルタリングされた環境マップのセットをＡＲデバイスに伝送してもよい。いくつかの実施形態では、ＰＷモジュールは、例えば、追跡マップの地理的場所、追跡マップおよび環境マップのセットの環境マップと関連付けられるネットワークアクセスポイントの１つ以上の識別子の類似性、追跡マップおよび環境マップのセットの環境マップのコンテンツを表すメトリックの類似性、および追跡マップの一部と環境マップのセットの環境マップの一部との間のマッチング度を含む、１つ以上の基準に基づいて、環境マップのセットをフィルタリングしてもよい。 In the illustrated embodiment, the user database stores an area identification that indicates the area in which the AR device was used. The area identification may be an area attribute, which may include radio network parameters detected by the AR device during use. The map database may store multiple environment maps constructed from the data supplied by the AR device and the associated metadata. The associated metadata may include an area identification derived from the area identification of the AR device that supplied the data from which the environment map was constructed. The AR device may send a message to the PW module to indicate that a new tracking map is being created or is being created. The PW module may calculate the area identifier for the AR device and update the user database based on the received parameters and / or the calculated area identifier. The PW module also determines the area identifier associated with the AR device requesting the environment map, identifies the set of environment maps from the map database based on the area identifier, filters the set of environment maps, and filters the environment. A set of maps may be transmitted to the AR device. In some embodiments, the PW module comprises, for example, the geographic location of a tracking map, the similarity of one or more identifiers of a network access point associated with an environment map of a set of tracking maps and environment maps, tracking maps and environment. Based on one or more criteria, including the similarity of the metrics that represent the contents of the environment map in the set of maps, and the degree of matching between part of the tracking map and part of the environment map in the set of environment maps. You may filter the set of environment maps.

いくつかの実施形態のいくつかの側面がこれまで説明されたが、種々の改変、修正、および改良が、当業者に容易に想起されるであろうことを理解されたい。一実施例として、実施形態は、拡張（ＡＲ）環境に関連して説明される。本明細書に説明される技法の一部または全部は、ＭＲ環境、より一般的には、他のＸＲ環境およびＶＲ環境内に適用されてもよいことを理解されたい。 Although some aspects of some embodiments have been described so far, it should be appreciated that various modifications, modifications, and improvements will be readily recalled to those of skill in the art. As an embodiment, embodiments are described in relation to an augmented reality (AR) environment. It should be understood that some or all of the techniques described herein may be applied within an MR environment, more generally other XR and VR environments.

別の実施例として、実施形態は、ウェアラブルデバイス等のデバイスに関連して説明される。本明細書に説明される技法の一部または全部は、ネットワーク（クラウド等）、離散アプリケーション、および／またはデバイス、ネットワーク、および離散アプリケーションの任意の好適な組み合わせを介して実装されてもよいことを理解されたい。 As another embodiment, embodiments are described in the context of devices such as wearable devices. Some or all of the techniques described herein may be implemented via any suitable combination of networks (clouds, etc.), discrete applications, and / or devices, networks, and discrete applications. I want you to understand.

さらに、図２９は、候補マップをフィルタリングし、高ランク付けマップのセットをもたらすために使用され得る、基準の実施例を提供する。他の基準が、説明される基準の代わりに、またはそれに加え、使用されてもよい。例えば、複数の候補マップが、あまり望ましくないマップをフィルタリング除去するために使用されるメトリックの類似値を有する場合、候補マップの特性が、候補マップとして留保される、またはフィルタリング除去されるマップを決定するために使用されてもよい。例えば、より大きいまたはより稠密候補マップは、より小さい候補マップより優先されてもよい。いくつかの実施形態では、図２７－２８は、図２９－３４に説明されるシステムおよび方法の全部または一部を説明し得る。 In addition, FIG. 29 provides examples of criteria that can be used to filter candidate maps and result in a set of highly ranked maps. Other criteria may be used in lieu of or in addition to the criteria described. For example, if multiple candidate maps have similar values for the metrics used to filter out less desirable maps, the characteristics of the candidate map determine which map is reserved or filtered out as a candidate map. May be used to. For example, a larger or more dense candidate map may take precedence over a smaller candidate map. In some embodiments, FIGS. 27-28 may illustrate all or part of the systems and methods described in FIGS. 29-34.

図３５および３６は、いくつかの実施形態による、複数の環境マップをランク付けおよびマージするように構成される、ＸＲシステムを図示する、概略図である。いくつかの実施形態では、パス可能世界（ＰＷ）は、マップをランク付けおよび／またはマージするステップをトリガすべきときを決定し得る。いくつかの実施形態では、使用されるべきマップを決定するステップは、いくつかの実施形態によると、少なくとも部分的に、図２１－２５に関連して上記に説明される深層キーフレームに基づいてもよい。 35 and 36 are schematic diagrams illustrating an XR system configured to rank and merge multiple environmental maps, according to some embodiments. In some embodiments, passable worlds (PWs) may determine when to trigger a step to rank and / or merge maps. In some embodiments, the step of determining the map to be used is, according to some embodiments, at least in part, based on the deep keyframes described above in relation to FIGS. 21-25. May be good.

図３７は、いくつかの実施形態による、物理的世界の環境マップを作成する方法３７００を図示する、ブロック図である。方法３７００は、ユーザによって装着されるＸＲデバイスによって捕捉された追跡マップを規準マップの群（例えば、図２８の方法および／または図２９の方法９００によって選択された規準マップ）に対して位置特定するステップ（行為３７０２）から開始してもよい。行為３７０２は、追跡マップのキーリグを規準マップの群の中に位置特定するステップを含んでもよい。各キーリグの位置特定結果は、キーリグの位置特定された姿勢と、２Ｄ／３Ｄ特徴対応のセットとを含んでもよい。 FIG. 37 is a block diagram illustrating a method 3700 for creating an environmental map of the physical world according to some embodiments. Method 3700 locates a tracking map captured by a user-worn XR device with respect to a group of canonical maps (eg, the canonical map selected by the method of FIG. 28 and / or the method 900 of FIG. 29). You may start from step (act 3702). Act 3702 may include the step of locating the key rig of the tracking map within the set of reference maps. The position-specific result of each key rig may include the position-specified posture of the key rig and a set corresponding to 2D / 3D features.

いくつかの実施形態では、方法３７００は、追跡マップを接続された成分に分裂させるステップ（行為３７０４）を含んでもよく、これは、接続された断片をマージすることによって、マップをロバストにマージすることを可能にし得る。各接続された成分は、所定の距離内にある、キーリグを含んでもよい。方法３７００は、所定の閾値より大きい接続された成分を１つ以上の規準マップの中にマージするステップ（行為３７０６）と、マージされた接続された成分を追跡マップから除去するステップとを含んでもよい。 In some embodiments, method 3700 may include splitting the tracking map into connected components (act 3704), which merges the map robustly by merging the connected pieces. Can make it possible. Each connected component may include a key rig that is within a predetermined distance. Method 3700 may include a step of merging connected components larger than a predetermined threshold into one or more reference maps (act 3706) and a step of removing the merged connected components from the tracking map. good.

いくつかの実施形態では、方法３７００は、追跡マップの同一の接続された成分とマージされる、群の規準マップをマージするステップ（行為３７０８）を含んでもよい。いくつかの実施形態では、方法３７００は、任意の規準マップとマージされない、追跡マップの残りの接続された成分を、規準マップにプロモートするステップ（行為３７１０）を含んでもよい。いくつかの実施形態では、方法３７００は、追跡マップの持続的姿勢および／またはＰＣＦと追跡マップの少なくとも１つの接続された成分とマージされる、規準マップをマージするステップ（行為３７１２）を含んでもよい。いくつかの実施形態では、方法３７００は、例えば、マップ点を融合させ、冗長キーリグをプルーニングすることによって、規準マップを完成させるステップ（行為３７１４）を含んでもよい。 In some embodiments, method 3700 may include merging a group canonical map (act 3708), which is merged with the same connected component of the tracking map. In some embodiments, method 3700 may include the step (act 3710) of promoting the remaining connected components of the tracking map to the canonical map, which are not merged with any canonical map. In some embodiments, method 3700 also includes the step of merging the canonical map (Act 3712), which is merged with the persistent posture of the tracking map and / or the PCF and at least one connected component of the tracking map. good. In some embodiments, method 3700 may include, for example, the step of completing a canonical map (act 3714) by fusing map points and pruning redundant key rigs.

図３８Ａおよび３８Ｂは、いくつかの実施形態による、新しい追跡マップで追跡マップ７００（図７）からプロモートされ得る、規準マップ７００を更新することによって作成された環境マップ３８００を図示する。図７に関して図示および説明されるように、規準マップ７００は、点７０２によって表される、対応する物理的世界内の再構成された物理的オブジェクトの間取図７０６を提供し得る。いくつかの実施形態では、マップ点７０２は、複数の特徴を含み得る、物理的オブジェクトの特徴を表し得る。新しい追跡マップは、物理的世界に関して捕捉され、クラウドにアップロードされ、マップ７００とマージされてもよい。新しい追跡マップは、マップ点３８０２と、キーリグ３８０４、３８０６とを含んでもよい。図示される実施例では、キーリグ３８０４は、例えば、マップ７００のキーリグ７０４との対応を確立することによって、規準マップに対して正常に位置特定されたキーリグを表す（図３８Ｂに図示されるように）。他方では、キーリグ３８０６は、マップ７００に対して位置特定されていない、キーリグを表す。キーリグ３８０６は、いくつかの実施形態では、別個の規準マップにプロモートされてもよい。 FIGS. 38A and 38B illustrate an environment map 3800 created by updating the reference map 700, which may be promoted from the tracking map 700 (FIG. 7) with a new tracking map, according to some embodiments. As illustrated and illustrated with respect to FIG. 7, the reference map 700 may provide a floor plan 706 of the reconstructed physical object within the corresponding physical world represented by the point 702. In some embodiments, the map point 702 may represent a feature of a physical object that may contain multiple features. The new tracking map may be captured for the physical world, uploaded to the cloud and merged with Map 700. The new tracking map may include map points 3802 and key rigs 3804, 3806. In the illustrated embodiment, the key rig 3804 represents a key rig successfully positioned relative to the reference map, for example by establishing a correspondence with the key rig 704 of the map 700 (as illustrated in FIG. 38B). ). On the other hand, the key rig 3806 represents a key rig that is not located relative to the map 700. The key rig 3806 may be promoted to a separate criteria map in some embodiments.

図３９Ａ－３９Ｆは、同一物理的空間内のユーザのための共有体験を提供する、クラウドベースの持続的座標系の実施例を図示する、概略図である。図３９Ａは、例えば、クラウドからの規準マップ４８１４が、図２０Ａ－２０Ｃのユーザ４８０２Ａおよび４８０２Ｂによって装着されるＸＲデバイスによって受信されることを示す。規準マップ４８１４は、規準座標フレーム４８０６Ｃを有してもよい。規準マップ４８１４は、複数の関連付けられるＰＰ（例えば、図３９Ｃにおける４８１８Ａ、４８１８Ｂ）を伴う、ＰＣＦ４８１０Ｃを有してもよい。 39A-39F are schematic diagrams illustrating an embodiment of a cloud-based persistent coordinate system that provides a shared experience for users within the same physical space. FIG. 39A shows, for example, that the reference map 4814 from the cloud is received by the XR device worn by the users 4802A and 4802B of FIGS. 20A-20C. The reference map 4814 may have a reference coordinate frame 4806C. Criteria map 4814 may have a PCF4810C with a plurality of associated PPs (eg, 4818A, 4818B in FIG. 39C).

図３９Ｂは、ＸＲデバイスが、その個別の世界座標系４８０６Ａ、４８０６Ｂと規準座標フレーム４８０６Ｃとの間に関係を確立したことを示す。これは、例えば、規準マップ４８１４を個別のデバイス上で位置特定することによって行われてもよい。追跡マップを規準マップに対して位置特定することは、デバイス毎に、そのローカル世界座標系と規準マップの座標系との間の変換をもたらし得る。 FIG. 39B shows that the XR device has established a relationship between its individual world coordinate system 4806A, 4806B and the reference coordinate frame 4806C. This may be done, for example, by locating the reference map 4814 on a separate device. Positioning the tracking map relative to the canonical map can result in a transformation between its local world coordinate system and the coordinate system of the canonical map on a device-by-device basis.

図３９Ｃは、位置特定の結果として、変換が個別のデバイス上のローカルＰＣＦ（例えば、ＰＣＦ４８１０Ａ、４８１０Ｂ）と規準マップ上の個別の持続的姿勢（例えば、ＰＰ４８１８Ａ、４８１８Ｂ）との間で算出され得ることを示す（例えば、変換４８１６Ａ、４８１６Ｂ）。これらの変換を用いることで、各デバイスは、そのローカルＰＣＦを使用し得、これは、デバイス上のセンサを用いて検出された画像を処理し、ローカルデバイスに対する場所を決定し、ＰＰ４８１８Ａ、４８１８Ｂまたは規準マップの他の持続的点に結び付けられる仮想コンテンツを表示することによって、デバイス上でローカルで検出されることができる。そのようなアプローチは、仮想コンテンツを各ユーザに対して正確に位置特定し得、各ユーザが物理的空間内で仮想コンテンツの同一体験を有することを可能にし得る。 FIG. 39C shows that the transformation can be calculated between the local PCFs on the individual devices (eg PCF4810A, 4810B) and the individual persistent attitudes on the reference map (eg PP4818A, 4818B) as a result of the localization. (Eg, conversions 4816A, 4816B). By using these transformations, each device may use its local PCF, which processes the image detected using the sensor on the device, determines the location for the local device, PP4818A, 4818B or It can be detected locally on the device by displaying virtual content that is tied to other persistence points in the canonical map. Such an approach may accurately locate the virtual content to each user and allow each user to have the same experience of the virtual content within the physical space.

図３９Ｄは、規準マップからローカル追跡マップへの持続的姿勢スナップショットを示す。図から分かるように、ローカル追跡マップは、持続的姿勢を介して、相互に接続される。図３９Ｅは、ユーザ４８０２Ａによって装着されるデバイス上のＰＣＦ４８１０Ａが、ＰＰ４８１８Ａを通して、ユーザ４８０２Ｂによって装着されるデバイス内にアクセス可能であることを示す。図３９Ｆは、追跡マップ４８０４Ａ、４８０４Ｂおよび規準４８１４がマージされ得ることを示す。いくつかの実施形態では、いくつかのＰＣＦは、マージの結果として、除去されてもよい。図示される実施例では、マージされたマップは、規準マップ４８１４のＰＣＦ４８１０Ｃを含むが、追跡マップ４８０４Ａ、４８０４ＢのＰＣＦ４８１０Ａ、４８１０Ｂを含まない。ＰＣＦ４８１０Ａ、４８１０Ｂと以前に関連付けられるＰＰは、マップマージ後、ＰＣＦ４８１０Ｃと関連付けられてもよい。 FIG. 39D shows a continuous posture snapshot from the canonical map to the local tracking map. As can be seen from the figure, the local tracking maps are interconnected via a sustained attitude. FIG. 39E shows that the PCF4810A on the device worn by the user 4802A is accessible through the PP4818A into the device worn by the user 4802B. FIG. 39F shows that tracking maps 4804A, 4804B and criteria 4814 can be merged. In some embodiments, some PCFs may be removed as a result of the merge. In the illustrated embodiment, the merged map includes the PCF4810C of the reference map 4814 but does not include the PCF4810A, 4810B of the tracking maps 4804A, 4804B. The PP previously associated with PCF4810A, 4810B may be associated with PCF4810C after map merging.

図４０および４１は、図９の第１のＸＲデバイス１２．１によって追跡マップを生成する実施例を図示する。図４０は、いくつかの実施形態による、３次元の第１のローカル追跡マップ（マップ１）の２次元表現であって、これは、図９の第１のＸＲデバイスによって生成され得る。図４１は、いくつかの実施形態による、マップ１を第１のＸＲデバイスから図９のサーバにアップロードするステップを図示する、ブロック図である。 40 and 41 illustrate an example of generating a tracking map with the first XR device 12.1 of FIG. FIG. 40 is a two-dimensional representation of a three-dimensional first local tracking map (Map 1) according to some embodiments, which can be generated by the first XR device of FIG. FIG. 41 is a block diagram illustrating a step of uploading a map 1 from a first XR device to the server of FIG. 9 according to some embodiments.

図４０は、第１のＸＲデバイス１２．１上のマップ１および仮想コンテンツ（コンテンツ１２３およびコンテンツ４５６）を図示する。マップ１は、原点（原点１）を有する。マップ１は、いくつかのＰＣＦ（ＰＣＦａ－ＰＣＦｄ）を含む。第１のＸＲデバイス１２．１の視点から、ＰＣＦａは、一例として、マップ１の原点に位置し、（０，０，０）のＸ、Ｙ、およびＺ座標を有し、ＰＣＦｂは、Ｘ、Ｙ、およびＺ座標（－１，０，０）を有する。コンテンツ１２３は、ＰＣＦａと関連付けられる。本実施例では、コンテンツ１２３は、（１，０，０）のＰＣＦａに対するＸ、Ｙ、およびＺ関係を有する。コンテンツ４５６は、ＰＣＦｂに対する関係を有する。本実施例では、コンテンツ４５６は、ＰＣＦｂに対する（１，０，０）のＸ、Ｙ、およびＺ関係を有する。 FIG. 40 illustrates map 1 and virtual content (content 123 and content 456) on the first XR device 12.1. Map 1 has an origin (origin 1). Map 1 includes several PCFs (PCFa-PCFd). From the viewpoint of the first XR device 12.1. It has Y and Z coordinates (-1,0,0). Content 123 is associated with PCFa. In this embodiment, the content 123 has an X, Y, and Z relationship with respect to PCFa of (1,0,0). Content 456 has a relationship to PCFb. In this embodiment, the content 456 has an X, Y, and Z relationship of (1,0,0) with respect to the PCFb.

図４１では、第１のＸＲデバイス１２．１は、マップ１をサーバ２０にアップロードする。サーバ２０は、ここでは、マップ１に基づく、規準マップを有する。第１のＸＲデバイス１２．１は、本段階では空である、規準マップを有する。サーバ２０は、議論の目的のために、いくつかの実施形態では、マップ１以外の他のマップを含まない。マップは、第２のＸＲデバイス１２．２上に記憶されていない。 In FIG. 41, the first XR device 12.1 uploads the map 1 to the server 20. The server 20 here has a canonical map based on the map 1. The first XR device 12.1 has a canonical map that is empty at this stage. For the purposes of discussion, the server 20, in some embodiments, does not include maps other than map 1. The map is not stored on the second XR device 12.2.

第１のＸＲデバイス１２．１はまた、そのＷｉ－Ｆｉシグネチャデータをサーバ２０に伝送する。サーバ２０は、Ｗｉ－Ｆｉシグネチャデータを使用して、過去に、サーバ２０または他のサーバに接続されていた他のデバイスから集められた知能とともに、記録されていたそのような他のデバイスのＧＰＳ場所に基づいて、第１のＸＲデバイス１２．１の大まかな場所を決定してもよい。第１のＸＲデバイス１２．１は、ここで、第１のセッション（図８参照）を終了し得、サーバ２０から接続解除し得る。 The first XR device 12.1 also transmits its Wi-Fi signature data to the server 20. Server 20 uses Wi-Fi signature data to record GPS for such other devices, along with intelligence gathered from other devices that were previously connected to server 20 or other servers. A rough location for the first XR device 12.1 may be determined based on the location. The first XR device 12.1 can now terminate the first session (see FIG. 8) and disconnect from the server 20.

図４２は、いくつかの実施形態による、図１６のＸＲシステムを図示する、概略図であって、第１のユーザ１４．１が第１のセッションを終了した後、第２のユーザ１４．２がＸＲシステムの第２のＸＲデバイスを使用して第２のセッションを開始したことを示す。図４３Ａは、第２のユーザ１４．２による第２のセッションの開始を示す、ブロック図である。第１のユーザ１４．１は、第１のユーザ１４．１による第１のセッションが終了したため、想像線で示される。第２のＸＲデバイス１２．２は、オブジェクトを記録し始める。可変粒度を伴う種々のシステムが、第２のＸＲデバイス１２．２による第２のセッションが第１のＸＲデバイス１２．１による第１のセッションの同一近傍内にあることを決定するために、サーバ２０によって使用されてもよい。例えば、Ｗｉ－Ｆｉシグネチャデータ、全地球測位システム（ＧＰＳ）測位データ、Ｗｉ－Ｆｉシグネチャデータに基づくＧＰＳデータ、または場所を示す任意の他のデータが、その場所を記録するために、第１および第２のＸＲデバイス１２．１および１２．２内に含まれてもよい。代替として、第２のＸＲデバイス１２．２によって識別されるＰＣＦは、マップ１のＰＣＦと類似性を示し得る。 42 is a schematic diagram illustrating the XR system of FIG. 16 according to some embodiments, wherein the second user 14.2 after the first user 14.1 ends the first session. Indicates that the second XR device of the XR system has been used to initiate a second session. FIG. 43A is a block diagram showing the start of a second session by a second user 14.2. The first user 14.1 is shown by an imaginary line because the first session by the first user 14.1 has ended. The second XR device 12.2 begins recording the object. Various systems with variable particle size are used to determine that the second session by the second XR device 12.2 is in the same neighborhood as the first session by the first XR device 12.1. May be used by 20. For example, Wi-Fi signature data, Global Positioning System (GPS) positioning data, GPS data based on Wi-Fi signature data, or any other data indicating a location, first and for record of the location. It may be included in the second XR devices 12.1 and 12.2. Alternatively, the PCF identified by the second XR device 12.2 may show similarities to the PCF in Map 1.

図４３Ｂに示されるように、第２のＸＲデバイスは、ブートアップし、画像１１１０等のデータを１つ以上のカメラ４４、４６から収集し始める。図１４に示されるように、いくつかの実施形態では、ＸＲデバイス（例えば、第２のＸＲデバイス１２．２）は、１つ以上の画像１１１０を収集し、画像処理を実施し、１つ以上の特徴／着目点１１２０を抽出してもよい。各特徴は、記述子１１３０に変換されてもよい。いくつかの実施形態では、記述子１１３０は、結び付けられた関連付けられる画像の位置および方向を有し得る、キーフレーム１１４０を説明するために使用されてもよい。１つ以上のキーフレーム１１４０は、前の持続的姿勢１１５０からの閾値距離、例えば、３メートル後に自動的に生成され得る、単一持続的姿勢１１５０に対応してもよい。１つ以上の持続的姿勢１１５０は、所定の距離後、例えば、５メートル毎に自動的に生成され得る、単一ＰＣＦ１１６０に対応してもよい。経時的に、ユーザが、ユーザの環境を動き回り続け、ＸＲデバイスが、画像１１１０等のより多くのデータを収集し続けるにつれて、付加的ＰＣＦ（例えば、ＰＣＦ３およびＰＣＦ４、５）が、作成され得る。アプリケーション、すなわち、２つの１１８０は、ＸＲデバイス上で起動され、ユーザへの提示のために、仮想コンテンツ１１７０をＸＲデバイスに提供してもよい。仮想コンテンツは、関連付けられるコンテンツ座標フレームを有してもよく、これは、１つ以上のＰＣＦに対して設置されてもよい。図４３Ｂに示されるように、第２のＸＲデバイス１２．２は、３つのＰＣＦを作成する。いくつかの実施形態では、第２のＸＲデバイス１２．２は、サーバ上２０に記憶される１つ以上の規準マップに対して位置特定するように試みてもよい。 As shown in FIG. 43B, the second XR device boots up and begins collecting data such as images 1110 from one or more cameras 44, 46. As shown in FIG. 14, in some embodiments, the XR device (eg, a second XR device 12.2) collects one or more images 1110, performs image processing, and performs one or more images. Features / points of interest 1120 may be extracted. Each feature may be converted to descriptor 1130. In some embodiments, descriptor 1130 may be used to illustrate key frame 1140, which may have the position and orientation of the associated associated image. One or more keyframes 1140 may correspond to a threshold distance from the previous persistent posture 1150, eg, a single persistent posture 1150 that may be automatically generated after 3 meters. One or more sustained postures 1150 may correspond to a single PCF 1160 that may be automatically generated after a given distance, eg, every 5 meters. Over time, additional PCFs (eg, PCF3 and PCF4, 5) may be created as the user continues to roam the user's environment and the XR device continues to collect more data such as images 1110. The application, i.e., two 1180s, may be launched on the XR device and provide virtual content 1170 to the XR device for presentation to the user. The virtual content may have a content coordinate frame associated with it, which may be installed for one or more PCFs. As shown in FIG. 43B, the second XR device 12.2 creates three PCFs. In some embodiments, the second XR device 12.2 may attempt to locate to one or more canonical maps stored on the server 20.

いくつかの実施形態では、図４３Ｃに示されるように、第２のＸＲデバイス１２．２は、規準マップ１２０をサーバ２０からダウンロードしてもよい。第２のＸＲデバイス１２．２上のマップ１は、ＰＣＦａ－ｄと、原点１とを含む。いくつかの実施形態では、サーバ２０は、種々の場所のための複数の規準マップを有してもよく、第２のＸＲデバイス１２．２が第１のセッションの間の第１のＸＲデバイス１２．１の近傍と同じ近傍内にあることを決定し、第２のＸＲデバイス１２．２に、その近傍に関する規準マップを送信してもよい。 In some embodiments, the second XR device 12.2 may download the reference map 120 from the server 20, as shown in FIG. 43C. Map 1 on the second XR device 12.2 includes PCFad and origin 1. In some embodiments, the server 20 may have a plurality of reference maps for different locations, with the second XR device 12.2 being the first XR device 12 during the first session. It may be determined that it is in the same neighborhood as the neighborhood of .1 and a criterion map for that neighborhood may be transmitted to the second XR device 12.2.

図４４は、第２のＸＲデバイス１２．２が、マップ２を生成する目的のためのＰＣＦの識別を開始することを示す。第２のＸＲデバイス１２．２は、単一ＰＣＦ、すなわち、ＰＣＦ１、２のみを識別している。第２のＸＲデバイス１２．２のためのＰＣＦ１、２のＸ、Ｙ、およびＺ座標は、（１，１，１）であり得る。マップ２は、その独自の原点（原点２）を有し、これは、現在の頭部姿勢セッションのためのデバイス始動時のデバイス２の頭部姿勢に基づいてもよい。いくつかの実施形態では、第２のＸＲデバイス１２．２は、マップ２を規準マップに対して位置特定するように直ちに試みてもよい。いくつかの実施形態では、マップ２は、システムが２つのマップ間の任意または十分な重複を認識しないため、規準マップ（マップ１）に対して位置特定することが不可能である場合がある（すなわち、位置特定が失敗し得る）。いくつかの実施形態では、システムは、ローカルマップと規準マップとの間のＰＣＦ比較に基づいて、位置特定してもよい。いくつかの実施形態では、システムは、ローカルマップと規準マップとの間の持続的姿勢比較に基づいて、位置特定してもよい。いくつかの実施形態では、システムは、ローカルマップと規準マップとの間のキーフレーム比較に基づいて、位置特定してもよい。 FIG. 44 shows that the second XR device 12.2 initiates identification of the PCF for the purpose of generating map 2. The second XR device 12.2 identifies only a single PCF, ie PCFs 1 and 2. The X, Y, and Z coordinates of PCF1, 2 for the second XR device 12.2 can be (1,1,1). Map 2 has its own origin (origin 2), which may be based on the device 2's head posture at device startup for the current head posture session. In some embodiments, the second XR device 12.2 may immediately attempt to locate Map 2 with respect to the reference map. In some embodiments, map 2 may not be able to be located relative to the reference map (map 1) because the system does not recognize any or sufficient overlap between the two maps (map 1). That is, location identification can fail). In some embodiments, the system may be located based on a PCF comparison between a local map and a canonical map. In some embodiments, the system may be located based on a sustained attitude comparison between the local map and the canonical map. In some embodiments, the system may locate based on a keyframe comparison between the local map and the canonical map.

図４５は、第２のＸＲデバイス１２．２がマップ２のさらなるＰＣＦ（ＰＣＦ１、２、ＰＣＦ３、ＰＣＦ４、５）を識別した後のマップ２を示す。第２のＸＲデバイス１２．２は、再び、マップ２を規準マップに対して位置特定するように試みる。マップ２は、規準マップの少なくとも一部と重複するように拡張されたため、位置特定試行は、成功するであろう。いくつかの実施形態では、ローカル追跡マップ、マップ２、および規準マップ間の重複は、ＰＣＦ、持続的姿勢、キーフレーム、または任意の他の好適な中間または派生構築物によって表されてもよい。 FIG. 45 shows Map 2 after the second XR device 12.2 has identified additional PCFs (PCF1, 2, PCF3, PCF4, 5) of Map 2. The second XR device 12.2 again attempts to locate Map 2 with respect to the reference map. The location attempt will be successful because Map 2 has been extended to overlap at least part of the canonical map. In some embodiments, overlap between the local tracking map, map 2, and canonical map may be represented by a PCF, persistent posture, keyframes, or any other suitable intermediate or derivative construct.

さらに、第２のＸＲデバイス１２．２は、コンテンツ１２３およびコンテンツ４５６をマップ２のＰＣＦ１、２およびＰＣＦ３に関連付けている。コンテンツ１２３は、（１，０，０）のＰＣＦ１、２に対するＸ、Ｙ、およびＺ座標を有する。同様に、マップ２内のＰＣＦ３に対するＸ、Ｙ、およびＺ座標も、（１，０，０）である。 Further, the second XR device 12.2 associates content 123 and content 456 with PCF1, 2 and PCF3 of map 2. The content 123 has X, Y, and Z coordinates with respect to PCFs 1 and 2 of (1,0,0). Similarly, the X, Y, and Z coordinates for PCF3 in Map 2 are also (1,0,0).

図４６Ａおよび４６Ｂは、規準マップに対するマップ２の位置特定成功を図示する。マップ１４１０の重複エリア／体積／セクションは、マップ１および規準マップとの共通部分を表す。マップ２は、位置特定する前に、ＰＣＦ３および４、５を作成し、規準マップは、マップ２が作成される前に、ＰＣＦａおよびｃを作成したため、異なるＰＣＦが、実空間（例えば、異なるマップ）内の同一体積を表すために作成された。 FIGS. 46A and 46B illustrate the successful positioning of Map 2 with respect to the reference map. The overlapping area / volume / section of map 1410 represents the intersection with map 1 and the reference map. The map 2 created PCFs 3 and 4, 5 before locating, and the reference map created PCFa and c before map 2, so different PCFs were created in real space (eg, different maps). ) Was created to represent the same volume.

図４７に示されるように、第２のＸＲデバイス１２．２は、規準マップからのＰＣＦａ－ｄを含むように、マップ２を拡張させる。ＰＣＦａ－ｄの含有は、規準マップに対するマップ２の位置特定を表す。いくつかの実施形態では、ＸＲシステムは、最適化ステップを実施し、１４１０内のＰＣＦ、すなわち、ＰＣＦ３およびＰＣＦ４、５等の複製ＰＣＦを重複エリアから除去してもよい。マップ２が位置特定した後、コンテンツ４５６およびコンテンツ１２３等の仮想コンテンツの設置は、更新されたマップ２内の最も近い更新されたＰＣＦに対することになるだろう。仮想コンテンツは、コンテンツのための変化されたＰＣＦ結び付けにもかかわらず、かつマップ２のための更新されたＰＣＦにもかかわらず、ユーザに対して同一実世界場所内に現れる。 As shown in FIG. 47, the second XR device 12.2 extends map 2 to include PCFad from the reference map. The inclusion of PCFa-d represents the location of Map 2 with respect to the reference map. In some embodiments, the XR system may perform an optimization step to remove PCFs within 1410, ie duplicate PCFs such as PCF3 and PCF4, 5 from overlapping areas. After the map 2 is located, the installation of virtual content such as content 456 and content 123 will be for the nearest updated PCF in the updated map 2. The virtual content appears in the same real-world location for the user, despite the modified PCF binding for the content and the updated PCF for Map 2.

図４８に示されるように、第２のＸＲデバイス１２．２は、さらなるＰＣＦ（ＰＣＦｅ、ｆ、ｇ、およびｈ）が第２のＸＲデバイス１２．２によって識別されるにつれて、例えば、ユーザが実世界を歩き回るにつれて、マップ２を拡張させ続ける。また、マップ１は、図４７および４８において拡張されていないことに留意されたい。 As shown in FIG. 48, the second XR device 12.2, for example, as the additional PCF (PCFe, f, g, and h) is identified by the second XR device 12.2 Continue to expand Map 2 as you roam the world. Also note that Map 1 is not extended in FIGS. 47 and 48.

図４９を参照すると、第２のＸＲデバイス１２．２は、マップ２をサーバ２０にアップロードする。サーバ２０は、規準マップとともに、マップ２を記憶する。いくつかの実施形態では、マップ２は、第２のＸＲデバイス１２．２のためのセッションが終了すると、サーバ２０にアップロードされてもよい。 Referring to FIG. 49, the second XR device 12.2 uploads the map 2 to the server 20. The server 20 stores the map 2 together with the reference map. In some embodiments, map 2 may be uploaded to server 20 at the end of the session for the second XR device 12.2.

サーバ２０内の規準マップは、ここでは、ＰＣＦｉを含み、これは、第１のＸＲデバイス１２．１上のマップ１内に含まれない。サーバ２０上の規準マップは、第３のＸＲデバイス（図示せず）が、マップをサーバ２０にアップロードし、そのようなマップが、ＰＣＦｉを含むと、ＰＣＦｉを含むように拡張され得る。 The reference map in the server 20 includes PCFi here, which is not included in the map 1 on the first XR device 12.1. The canonical map on the server 20 can be extended to include the PCFi when a third XR device (not shown) uploads the map to the server 20 and such a map contains the PCFi.

図５０では、サーバ２０は、マップ２を規準マップとマージし、新しい規準マップを形成する。サーバ２０は、ＰＣＦａ－ｄが規準マップおよびマップ２と共通であることを決定する。サーバは、ＰＣＦｅ－ｈおよびＰＣＦ１、２をマップ２から含むように、規準マップを拡張させ、新しい規準マップを形成する。第１および第２のＸＲデバイス１２．１および１２．２上の規準マップは、マップ１に基づき、古くなる。 In FIG. 50, the server 20 merges the map 2 with the canonical map to form a new canonical map. The server 20 determines that the PCFad is common to the reference map and the map 2. The server extends the canonical map to include PCFe-h and PCF1 and 2 from the map 2 to form a new canonical map. The canonical maps on the first and second XR devices 12.1 and 12.2 are based on Map 1 and become obsolete.

図５１では、サーバ２０は、新しい規準マップを第１および第２のＸＲデバイス１２．１および１２．２に伝送する。いくつかの実施形態では、これは、第１のＸＲデバイス１２．１および第２のデバイス１２．２が、異なるまたは新しいまたは後続セッションの間、位置特定しようと試みるときに生じ得る。第１および第２のＸＲデバイス１２．１および１２．２は、上記に説明されるように、その個別のローカルマップ（それぞれ、マップ１およびマップ２）を新しい規準マップに対して位置特定するステップに進む。 In FIG. 51, the server 20 transmits the new reference map to the first and second XR devices 12.1 and 12.2. In some embodiments, this can occur when the first XR device 12.1 and the second device 12.2 attempt to locate during a different or new or subsequent session. The first and second XR devices 12.1 and 12.2 are steps to locate their individual local maps (Map 1 and Map 2, respectively) with respect to the new reference map, as described above. Proceed to.

図５２に示されるように、頭部座標フレーム９６または「頭部姿勢」は、マップ２内のＰＣＦに関連する。いくつかの実施形態では、マップの原点、すなわち、原点２は、セッションの開始時における第２のＸＲデバイス１２．２の頭部姿勢に基づく。ＰＣＦが、セッションの間に作成されるにつれて、ＰＣＦは、世界座標フレーム、すなわち、原点２に対して設置される。マップ２のＰＣＦは、規準座標フレームに対する持続座標フレームとしての役割を果たし、世界座標フレームは、前のセッションの世界座標フレーム（例えば、図４０におけるマップ１の原点１）であってもよい。世界座標フレームから頭部座標フレーム９６への変換は、図９を参照して前述されている。図５２に示される頭部座標フレーム９６は、マップ２のＰＣＦに対する特定の座標位置にあって、マップ２に対して特定の角度にある、２つの直交軸のみを有する。しかしながら、頭部座標フレーム９６は、マップ２のＰＣＦに対するある３次元場所内にあって、３次元空間内の３つの直交軸を有することを理解されたい。 As shown in FIG. 52, the head coordinate frame 96 or "head posture" is associated with the PCF in map 2. In some embodiments, the origin of the map, or origin 2, is based on the head posture of the second XR device 12.2 at the start of the session. As the PCF is created during the session, the PCF is placed with respect to the world coordinate frame, ie origin 2. The PCF of the map 2 serves as a sustained coordinate frame with respect to the reference coordinate frame, and the world coordinate frame may be the world coordinate frame of the previous session (eg, the origin 1 of map 1 in FIG. 40). The conversion from the world coordinate frame to the head coordinate frame 96 is described above with reference to FIG. The head coordinate frame 96 shown in FIG. 52 has only two orthogonal axes at a particular coordinate position of the map 2 with respect to the PCF and at a particular angle with respect to the map 2. However, it should be understood that the head coordinate frame 96 is in a three-dimensional location with respect to the PCF of Map 2 and has three orthogonal axes in the three-dimensional space.

図５３では、頭部座標フレーム９６は、マップ２のＰＣＦに対して移動している。頭部座標フレーム９６は、第２のユーザ１４．２がその頭部を移動させたため、移動している。ユーザは、その頭部を６自由度（６ｄｏｆ）において移動させることができる。頭部座標フレーム９６は、したがって、６ｄｏｆ、すなわち、図５２におけるその前の場所から、３次元において、かつマップ２のＰＣＦに対する約３つの直交軸において移動することができる。頭部座標フレーム９６は、それぞれ、図９における実オブジェクト検出カメラ４４および慣性測定ユニット４８が、実オブジェクトおよび頭部ユニット２２の運動を検出すると、調節される。頭部姿勢追跡に関するさらなる情報は、「ＥｎｈａｎｃｅｄＰｏｓｅＤｅｔｅｒｍｉｎａｔｉｏｎｆｏｒＤｉｓｐｌａｙＤｅｖｉｃｅ」と題された米国特許第出願第１６／２２１，０６５号に開示され、参照することによってその全体として本明細書に組み込まれる。 In FIG. 53, the head coordinate frame 96 is moving with respect to the PCF of the map 2. The head coordinate frame 96 is moving because the second user 14.2 has moved its head. The user can move its head in 6 degrees of freedom (6 dof). The head coordinate frame 96 can therefore move from 6 dof, i.e., its previous location in FIG. 52, in three dimensions and about three orthogonal axes to the PCF of Map 2. The head coordinate frame 96 is adjusted when the real object detection camera 44 and the inertial measurement unit 48 in FIG. 9 detect the motion of the real object and the head unit 22, respectively. Further information on head posture tracking is disclosed and incorporated herein by reference in US Pat. No. 6,221,065, entitled "Enhanced Pose Determination for Display Device."

図５４は、音が１つ以上のＰＣＦと関連付けられてもよいことを示す。ユーザは、例えば、立体音を伴うヘッドホンまたはイヤホンを装着し得る。ヘッドホンを通した音の場所が、従来の技法を使用してシミュレートされることができる。音の場所は、ユーザがその頭部を左に回転させると、音の場所が右に回転し、したがって、ユーザが実世界内の同一場所から生じる音を知覚するように、定常位置に位置してもよい。本実施例では、音の場所は、音１２３および音４５６によって表される。議論の目的のために、図５４は、その分析において、図４８に類似する。第１および第２のユーザ１４．１および１４．２が、同一または異なる時間において、同一部屋内に位置するとき、彼らは、実世界内の同一場所から生じる音１２３および音４５６を知覚する。 FIG. 54 shows that sound may be associated with one or more PCFs. The user may wear, for example, headphones or earphones with stereoscopic sound. The location of the sound through the headphones can be simulated using conventional techniques. The sound location is located in a stationary position so that when the user rotates his head to the left, the sound location rotates to the right, and thus the user perceives the sound coming from the same location in the real world. You may. In this embodiment, the location of the sound is represented by sound 123 and sound 456. For the purposes of discussion, FIG. 54 is similar to FIG. 48 in its analysis. When the first and second users 14.1 and 14.2 are located in the same room at the same or different times, they perceive sounds 123 and 456 originating from the same place in the real world.

図５５および５６は、上記に説明される技術のさらなる実装を図示する。第１のユーザ１４．１は、図８を参照して説明されるように、第１のセッションを開始した。図５５に示されるように、第１のユーザ１４．１は、想像線によって示されるように、第１のセッションを終了した。第１のセッションの終了時、第１のＸＲデバイス１２．１は、マップ１をサーバ２０にアップロードした。第１のユーザ１４．１は、今度は、第１のセッションの後の時間において、第２のセッションを開始した。第１のＸＲデバイス１２．１は、マップ１が第１のＸＲデバイス１２．１上にすでに記憶されているため、マップ１をサーバ２０からダウンロードしない。マップ１が、喪失される場合、第１のＸＲデバイス１２．１は、マップ１をサーバ２０からダウンロードする。第１のＸＲデバイス１２．１は、次いで、上記に説明されるように、マップ２のためのＰＣＦを構築し、マップ１に対して位置特定し、規準マップをさらに展開するステップに進む。第１のＸＲデバイス１２．１のマップ２は、次いで、上記に説明されるように、ローカルコンテンツ、頭部座標フレーム、ローカル音等を関連させるために使用される。 FIGS. 55 and 56 illustrate further implementations of the techniques described above. The first user 14.1 initiates a first session, as described with reference to FIG. As shown in FIG. 55, the first user 14.1 has terminated the first session, as indicated by the imaginary line. At the end of the first session, the first XR device 12.1 uploaded the map 1 to the server 20. The first user 14.1 in turn started a second session at a time after the first session. The first XR device 12.1 does not download the map 1 from the server 20 because the map 1 is already stored on the first XR device 12.1. If the map 1 is lost, the first XR device 12.1 downloads the map 1 from the server 20. The first XR device 12.1 then proceeds to the step of constructing a PCF for map 2, locating it relative to map 1, and further developing the canonical map, as described above. Map 2 of the first XR device 12.1 is then used to relate local content, head coordinate frames, local sounds, etc., as described above.

図５７および５８を参照すると、また、１人を上回るユーザが同一セッションにおいてサーバと相互作用することが可能性として考えられ得る。本実施例では、第１のユーザ１４．１および第２のユーザ１４．２に、第３のＸＲデバイス１２．３を伴う第３のユーザ１４．３が加わっている。各ＸＲデバイス１２．１、１２．２、および１２．３は、それぞれ、その独自のマップ、すなわち、マップ１、マップ２、およびマップ３を生成し始める。ＸＲデバイス１２．１、１２．２、および１２．３が、マップ１、２、および３を展開し続けるにつれて、マップは、サーバ２０に漸次的にアップロードされる。サーバ２０は、マップ１、２、および３をマージし、規準マップを形成する。規準マップは、次いで、サーバ２０からＸＲデバイス１２．１、１２．２、および１２．３のそれぞれのものに伝送される。 With reference to FIGS. 57 and 58, it is also possible that more than one user can interact with the server in the same session. In this embodiment, the first user 14.1 and the second user 14.2 are joined by a third user 14.3 with a third XR device 12.3. Each XR device 12.1, 12.2, and 12.3 begins to generate its own map, namely Map 1, Map 2, and Map 3. Maps are progressively uploaded to server 20 as the XR devices 12.1, 12.2, and 12.3 continue to expand maps 1, 2, and 3. The server 20 merges maps 1, 2, and 3 to form a canonical map. The canonical map is then transmitted from the server 20 to each of the XR devices 12.1, 12.2, and 12.3.

図５９は、いくつかの実施形態による、頭部姿勢を復元および／またはリセットするための視認方法の側面を図示する。図示される実施例では、行為１４００において、視認デバイスが、電源投入される。行為１４１０では、電源投入されることに応答して、新しいセッションが、開始される。いくつかの実施形態では、新しいセッションは、頭部姿勢を確立するステップを含んでもよい。ユーザの頭部に固着される、頭部搭載型フレーム上の１つ以上の捕捉デバイスが、最初に、環境の画像を捕捉し、次いで、表面を画像から決定することによって、環境の表面を捕捉する。いくつかの実施形態では、表面データはまた、重力センサからのデータと組み合わせられ、頭部姿勢を確立してもよい。頭部姿勢を確立する他の好適な方法が、使用されてもよい。 FIG. 59 illustrates aspects of a visual method for restoring and / or resetting head posture, according to some embodiments. In the illustrated embodiment, at act 1400, the visual device is powered on. At act 1410, a new session is initiated in response to being powered on. In some embodiments, the new session may include steps to establish a head posture. One or more capture devices on a head-mounted frame, anchored to the user's head, first capture an image of the environment and then capture the surface of the environment by determining the surface from the image. do. In some embodiments, the surface data may also be combined with data from a gravity sensor to establish a head posture. Other suitable methods for establishing head posture may be used.

行為１４２０では、視認デバイスのプロセッサが、頭部姿勢の追跡のためのルーチンに入る。捕捉デバイスは、ユーザがその頭部を移動させるにつれて、環境の表面を捕捉し、表面に対する頭部搭載型フレームの配向を決定し続ける。 At act 1420, the processor of the visual device enters a routine for tracking head posture. The capture device captures the surface of the environment as the user moves its head and continues to determine the orientation of the head-mounted frame with respect to the surface.

行為１４３０では、プロセッサが、頭部姿勢が喪失されたかどうかを決定する。頭部姿勢は、低特徴入手をもたらし得る、あまりに多くの反射性表面、低光量、何もない壁、屋外等、「エッジ」ケースに起因して、または移動し、マップの一部を形成する、群集等、動的ケースのため、喪失された状態になり得る。１４３０におけるルーチンは、頭部姿勢が喪失されたかどうかを決定するための十分な時間を可能にするために、ある時間量、例えば、１０秒が経過することを可能にする。頭部姿勢が喪失されていない場合、プロセッサは、１４２０に戻り、再び、頭部姿勢の追跡に入る。 At act 1430, the processor determines if the head posture has been lost. Head postures result in or move from "edge" cases, such as too many reflective surfaces, low light intensity, empty walls, outdoors, etc., which can result in low feature availability, forming part of the map. , Crowds, etc., can be lost due to dynamic cases. The routine at 1430 allows a certain amount of time, eg, 10 seconds, to elapse to allow sufficient time to determine if head posture has been lost. If no head posture has been lost, the processor returns to 1420 and enters head posture tracking again.

頭部姿勢が、行為１４３０において喪失された場合、プロセッサは、１４４０において、頭部姿勢を復元するためのルーチンに入る。頭部姿勢が、低光量に起因して喪失された場合、以下のメッセージ等のメッセージが、視認デバイスのディスプレイを通してユーザに表示される： If the head posture is lost in action 1430, the processor enters a routine to restore the head posture in 1440. If the head posture is lost due to low light intensity, a message such as the following message is displayed to the user through the display of the visual device:

システムが、低光量条件を検出中です。より光の多いエリアに移動してください。 The system is detecting low light conditions. Move to an area with more light.

システムは、十分な光が利用可能であるかどうかと、頭部姿勢が復元され得るかどうかとを監視し続けるであろう。システムは、代替として、表面の低テクスチャが頭部姿勢を喪失させていることを決定し得、その場合、ユーザは、表面の捕捉を改良するための提案として、以下のプロンプトがディスプレイ内に与えられる： The system will continue to monitor whether sufficient light is available and whether head posture can be restored. Alternatively, the system may determine that the low texture of the surface is causing head posture loss, in which case the user is given the following prompt in the display as a suggestion to improve surface capture: Be:

システムは、微細なテクスチャを伴う十分な表面を検出することができません。表面のテクスチャが粗くなく、テクスチャがより精緻化されたエリアに移動してください。 The system cannot detect enough surfaces with fine textures. Move to an area where the surface texture is not rough and the texture is more refined.

行為１４５０では、プロセッサは、頭部姿勢復元が失敗したかどうかを決定するためのルーチンに入る。頭部姿勢復元が失敗していない（すなわち、頭部姿勢復元が成功した）場合、プロセッサは、再び、頭部姿勢の追跡に入ることによって、行為１４２０に戻る。頭部姿勢復元が失敗した場合、プロセッサは、行為１４１０に戻り、新しいセッションを確立する。新しいセッションの一部として、全てのキャッシュされたデータは、無効化され、以降、頭部姿勢は、新しく確立される。頭部追跡の任意の好適な方法が、図５９に説明されるプロセスと組み合わせて使用されてもよい。米国特許第出願第１６／２２１，０６５号は、頭部追跡を説明しており、参照することによってその全体として本明細書に組み込まれる。 At act 1450, the processor enters a routine to determine if head posture restoration has failed. If the head posture restoration has not failed (ie, the head posture restoration was successful), the processor returns to action 1420 by entering the head posture tracking again. If the head posture restoration fails, the processor returns to action 1410 and establishes a new session. As part of the new session, all cached data will be invalidated and the head posture will be newly established thereafter. Any suitable method of head tracking may be used in combination with the process described in FIG. U.S. Pat. No. 16,221,065 describes head tracking and is incorporated herein by reference in its entirety.

図６０は、コンピュータシステム１９００の例示的形態における機械の略図表現を示し、機械に本明細書で議論される方法論のうちの任意の１つ以上のものを実施させるための命令のセットが、いくつかの実施形態に従って実行されてもよい。代替実施形態では、機械は、独立型デバイスとして動作する、または他の機械に接続（例えば、ネットワーク化）されてもよい。さらに、単一機械のみが、図示されるが、用語「機械」はまた、個々にまたはともに、命令のセット（または複数のセット）を実行し、本明細書で議論される方法論のうちの任意の１つ以上のものを実施する、機械の任意の集合を含むものと捉えられるものとする。 FIG. 60 shows a schematic representation of a machine in an exemplary embodiment of a computer system 1900, which is a set of instructions for causing the machine to perform any one or more of the methodologies discussed herein. It may be carried out according to the embodiment. In an alternative embodiment, the machine may operate as a stand-alone device or be connected (eg, networked) to another machine. Further, although only a single machine is illustrated, the term "machine" also performs a set of instructions (or a set of instructions) individually or together and is any of the methodologies discussed herein. It shall be considered to include any set of machines that carry out one or more of the above.

例示的コンピュータシステム１９００は、プロセッサ１９０２（例えば、中央処理ユニット（ＣＰＵ）、グラフィック処理ユニット（ＧＰＵ）、または両方）と、メインメモリ１９０４（例えば、読取専用メモリ（ＲＯＭ）、フラッシュメモリ、動的ランダムアクセスメモリ（ＤＲＡＭ）例えば、同期ＤＲＡＭ（ＳＤＲＡＭ）またはＲａｍｂｕｓＤＲＡＭ（ＲＤＲＡＭ）等）と、静的メモリ１９０６（例えば、フラッシュメモリ、静的ランダムアクセスメモリ（ＳＲＡＭ）等）とを含み、これらは、バス１９０８を介して相互に通信する。 An exemplary computer system 1900 includes a processor 1902 (eg, a central processing unit (CPU), a graphics processing unit (GPU), or both) and a main memory 1904 (eg, read-only memory (ROM), flash memory, dynamic random). Access memory (DRAM) includes, for example, synchronous DRAM (SDRAM) or Rambus DRAM (RDRAM), etc.) and static memory 1906 (eg, flash memory, static random access memory (SRAM), etc.), which are buses. Communicate with each other via 1908.

コンピュータシステム１９００はさらに、ディスクドライブユニット１９１６と、ネットワークインターフェースデバイス１９２０とを含んでもよい。 The computer system 1900 may further include a disk drive unit 1916 and a network interface device 1920.

ディスクドライブユニット１９１６は、その上に本明細書に説明される方法論または機能のうちの任意の１つ以上のものを具現化する、１つ以上の命令のセット１９２４（例えば、ソフトウェア）が記憶される、機械可読媒体１９２２を含む。ソフトウェアはまた、コンピュータシステム１９００、メインメモリ１９０４、およびプロセッサ１９０２によるその実行の間、完全にまたは少なくとも部分的に、メインメモリ１９０４内および／またはプロセッサ１９０２内に常駐し、同様に機械可読媒体を構成してもよい。 The disk drive unit 1916 stores on it a set of one or more instructions 1924 (eg, software) that embodies any one or more of the methodologies or functions described herein. , Includes machine-readable media 1922. The software also resides entirely or at least partially within the main memory 1904 and / or the processor 1902 during its execution by the computer system 1900, main memory 1904, and processor 1902, and also constitutes a machine-readable medium. You may.

ソフトウェアはさらに、ネットワーク１８を経由して、ネットワークインターフェースデバイス１９２０を介して、伝送または受信されてもよい。 The software may further be transmitted or received via the network 18 and via the network interface device 1920.

コンピュータシステム１９００は、プロジェクタを駆動し、光を生成するために使用される、ドライバチップ１９５０を含む。ドライバチップ１９５０は、その独自のデータ記憶装置１９６０と、その独自のプロセッサ１９６２とを含む。 The computer system 1900 includes a driver chip 1950, which is used to drive a projector and generate light. The driver chip 1950 includes its own data storage device 1960 and its own processor 1962.

機械可読媒体１９２２が、例示的実施形態では、単一媒体であるように示されるが、用語「機械可読媒体」は、１つ以上の命令のセットを記憶する、単一媒体または複数の媒体（例えば、集中型または分散型データベースおよび／または関連付けられるキャッシュおよびサーバ）を含むものと捉えられるべきである。用語「機械可読媒体」はまた、機械による実行のための命令のセットを記憶、エンコーディング、または搬送することが可能であって、機械に、本発明の方法論のうちの任意の１つ以上のものを実施させる、任意の媒体を含むものと捉えられるものとする。用語「機械可読媒体」は、故に、限定ではないが、ソリッドステートメモリ、光学および磁気媒体、および搬送波信号を含むものと捉えられるものとする。 Although the machine-readable medium 1922 is shown as a single medium in an exemplary embodiment, the term "machine-readable medium" is a single medium or a plurality of media that stores a set of one or more instructions. For example, it should be considered to include centralized or distributed databases and / or associated caches and servers). The term "machine readable medium" is also capable of storing, encoding, or carrying a set of instructions for execution by a machine and to the machine any one or more of the methodologies of the invention. It shall be regarded as including any medium for carrying out. The term "machine readable medium" is therefore considered to include, but is not limited to, solid state memory, optical and magnetic media, and carrier signals.

いくつかの実施形態のいくつかの側面がこれまで説明されたが、種々の改変、修正、および改良が、当業者に容易に想起されるであろうことを理解されたい。 Although some aspects of some embodiments have been described so far, it should be appreciated that various modifications, modifications, and improvements will be readily recalled to those of skill in the art.

一実施例として、実施形態は、拡張（ＡＲ）環境に関連して説明される。本明細書に説明される技法の一部または全部は、ＭＲ環境、またはより一般的には、他のＸＲ環境およびＶＲ環境内に適用されてもよいことを理解されたい。 As an embodiment, embodiments are described in relation to an augmented reality (AR) environment. It should be understood that some or all of the techniques described herein may be applied within an MR environment, or more generally, other XR and VR environments.

さらに、図２９は、候補マップをフィルタリングし、高ランク付けマップのセットをもたらすために使用され得る、基準の実施例を提供する。他の基準が、説明される基準の代わりに、またはそれに加え、使用されてもよい。例えば、複数の候補マップが、あまり望ましくないマップをフィルタリング除去するために使用されるメトリックの類似値を有する場合、候補マップの特性が、候補マップとして留保される、またはフィルタリング除去されるマップを決定するために使用されてもよい。例えば、より大きいまたはより稠密な候補マップは、より小さい候補マップより優先されてもよい。 In addition, FIG. 29 provides examples of criteria that can be used to filter candidate maps and result in a set of highly ranked maps. Other criteria may be used in lieu of or in addition to the criteria described. For example, if multiple candidate maps have similar values for the metrics used to filter out less desirable maps, the characteristics of the candidate map determine which map is reserved or filtered out as a candidate map. May be used to. For example, a larger or denser candidate map may take precedence over a smaller candidate map.

そのような改変、修正、および改良は、本開示の一部であることが意図され、本開示の精神および範囲内であると意図される。さらに、本開示の利点が示されるが、本開示の全ての実施形態が、全ての説明される利点を含むわけではないことを理解されたい。いくつかの実施形態は、本明細書およびいくつかの事例において有利として説明される任意の特徴を実装しなくてもよい。故に、前述の説明および図面は、一例にすぎない。 Such modifications, modifications, and improvements are intended to be part of this disclosure and are intended to be within the spirit and scope of this disclosure. Further, although the advantages of the present disclosure are shown, it should be understood that not all embodiments of the present disclosure include all the described advantages. Some embodiments may not implement any of the features described as advantageous herein and in some cases. Therefore, the above description and drawings are merely examples.

いくつかの実施形態は、３次元（３Ｄ）環境についての情報を捕捉し、画像を出力するように構成される、センサであって、各画像は、複数のピクセルを備える、センサと、コンピュータ実行可能命令を実行し、センサによって出力された画像を処理するように構成される、少なくとも１つのプロセッサとを含む、ポータブル電子システムに関する。コンピュータ実行可能命令は、センサによって捕捉された複数の画像を受信するステップと、複数の画像の少なくともサブセットに関して、画像のサブセットの画像毎に、複数のピクセル内の１つ以上の特徴を識別するステップであって、各特徴は、１つ以上のピクセルに対応する、ステップと、１つ以上の特徴の特徴毎に、特徴記述子を算出するステップと、サブセットの画像毎に、少なくとも部分的に、画像内の算出された特徴記述子に基づいて、画像を表すためのフレーム記述子を算出するステップとのための命令を備える。 Some embodiments are sensors configured to capture information about a three-dimensional (3D) environment and output an image, each image comprising a plurality of pixels, a sensor and a computer run. It relates to a portable electronic system including at least one processor configured to execute possible instructions and process an image output by a sensor. A computer-executable instruction is a step of receiving multiple images captured by a sensor and a step of identifying one or more features within a plurality of pixels for each image of a subset of images with respect to at least a subset of the images. And each feature has a step corresponding to one or more pixels, a step to calculate a feature descriptor for each feature of one or more features, and at least a partial image for each subset of images. It comprises instructions for and with the step of calculating a frame descriptor to represent the image based on the calculated feature descriptor in the image.

いくつかの実施形態では、センサは、少なくとも百万ピクセル回路を備える。複数の画像毎のフレーム記述子は、５１２またはより少ない数を備える。 In some embodiments, the sensor comprises at least a million pixel circuit. The frame descriptor for each of the plurality of images comprises 512 or a smaller number.

いくつかの実施形態では、コンピュータ実行可能命令はさらに、３Ｄ環境の少なくとも一部のマップを構築するステップと、個別のフレームのための特徴記述子と、少なくとも部分的に個別のフレームから生成されたマップの部分を関連付けるステップとのための命令を備える。 In some embodiments, the computer executable instructions are further generated from steps to build a map of at least part of the 3D environment, feature descriptors for individual frames, and at least partially individual frames. Provides instructions for and for associating parts of the map.

いくつかの実施形態では、コンピュータ実行可能命令は、複数の画像のサブセットとして、少なくとも部分的に、３Ｄ環境に対する画像の場所および複数の画像の複数のピクセルに基づいて、複数の画像から１つ以上のキーフレームを選択するステップのための命令を備える。 In some embodiments, a computer executable instruction is one or more from a plurality of images, at least in part, as a subset of the plurality of images, based on the location of the image and the multiple pixels of the plurality of images. Provides instructions for the step of selecting a keyframe of.

いくつかの実施形態では、コンピュータ実行可能命令は、１つ以上のキーフレームのキーフレームのために３Ｄ環境のマップと関連付けられる、１つ以上のフレームを識別するステップのための命令を備え、１つ以上のフレームは、キーフレームのためのフレーム記述子からの閾値距離未満のフレーム記述子を有する。 In some embodiments, the computer executable instruction comprises an instruction for the step of identifying one or more frames associated with a map of the 3D environment for the keyframes of one or more keyframes. One or more frames have a frame descriptor less than the threshold distance from the frame descriptor for the key frame.

いくつかの実施形態では、フレーム記述子を算出するためのコンピュータ実行可能命令は、人工ニューラルネットワークを備える。 In some embodiments, the computer executable instruction for calculating the frame descriptor comprises an artificial neural network.

いくつかの実施形態では、人工ニューラルネットワークは、類似および異なる画像に基づいて訓練され、画像内の特徴を表す複数の値を入力として受信し、特徴を表す複数の値の加重された組み合わせを出力として提供するように構成される、多層パーセプトロンユニットと、多層パーセプトロンユニットの出力のサブセットをフレーム記述子として選択するように構成される、最大プーリングユニットとを備える。 In some embodiments, the artificial neural network is trained on the basis of similar and different images, receives multiple values representing features in the image as inputs, and outputs a weighted combination of multiple values representing features. It comprises a multi-layer perceptron unit configured to provide as and a maximum pooling unit configured to select a subset of the output of the multi-layer perceptron unit as the frame descriptor.

いくつかの実施形態は、コンピューティングシステムを動作させ、ユーザによって装着されるデバイスによって収集されたセンサデータに基づいて、３次元（３Ｄ）環境の少なくとも一部のマップを生成する方法に関する。本方法は、ユーザによって装着されるデバイスによって捕捉された複数の画像を受信するステップと、１つ以上のキーフレームを複数の画像から決定するステップと、第１の人工ニューラルネットワークを用いて、１つ以上のキーフレーム内の１つ以上の着目点を識別するステップと、第１の人工ニューラルネットワークを用いて、個々の着目点のための特徴記述子を算出するステップと、１つ以上のキーフレーム毎に、第２の人工ニューラルネットワークを用いて、少なくとも部分的に、キーフレーム内の識別された着目点のための算出された特徴記述子に基づいて、キーフレームを表すためのフレーム記述子を算出するステップとを含む。 Some embodiments relate to a method of operating a computing system to generate a map of at least a portion of a three-dimensional (3D) environment based on sensor data collected by a device worn by a user. The method uses a step of receiving a plurality of images captured by a device worn by the user, a step of determining one or more keyframes from the plurality of images, and a first artificial neural network. A step of identifying one or more points of interest in one or more key frames, a step of calculating a feature descriptor for each point of interest using a first artificial neural network, and one or more keys. For each frame, a second artificial neural network is used to represent the keyframe, at least in part, based on the calculated feature descriptor for the identified point of interest within the keyframe. Includes a step to calculate.

いくつかの実施形態では、第１および第２の人工ニューラルネットワークは、人工ニューラルネットワークのサブネットワークである。 In some embodiments, the first and second artificial neural networks are subnets of the artificial neural network.

いくつかの実施形態では、フレーム記述子は、個々のキーフレームに一意である。 In some embodiments, the frame descriptor is unique to each keyframe.

いくつかの実施形態では、１つ以上のキーフレームはそれぞれ、１メガバイトより高い分解能を有する。１つ以上のキーフレーム毎のフレーム記述子は、５１２の数未満の列である。 In some embodiments, each one or more keyframes has a resolution higher than 1 megabyte. The frame descriptor for each one or more keyframes is a column with less than 512 numbers.

いくつかの実施形態では、各特徴記述子は、３２バイトの列である。 In some embodiments, each feature descriptor is a 32-byte column.

いくつかの実施形態では、フレーム記述子は、特徴記述子を最大プーリングすることによって生成される。 In some embodiments, the frame descriptor is generated by maximally pooling the feature descriptor.

いくつかの実施形態では、本方法は、ユーザによって装着されるデバイスによって捕捉された新しい画像を受信するステップと、マップを生成するために使用されるキーフレームを備える、データベース内の１つ以上の最も近くのキーフレームを識別するステップであって、１つ以上の最も近くのキーフレームは、新しい画像のためのフレーム記述子の所定の距離内のフレーム記述子を有する、ステップとを含む。 In some embodiments, the method is one or more in a database comprising a step of receiving a new image captured by a device worn by the user and a keyframe used to generate a map. A step of identifying the nearest keyframe, wherein one or more nearest keyframes include a step having a frame descriptor within a predetermined distance of the frame descriptor for the new image.

いくつかの実施形態では、本方法は、識別された１つ以上の最も近くのキーフレームに対応する、マップの３Ｄマップ点に対して特徴マッチングを実施するステップと、特徴マッチング結果に基づいて、ユーザによって装着されるデバイスの姿勢を算出するステップとを含む。 In some embodiments, the method is based on a step of performing feature matching against a 3D map point of a map corresponding to one or more identified keyframes and a feature matching result. Includes a step of calculating the posture of the device worn by the user.

いくつかの実施形態では、複数の画像から１つ以上のキーフレームを決定するステップは、第１の画像のピクセルと第１の画像直後に撮影された第２の画像のピクセルを比較するステップと、第１の画像のピクセルと第２の画像のピクセルとの間の差異が閾値を上回るまたは下回るとき、第２の画像をキーフレームとして識別するステップとを含む。 In some embodiments, the step of determining one or more keyframes from a plurality of images is a step of comparing the pixels of the first image with the pixels of the second image taken immediately after the first image. , Includes a step of identifying the second image as a key frame when the difference between the pixels of the first image and the pixels of the second image is above or below the threshold.

いくつかの実施形態では、本方法は、複数の画像セットを備えるデータセットを生成することによって、第２の人工ニューラルネットワークを訓練するステップであって、複数の画像セットはそれぞれ、クエリ画像、正のサンプル画像、および負のサンプル画像を含む、ステップと、データセット内の複数の画像セットの画像セット毎に、クエリ画像と正のサンプル画像および負のサンプル画像を比較することによって、損失を算出するステップと、クエリ画像のために第２の人工ニューラルネットワークによって生成されたフレーム記述子と正のサンプル画像のためのフレーム記述子との間の距離が、クエリ画像のためのフレーム記述子と負のサンプル画像のためのフレーム記述子との間の距離を上回るように、算出された損失に基づいて、第２の人工ニューラルネットワークを修正するステップとを含む。 In some embodiments, the method is a step of training a second artificial neural network by generating a dataset with multiple image sets, where the plurality of image sets are query images, positive, respectively. The loss is calculated by comparing the query image with the positive and negative sample images for each step and image set of multiple image sets in the dataset, including sample images and negative sample images. The distance between the step and the frame descriptor generated by the second artificial neural network for the query image and the frame descriptor for the positive sample image is negative with the frame descriptor for the query image. Includes a step of modifying the second artificial neural network based on the calculated loss to exceed the distance to the frame descriptor for the sample image of.

いくつかの実施形態は、クロスリアリティシステムのためのコンピューティング環境に関する。コンピューティング環境は、複数のマップを記憶するデータベースを含む。各マップは、３Ｄ環境の領域を表す情報を備える。各領域を表す情報は、領域の画像を表すフレーム記述子を備え、非一過性コンピュータ記憶媒体は、少なくとも１つのプロセッサによって実行されると、コンピューティング環境内において、画像内の複数の特徴を識別することによって、ポータブルデバイスによって捕捉された画像を処理し、複数の特徴毎に、特徴記述子を算出し、少なくとも部分的に、画像内の１つ以上の識別された着目点のための算出された特徴記述子に基づいて、画像を表すためのフレーム記述子を算出し、算出されたフレーム記述子とマップのデータベース内に記憶されたフレーム記述子との間の比較に基づいて、データベース内のマップを選択する、コンピュータ実行可能命令を記憶する。 Some embodiments relate to computing environments for cross-reality systems. The computing environment includes a database that stores multiple maps. Each map contains information that represents an area of the 3D environment. The information representing each region comprises a frame descriptor representing the image of the region, and the non-transient computer storage medium, when executed by at least one processor, exhibits multiple features within the image within the computing environment. By identifying, the image captured by the portable device is processed, a feature descriptor is calculated for each of the features, and at least in part, a calculation for one or more identified points of interest in the image. Based on the feature descriptors made, the frame descriptor to represent the image is calculated and in the database based on the comparison between the calculated frame descriptor and the frame descriptor stored in the map's database. Memorize computer-executable instructions to select a map of.

いくつかの実施形態では、フレーム記述子は、データベース内に記憶されたフレームに一意である。 In some embodiments, the frame descriptor is unique to the frame stored in the database.

いくつかの実施形態では、画像は、１メガバイトより高い分解能を有する。画像を表すために算出されたフレーム記述子は、５１２の数未満である、列である。 In some embodiments, the image has a resolution higher than 1 megabyte. The frame descriptor calculated to represent the image is a column, which is less than 512 numbers.

いくつかの実施形態では、コンピュータ実行可能命令は、複数の画像セットを備える、データセットを処理するステップであって、複数の画像セットはそれぞれ、クエリ画像、正のサンプル画像、および負のサンプル画像を含む、ステップと、データセット内の複数の画像セットの画像のセットのための損失を算出するステップと、クエリ画像と正のサンプル画像および負のサンプル画像を比較するステップと、クエリ画像のための人工ニューラルネットワークによって生成されたフレーム記述子と正のサンプル画像のためのフレーム記述子との間の距離がクエリ画像のためのフレーム記述子と負のサンプル画像のためのフレーム記述子との間の距離未満であるように、算出された損失に基づいて、人工ニューラルネットワークを修正するステップとによって訓練された、人工ニューラルネットワークを備える。 In some embodiments, a computer-executable instruction is a step of processing a dataset that comprises a plurality of image sets, where the plurality of image sets are a query image, a positive sample image, and a negative sample image, respectively. For the query image, including the steps to calculate the loss for a set of images in multiple image sets in the dataset, and the step to compare the query image with the positive and negative sample images. The distance between the frame descriptor generated by the artificial neural network and the frame descriptor for the positive sample image is between the frame descriptor for the query image and the frame descriptor for the negative sample image. It comprises an artificial neural network trained by the steps of modifying the artificial neural network based on the calculated loss so that it is less than the distance of.

いくつかの実施形態では、人工ニューラルネットワークを修正するステップは、コンピューティング環境内のポータブルデバイス上の人工ニューラルネットワークのコピーを修正するステップを含む。 In some embodiments, modifying the artificial neural network comprises modifying a copy of the artificial neural network on a portable device in a computing environment.

いくつかの実施形態では、コンピューティング環境は、クラウドプラットフォームと、クラウドプラットフォームと通信する、複数のポータブルデバイスとを備える。クラウドプラットフォームは、データベースと、マップを選択するためのコンピュータ実行可能命令とを備える。ポータブルデバイスによって捕捉された画像を処理するためのコンピュータ実行可能命令は、ポータブルデバイス上に記憶される。 In some embodiments, the computing environment comprises a cloud platform and a plurality of portable devices that communicate with the cloud platform. The cloud platform comprises a database and computer executable instructions for selecting maps. Computer-executable instructions for processing images captured by the portable device are stored on the portable device.

いくつかの実施形態は、第１のプロセッサと、第１のプロセッサに接続される、第１のコンピュータ可読媒体と、第１のコンピュータ可読媒体上に記憶される、第１の原点座標フレームと、コンピュータ可読媒体上に記憶される、第１の目的地座標フレームと、ローカルコンテンツを表すデータを受信する、第１のデータチャネルと、第１のプロセッサによって実行可能であって、ローカルコンテンツの位置付けを第１の原点座標フレームから第１の目的地座標フレームに変換する、第１の座標フレーム変換器と、ローカルコンテンツの位置付けを第１の原点座標フレームから第１の目的地座標フレームに変換後、ローカルコンテンツを第１のユーザに表示するように適合される、第１のディスプレイシステムとを含む、第１のＸＲデバイスを含む、ＸＲシステムに関する。 Some embodiments include a first processor, a first computer-readable medium connected to the first processor, and a first origin coordinate frame stored on the first computer-readable medium. A first destination coordinate frame stored on a computer-readable medium, a first data channel that receives data representing local content, and a position of local content that can be executed by a first processor. After converting the first coordinate frame converter that converts the first origin coordinate frame to the first destination coordinate frame and the positioning of the local content from the first origin coordinate frame to the first destination coordinate frame, With respect to an XR system, including a first XR device, including a first display system, adapted to display local content to a first user.

いくつかの実施形態は、第１の原点座標フレームを記憶するステップと、第１の目的地座標フレームを記憶するステップと、ローカルコンテンツを表すデータを受信するステップと、ローカルコンテンツの位置付けを第１の原点座標フレームから第１の目的地座標フレームに変換するステップと、ローカルコンテンツの位置付けを第１の原点座標フレームから第１の目的地座標フレームに変換後、ローカルコンテンツを第１のユーザに表示するステップとを含む、視認方法に関する。 In some embodiments, the step of storing the first origin coordinate frame, the step of storing the first destination coordinate frame, the step of receiving the data representing the local content, and the positioning of the local content are first. After converting the origin coordinate frame of the to the first destination coordinate frame and the positioning of the local content from the first origin coordinate frame to the first destination coordinate frame, the local content is displayed to the first user. It relates to a visual recognition method including a step to be performed.

いくつかの実施形態は、複数の持続的座標フレーム（ＰＣＦ）を有する規準マップである、第１のマップを記憶する、マップ記憶ルーチンであって、第１のマップの各ＰＣＦは、座標のセットを有する、マップ記憶ルーチンと、実オブジェクトの場所を検出するように位置付けられる、実オブジェクト検出デバイスと、実オブジェクト検出デバイスに接続され、実オブジェクトの場所に基づいて、第２のマップのＰＣＦを検出する、ＰＣＦ識別システムであって、第２のマップの各ＰＣＦは、座標のセットを有する、ＰＣＦ識別システムと、規準マップおよび第２のマップに接続され、第２のマップの第１のＰＣＦを規準マップの第１のＰＣＦにマッチングさせ、第２のマップの第２のＰＣＦを規準マップの第２のＰＣＦにマッチングさせることによって、第２のマップを規準マップに対して位置特定するように実行可能である、位置特定モジュールとを含む、ＸＲシステムに関する。 Some embodiments are map storage routines that store a first map, which is a canonical map with multiple persistent coordinate frames (PCFs), where each PCF in the first map is a set of coordinates. A map storage routine that is connected to a real object detection device and a real object detection device that is positioned to detect the location of the real object, and detects the PCF of the second map based on the location of the real object. PCF identification system, each PCF of the second map is connected to a PCF identification system having a set of coordinates, a reference map and a second map, and the first PCF of the second map. Execute to locate the second map with respect to the reference map by matching the first PCF of the reference map and matching the second PCF of the second map with the second PCF of the reference map. With respect to the XR system, including possible, locating modules.

いくつかの実施形態では、実オブジェクト検出デバイスは、実オブジェクト検出カメラである。 In some embodiments, the real object detection device is a real object detection camera.

いくつかの実施形態では、ＸＲシステムはさらに、規準マップおよび第２のマップに接続され、規準マップの第３のＰＣＦを第２のマップに組み込むように実行可能である、規準マップ組込器を備える。 In some embodiments, the XR system is further connected to a canonical map and a second map, which can be implemented to incorporate a third PCF of the canonical map into the second map. Be prepared.

いくつかの実施形態では、ＸＲシステムはさらに、頭部搭載可能フレームを備える、頭部ユニットであって、実オブジェクト検出デバイスが頭部搭載可能フレームに搭載される、頭部ユニットと、ローカルコンテンツの画像データを受信する、データチャネルと、データチャネルに接続され、ローカルコンテンツを規準マップの１つのＰＣＦに関連させるように実行可能である、ローカルコンテンツ位置付けシステムと、ローカルコンテンツ位置付けシステムに接続され、ローカルコンテンツを表示する、ディスプレイシステムとを備える。 In some embodiments, the XR system is a head unit that further comprises a head mountable frame, wherein the real object detection device is mounted on the head mountable frame, the head unit, and local content. Connected to and local to a local content positioning system and a local content positioning system that receives image data, is connected to the data channel, and is capable of executing local content to relate to one PCF of the canonical map. It is equipped with a display system that displays content.

いくつかの実施形態では、ＸＲシステムはさらに、ローカルコンテンツのローカル座標フレームを第２のマップの世界座標フレームに変換する、ローカル／世界座標変換器を備える。 In some embodiments, the XR system further comprises a local / world coordinate converter that transforms the local coordinate frame of the local content into the world coordinate frame of the second map.

いくつかの実施形態では、ＸＲシステムはさらに、第２のマップのＰＣＦに基づいて、第１の世界座標フレームを計算する、第１の世界フレーム決定ルーチンと、世界座標フレームを記憶する、第１の世界フレーム記憶命令と、頭部搭載可能フレームの移動に応じて変化する、頭部座標フレームを計算する、頭部フレーム決定ルーチンと、第１の頭部座標フレームを記憶する、頭部フレーム記憶命令と、世界座標フレームを頭部座標フレームに変換する、世界／頭部座標変換器とを備える。 In some embodiments, the XR system further stores a first world frame determination routine, which calculates a first world coordinate frame, and a first world coordinate frame, based on the PCF of the second map. World frame storage command, head frame determination routine that calculates the head coordinate frame, and stores the first head coordinate frame, which changes according to the movement of the head mountable frame, head frame storage It includes a command and a world / head coordinate converter that converts the world coordinate frame into a head coordinate frame.

いくつかの実施形態では、頭部座標フレームは、頭部搭載可能フレームが移動すると、世界座標フレームに対して変化する。 In some embodiments, the head coordinate frame changes relative to the world coordinate frame as the head mountable frame moves.

いくつかの実施形態では、ＸＲシステムはさらに、第２のマップの少なくとも１つのＰＣＦに関連する、少なくとも１つの音要素を備える。 In some embodiments, the XR system further comprises at least one sound element associated with at least one PCF in the second map.

いくつかの実施形態では、第１および第２のマップは、ＸＲデバイスによって作成される。 In some embodiments, the first and second maps are created by an XR device.

いくつかの実施形態では、ＸＲシステムはさらに、第１および第２のＸＲデバイスを備える。各ＸＲデバイスは、頭部搭載可能フレームを備える、頭部ユニットであって、実オブジェクト検出デバイスが頭部搭載可能フレームに搭載される、頭部ユニットと、ローカルコンテンツの画像データを受信する、データチャネルと、データチャネルに接続され、ローカルコンテンツを規準マップの１つのＰＣＦに関連させるように実行可能である、ローカルコンテンツ位置付けシステムと、ローカルコンテンツ位置付けシステムに接続され、ローカルコンテンツを表示する、ディスプレイシステムとを含む。 In some embodiments, the XR system further comprises first and second XR devices. Each XR device is a head unit having a head mountable frame, and the real object detection device is mounted on the head mountable frame, and receives image data of the head unit and local contents. A display system that is connected to a channel and a data channel and is connected to a local content positioning system and can display local content so that the local content can be associated with one PCF in the canonical map. And include.

いくつかの実施形態では、第１のＸＲデバイスは、第１のマップのためのＰＣＦを作成し、第２のＸＲデバイスは、第２のマップのためのＰＣＦを作成し、位置特定モジュールは、第２のＸＲデバイスの一部を形成する。 In some embodiments, the first XR device creates a PCF for the first map, the second XR device creates a PCF for the second map, and the location module. It forms part of a second XR device.

いくつかの実施形態では、第１および第２のマップは、それぞれ、第１および第２のセッションにおいて作成される。 In some embodiments, the first and second maps are created in the first and second sessions, respectively.

いくつかの実施形態では、ＸＲシステムはさらに、サーバと、ＸＲデバイスの一部を形成し、第１のマップをネットワークを経由してサーバからダウンロードする、マップダウンロードシステムとを備える。 In some embodiments, the XR system further comprises a server and a map download system that forms part of the XR device and downloads a first map from the server via a network.

いくつかの実施形態では、位置特定モジュールは、第２のマップを規準マップに対して位置特定するように繰り返し試みる。 In some embodiments, the locating module repeatedly attempts to locate the second map relative to the reference map.

いくつかの実施形態では、ＸＲシステムはさらに、第２のマップをネットワークを経由してサーバにアップロードする、マップ発行器を備える。 In some embodiments, the XR system further comprises a map issuer that uploads a second map to the server over the network.

いくつかの実施形態は、複数のＰＣＦを有する規準マップである、第１のマップを記憶するステップであって、規準マップの各ＰＣＦは、座標のセットを有する、ステップと、実オブジェクトの場所を検出するステップと、実オブジェクトの場所に基づいて、第２のマップのＰＣＦを検出するステップであって、第２のマップの各ＰＣＦは、座標のセットを有する、ステップと、第２のマップの第１のＰＣＦを第１のマップの第１のＰＣＦにマッチングさせ、第２のマップの第２のＰＣＦを規準マップの第２のＰＣＦにマッチングさせることによって、第２のマップを規準マップに対して位置特定するステップとを含む、視認方法に関する。 In some embodiments, there is a step of storing a first map, which is a canonical map with multiple PCFs, where each PCF of the canonical map has a set of coordinates, a step and the location of the real object. The step to detect and the step to detect the PCF of the second map based on the location of the real object, where each PCF of the second map has a set of coordinates, the step and the second map. By matching the first PCF to the first PCF of the first map and the second PCF of the second map to the second PCF of the reference map, the second map is matched to the reference map. The present invention relates to a visual recognition method including a step of identifying a position.

いくつかの実施形態は、プロセッサと、プロセッサに接続される、コンピュータ可読媒体と、コンピュータ可読媒体上の複数の規準マップと、各個別の規準マップと関連付けられるコンピュータ可読媒体上の個別の規準マップ識別子であって、相互に異なり、規準マップを一意に識別する、規準マップ識別子と、コンピュータ可読媒体上にあって、プロセッサによって実行可能であって、位置識別子をＸＲデバイスから受信し、記憶する、位置検出器と、コンピュータ可読媒体上にあって、プロセッサによって実行可能であって、位置識別子と規準マップ識別子を比較し、第１のフィルタリングされた選択を形成する、１つ以上の規準マップを決定する、第１のフィルタと、コンピュータ可読媒体上にあって、プロセッサによって実行可能であって、第１のフィルタリングされた選択に基づいて、規準マップのうちの１つ以上のものをＸＲデバイスに伝送する、マップ伝送機とを有し得る、サーバを備える、ＸＲシステムに関する。 Some embodiments include a processor, a computer-readable medium connected to the processor, a plurality of reference maps on the computer-readable medium, and a separate reference map identifier on the computer-readable medium associated with each individual reference map. The location, which is different from each other and uniquely identifies the reference map, and the location, which is on a computer-readable medium and is executable by the processor, receives and stores the location identifier from the XR device. The detector is on a computer-readable medium, runn by the processor, and compares the location identifier with the criterion map identifier to determine one or more criteria maps that form the first filtered selection. , A first filter and one or more of the canonical maps transmitted to the XR device on a computer-readable medium, which is executable by the processor and is based on the first filtered selection. With respect to an XR system, including a server, which may have a map transmitter.

いくつかの実施形態では、規準マップ識別子はそれぞれ、経度および緯度を含み、位置識別子は、経度および緯度を含む。 In some embodiments, the canonical map identifier includes longitude and latitude, respectively, and the location identifier includes longitude and latitude, respectively.

いくつかの実施形態では、第１のフィルタは、位置識別子の経度および緯度を含む、経度および緯度を網羅する、少なくとも１つのマッチングする規準マップと、第１のマッチングする規準マップに隣接する経度および緯度を網羅する、少なくとも１つの近傍マップとを選択する、近傍エリアフィルタである。 In some embodiments, the first filter comprises at least one matching criterion map covering the longitude and latitude, including the longitude and latitude of the location identifier, and the longitude and adjacent to the first matching criterion map. A neighborhood area filter that selects at least one neighborhood map that covers the latitude.

いくつかの実施形態では、位置識別子は、ＷｉＦｉフィンガプリントを含む。ＸＲシステムはさらに、ＷｉＦｉフィンガプリントフィルタであって、コンピュータ可読媒体上にあって、プロセッサによって、ＷｉＦｉフィンガプリントに基づいて、緯度および経度を決定し、ＷｉＦｉフィンガプリントフィルタからの緯度および経度と規準マップの緯度および経度を比較し、第１のフィルタリングされた選択内の第２のフィルタリングされた選択を形成する、１つ以上の規準マップを決定し、マップ伝送機は、第２の選択に基づいて、１つ以上の規準マップを伝送し、第２の選択外の第１の選択に基づく規準マップを伝送しないように実行可能である、第２のフィルタを備える。 In some embodiments, the location identifier comprises a WiFi finger print. The XR system is also a WiFi finger print filter, which is on a computer readable medium and the processor determines the latitude and longitude based on the WiFi finger print, and the latitude and longitude and reference map from the WiFi finger print filter. Latitude and longitude are compared to determine one or more criterion maps that form the second filtered selection within the first filtered selection, and the map transmitter is based on the second selection. It comprises a second filter capable of transmitting one or more criterion maps and not transmitting a criterion map based on the first selection outside the second selection.

いくつかの実施形態では、第１のフィルタは、コンピュータ可読媒体上にあって、プロセッサによって、ＷｉＦｉフィンガプリントに基づいて、緯度および経度を決定し、ＷｉＦｉフィンガプリントフィルタからの緯度および経度と規準マップの緯度および経度を比較し、第１のフィルタリングされた選択を形成する、１つ以上の規準マップを決定するように実行可能である、ＷｉＦｉフィンガプリントフィルタである。 In some embodiments, the first filter is on a computer-readable medium, the processor determines the latitude and longitude based on the WiFi finger print, and the latitude and longitude from the WiFi finger print filter and the canonical map. A WiFi finger print filter that can be run to compare the latitude and longitude of and determine one or more canonical maps that form a first filtered selection.

いくつかの実施形態では、ＸＲシステムはさらに、コンピュータ可読媒体上にあって、プロセッサによって実行可能であって、画像の複数の特徴を受信し、各特徴を個別の数の列に変換する、多層知覚ユニットと、コンピュータ可読媒体上にあって、プロセッサによって実行可能であって、各数の列の最大値を画像を表すグローバル特徴列に組み合わせる、最大プールユニットであって、各規準マップは、グローバル特徴列のうちの少なくとも１つを有し、ＸＲデバイスから受信された位置識別子は、多層知覚ユニットおよび最大プールユニットによって、画像のグローバル特徴列を決定するために進展される、ＸＲデバイスによって捕捉された画像の特徴を含む、最大プールユニットと、画像のグローバル特徴列と規準マップのグローバル特徴列を比較し、第２のフィルタリングされた選択内の第３のフィルタリングされた選択を形成する、１つ以上の規準マップを決定する、キーフレームフィルタであって、マップ伝送機は、第３の選択に基づいて、１つ以上の規準マップを伝送し、第３の選択外の第２の選択に基づく規準マップを伝送しない、キーフレームフィルタとを備える。 In some embodiments, the XR system is further on a computer-readable medium, executable by a processor, receiving multiple features of an image and converting each feature into a separate number of columns, multi-layered. A perceptual unit and a maximum pool unit on a computer-readable medium that can be run by a processor and combines the maximum value of each number of columns into a global feature column that represents an image, with each criterion map being global. The position identifier having at least one of the feature sequences and received from the XR device is captured by the XR device, which is advanced to determine the global feature sequence of the image by the multi-layer perception unit and the maximum pool unit. One that compares the largest pool unit, including the features of the image, with the global feature column of the image and the global feature column of the canonical map to form a third filtered selection within the second filtered selection. A keyframe filter that determines the above criterion map, the map transmitter transmits one or more criterion maps based on a third selection and is based on a second selection outside the third selection. It is equipped with a keyframe filter that does not transmit the reference map.

いくつかの実施形態では、ＸＲシステムは、コンピュータ可読媒体上にあって、プロセッサによって実行可能であって、画像の複数の特徴を受信し、各特徴を個別の数の列に変換する、多層知覚ユニットと、コンピュータ可読媒体上にあって、プロセッサによって実行可能であって、各数の列の最大値を画像を表すグローバル特徴列に組み合わせる、最大プールユニットであって、各規準マップは、グローバル特徴列のうちの少なくとも１つを有し、ＸＲデバイスから受信された位置識別子は、多層知覚ユニットおよび最大プールユニットによって、画像のグローバル特徴列を決定するために進展される、ＸＲデバイスによって捕捉された画像の特徴を含む、最大プールユニットとをさらに備え、第１のフィルタは、画像のグローバル特徴列と規準マップのグローバル特徴列を比較し、１つ以上の規準マップを決定する、キーフレームフィルタである。 In some embodiments, the XR system is on a computer-readable medium, runn by a processor, receiving multiple features of an image and converting each feature into a separate number of columns, multi-layered perception. A unit and a maximum pool unit on a computer-readable medium that is runnable by the processor and combines the maximum value of each number of columns into a global feature column that represents an image, and each criterion map is a global feature. The location identifier having at least one of the columns and received from the XR device was captured by the XR device, which is advanced to determine the global feature sequence of the image by the multi-layer perception unit and the maximum pool unit. Further equipped with a maximum pool unit, including image features, the first filter is a keyframe filter that compares the global feature column of the image with the global feature column of the canonical map to determine one or more canonical maps. be.

いくつかの実施形態では、ＸＲシステムは、頭部搭載可能フレームを備える、頭部ユニットであって、実オブジェクト検出デバイスが頭部搭載可能フレームに搭載される、頭部ユニットと、ローカルコンテンツの画像データを受信する、データチャネルと、データチャネルに接続され、ローカルコンテンツを規準マップの１つのＰＣＦに関連させるように実行可能である、ローカルコンテンツ位置付けシステムと、ローカルコンテンツ位置付けシステムに接続され、ローカルコンテンツを表示する、ディスプレイシステムとを含む、ＸＲデバイスを備える。 In some embodiments, the XR system is a head unit that comprises a head mountable frame, wherein the real object detection device is mounted on the head mountable frame, the head unit, and an image of local content. A local content positioning system that receives data, is connected to the data channel, and is capable of performing local content to relate to one PCF in the canonical map, and is connected to the local content positioning system and local content. The XR device includes a display system and a display system.

いくつかの実施形態では、ＸＲデバイスは、複数のＰＣＦを有する規準マップである、第１のマップを記憶する、マップ記憶ルーチンであって、第１のマップの各ＰＣＦは、座標のセットを有する、マップ記憶ルーチンと、実オブジェクトの場所を検出するように位置付けられる、実オブジェクト検出デバイスと、実オブジェクト検出デバイスに接続され、実オブジェクトの場所に基づいて、第２のマップのＰＣＦを検出する、ＰＣＦ識別システムであって、第２のマップの各ＰＣＦは、座標のセットを有する、ＰＣＦ識別システムと、規準および第２のマップに接続され、第２のマップの第１のＰＣＦを規準マップの第１のＰＣＦにマッチングさせ、第２のマップの第２のＰＣＦを規準マップの第２のＰＣＦにマッチングさせることによって、第２のマップを規準マップに対して位置特定するように実行可能である、位置特定モジュールとを含む。 In some embodiments, the XR device is a map storage routine that stores a first map, which is a canonical map with multiple PCFs, where each PCF in the first map has a set of coordinates. , A map storage routine, a real object detection device positioned to detect the location of a real object, and a PCF of a second map that is connected to the real object detection device and is based on the location of the real object. A PCF identification system, each PCF of the second map is connected to a PCF identification system having a set of coordinates and a reference and a second map, and the first PCF of the second map is the reference map. By matching to the first PCF and matching the second PCF of the second map to the second PCF of the canonical map, it is possible to locate the second map with respect to the canonical map. , Includes locating modules.

いくつかの実施形態では、ＸＲシステムは、規準マップおよび第２のマップに接続され、規準マップの第３のＰＣＦを第２のマップに組み込むように実行可能である、規準マップ組込器を備える。 In some embodiments, the XR system comprises a canonical map embedding device that is connected to a canonical map and a second map and can be implemented to incorporate a third PCF of the canonical map into the second map. ..

いくつかの実施形態は、複数の規準マップをコンピュータ可読媒体上に記憶するステップであって、各規準マップは、個別の規準マップと関連付けられる個別の規準マップ識別子を有し、規準マップ識別子は、相互に異なり、規準マップを一意に識別する、ステップと、コンピュータ可読媒体に接続されるプロセッサを用いて、位置識別子をＸＲデバイスから受信し、記憶するステップと、プロセッサを用いて、位置識別子と規準マップ識別子を比較し、第１のフィルタリングされた選択を形成する、１つ以上の規準マップを決定するステップと、プロセッサを用いて、第１のフィルタリングされた選択に基づいて、複数の規準マップをＸＲデバイスに伝送するステップとを含む、視認方法に関する。 Some embodiments are steps of storing a plurality of criterion maps on a computer-readable medium, where each criterion map has a separate canon map identifier associated with a separate canon map, which is a canonical map identifier. A step that uniquely identifies the criteria map, which is different from each other, and a step that receives and stores the location identifier from the XR device using a processor connected to a computer-readable medium, and a location identifier and criterion using the processor. Multiple canonical maps are created based on the first filtered selection using a processor and the steps of comparing map identifiers to determine one or more canonical maps that form a first filtered selection. The present invention relates to a visual recognition method including a step of transmitting to an XR device.

いくつかの実施形態は、プロセッサと、プロセッサに接続される、コンピュータ可読媒体と、コンピュータ可読媒体上にあって、プロセッサによって実行可能であって、画像の複数の特徴を受信し、各特徴を個別の数の列に変換する、多層知覚ユニットと、コンピュータ可読媒体上にあって、プロセッサによって実行可能であって、各数の列の最大値を画像を表すグローバル特徴列に組み合わせる、最大プールユニットとを含む、ＸＲシステムに関する。 Some embodiments are on a processor, a computer-readable medium connected to the processor, and a computer-readable medium that can be executed by the processor and receive multiple features of the image, each feature individually. With a multi-layered perception unit that transforms into a number of columns, and a maximum pool unit that is on a computer-readable medium and is processor-executable and combines the maximum value of each number of columns into a global feature column that represents an image. With respect to the XR system, including.

いくつかの実施形態では、ＸＲシステムは、コンピュータ可読媒体上の複数の規準マップであって、各規準マップは、それと関連付けられる該グローバル特徴列のうちの少なくとも１つを有する、複数の規準マップと、コンピュータ可読媒体上にあって、プロセッサによって実行可能であって、画像のグローバル特徴列を決定するために、多層知覚ユニットおよび最大プールユニットによって処理される、ＸＲデバイスによって捕捉された画像の特徴をＸＲデバイスから受信する、位置検出器と、画像のグローバル特徴列と規準マップのグローバル特徴列を比較し、フィルタリングされた選択の一部を形成する、１つ以上の規準マップを決定する、キーフレームフィルタと、コンピュータ可読媒体上にあって、プロセッサによって実行可能であって、フィルタリングされた選択に基づいて、規準マップのうちの１つ以上のものをＸＲデバイスに伝送する、マップ伝送機とを備える。 In some embodiments, the XR system is a plurality of criteria maps on a computer-readable medium, with each criterion map having at least one of the global feature columns associated with it. Features of an image captured by an XR device, on a computer-readable medium, runn by a processor and processed by a multi-layer perception unit and a maximum pool unit to determine the global feature sequence of the image. A keyframe that compares the position detector received from the XR device with the global feature column of the image and the global feature column of the canonical map to determine one or more canonical maps that form part of the filtered selection. It comprises a filter and a map transmitter that is on a computer readable medium and is executable by the processor and transmits one or more of the canonical maps to the XR device based on the filtered selection. ..

いくつかの実施形態では、ＸＲシステムは、頭部搭載可能フレームを備える、頭部ユニットであって、実オブジェクト検出デバイスが頭部搭載可能フレームに搭載される、頭部ユニットと、ローカルコンテンツの画像データを受信する、データチャネルと、データチャネルに接続され、ローカルコンテンツを規準マップの１つのＰＣＦに関連させるように実行可能である、ローカルコンテンツ位置付けシステムと、ローカルコンテンツ位置付けシステムに接続され、ローカルコンテンツを表示する、ディスプレイシステムとを備える。 In some embodiments, the XR system is a head unit that comprises a head mountable frame, wherein the real object detection device is mounted on the head mountable frame, the head unit, and an image of local content. A local content positioning system that receives data, is connected to the data channel, and is capable of performing local content to relate to one PCF of the canonical map, and is connected to the local content positioning system and local content. It is equipped with a display system to display.

いくつかの実施形態では、ＸＲシステムは、頭部搭載可能フレームを備える、頭部ユニットであって、実オブジェクト検出デバイスが頭部搭載可能フレームに搭載される、頭部ユニットと、ローカルコンテンツの画像データを受信する、データチャネルと、データチャネルに接続され、ローカルコンテンツを規準マップの１つのＰＣＦに関連させるように実行可能である、ローカルコンテンツ位置付けシステムと、ローカルコンテンツ位置付けシステムに接続され、ローカルコンテンツを表示する、ディスプレイシステムであって、マッチングさせるステップは、第２のマップの該グローバル特徴列を規準マップの該グローバル特徴列にマッチングさせることによって実行される、ディスプレイシステムとを含む、ＸＲデバイスを備える。 In some embodiments, the XR system is a head unit that comprises a head mountable frame, wherein the real object detection device is mounted on the head mountable frame, the head unit, and an image of local content. A local content positioning system and a local content positioning system that receive data, are connected to the data channel, and are capable of executing the local content to relate to one PCF of the canonical map. A display system that displays an XR device, including a display system, wherein the matching step is performed by matching the global feature column of a second map to the global feature column of a canonical map. Be prepared.

いくつかの実施形態は、プロセッサを用いて、画像の複数の特徴を受信するステップと、プロセッサを用いて、各特徴を個別の数の列に変換するステップと、プロセッサを用いて、各数の列の最大値を画像を表すグローバル特徴列に組み合わせるステップとを含む、視認方法に関する。 In some embodiments, a processor is used to receive multiple features of an image, a processor is used to convert each feature into a separate number of columns, and a processor is used to convert each feature. It relates to a visual method including a step of combining the maximum value of a column into a global feature column representing an image.

いくつかの実施形態は、コンピューティングシステムを動作させ、データベース内に記憶される１つ以上の環境マップを識別し、ユーザによって装着されるデバイスによって収集されたセンサデータに基づいて算出された追跡マップとマージする方法であって、デバイスは、追跡マップを算出する間、コンピュータネットワークへのアクセスポイントの信号を受信し、アクセスポイントとの通信の特性に基づいて、追跡マップの少なくとも１つのエリア属性を決定するステップと、少なくとも１つのエリア属性に基づいて、追跡マップの地理的場所を決定するステップと、決定された地理的場所に対応するデータベース内に記憶される環境マップのセットを識別するステップと、追跡マップおよび環境マップのセットの環境マップと関連付けられるネットワークアクセスポイントの１つ以上の識別子の類似性に基づいて、環境マップのセットをフィルタリングするステップと、追跡マップおよび環境マップのセットの環境マップのコンテンツを表すメトリックの類似性に基づいて、環境マップのセットをフィルタリングするステップと、追跡マップの一部と環境マップのセットの環境マップの一部との間のマッチング度に基づいて、環境マップのセットをフィルタリングするステップとを含む、方法に関する。 Some embodiments operate a computing system, identify one or more environment maps stored in a database, and a tracking map calculated based on sensor data collected by a device worn by the user. In a method of merging with, the device receives the signal of the access point to the computer network while calculating the tracking map, and based on the characteristics of communication with the access point, at least one area attribute of the tracking map. A step to determine, a step to determine the geographic location of the tracking map based on at least one area attribute, and a step to identify the set of environment maps stored in the database corresponding to the determined geographic location. , A step to filter the set of environment maps based on the similarity of one or more identifiers of the network access points associated with the environment map of the set of tracking maps and environment maps, and the environment map of the set of tracking maps and environment maps. An environment map based on the degree of matching between the steps of filtering a set of environment maps based on the similarity of the metrics that represent the content of the map and part of the tracking map and part of the environment map of the set of environment maps. With respect to methods, including steps to filter a set of.

いくつかの実施形態では、ネットワークアクセスポイントの１つ以上の識別子の類似性に基づいて、環境マップのセットをフィルタリングするステップは、環境マップのセット内に、ネットワークアクセスポイントの１つ以上の識別子に基づいて、追跡マップの少なくとも１つのエリア属性との最高Ｊａｃｃａｒｄ類似性を伴う環境マップを留保するステップを含む。 In some embodiments, the step of filtering a set of environment maps based on the similarity of one or more identifiers of a network access point is to one or more identifiers of the network access point within the set of environment maps. Based on, it comprises the step of reserving an environment map with the highest Jaccard similarity to at least one area attribute of the tracking map.

いくつかの実施形態では、追跡マップおよび環境マップのセットの環境マップのコンテンツを表すメトリックの類似性に基づいて、環境マップのセットをフィルタリングするステップは、環境マップのセット内に、追跡マップの特性のベクトルと環境マップのセット内の環境マップを表すベクトルとの間に最小ベクトル距離を伴う環境マップを留保するステップを含む。 In some embodiments, the step of filtering a set of environment maps based on the similarity of the metrics that represent the contents of the environment map of the tracking map and the set of environment maps is a characteristic of the tracking map within the set of environment maps. Includes the step of reserving an environment map with a minimum vector distance between the vector of and the vector representing the environment map in the set of environment maps.

いくつかの実施形態では、追跡マップおよび環境マップのコンテンツを表すメトリックは、マップのコンテンツから算出された値のベクトルを含む。 In some embodiments, the metrics representing the content of the tracking map and the environment map include a vector of values calculated from the content of the map.

いくつかの実施形態では、追跡マップの一部と環境マップのセットの環境マップの一部との間のマッチング度に基づいて、環境マップのセットをフィルタリングするステップは、環境マップのセットの環境マップ内にもまた表される、追跡マップによって表される物理的世界の体積を算出するステップと、環境マップのセット内に、セットからフィルタリング除去された環境マップより大きい算出された体積を伴う環境マップを留保するステップとを含む。 In some embodiments, the step of filtering the set of environment maps based on the degree of matching between part of the tracking map and part of the environment map of the set of environment maps is the environment map of the set of environment maps. An environment map with a step to calculate the volume of the physical world represented by the tracking map, also represented within, and a calculated volume within the set of environment maps that is larger than the environment map filtered out of the set. Includes steps to reserve.

いくつかの実施形態では、環境マップのセットは、最初に、１つ以上の識別子の類似性に基づいて、続いて、コンテンツを表すメトリックの類似性に基づいて、続いて、追跡マップの一部と環境マップの一部との間のマッチング度に基づいて、フィルタリングされる。 In some embodiments, the set of environment maps is first based on the similarity of one or more identifiers, then based on the similarity of the metrics that represent the content, and then part of the tracking map. Filtered based on the degree of matching between and part of the environment map.

いくつかの実施形態では、１つ以上の識別子の類似性と、コンテンツを表すメトリックの類似性と、追跡マップの一部と環境マップの一部との間のマッチング度とに基づく、環境マップのセットのフィルタリングは、フィルタリングを実施するために要求される処理に基づく順序で実施される。 In some embodiments, the environment map is based on the similarity of one or more identifiers, the similarity of the metrics that represent the content, and the degree of matching between part of the tracking map and part of the environment map. The filtering of the set is performed in an order based on the processing required to perform the filtering.

いくつかの実施形態では、環境マップは、１つ以上の識別子の類似性と、コンテンツを表すメトリックの類似性と、追跡マップの一部と環境マップの一部との間のマッチング度とに基づく、環境マップのセットのフィルタリングに基づいて選択され、情報は、選択された環境マップからユーザデバイス上にロードされる。 In some embodiments, the environment map is based on the similarity of one or more identifiers, the similarity of the metrics that represent the content, and the degree of matching between part of the tracking map and part of the environment map. , Selected based on the filtering of the set of environment maps, the information is loaded onto the user device from the selected environment map.

いくつかの実施形態では、環境マップは、１つ以上の識別子の類似性と、コンテンツを表すメトリックの類似性と、追跡マップの一部と環境マップの一部との間のマッチング度と、に基づく、環境マップのセットのフィルタリングに基づいて選択され、追跡マップは、選択された環境マップとマージされる。 In some embodiments, the environment map has the similarity of one or more identifiers, the similarity of the metrics that represent the content, and the degree of matching between part of the tracking map and part of the environment map. Based on the filtering of the set of environment maps, the tracking map is merged with the selected environment map.

いくつかの実施形態は、センサを備える複数のユーザデバイスとの通信のために構成される、拡張現実システムのためのクラウドコンピューティング環境であって、複数のユーザデバイスが使用されたエリアを示す、エリア識別を記憶する、ユーザデータベースであって、エリア識別は、使用時にユーザデバイスによって検出された無線ネットワークのパラメータを備える、ユーザデータベースと、複数のユーザデバイスによって供給されるデータおよび関連付けられるメタデータから構築された複数の環境マップを記憶する、マップデータベースであって、関連付けられるメタデータは、そこからマップが構築されたデータを供給した、複数のユーザデバイスのエリア識別から導出されたエリア識別を備え、エリア識別は、そこからマップが構築されたデータを供給した、ユーザデバイスによって検出された無線ネットワークのパラメータを備える、マップデータベースと、クラウドコンピューティング環境内の少なくとも１つのプロセッサによって実行されると、メッセージを、ユーザデバイスによって検出された無線ネットワークのパラメータを備える、複数のユーザデバイスから受信し、ユーザデバイスのためのエリア識別子を算出し、受信されたパラメータおよび／または算出されたエリア識別子に基づいて、ユーザデータベースを更新し、環境マップのための要求を複数のユーザデバイスから受信し、環境マップを要求するユーザデバイスと関連付けられるエリア識別子を決定し、少なくとも部分的に、エリア識別子に基づいて、環境マップのセットをマップデータベースから識別し、環境マップのセットをフィルタリングし、フィルタリングされた環境マップのセットをユーザデバイスに伝送し、環境マップのセットをフィルタリングするステップは、そこから環境マップのための要求が生じた、ユーザデバイスによって検出された無線ネットワークのパラメータと、環境マップのセット内の環境マップのためのマップデータベース内の無線ネットワークのパラメータとの類似性に基づく、コンピュータ実行可能命令を記憶する、非一過性コンピュータ記憶媒体とを備える、クラウドコンピューティング環境に関する。 Some embodiments are cloud computing environments for augmented reality systems configured for communication with a plurality of user devices comprising sensors, indicating an area in which the plurality of user devices are used. A user database that stores area identification, where area identification consists of a user database with parameters of the wireless network detected by the user device at the time of use, and data supplied by multiple user devices and associated metadata. A map database that stores multiple constructed environment maps, with associated metadata having area identification derived from the area identification of multiple user devices that supplied the data from which the map was built. Area identification is performed by a map database and at least one processor in a cloud computing environment, with the parameters of the wireless network detected by the user device that supplied the data from which the map was built. The message is received from multiple user devices, including the parameters of the wireless network detected by the user device, the area identifier for the user device is calculated, and based on the received parameters and / or the calculated area identifier. , Update the user database, receive requests for the environment map from multiple user devices, determine the area identifier associated with the user device requesting the environment map, and at least partially based on the area identifier, the environment. The steps of identifying a set of maps from the map database, filtering the set of environment maps, transmitting the filtered set of environment maps to the user device, and filtering the set of environment maps from there are requests for the environment map. Stores computer-executable instructions based on the similarity between the wireless network parameters detected by the user device and the wireless network parameters in the map database for the environmental map in the set of environmental maps. Concerning a cloud computing environment with a non-transient computer storage medium.

いくつかの実施形態では、コンピュータ実行可能命令はさらに、クラウドコンピューティング環境内の少なくとも１つのプロセッサによって実行されると、追跡マップを環境マップを要求するユーザデバイスから受信するように構成され、環境マップのセットをフィルタリングするステップはさらに、追跡マップおよび環境マップのセットの環境マップのコンテンツを表すメトリックの類似性に基づく。 In some embodiments, the computer executable instruction is further configured to receive a tracking map from the user device requesting the environment map when executed by at least one processor in the cloud computing environment. The step of filtering a set of is further based on the similarity of the metrics that represent the contents of the tracking map and the environment map of the environment map set.

いくつかの実施形態では、コンピュータ実行可能命令はさらに、クラウドコンピューティング環境内の少なくとも１つのプロセッサによって実行されると、追跡マップを環境マップを要求するユーザデバイスから受信するように構成され、環境マップのセットをフィルタリングするステップはさらに、追跡マップの一部と環境マップのセットの環境マップの一部との間のマッチング度に基づく。 In some embodiments, the computer executable instruction is further configured to receive a tracking map from the user device requesting the environment map when executed by at least one processor in the cloud computing environment. The step of filtering a set of is further based on the degree of matching between a portion of the tracking map and a portion of the environment map of the environment map set.

いくつかの実施形態では、無線ネットワークのパラメータは、それにユーザデバイスが接続される、ネットワークの基本サービスセット識別子（ＢＳＳＩＤ）を備える。 In some embodiments, the wireless network parameter comprises a network basic service set identifier (BSSID) to which the user device is connected.

いくつかの実施形態では、無線ネットワークのパラメータの類似性に基づいて、環境マップのセットをフィルタリングするステップは、環境マップを要求するユーザデバイスと関連付けられるユーザデータベース内に記憶される複数のＢＳＳＩＤと、環境マップのセットの環境マップと関連付けられるマップデータベース内に記憶されるＢＳＳＩＤとの類似性を算出するステップを含む。 In some embodiments, the step of filtering a set of environment maps based on the similarity of wireless network parameters is with multiple BSSIDs stored in the user database associated with the user device requesting the environment map. It involves calculating the similarity to the BSSID stored in the map database associated with the environment map of the set of environment maps.

いくつかの実施形態では、エリア識別子は、経度および緯度によって、地理的場所を示す。 In some embodiments, the area identifier indicates a geographical location by longitude and latitude.

いくつかの実施形態では、エリア識別子を決定するステップは、ユーザデータベースからのエリア識別子にアクセスするステップを含む。 In some embodiments, the step of determining the area identifier comprises accessing the area identifier from the user database.

いくつかの実施形態では、エリア識別子を決定するステップは、複数のユーザデバイスから受信されたメッセージ内のエリア識別子を受信するステップを含む。 In some embodiments, the step of determining the area identifier comprises receiving the area identifier in a message received from a plurality of user devices.

いくつかの実施形態では、無線ネットワークのパラメータは、Ｗｉ－Ｆｉおよび５ＧＮＲを含む、プロトコルに準拠する。 In some embodiments, the radio network parameters are protocol compliant, including Wi-Fi and 5G NR.

いくつかの実施形態では、コンピュータ実行可能命令はさらに、クラウドコンピューティング環境内の少なくとも１つのプロセッサによって実行されると、追跡マップをユーザデバイスから受信するように構成され、環境マップのセットをフィルタリングするステップはさらに、追跡マップの一部と環境マップのセットの環境マップの一部との間のマッチング度に基づく。 In some embodiments, the computer executable instruction is further configured to receive a tracking map from the user device when executed by at least one processor in the cloud computing environment, filtering the set of environment maps. The steps are further based on the degree of matching between part of the tracking map and part of the environment map in the set of environment maps.

いくつかの実施形態では、コンピュータ実行可能命令はさらに、クラウドコンピューティング環境内の少なくとも１つのプロセッサによって実行されると、追跡マップをユーザデバイスから受信し、追跡マップを供給するユーザデバイスに基づいて、追跡マップと関連付けられる、エリア識別子を決定し、少なくとも部分的に、追跡マップと関連付けられるエリア識別子に基づいて、第２の環境マップのセットをマップデータベースから選択し、受信された追跡マップに基づいて、マップデータベースを更新し、更新するステップは、受信された追跡マップと第２の環境マップのセット内の１つ以上の環境マップをマージするステップを含む、ように構成される。 In some embodiments, the computer executable instruction is further based on the user device that receives the tracking map from the user device and supplies the tracking map when executed by at least one processor in the cloud computing environment. Determine the area identifier associated with the tracking map, select a second set of environment maps from the map database, at least in part, based on the area identifier associated with the tracking map, and based on the tracking map received. , The step of updating and updating the map database is configured to include the step of merging one or more environment maps in a set of received tracking maps and a second environment map.

いくつかの実施形態では、コンピュータ実行可能命令はさらに、クラウドコンピューティング環境内の少なくとも１つのプロセッサによって実行されると、受信された追跡マップの一部と第２の環境マップのセットの環境マップの一部との間のマッチング度に基づいて、第２の環境マップのセットをフィルタリングするように構成され、追跡マップと第２の環境マップのセット内の１つ以上の環境マップをマージするステップは、追跡マップとフィルタリングされた第２の環境マップのセット内の１つ以上の環境マップをマージするステップを含む。 In some embodiments, the computer executable instructions are further executed by at least one processor in the cloud computing environment to receive a portion of the tracking map and a second set of environment maps for the environment map. The step is configured to filter the second set of environment maps based on the degree of matching with some, and the step of merging the tracking map and one or more environment maps in the second set of environment maps is , Includes a step of merging one or more environment maps in a set of tracking maps and a filtered second environment map.

いくつかの実施形態は、実世界オブジェクトの複数の表面を検出する、実オブジェクト検出デバイスと、実オブジェクト検出デバイスに接続され、実世界オブジェクトに基づいて、マップを生成する、ＰＣＦ識別システムと、マップに基づいて、第１のＰＣＦを生成し、第１のＰＣＦとマップを関連付ける、持続的座標フレーム（ＰＣＦ）生成システムと、それぞれ、第１および第２のＸＲデバイス上にある、第１および第２の記憶媒体と、第１のＰＣＦを、それぞれ、第１および第２のＸＲデバイスの第１および第２の記憶媒体内に記憶する、第１および第２のＸＲデバイスの少なくとも第１および第２のプロセッサとを備える、ＸＲシステムに関する。 Some embodiments are PCF identification systems and maps that are connected to a real object detection device that detects multiple surfaces of a real world object and generate a map based on the real world object. Based on, a persistent coordinate frame (PCF) generation system that generates a first PCF and associates a map with the first PCF, and first and second XR devices on the first and second XR devices, respectively. At least the first and first of the first and second XR devices, which store the second storage medium and the first PCF in the first and second storage media of the first and second XR devices, respectively. The present invention relates to an XR system including two processors.

いくつかの実施形態では、ＸＲシステムは、少なくとも１つのプロセッサによって実行可能であって、複数のカメラ画像を複数の個別のキーフレームに変換する、キーフレーム生成器と、少なくとも１つのプロセッサによって実行可能であって、複数のキーフレームを平均することによって、持続的姿勢を生成する、持続的姿勢計算機と、少なくとも１つのプロセッサによって実行可能であって、追跡マップを持続的姿勢に変換し、追跡マップに対する原点における持続的姿勢を決定する、追跡マップおよび持続的姿勢変換器と、少なくとも１つのプロセッサによって実行可能であって、持続的姿勢を第１のＰＣＦに変換し、持続的姿勢に対する第１のＰＣＦを決定する、持続的姿勢およびＰＣＦ変換器と、少なくとも１つのプロセッサによって実行可能であって、第１のＰＣＦを画像データに変換する、ＰＣＦおよび画像データ変換器と、第１のＰＣＦに対する画像データをユーザに表示する、ディスプレイデバイスとを備える。 In some embodiments, the XR system can be run by at least one processor, a keyframe generator that converts multiple camera images into multiple individual keyframes, and at least one processor. And it can be run by a persistent posture calculator and at least one processor to generate a sustained pose by averaging multiple keyframes, transforming the tracking map into a sustained pose and tracking map. A tracking map and a sustained attitude converter that determines the sustained attitude at the origin with respect to, and a first PCF that can be executed by at least one processor and that converts the sustained attitude to a first PCF and a sustained attitude. A PCF and an image data converter that can be run by at least one processor and that converts the first PCF to image data, and an image to the first PCF that determines the PCF. It is equipped with a display device that displays data to the user.

いくつかの実施形態では、検出デバイスは、第１のＸＲデバイスプロセッサに接続される、第１のＸＲデバイスの検出デバイスである。 In some embodiments, the detection device is a detection device for the first XR device that is connected to the first XR device processor.

いくつかの実施形態では、マップは、第１のＸＲデバイス上の第１のマップであって、第１のマップを生成するプロセッサは、第１のＸＲデバイスの第１のＸＲデバイスプロセッサである。 In some embodiments, the map is a first map on a first XR device, and the processor that produces the first map is the first XR device processor of the first XR device.

いくつかの実施形態では、第１のＰＣＦを生成するプロセッサは、第１のＸＲデバイスの第１のＸＲデバイスプロセッサである。 In some embodiments, the processor that produces the first PCF is the first XR device processor of the first XR device.

いくつかの実施形態では、第１のＰＣＦと第１のマップを関連付けるプロセッサは、第１のＸＲデバイスの第１のＸＲデバイスプロセッサである。 In some embodiments, the processor that associates the first PCF with the first map is the first XR device processor of the first XR device.

いくつかの実施形態では、ＸＲシステムは、第１のＸＲデバイスプロセッサによって実行可能である、アプリケーションと、第１のＸＲデバイスプロセッサによって実行可能であって、第１のＰＣＦ追跡器をアプリケーションからオンに切り替えるためのオンプロンプトを含む、第１のＰＣＦ追跡器であって、第１のＰＣＦ追跡器がオンに切り替えられる場合のみ、第１のＰＣＦを生成する、第１のＰＣＦ追跡器とを備える。 In some embodiments, the XR system is run by the application, which is run by the first XR device processor, and the first PCF tracker, which is run by the first XR device processor, from the application. It comprises a first PCF tracker that includes an on-prompt for switching and that produces a first PCF only if the first PCF tracker is switched on.

いくつかの実施形態では、第１のＰＣＦ追跡器は、第１のＰＣＦ追跡器をアプリケーションからオフに切り替えるためのオフプロンプトを有し、第１のＰＣＦ追跡器は、第１のＰＣＦ追跡器がオフに切り替えられると、第１のＰＣＦ生成を終了する。 In some embodiments, the first PCF tracker has an off-prompt for switching the first PCF tracker off from the application, and the first PCF tracker is the first PCF tracker. When switched off, the first PCF generation ends.

いくつかの実施形態では、ＸＲシステムは、第１のＸＲデバイスプロセッサによって実行可能であって、第１のＰＣＦをサーバに伝送する、マップ発行器と、サーバのサーバプロセッサによって実行可能であって、第１のＰＣＦをサーバの記憶デバイス上に記憶する、マップ記憶ルーチンと、サーバのサーバプロセッサを用いて、第１のＰＣＦを第２のＸＲデバイスに伝送するステップと、第２のＸＲデバイスの第２のＸＲデバイスプロセッサによって実行可能であって、第１のＰＣＦをサーバからダウンロードする、マップダウンロードシステムとを備える。 In some embodiments, the XR system is runnable by a first XR device processor, a map issuer that transmits the first PCF to the server, and a server processor of the server. A map storage routine that stores the first PCF on the storage device of the server, a step of transmitting the first PCF to the second XR device using the server processor of the server, and a second of the second XR devices. It comprises a map download system that can be run by two XR device processors and downloads a first PCF from a server.

いくつかの実施形態では、ＸＲシステムは、第２のＸＲデバイスプロセッサによって実行可能である、アプリケーションと、第２のＸＲデバイスプロセッサによって実行可能であって、第２のＰＣＦ追跡器をアプリケーションからオンに切り替えるためのオンプロンプトを含む、第２のＰＣＦ追跡器であって、第２のＰＣＦ追跡器がオンに切り替えられる場合のみ、第２のＰＣＦを生成する、第２のＰＣＦ追跡器とを備える。 In some embodiments, the XR system is runn by an application, runn by a second XR device processor, and a second PCF tracker turned on from the application, runn by a second XR device processor. It comprises a second PCF tracker that includes an on-prompt for switching, with a second PCF tracker that produces a second PCF only if the second PCF tracker is switched on.

いくつかの実施形態では、第２のＰＣＦ追跡器は、第２のＰＣＦ追跡器をアプリケーションからオフに切り替えるためのオフプロンプトを有し、第２のＰＣＦ追跡器は、第２のＰＣＦ追跡器がオフに切り替えられると、第２のＰＣＦ生成を終了する。 In some embodiments, the second PCF tracker has an off-prompt for switching the second PCF tracker off from the application, and the second PCF tracker is a second PCF tracker. When switched off, the second PCF generation ends.

いくつかの実施形態では、ＸＲシステムは、第２のＸＲデバイスプロセッサによって実行可能であって、第２のＰＣＦをサーバに伝送する、マップ発行器を備える。 In some embodiments, the XR system comprises a map issuer that is executable by a second XR device processor and transmits a second PCF to the server.

いくつかの実施形態では、ＸＲシステムは、第１のＸＲデバイスプロセッサによって実行可能であって、持続的姿勢をサーバからダウンロードする、持続的姿勢入手器と、第１のＸＲデバイスプロセッサによって実行可能であって、持続的姿勢に基づいて、ＰＣＦを第１のＸＲデバイスの第１の記憶デバイスから読み出す、ＰＣＦ確認器と、第１のＸＲデバイスプロセッサによって実行可能であって、第１の記憶デバイスから読み出されたＰＣＦに基づいて、座標フレームを計算する、座標フレーム計算機とを備える。 In some embodiments, the XR system is runnable by a first XR device processor, a persistent posture acquirer that downloads the persistent posture from the server, and a first XR device processor. There is a PCF confirmer that reads the PCF from the first storage device of the first XR device based on a sustained attitude, and can be executed by the first XR device processor from the first storage device. A coordinate frame processor that calculates a coordinate frame based on the read PCF is provided.

いくつかの実施形態は、少なくとも１つの検出デバイスを用いて、実世界オブジェクトの複数の表面を検出するステップと、少なくとも１つのプロセッサを用いて、実世界オブジェクトに基づいて、マップを生成するステップと、少なくとも１つのプロセッサを用いて、マップに基づいて、第１のＰＣＦを生成するステップと、少なくとも１つのプロセッサを用いて、第１のＰＣＦとマップを関連付けるステップと、第１および第２のＸＲデバイスの少なくとも第１および第２のプロセッサを用いて、第１のＰＣＦを、それぞれ、第１および第２のＸＲデバイスの第１および第２の記憶媒体内に記憶するステップとを含む、視認方法に関する。 Some embodiments include a step of detecting multiple surfaces of a real-world object using at least one detection device and a step of generating a map based on the real-world object using at least one processor. , A step of generating a first PCF based on a map using at least one processor, a step of associating a map with a first PCF using at least one processor, and a first and second XR. A visual method comprising storing the first PCF in the first and second storage media of the first and second XR devices, respectively, using at least the first and second processors of the device. Regarding.

いくつかの実施形態では、視認方法は、少なくとも１つのプロセッサを用いて、複数のカメラ画像を複数の個別のキーフレームに変換するステップと、少なくとも１つのプロセッサを用いて、複数のキーフレームを平均することによって、持続的姿勢を生成するステップと、少なくとも１つのプロセッサを用いて、追跡マップを持続的姿勢に変換し、追跡マップに対する原点における持続的姿勢を決定するステップと、少なくとも１つのプロセッサによって、持続的姿勢を第１のＰＣＦに変換し、持続的姿勢に対する第１のＰＣＦを決定するステップと、少なくとも１つのプロセッサを用いて、第１のＰＣＦを画像データに変換するステップと、ディスプレイデバイスを用いて、第１のＰＣＦに対する画像データをユーザに表示するステップとを含む。 In some embodiments, the visual method uses at least one processor to convert multiple camera images into multiple individual keyframes, and at least one processor to average multiple keyframes. By doing so, with at least one processor, there is a step to generate a persistent posture, and with at least one processor, there is a step to convert the tracking map into a persistent posture and determine the persistent posture at the origin with respect to the tracking map. , A step of converting a persistent posture to a first PCF and determining a first PCF for a sustained posture, a step of converting the first PCF to image data using at least one processor, and a display device. Includes a step of displaying image data for the first PCF to the user using.

いくつかの実施形態では、視認方法は、第１のＸＲデバイスプロセッサを用いて、アプリケーションを実行するステップと、第１のＸＲデバイスプロセッサを用いて、アプリケーションからのオンプロンプトで、第１のＰＣＦ追跡器をオンに切り替えるステップであって、第１のＰＣＦ追跡器は、第１のＰＣＦ追跡器がオンに切り替えられる場合のみ、第１のＰＣＦを生成する、ステップとを含む。 In some embodiments, the visual method is a step of running the application using the first XR device processor and a first PCF tracking on-prompt from the application using the first XR device processor. A step of switching on a device, wherein the first PCF tracker includes a step of generating a first PCF only when the first PCF tracker is turned on.

いくつかの実施形態では、視認方法は、第１のＸＲデバイスプロセッサを用いて、アプリケーションからのオフプロンプトで、第１のＰＣＦ追跡器をオフに切り替え得るステップであって、第１のＰＣＦ追跡器は、第１のＰＣＦ追跡器がオフに切り替えられると、第１のＰＣＦ生成を終了するステップを含む。 In some embodiments, the visual method is a step in which the first PCF tracker can be switched off at an off-prompt from an application using the first XR device processor, the first PCF tracker. Includes the step of ending the first PCF generation when the first PCF tracker is switched off.

いくつかの実施形態では、視認方法は、第１のＸＲデバイスプロセッサを用いて、第１のＰＣＦをサーバに伝送するステップと、サーバのサーバプロセッサを用いて、第１のＰＣＦをサーバの記憶デバイス上に記憶するステップと、サーバのサーバプロセッサを用いて、第１のＰＣＦを第２のＸＲデバイスに伝送するステップと、第２のＸＲデバイスの第２のＸＲデバイスプロセッサを用いて、第１のＰＣＦをサーバから受信するステップとを含む。 In some embodiments, the visual method uses a first XR device processor to transmit the first PCF to the server and a server processor to use the server's server processor to transfer the first PCF to the server's storage device. The steps stored above, the step of transmitting the first PCF to the second XR device using the server processor of the server, and the first step using the second XR device processor of the second XR device. Includes a step of receiving the PCF from the server.

いくつかの実施形態では、視認方法は、第２のＸＲデバイスプロセッサを用いて、アプリケーションを実行するステップと、第２のＸＲデバイスプロセッサを用いて、アプリケーションからのオンプロンプトで、第２のＰＣＦ追跡器をオンに切り替えるステップであって、第２のＰＣＦ追跡器は、第２のＰＣＦ追跡器がオンに切り替えられる場合のみ、第２のＰＣＦを生成する、ステップとを含む。 In some embodiments, the visual method is a step of running the application using a second XR device processor and a second PCF tracking on-prompt from the application using the second XR device processor. A step of switching the device on, wherein the second PCF tracker includes a step of generating a second PCF only if the second PCF tracker is turned on.

いくつかの実施形態では、視認方法は、第１のＸＲデバイスプロセッサを用いて、アプリケーションからのオフプロンプトで、第２のＰＣＦ追跡器をオフに切り替えるステップであって、第２のＰＣＦ追跡器は、第２のＰＣＦ追跡器がオフに切り替えられると、第２のＰＣＦ生成を終了するステップを含む。 In some embodiments, the visual method is a step of switching off the second PCF tracker at an off-prompt from the application using the first XR device processor, where the second PCF tracker is , Includes a step to end the second PCF generation when the second PCF tracker is switched off.

いくつかの実施形態では、視認方法は、第２のＸＲデバイスプロセッサを用いて、第２のＰＣＦをサーバにアップロードするステップを含む。 In some embodiments, the visual method comprises uploading a second PCF to a server using a second XR device processor.

いくつかの実施形態では、視認方法は、第１のＸＲデバイスプロセッサを用いて、持続的姿勢をサーバから決定するステップと、第１のＸＲデバイスプロセッサを用いて、持続的姿勢に基づいて、ＰＣＦを第１のＸＲデバイスの第１の記憶デバイスから読み出すステップと、第１のＸＲデバイスプロセッサを用いて、第１の記憶デバイスから読み出されたＰＣＦに基づいて、座標フレームを計算するステップとを含む。 In some embodiments, the visual method uses a first XR device processor to determine a persistent posture from the server and a first XR device processor to use a PCF based on the persistent posture. From the first storage device of the first XR device, and a step of calculating a coordinate frame based on the PCF read from the first storage device using the first XR device processor. include.

いくつかの実施形態は、第１のＸＲデバイスプロセッサと、第１のＸＲデバイスプロセッサに接続される、第１のＸＲデバイス記憶デバイスと、第１のＸＲデバイスプロセッサによって実行可能であって、持続的姿勢をサーバからダウンロードする、ダウンロードシステムと、第１のＸＲデバイスプロセッサによって実行可能であって、持続的姿勢に基づいて、ＰＣＦを第１のＸＲデバイスの第１の記憶デバイスから読み出す、ＰＣＦ読出器と、第１のＸＲデバイスプロセッサによって実行可能であって、第１の記憶デバイスから読み出されたＰＣＦに基づいて、座標フレームを計算する、座標フレーム計算機とを含む、第１のＸＲデバイスプロセッサ上の命令のセットとを含み得る、第１のＸＲデバイスを備えるＸＲシステムに関する。 Some embodiments are feasible and sustainable by the first XR device processor, the first XR device storage device connected to the first XR device processor, and the first XR device processor. A PCF reader that reads the PCF from the first storage device of the first XR device, which can be executed by a download system that downloads the posture from the server and a first XR device processor, based on a sustained posture. On a first XR device processor, including a coordinate frame processor that is executable by the first XR device processor and calculates coordinate frames based on the PCF read from the first storage device. With respect to an XR system comprising a first XR device, which may include a set of instructions.

いくつかの実施形態は、第１のＸＲデバイスの第１のＸＲデバイスプロセッサを用いて、持続的姿勢をサーバからダウンロードするステップと、第１のＸＲデバイスプロセッサを用いて、持続的姿勢に基づいて、ＰＣＦを第１のＸＲデバイスの第１の記憶デバイスから読み出すステップと、第１のＸＲデバイスプロセッサを用いて、第１の記憶デバイスから読み出されたＰＣＦに基づいて、座標フレームを計算するステップとを含む、視認方法に関する。 In some embodiments, the first XR device processor of the first XR device is used to download the persistent posture from the server, and the first XR device processor is used based on the persistent posture. , A step of reading the PCF from the first storage device of the first XR device, and a step of calculating a coordinate frame based on the PCF read from the first storage device using the first XR device processor. Concerning visual recognition methods, including.

いくつかの実施形態は、サーバプロセッサと、サーバプロセッサに接続される、サーバ記憶デバイスと、サーバのサーバプロセッサを用いて実行可能であって、マップと関連付けられた第１のＰＣＦをサーバのサーバ記憶デバイス上に記憶する、マップ記憶ルーチンと、サーバプロセッサを用いて、サーバプロセッサを用いて実行可能であって、マップおよび第１のＰＣＦを第１のＸＲデバイスに伝送する、マップ伝送機とを含み得る、サーバを含む、ＸＲデバイスに関する。 Some embodiments are feasible with a server processor, a server storage device connected to the server processor, and a server server processor, and a first PCF associated with a map is stored in the server server. Includes a map storage routine that stores on the device and a map transmitter that can be executed using the server processor and transmits the map and the first PCF to the first XR device. Get about XR devices, including servers.

いくつかの実施形態は、サーバのサーバプロセッサを用いて、マップと関連付けられた第１のＰＣＦをサーバのサーバ記憶デバイス上に記憶するステップと、サーバのサーバプロセッサを用いて、マップおよび第１のＰＣＦを第１のＸＲデバイスに伝送するステップとを含む、視認方法に関する。 In some embodiments, the server processor of the server is used to store the first PCF associated with the map on the server storage device of the server, and the server processor of the server is used to store the map and the first PCF. The present invention relates to a visual recognition method including a step of transmitting a PCF to a first XR device.

いくつかの実施形態は、ＸＲデバイスのプロセッサによって、ユーザの頭部に固着される頭部搭載型フレーム上の捕捉デバイスを用いて、環境を捕捉し、頭部搭載型フレームの配向を決定することによって、頭部姿勢の追跡に入るステップと、プロセッサによって、頭部搭載型フレームの配向決定不能に起因して、頭部姿勢が喪失されたかどうかを決定するステップと、頭部姿勢が、喪失された場合、プロセッサによって、姿勢復元モードに入り、頭部搭載型フレームの配向を決定することによって、頭部姿勢を確立するステップとを含む、視認方法に関する。 In some embodiments, the processor of the XR device uses a capture device on a head-mounted frame that is anchored to the user's head to capture the environment and determine the orientation of the head-mounted frame. The step of entering the head posture tracking and the step of determining whether the head posture was lost due to the inability to orient the head-mounted frame by the processor, and the head posture was lost. If so, the present invention relates to a visual method including a step of establishing a head posture by entering a posture restoration mode by a processor and determining the orientation of the head-mounted frame.

いくつかの実施形態では、頭部姿勢が、喪失されていない場合、プロセッサによって、頭部姿勢の追跡に入る。 In some embodiments, if the head posture is not lost, the processor enters the tracking of the head posture.

いくつかの実施形態では、姿勢復元は、プロセッサによって、環境の捕捉を改良するための提案とともに、メッセージをユーザに表示するステップを含む。 In some embodiments, the posture restoration involves displaying a message to the user, along with suggestions for improving the capture of the environment by the processor.

いくつかの実施形態では、提案は、光を増加させるステップおよびテクスチャを精緻化するステップのうちの少なくとも１つである。 In some embodiments, the proposal is at least one of a step of increasing light and a step of refining the texture.

いくつかの実施形態では、視認方法は、プロセッサによって、復元が失敗したかどうかを決定するステップと、復元が、失敗した場合、プロセッサによって、頭部姿勢を確立するステップを含む、新しいセッションを開始するステップとを含む。 In some embodiments, the visual method initiates a new session, comprising the step of determining if the restore failed by the processor and, if the restore fails, the step of establishing the head posture by the processor. Including steps to do.

いくつかの実施形態では、視認方法は、プロセッサによって、新しいセッションが開始されるであろうことのメッセージをユーザに表示するステップを含む。 In some embodiments, the visual method comprises displaying to the user a message that the processor will initiate a new session.

いくつかの実施形態では、視認方法は、頭部姿勢が、喪失されていない場合、プロセッサによって、頭部姿勢の追跡に入るステップを含む。 In some embodiments, the visual method comprises stepping into tracking the head posture by a processor if the head posture is not lost.

いくつかの実施形態は、コンピューティングシステムを動作させ、仮想オブジェクトを、１つ以上の物理的オブジェクトを備える、場面内にレンダリングする方法に関する。本方法は、場面に関する複数の画像をユーザによって装着される第１のデバイスの１つ以上のセンサから捕捉するステップと、少なくとも部分的に、複数の画像に基づいて、１つ以上の持続的姿勢を算出するステップと、複数の画像の情報が、持続的座標フレームを介して、第１のデバイスおよび／または第２のデバイス上で起動する１つ以上のアプリケーションによって、異なる時間にアクセスされ得るように、少なくとも部分的に、算出された１つ以上の持続的姿勢に基づいて、持続的座標フレームを生成するステップとを含む。 Some embodiments relate to a method of operating a computing system to render a virtual object in a scene comprising one or more physical objects. The method captures multiple images of a scene from one or more sensors of a first device worn by the user, and at least in part, one or more sustained postures based on the images. And the information of multiple images can be accessed at different times by one or more applications launched on the first device and / or the second device via a persistent coordinate frame. Includes, at least in part, a step of generating a persistent coordinate frame based on one or more calculated persistent poses.

いくつかの実施形態では、少なくとも部分的に、複数の画像に基づいて、１つ以上の持続的姿勢を算出するステップは、１つ以上の特徴を複数の画像のそれぞれから抽出するステップと、１つ以上の特徴毎に、記述子を生成するステップと、少なくとも部分的に、記述子に基づいて、複数の画像毎に、キーフレームを生成するステップと、少なくとも部分的に、１つ以上のキーフレームに基づいて、１つ以上の持続的姿勢を生成するステップとを含む。 In some embodiments, at least in part, the step of calculating one or more persistent postures based on a plurality of images is a step of extracting one or more features from each of the plurality of images and one. A step to generate a descriptor for each of one or more features, and at least partly to generate a keyframe for each of multiple images based on the descriptor, and at least partially one or more keys. Includes steps to generate one or more sustained postures based on the frame.

いくつかの実施形態では、少なくとも部分的に、算出された１つ以上の持続的姿勢に基づいて、持続的座標フレームを生成するステップは、第１のデバイスが以前の持続的座標フレームの場所から所定の距離を進行すると、持続的座標フレームを生成するステップを含む。 In some embodiments, at least in part, the step of generating a persistent coordinate frame based on one or more calculated persistent poses is that the first device is from the location of the previous persistent coordinate frame. Includes a step to generate a persistent coordinate frame as it travels a given distance.

いくつかの実施形態では、所定の距離は、２～２０メートルであって、デバイスの算出リソースの消費および仮想オブジェクトの設置誤差の両方に基づく。 In some embodiments, the predetermined distance is 2 to 20 meters and is based on both the consumption of computational resources of the device and the installation error of the virtual object.

いくつかの実施形態では、本方法は、第１のデバイスが電源投入されると、初期持続的姿勢を生成するステップと、第１のデバイスが、円形の中心としての初期持続的姿勢および閾値距離に等しい半径を伴う、円形の周に到達すると、第１のデバイスの現在の場所において第１の持続的姿勢を生成するステップとを含む。 In some embodiments, the method comprises the step of generating an initial sustained posture when the first device is powered on, and the initial sustained posture and threshold distance where the first device is the center of a circle. Includes the step of generating a first sustained posture at the current location of the first device upon reaching a circular circumference with a radius equal to.

いくつかの実施形態では、円形は、第１の円形である。本方法はさらに、デバイスが、円形の中心としての第１の持続的姿勢および閾値距離の２倍に等しい半径を伴う、第２の円形の周に到達すると、第１のデバイスの現在の場所で第２の持続的姿勢を生成するステップを含む。 In some embodiments, the circle is the first circle. The method further provides that when the device reaches the circumference of the second circle with a first sustained posture as the center of the circle and a radius equal to twice the threshold distance, at the current location of the first device. It involves the step of generating a second sustained posture.

いくつかの実施形態では、第１のデバイスが、既存の持続的姿勢を初期持続的姿勢からの閾値距離内に見出すとき、第１の持続的姿勢は、生成されない。 In some embodiments, when the first device finds an existing sustained posture within a threshold distance from the initial sustained posture, the first sustained posture is not generated.

いくつかの実施形態では、第１のデバイスは、第１の持続的姿勢に、第１の持続的姿勢までの所定の距離内にある、複数のキーフレームのうちの１つ以上のものを結び付ける。 In some embodiments, the first device associates the first sustained posture with one or more of a plurality of keyframes within a predetermined distance to the first sustained posture. ..

いくつかの実施形態では、第１のデバイス上で起動するアプリケーションが持続的姿勢を要求しない限り、第１の持続的姿勢は、生成されない。 In some embodiments, the first persistent attitude is not generated unless the application launched on the first device requires a sustained attitude.

いくつかの実施形態は、ユーザによって携行可能な電子システムに関する。電子システムは、場面内の１つ以上の物理的オブジェクトに関する画像を捕捉するように構成される、１つ以上のセンサと、コンピュータ実行可能命令を実行し、仮想コンテンツを場面内にレンダリングするように構成される、アプリケーションと、コンピュータ実行可能命令を実行し、仮想コンテンツについての画像データをアプリケーションに提供するように構成される、少なくとも１つのプロセッサであって、コンピュータ実行可能命令は、少なくとも部分的に、捕捉された画像に基づいて、持続座標フレームを生成するための命令を備える、少なくとも１つのプロセッサとを含む。 Some embodiments relate to electronic systems that can be carried by the user. The electronic system is configured to capture images of one or more physical objects in the scene, execute computer-executable instructions with one or more sensors, and render virtual content in the scene. A computer-executable instruction that is configured to execute an application and a computer-executable instruction and is configured to provide image data about virtual content to the application, wherein the computer-executable instruction is at least partially. Includes at least one processor with instructions for generating sustained coordinate frames based on the captured image.

本開示の前述の実施形態は、多数の方法のいずれかにおいて実装されることができる。例えば、実施形態は、ハードウェア、ソフトウェア、またはそれらの組み合わせを使用して実装されてもよい。ソフトウェア内に実装されるとき、ソフトウェアコードが、単一コンピュータ内に提供される、または複数のコンピュータ間に分散されるかどうかにかかわらず、任意の好適なプロセッサまたはプロセッサの集合上で実行されることができる。そのようなプロセッサは、いくつか挙げると、ＣＰＵチップ、ＧＰＵチップ、マイクロプロセッサ、マイクロコントローラ、またはコプロセッサ等、当技術分野において公知の市販の集積回路コンポーネントを含む、集積回路コンポーネント内の１つ以上のプロセッサとともに、集積回路として実装されてもよい。いくつかの実施形態では、プロセッサは、ＡＳＩＣ等のカスタム回路内に、またはプログラマブル論理デバイスを構成することから生じる半カスタム回路内に実装されてもよい。さらなる代替として、プロセッサは、市販、半カスタム、またはカスタムかどうかにかかわらず、より大きい回路または半導体デバイスの一部であってもよい。具体的実施例として、いくつかの市販のマイクロプロセッサは、１つまたはそれらのコアのサブセットがプロセッサを構成し得るように、複数のコアを有する。但し、プロセッサは、任意の好適なフォーマットにおける回路を使用して実装されてもよい。 The aforementioned embodiments of the present disclosure can be implemented in any of a number of methods. For example, embodiments may be implemented using hardware, software, or a combination thereof. When implemented in software, the software code runs on any suitable processor or set of processors, whether provided within a single computer or distributed among multiple computers. be able to. Such processors are one or more of integrated circuit components, including commercially available integrated circuit components known in the art, such as CPU chips, GPU chips, microprocessors, microprocessors, or coprocessors, to name a few. It may be implemented as an integrated circuit together with the processor of. In some embodiments, the processor may be implemented in a custom circuit such as an ASIC, or in a semi-custom circuit resulting from configuring a programmable logic device. As a further alternative, the processor may be part of a larger circuit or semiconductor device, whether commercially available, semi-custom, or custom. As a specific embodiment, some commercially available microprocessors have multiple cores such that one or a subset of those cores can constitute the processor. However, the processor may be implemented using circuits in any suitable format.

さらに、コンピュータは、ラックマウント式コンピュータ、デスクトップコンピュータ、ラップトップコンピュータ、またはタブレットコンピュータ等のいくつかの形態のうちのいずれかで具現化され得ることを理解されたい。加えて、コンピュータは、携帯情報端末（ＰＤＡ）、スマートフォン、または任意の好適な携帯用または固定電子デバイスを含む、概してコンピュータと見なされないが好適な処理能力を伴う、デバイスで具現化されてもよい。 Further, it should be understood that a computer can be embodied in any of several forms such as a rack-mounted computer, a desktop computer, a laptop computer, or a tablet computer. In addition, a computer may be embodied in a device, including a personal digital assistant (PDA), a smartphone, or any suitable portable or fixed electronic device, which is generally not considered a computer but has suitable processing power. good.

また、コンピュータは、１つ以上の入力および出力デバイスを有してもよい。これらのデバイスは、とりわけ、ユーザインターフェースを提示するために使用されることができる。ユーザインターフェースを提供するために使用され得る、出力デバイスの実施例は、出力の視覚的提示のためのプリンタまたはディスプレイ画面、または出力の可聴提示のためのスピーカまたは他の音生成デバイスを含む。ユーザインターフェースのために使用され得る、入力デバイスの実施例は、キーボード、およびマウス、タッチパッド、およびデジタル化タブレット等のポインティングデバイスを含む。別の実施例として、コンピュータは、発話認識を通して、または他の可聴フォーマットにおいて、入力情報を受信してもよい。図示される実施形態では、入力／出力デバイスは、コンピューティングデバイスと物理的に別個として図示される。しかしながら、いくつかの実施形態では、入力および／または出力デバイスは、プロセッサと同一ユニットまたはコンピューティングデバイスの他の要素の中に物理的に統合されてもよい。例えば、キーボードは、タッチスクリーン上のソフトキーボードとして実装され得る。いくつかの実施形態では、入力／出力デバイスは、コンピューティングデバイスから完全に接続解除され、無線接続を通して機能的に統合されてもよい。 The computer may also have one or more input and output devices. These devices can be used, among other things, to present a user interface. Examples of output devices that can be used to provide a user interface include a printer or display screen for the visual presentation of the output, or a speaker or other sound generating device for the audible presentation of the output. Examples of input devices that can be used for user interfaces include keyboards and pointing devices such as mice, touchpads, and digitized tablets. As another embodiment, the computer may receive input information through speech recognition or in other audible formats. In the illustrated embodiment, the input / output device is shown as physically separate from the computing device. However, in some embodiments, the input and / or output device may be physically integrated into the same unit as the processor or other element of the computing device. For example, the keyboard may be implemented as a soft keyboard on a touch screen. In some embodiments, the input / output device may be completely disconnected from the computing device and functionally integrated through a wireless connection.

そのようなコンピュータは、企業ネットワークまたはインターネット等、ローカル面積ネットワークまたは広域ネットワークとしての形態を含む、任意の好適な形態の１つ以上のネットワークによって相互接続されてもよい。そのようなネットワークは、任意の好適な技術に基づいてもよく、任意の好適なプロトコルに従って動作してもよく、無線ネットワーク、有線ネットワーク、または光ファイバネットワークを含んでもよい。 Such computers may be interconnected by one or more networks of any suitable form, including as a local area network or wide area network, such as a corporate network or the Internet. Such networks may be based on any suitable technique, may operate according to any suitable protocol, and may include wireless networks, wired networks, or fiber optic networks.

また、本明細書で概説される種々の方法およびプロセスは、種々のオペレーティングシステムまたはプラットフォームのうちのいずれか１つを採用する、１つ以上のプロセッサ上で実行可能である、ソフトウェアとしてコード化されてもよい。加えて、そのようなソフトウェアは、いくつかの好適なプログラミング言語および／またはプログラミングまたはスクリプト作成ツールのうちのいずれかを使用して、書き込まれてもよく、また、フレームワークまたは仮想マシン上で実行される実行可能機械言語コードまたは中間コードとしてコンパイルされてもよい。 Also, the various methods and processes outlined herein are encoded as software that can be run on one or more processors that employ any one of the various operating systems or platforms. You may. In addition, such software may be written using any of several suitable programming languages and / or programming or scripting tools and may also be run on a framework or virtual machine. It may be compiled as executable machine language code or intermediate code.

本側面では、本開示は、１つ以上のコンピュータまたは他のプロセッサ上で実行されるときに、上記で議論される本開示の種々の実施形態を実装する方法を行う、１つ以上のプログラムで符号化される、コンピュータ可読記憶媒体（または複数のコンピュータ可読媒体）（例えば、コンピュータメモリ、１つ以上のフロッピー（登録商標）ディスク、コンパクトディスク（ＣＤ）、光学ディスク、デジタルビデオディスク（ＤＶＤ）、磁気テープ、フラッシュメモリ、フィールドプログラマブルゲートアレイまたは他の半導体デバイス内の回路構成、または他の有形コンピュータ記憶媒体）として具現化されてもよい。前述の実施例から明白なように、コンピュータ可読記憶媒体は、非一過性形態においてコンピュータ実行可能命令を提供するために十分な時間の間、情報を留保し得る。そのようなコンピュータ可読記憶媒体または複数の媒体は、上記に記載されるように、その上に記憶される１つまたは複数のプログラムが、本開示の種々の側面を実装するように１つ以上の異なるコンピュータまたは他のプロセッサ上にロードされ得るように、トランスポータブルであることができる。本明細書で使用されるように、用語「コンピュータ可読記憶媒体」は、製造（すなわち、製造品）または機械と見なされ得るコンピュータ可読媒体のみを包含する。いくつかの実施形態では、本開示は、伝搬信号等のコンピュータ可読記憶媒体以外のコンピュータ可読媒体として具現化されてもよい。 In this aspect, the disclosure is in one or more programs that, when run on one or more computers or other processors, implement methods of implementing the various embodiments of the present disclosure discussed above. Encoded computer-readable storage medium (or multiple computer-readable media) (eg, computer memory, one or more floppy® discs, compact discs (CDs), optical discs, digital video discs (DVDs), It may be embodied as a magnetic tape, flash memory, circuit configuration within a field programmable gate array or other semiconductor device, or other tangible computer storage medium). As will be apparent from the above embodiments, the computer-readable storage medium may reserve information for a sufficient amount of time to provide computer-executable instructions in a non-transient form. Such computer-readable storage media or media, as described above, is one or more such that one or more programs stored on it implement various aspects of the present disclosure. It can be transportable so that it can be loaded on different computers or other processors. As used herein, the term "computer-readable storage medium" includes only computer-readable media that can be considered manufactured (ie, manufactured) or machine. In some embodiments, the present disclosure may be embodied as a computer-readable medium other than a computer-readable storage medium, such as a propagating signal.

用語「プログラム」または「ソフトウェア」は、上記に記載されるように、本開示の種々の側面を実装するようにコンピュータまたは他のプロセッサをプログラムするために採用され得る、任意のタイプのコンピュータコードまたはコンピュータ実行可能命令のセットを指すために、一般的意味において本明細書で使用される。加えて、本実施形態の一側面によると、実行されると、本開示の方法を行う、１つ以上のコンピュータプログラムは、単一のコンピュータまたはプロセッサ上に常駐する必要はないが、本開示の種々の側面を実装するように、いくつかの異なるコンピュータまたはプロセッサの間でモジュール様式において分散され得ることを理解されたい。 The term "program" or "software", as described above, may be any type of computer code or that may be employed to program a computer or other processor to implement various aspects of the present disclosure. As used herein in a general sense, to refer to a set of computer-executable instructions. In addition, according to one aspect of this embodiment, one or more computer programs that, when executed, perform the methods of the present disclosure do not need to reside on a single computer or processor, but of the present disclosure. It should be understood that it can be distributed in a modular fashion among several different computers or processors to implement various aspects.

コンピュータ実行可能命令は、１つ以上のコンピュータまたは他のデバイスによって実行される、プログラムモジュール等の多くの形態であってもよい。概して、プログラムモジュールは、特定のタスクを行う、または特定の抽象データタイプを実装する、ルーチン、プログラム、オブジェクト、構成要素、データ構造等を含む。典型的には、プログラムモジュールの機能性は、種々の実施形態では、所望に応じて、組み合わせられる、または分散されてもよい。 Computer-executable instructions may be in many forms, such as program modules, executed by one or more computers or other devices. In general, a program module includes routines, programs, objects, components, data structures, etc. that perform a particular task or implement a particular abstract data type. Typically, the functionality of the program modules may be combined or distributed, as desired, in various embodiments.

また、データ構造は、任意の好適な形態でコンピュータ可読媒体に記憶されてもよい。例証を簡単にするために、データ構造は、データ構造内の場所を通して関係付けられるフィールドを有することが示されてもよい。そのような関係は、同様に、フィールド間の関係を伝えるコンピュータ可読媒体内の場所を伴うフィールドのために記憶装置を割り当てることによって、達成されてもよい。しかしながら、ポインタ、タグ、またはデータ要素間の関係を確立する他の機構の使用を通すことを含む、任意の好適な機構が、データ構造のフィールド内の情報の間の関係を確立するために使用されてもよい。 Further, the data structure may be stored in a computer-readable medium in any suitable form. For simplicity of illustration, a data structure may be shown to have fields associated through locations within the data structure. Such relationships may also be achieved by allocating storage for fields with locations in computer-readable media that convey the relationships between the fields. However, any suitable mechanism, including through the use of pointers, tags, or other mechanisms that establish relationships between data elements, is used to establish relationships between information within fields of a data structure. May be done.

本開示の種々の側面は、単独で、組み合わせて、または前述の実施形態に具体的に議論されない種々の配列において使用されてもよく、したがって、その用途は、前述の説明に記載される、または図面に図示されるコンポーネントの詳細および配列に限定されない。例えば、一実施形態に説明される側面は、他の実施形態に説明される側面と任意の様式で組み合わせられてもよい。 The various aspects of the present disclosure may be used alone, in combination, or in various sequences not specifically discussed in the aforementioned embodiments, and therefore their uses are described in the above description or described. Not limited to the details and arrangement of components illustrated in the drawings. For example, the aspects described in one embodiment may be combined with the aspects described in another embodiment in any manner.

また、本開示は、その実施例が提供されている、方法として具現化されてもよい。方法の一部として行われる作用は、任意の好適な方法で順序付けられてもよい。故に、例証的実施形態では、連続作用として示されるが、いくつかの作用を同時に行うことを含み得る、作用が図示されるものと異なる順序で行われる、実施形態が構築されてもよい。 Further, the present disclosure may be embodied as a method for which an embodiment thereof is provided. The actions performed as part of the method may be ordered in any suitable method. Thus, although shown as a continuous action in an exemplary embodiment, embodiments may be constructed in which the actions are performed in a different order than shown, which may include performing several actions simultaneously.

請求項要素を修飾するための請求項における「第１の」、「第２の」、「第３の」等の順序の用語の使用は、単独では、別の要素と比べた１つの請求項要素のいかなる優先順位、先行、または順序、または方法の行為が行われる時間順序も含意しないが、順序の用語は、請求項要素を区別するために、（順序の用語の使用のためであるが）ある名前を有する１つの請求項要素と、同一の名前を有する別の要素を区別する標識としてのみ使用される。 The use of ordering terms such as "first," "second," "third," etc. in a claim to modify a claim element alone is one claim compared to another element. It does not imply any priority, precedence, or order of the elements, or the time order in which the act of the method takes place, but the term of order is used to distinguish the claim elements (due to the use of the term of order). ) Used only as a marker to distinguish one claim element with a certain name from another with the same name.

また、本明細書で使用される語句および専門用語は、説明目的のためのものであって、限定と見なされるべきではない。本明細書の「～を含む」、「～を備える」、または「～を有する」、「～を含有する」、「～を伴う」、およびその変形の使用は、その後列挙されたアイテムおよびその均等物および付加的アイテムを包含することを意味する。 Also, the terms and terminology used herein are for explanatory purposes only and should not be considered limiting. The use of "including", "with", or "having", "containing", "with", and variants thereof herein are listed as items and their use thereof. Means to include equals and additional items.

Claims

It ’s an electronic system,
One or more sensors configured to capture information about a three-dimensional (3D) environment, wherein the captured information comprises a sensor and a plurality of images.
At least one processor configured to execute a computer executable instruction and generate a map of at least a portion of the 3D environment based on the plurality of images, wherein the computer executable instruction further comprises.
Identifying multiple features within the plurality of images and
To select a plurality of keyframes from the plurality of images, at least in part, based on the plurality of characteristics of the selected keyframes.
Generating one or more coordinate frames, at least in part, based on the identified characteristics of the selected keyframe.
An electronic system comprising at least one processor comprising instructions for storing the one or more coordinate frames as one or more persistent coordinate frames in association with a map of the 3D environment.

The one or more sensors include a plurality of pixel circuits arranged in a two-dimensional array such that each image of the plurality of images comprises a plurality of pixels.
Each feature corresponds to multiple pixels,
The electronic system according to claim 1.

Identifying a plurality of features in the plurality of images is a predetermined feature of the group of pixels based on a measurement of similarity to the group of pixels depicting a portion of a persistent object as the identified feature. The electronic system of claim 1, comprising selecting a number less than the maximum value of.

To store the one or more coordinate frames is to store the one or more coordinate frames for each of the one or more coordinate frames.
The electronic system of claim 1, comprising storing a descriptor representing at least a subset of the features in the selected keyframe from which the coordinate frames were generated.

To store the one or more coordinate frames is to store the one or more coordinate frames for each of the one or more coordinate frames.
The electronic system of claim 1, comprising storing at least a subset of the features within the selected keyframe from which the coordinate frames were generated.

To store the one or more coordinate frames is to store the one or more coordinate frames for each of the one or more coordinate frames.
The transformation between the coordinate frame of the map of the 3D environment and the persistent coordinate frame,
The electronic system of claim 1, comprising storing geographic information indicating the location of the selected keyframe in the 3D environment from which the coordinate frames were generated.

The electronic system of claim 6, wherein the geographical information comprises a WiFi finger print at the location.

The electronic system of claim 1, wherein the computer executable instruction comprises an instruction for calculating a feature descriptor for an individual feature using an artificial neural network.

The first artificial neural network is a first artificial neural network.
The computer executable instruction is to calculate a frame descriptor to represent a key frame, at least in part, based on the calculated feature descriptor for the identified feature within the key frame. With instructions for implementing a second artificial neural network to be constructed,
The electronic system according to claim 8.

The computer executable instruction further
An application programming interface, said application programming interface, is configured to provide an application running on a portable electronic system with information that characterizes the persistent coordinate frames of the one or more persistent coordinate frames. Application programming interface and
Instructions for refining the map of the 3D environment based on the second plurality of images,
Adjusting one or more of the persistent coordinate frames, at least in part, based on the second plurality of images.
The electronic system of claim 1, comprising instructions for providing notification of the adjusted persistent coordinate frame through the application programming interface.

10. The aspect of claim 10, wherein adjusting the one or more persistent coordinate frames comprises adjusting the translation and rotation of the one or more persistent coordinate frames with respect to the origin of the map of the 3D environment. Electronic system.

The electronic system comprises a wearable device, and the one or more sensors are mounted on the wearable device.
The map is a tracking map calculated on the wearable device.
The origin of the map is determined based on where the device is powered on.
The electronic system according to claim 11.

The electronic system comprises a wearable device, and the one or more sensors are mounted on the wearable device.
The computer executable instruction further
Tracking the movement of the portable device and
An instruction to generate one or more coordinate frames and / or an instruction to store one or more persistent coordinate frames based on the tracked motion indicating the motion of the wearable device over a threshold distance. The electronic system according to claim 1, further comprising an instruction for controlling the timing of execution, wherein the threshold distance is 2 to 20 meters.

A method of operating an electronic system and rendering virtual content in a 3D environment equipped with a portable device, wherein the method uses one or more processors.
Maintaining a coordinate frame local to the portable device based on the output of one or more sensors on the portable device.
Acquiring the stored coordinate frame from the stored spatial information about the 3D environment,
To calculate the transformation between the coordinate frame local to the portable device and the acquired stored coordinate frame,
Receiving the specifications of a virtual object that has a coordinate frame local to the virtual object and the location of the virtual object relative to the selected stored coordinate frame.
A method comprising rendering the virtual object onto the display of the portable device at a determined location, at least in part, based on the calculated transformation and the received location of the virtual object.

14. The method of claim 14, wherein acquiring the stored coordinate frame comprises acquiring the coordinate frame through an application programming interface (API).

The portable device comprises a first portable device comprising a first processor of the one or more processors.
The system further comprises a second portable device comprising a second processor of the one or more processors.
The processors on each of the first and second devices
To get the same stored coordinate frame,
To calculate the transformation between a coordinate frame local to an individual device and the same stored coordinate frame acquired above,
Receiving the specifications of the virtual object and
14. The method of claim 14, wherein the virtual object is rendered on a separate display.

The first and second devices, respectively,
With a camera configured to output multiple camera images,
A keyframe generator configured to convert multiple camera images into multiple keyframes,
A continuous posture calculator configured to generate a continuous posture by averaging the plurality of key frames.
A tracking map and a continuous posture converter configured to convert the tracking map to the continuous posture and determine the continuous posture with respect to the origin of the tracking map.
A Sustained Posture and Persistent Coordinate Frame (PCF) transducer configured to convert the Sustained Posture to a PCF.
16. The method of claim 16, comprising a map issuer configured to transmit spatial information including the PCF to a server.

16. The method of claim 16, further comprising running an application to generate the specification of the virtual object and the location of the virtual object with respect to the selected stored coordinate frame.

Maintaining a coordinate frame local to the portable device on the portable device is for each of the first and second portable devices.
Capturing multiple images of the 3D environment from one or more sensors on the portable device.
At least in part, calculating one or more sustained postures based on the plurality of images.
At least in part, including generating spatial information about the 3D environment based on one or more of the calculated sustained attitudes.
The method further comprises transmitting the generated spatial information to a remote server for each of the first and second portable devices.
Acquiring the stored coordinate frame includes receiving the stored coordinate frame from the remote server.
The method according to claim 16.

At least in part, calculating the one or more sustained postures based on the plurality of images can be done.
Extracting one or more features from each of the plurality of images,
Generating a descriptor for each of the above-mentioned one or more features,
Generating keyframes for each of the plurality of images, at least in part, based on the descriptor.
19. The method of claim 19, comprising generating the one or more sustained postures based on the one or more keyframes, at least in part.

Generating one or more of the sustained postures mentioned above
20. The method of claim 20, comprising selectively generating a sustained posture based on the portable device traveling a predetermined distance from another persistent posture location.

The first and second devices, respectively,
16. The method of claim 16, comprising a download system configured to download the stored coordinate frames from a server.

An electronic system for maintaining persistent spatial information about a 3D environment for rendering virtual content on each of a plurality of portable devices.
A networked computing device
With at least one processor
With at least one storage device connected to the processor
A map storage routine, wherein the map storage routine receives a plurality of maps from the portable devices of the plurality of portable devices and stores the map information on the at least one storage device. A map storage routine, which can be executed using a processor, wherein each of the received maps has at least one coordinate frame.
It is a map transmitter, and the map transmitter is
Receiving location information from the portable devices of the plurality of portable devices,
To select one or more maps from the stored maps,
By transmitting information from the selected one or more maps to the portable devices of the plurality of portable devices, the transmitted information is a map of the selected one or more maps. An electronic system comprising a networked computing device, comprising a map transmitter, which can be performed using the at least one processor to do so, including a coordinate frame.

The coordinate frame is
23. The electronic system of claim 23, comprising a computer data structure comprising coordinate frames comprising information that characterizes a plurality of features of the object in the 3D environment.

23. The electronic system of claim 23, wherein the information characterizing the plurality of features comprises a descriptor that characterizes the region of the 3D environment.

23. The electronic system of claim 23, wherein each coordinate frame of the at least one coordinate frame comprises a persistent point characterized by features detected in sensor data representing the 3D environment.

26. The electronic system of claim 26, wherein each coordinate frame of the at least one coordinate frame comprises a sustained posture.

26. The electronic system of claim 26, wherein each coordinate frame of at least one coordinate frame comprises a persistent coordinate frame.