JP2022522504A

JP2022522504A - Image depth map processing

Info

Publication number: JP2022522504A
Application number: JP2021551897A
Authority: JP
Inventors: クリスティアーンファレカムプ; ヘーストバルトロメウスウィルヘルムスダミアヌスファン
Original assignee: Koninklijke Philips NV
Current assignee: Koninklijke Philips NV
Priority date: 2019-03-05
Filing date: 2020-03-03
Publication date: 2022-04-19
Also published as: CN113795863A; EP3935602A1; CA3131980A1; US20220148207A1; KR20210134956A; TW202101374A; EP3706070A1; WO2020178289A1

Abstract

奥行きマップを処理する方法は、画像および対応する奥行きマップを受信すること（301）を含む。対応する奥行きマップのうちの第1奥行きマップの奥行き値は、対応する奥行きマップのうちの少なくとも第2奥行きマップの奥行き値に基づいて更新される（303）。更新は、他のマップから決定された候補奥行き値の重み付け結合に基づく。第2奥行きマップからの候補奥行き値に対する重みは、更新されている奥行きに対応する第1画像中のピクセル値と、候補奥行き値を用いて更新されている奥行き値の位置を第3画像に投影することによって決定された位置における第3画像中のピクセル値との間の類似性に基づいて決定される。このようにして、より一貫した奥行きマップを生成することができる。Methods of processing depth maps include receiving images and corresponding depth maps (301). The depth value of the first depth map of the corresponding depth map is updated based on the depth value of at least the second depth map of the corresponding depth map (303). Updates are based on weighted joins of candidate depth values determined from other maps. The weight for the candidate depth value from the second depth map projects the pixel value in the first image corresponding to the updated depth and the position of the updated depth value using the candidate depth value onto the third image. It is determined based on the similarity with the pixel value in the third image at the position determined by. In this way, a more consistent depth map can be generated.

Description

本発明は、画像のための奥行きマップの処理に関し、特に、仮想現実アプリケーションのためのビュー合成をサポートする奥行きマップの処理に関するが、これに限定されるものではない。 The present invention relates to, but is not limited to, processing depth maps for images, and in particular, processing depth maps that support view compositing for virtual reality applications.

近年、画像およびビデオアプリケーションの多様性および範囲が大幅に増加しており、ビデオを利用し消費する新しいサービスおよび方法が、継続的に開発され、導入されている。 In recent years, the variety and scope of image and video applications has increased significantly, and new services and methods of utilizing and consuming video are continually being developed and introduced.

例えば、人気が高まっている1つのサービスは、観察者が能動的にシステムと対話してレンダリングのパラメータを変更できるような方法で画像シーケンスを提供することである。多くのアプリケーションにおいて非常に魅力的な特徴は、例えば、観察者が、提示されているシーン内で動き回って「見回る」ことを可能にするなど、観察者の有効な視聴位置および視聴方向を変更する能力である。 For example, one growing service is to provide image sequences in a way that allows the observer to actively interact with the system to change rendering parameters. A very attractive feature in many applications is to change the effective viewing position and orientation of the observer, for example, allowing the observer to move around and "look around" within the presented scene. Ability.

そのような特徴は、特に、仮想現実体験がユーザに提供されることを可能にすることができる。これにより、ユーザは、例えば、(比較的)自由に仮想環境内で動き回ることができ、自分の位置および自分が見ている場所を動的に変更することができる。典型的にはこのような仮想現実アプリケーションがシーンの3次元モデルに基づいており、このモデルは特定の要求されたビューを提供するために動的に評価される。このアプローチは例えば、コンピュータ及びコンソール用の一人称シューティングゲームのカテゴリにおけるようなゲームアプリケーションから周知である。 Such features can, in particular, allow a virtual reality experience to be provided to the user. This allows the user, for example, to move around in the virtual environment (relatively) freely and dynamically change their position and where they are looking. Typically, such virtual reality applications are based on a 3D model of the scene, which is dynamically evaluated to provide a particular requested view. This approach is well known from gaming applications such as in the category of first-person shooters for computers and consoles.

また、特に仮想現実アプリケーションでは、提示される画像が三次元画像であることが望ましい。実際、観察者の没入感を最適化するために、ユーザは、典型的には提示されたシーンを三次元シーンとして体験することが好ましい。実際、仮想現実体験は、好ましくはユーザが自分の位置、カメラ視点、および仮想世界に対する時間の瞬間を選択することを可能にするはずである。 Further, especially in a virtual reality application, it is desirable that the presented image is a three-dimensional image. In fact, in order to optimize the observer's immersive feeling, it is preferable for the user to experience the presented scene as a three-dimensional scene. In fact, the virtual reality experience should preferably allow the user to select their location, camera perspective, and moment of time with respect to the virtual world.

多くの仮想現実アプリケーションは、シーンの所定のモデルに基づいており、典型的には、仮想世界の人工モデルに基づいている。多くの場合、仮想現実体験は、現実世界のキャプチャに基づいて提供されることが望ましい。 Many virtual reality applications are based on a given model of the scene, typically an artificial model of the virtual world. In many cases, it is desirable that virtual reality experiences be provided based on real-world captures.

特に現実世界シーンに基づく場合のような多くのシステムでは、シーンの画像表現が提供され、画像表現は、シーン内の1つまたは複数のキャプチャ点/視点に対する画像および奥行きを含む。画像+奥行き表現は、特に、現実世界シーンの非常に効率的な特徴付けを提供し、特徴付けは、現実世界シーンのキャプチャによって比較的容易に生成されるだけでなく、キャプチャされたもの以外の視点のビューを合成するレンダラにも非常に適している。例えば、レンダラは、現在のローカル観察者ポーズに一致するビューを動的に生成するように構成されることができる。例えば、観察者ポーズが動的に決定され、この観察者ポーズにマッチングするようにに、画像および、例えば提供される奥行きマップに基づいて、ビューを動的に生成することができる。 Many systems, especially those based on real-world scenes, provide an image representation of the scene, which includes the image and depth for one or more capture points / viewpoints in the scene. The image + depth representation provides a very efficient characterization of the real world scene in particular, and the characterization is not only relatively easy to generate by capturing the real world scene, but also other than what was captured. It is also very suitable for renderers that synthesize viewpoint views. For example, the renderer can be configured to dynamically generate a view that matches the current local observer pose. For example, the observer pose can be dynamically determined and the view can be dynamically generated based on the image and, for example, the provided depth map to match this observer pose.

多くの実用的なシステムでは、キャリブレーションされたマルチビューカメラリグを使用して、キャプチャされたシーンに対して異なる視点をとるユーザのための再生を可能にすることができる。アプリケーションは、スポーツ試合中に個々の視点を選択すること、すなわち、拡張または仮想現実ヘッドセット上でキャプチャされた3Dシーンを再生することを含む。 In many practical systems, a calibrated multi-view camera rig can be used to enable playback for users who have different perspectives on the captured scene. The application involves selecting individual perspectives during a sports match, ie playing a 3D scene captured on an enhanced or virtual reality headset.

"You Yang ET AL., Cross-View Multi-Lateral Filter for Compressed MultiView Depth Video", IEEE TRANSACTIONS ON IMAGE PROCESSING., vol. 28, no. 1, 1 January 2019 (2019-01-01), pages 302-315, XP055614403, US ISSN: 1057-7149, DOI: 10.1109/TIP.2018.2867740"は、奥行き圧縮を伴う非対称マルチビュービデオの枠組みの中で、圧縮された奥行きマップ/ビデオの品質を改善するためのビュー間マルチラテラルフィルタリングスキームを開示している。このスキームにより、歪んだ奥行きマップは、異なるタイムスロットの現在の視点及び隣接する視点から選択された非ローカル候補を介して強化される。具体的には、これらの候補は、ビュー間、空間および時間プリアの物理的および意味的相互関係を示すマクロスーパーピクセルにクラスタ化される。 "You Yang ET AL., Cross-View Multi-Lateral Filter for Compressed MultiView Depth Video", IEEE TRANSACTIONS ON IMAGE PROCESSING., Vol. 28, no. 1, 1 January 2019 (2019-01-01), pages 302- 315, XP055614403, US ISSN: 1057-7149, DOI: 10.1109 / TIP.2018.2867740 "is a view to improve the quality of compressed depth maps / videos within the framework of asymmetric multiview video with depth compression. It discloses an inter-multilateral filtering scheme, which enhances the distorted depth map through non-local candidates selected from the current and adjacent viewpoints in different time slots. , These candidates are clustered into macro superpixels that show the physical and semantic interrelationships between views, spatial and temporal preah.

"WOLFF KATJA ET AL., Point Cloud Noise and Outlier Removal for ImageBased 3D Reconstruction, 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), IEEE, 25 October 2016 (2016-10-25), pages 118-127, XP033027617, DOI: 10.1109/3DV.2016.20" は、入力画像と対応する奥行きマップとを使用して、入力によって暗示される色付きの表面と幾何学的または写真的に矛盾するピクセルを除去するアルゴリズムを開示する。これにより、標準的な表面再構成法は、より少ない平滑化を実行し、従ってより高い品質を達成することができる。 "WOLFF KATJA ET AL., Point Cloud Noise and Outlier Removal for ImageBased 3D Reconstruction, 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), IEEE, 25 October 2016 (2016-10-25), pages 118-127, XP033027617, DOI : 10.1109 / 3DV.2016.20 "discloses an algorithm that uses the input image and the corresponding depth map to remove pixels that are geometrically or photographicly inconsistent with the colored surface implied by the input. This allows standard surface reconstruction methods to perform less smoothing and thus achieve higher quality.

離散的にキャプチャされた視点とキャプチャされた視点を超える幾つかの外挿との間の円滑な移行を提供するために、奥行きマップがしばしば提供され、これらの他の視点からのビューを予測/合成するために使用される。 Depth maps are often provided to predict views from these other perspectives to provide a smooth transition between discretely captured viewpoints and some extrapolations beyond the captured viewpoints. Used for compositing.

奥行きマップは、一般的に、撮影されたカメラ間の（マルチビュー）ステレオマッチングを使用して生成されるか、奥行きセンサ（構造化光または飛行時間ベース）を使用することによって、より直接的に生成される。しかしながら、奥行きセンサまたは視差推定プロセスから得られるそのような奥行きマップは。本質的に、合成されたビューに誤差をもたらす可能性のある誤差および不正確さを有する。これは、観察者の体験を低下させる。 Depth maps are typically generated using (multi-view) stereo matching between captured cameras, or more directly by using depth sensors (structured light or time-based). Generated. However, such depth maps obtained from depth sensors or parallax estimation processes. In essence, it has errors and inaccuracies that can lead to errors in the combined view. This reduces the observer's experience.

したがって、奥行きマップを生成し、処理するための改善されたアプローチが有利であろう。特に、改善された動作、増加された柔軟性、改善された仮想現実体験、減少された複雑さ、容易にされた実施、改善された奥行きマップ、増加された合成画像品質、改善されたレンダリング、改善されたユーザ体験、ならびに/または改善された性能および/もしくは動作を可能にするシステムおよび/またはアプローチは有利である。 Therefore, an improved approach for generating and processing depth maps would be advantageous. In particular, improved behavior, increased flexibility, improved virtual reality experience, reduced complexity, easier implementation, improved depth maps, increased composite image quality, improved rendering, Systems and / or approaches that enable an improved user experience and / or improved performance and / or operation are advantageous.

したがって、本発明は、好ましくは上記の欠点の1つ以上を単独でまたは任意の組み合わせで軽減、低減または排除しようとするものである。 Therefore, the present invention preferably attempts to reduce, reduce or eliminate one or more of the above drawbacks alone or in any combination.

本発明の一側面によれば、奥行きマップを処理する方法が提供され、当該方法は、異なるビューポーズからのシーンを表す複数の画像および対応する奥行きマップを受信するステップと、対応する奥行きマップのうちの第1奥行きマップの奥行き値を、対応する奥行きマップのうちの少なくとも第2奥行きマップの奥行き値に基づいて更新するステップであって、前記第1奥行きマップは第1画像に対するものであり、前記第2奥行きマップは第2画像に対するものである、ステップと、を有し、前記更新するステップは、前記第1奥行きマップ中の第1奥行きマップ位置における前記第1奥行きマップの第1奥行きピクセルについて第1候補奥行き値を決定するステップであって、前記第1候補奥行き値が、前記第2奥行きマップ中の第2奥行きマップ位置における前記第2奥行きマップの第2奥行きピクセルの少なくとも1つの第2奥行き値に応じて決定される、ステップと、前記第1奥行きマップ位置のための複数の奥行き候補値の重み付け結合により、前記第1奥行きピクセルの第1奥行き値を決定するステップであって、前記重み付け結合は、第1重みで重み付けされた前記第1候補奥行き値を含む、ステップと、を有し、前記第1奥行き値を決定するステップは、前記第1奥行きマップ位置に対する前記第1画像中の第1画像位置を決定するステップと、前記複数の画像のうちの第3画像中の第3画像位置を決定するステップであって、前記第3画像位置は、前記第1候補奥行き値に基づく前記第1画像位置の前記第3画像への投影に対応する、ステップと、前記第3画像位置に対する前記第3画像中の画像ピクセル値と、前記第1画像位置に対する前記第1画像中の画像ピクセル値との間の差を示す第1マッチ誤差指標を決定するステップと、前記第1マッチ誤差指標に応じて前記第1重みを決定するステップと、を有する。 According to one aspect of the invention, a method of processing a depth map is provided, the step of receiving multiple images representing scenes from different view poses and the corresponding depth map, and the corresponding depth map. A step of updating the depth value of our first depth map based on at least the depth value of the second depth map of the corresponding depth maps, wherein the first depth map is for the first image. The second depth map has a step, which is for a second image, and the updating step is the first depth pixel of the first depth map at the first depth map position in the first depth map. The first candidate depth value is at least one of the second depth pixels of the second depth map at the second depth map position in the second depth map. 2 A step that determines the first depth value of the first depth pixel by a weighted combination of a step determined according to the depth value and a plurality of depth candidate values for the first depth map position. The weighted coupling comprises a step comprising the first candidate depth value weighted by the first weight, and the step of determining the first depth value is the first image with respect to the first depth map position. The step of determining the position of the first image in the image and the step of determining the position of the third image in the third image of the plurality of images, wherein the third image position is the first candidate depth value. A step, an image pixel value in the third image with respect to the third image position, and a first image in the first image with respect to the first image position, corresponding to the projection of the first image position onto the third image. It has a step of determining a first match error index indicating a difference from an image pixel value, and a step of determining the first weight according to the first match error index.

このアプローチは、多くの実施形態において、改善された奥行きマップを提供することができ、特に、増加した一貫性を有する奥行きマップのセットを提供することができる。このアプローチは、画像及び更新された奥行きマップに基づいて画像が合成されるときに、改善されたビュー一貫性を可能にすることができる。 This approach can, in many embodiments, provide an improved depth map, in particular a set of depth maps with increased consistency. This approach can enable improved view consistency when images are combined based on the image and the updated depth map.

本発明者らは、奥行きマップ間の不一致が、奥行きマップ間で一貫性のある誤差またはノイズよりも知覚可能であることが多く、特定のアプローチはより一貫性のある更新された奥行きマップを提供することができるという洞察を得た。この方法は、シーンのマルチビュー画像のセットに対する奥行きマップの品質を改善する奥行き精緻化アルゴリズムとして使用されることができる。 We find that discrepancies between depth maps are often more perceptible than consistent errors or noise between depth maps, and certain approaches provide a more consistent and updated depth map. I got the insight that I can. This method can be used as a depth refinement algorithm to improve the quality of the depth map for a set of multi-view images of the scene.

このアプローチは、多くの実施形態における実施を容易にすることができ、比較的低い複雑さおよびリソース要件で実施されることができる。 This approach can be facilitated in many embodiments and can be implemented with relatively low complexity and resource requirements.

画像中の位置は、対応する奥行きマップ中の位置に直接対応することができ、逆もまた同様である。画像中の位置と、対応する奥行きマップ中の位置との間には、1対1の対応があってもよい。多くの実施形態において、ピクセル位置は画像中で同じであり得、対応する奥行きマップは、画像中の各ピクセルに対して1つのピクセルを含み得る。 Positions in the image can directly correspond to positions in the corresponding depth map and vice versa. There may be a one-to-one correspondence between the position in the image and the position in the corresponding depth map. In many embodiments, the pixel positions can be the same in the image and the corresponding depth map can contain one pixel for each pixel in the image.

いくつかの実施形態では、重みはバイナリ（例えば、1または0）であってもよく、重み付け結合は選択であってもよい。 In some embodiments, the weights may be binary (eg, 1 or 0) and the weighted joins may be selective.

投影という用語はしばしば、シーン中の三次元空間座標の、画像または奥行きマップ中の二次元画像座標（u, v）への投影を指し得ることが理解されるであろう。ただし、投影は、ある画像または奥行きマップから他へのシーン点の次元画像座標（u, v）間のマッピング、つまり、あるポーズの画像座標（u₁, v₁）のセットから別のポーズの画像座標（u₂, v₂）のセットへのマッピングを指す場合もある。異なるビューポーズ/視点に対応する画像に対する画像座標間のこのような投影は、典型的には対応する空間シーン点を考慮して、特にシーン点の奥行きを考慮して実行される。 It will be appreciated that the term projection can often refer to the projection of 3D spatial coordinates in a scene onto an image or 2D image coordinates (u, v) in a depth map. However, the projection is a mapping between the dimensional image coordinates (u, v) of a scene point from one image or depth map to another, that is, from a set of image coordinates (u ₁ , v ₁ ) in one pose to another pose. It may also refer to a mapping of image coordinates (u ₂ , v ₂ ) to a set. Such projections between image coordinates for images corresponding to different view poses / viewpoints are typically performed with the corresponding spatial scene points in mind, especially the depth of the scene points.

いくつかの実施形態では、前記第1奥行き値を決定するステップは、前記第1奥行きマップ位置を、対応する奥行きマップのうちの第3奥行きマップ中の第3奥行きマップ位置に投影するステップであって、前記第3奥行きマップは第3画像に対するものであり、前記投影は、前記第1候補奥行き値に基づいて行われる、ステップと、前記第3奥行きマップ位置に対する前記第3画像中の画像ピクセル値と、前記第1奥行きマップ位置に対する前記第1画像中の画像ピクセル値との間の差を示す第1マッチ誤差指標を決定するステップと、前記第1マッチ誤差指標に応じて前記第1重みを決定するステップと、を有する。 In some embodiments, the step of determining the first depth value is a step of projecting the first depth map position onto a third depth map position in the third depth map of the corresponding depth maps. The third depth map is for the third image, and the projection is performed based on the first candidate depth value, the step and the image pixels in the third image with respect to the third depth map position. A step of determining a first match error index that indicates the difference between the value and the image pixel value in the first image with respect to the first depth map position, and the first weight according to the first match error index. Has a step to determine.

本発明のオプションの特徴によれば、第1候補奥行き値を決定するステップは、第2の値および前記第1奥行きマップの第1の元の奥行き値のうちの少なくとも1つに基づく、前記第1画像の第1ビューポーズと前記第2画像の第2ビューポーズとの間の投影によって、前記第1奥行きマップ位置に対する第2奥行きマップ位置を決定するステップ、を有する。 According to the optional features of the present invention, the step of determining the first candidate depth value is based on at least one of the second value and the first original depth value of the first depth map. It has a step of determining a second depth map position with respect to the first depth map position by projection between the first view pose of one image and the second view pose of the second image.

これは、多くの実施形態において特に有利な性能を提供することができ、特に、多くのシナリオにおいて改善された一貫性を有する改善された奥行きマップを可能にすることができる。 This can provide particularly advantageous performance in many embodiments, and in particular can enable improved depth maps with improved consistency in many scenarios.

投影は、第1の元の奥行き値に基づいて、第2奥行きマップ位置から第1奥行きマップ位置まで、したがって、第2ビューポーズから第1ビューポーズまでであってもよい。 The projection may be from the second depth map position to the first depth map position, and thus from the second view pose to the first view pose, based on the first original depth value.

投影は、第2奥行き値に基づいて、第2奥行きマップ位置から第1奥行きマップ位置まで、したって、第2ビューポーズから第1ビューポーズまでであってもよい。 The projection may be from the second depth map position to the first depth map position, and thus from the second view pose to the first view pose, based on the second depth value.

元の奥行き値は、第1奥行きマップの更新されていない奥行き値であってもよい。 The original depth value may be the unupdated depth value of the first depth map.

元の奥行き値は、受信機によって受信された第1奥行きマップの奥行き値であってもよい。 The original depth value may be the depth value of the first depth map received by the receiver.

いくつかの実施形態では、第1奥行き値を決定するステップは、前記第1奥行きマップ位置を、対応する奥行きマップのうちの第3奥行きマップ中の第3奥行きマップ位置に投影するステップであって、前記第3奥行きマップは第3画像に対するものであり、前記投影は前記第1候補奥行き値に基づいて行われる、ステップと、前記第3奥行きマップ位置に対する前記第3画像中の画像ピクセル値と、前記第1奥行きマップ位置に対する前記第1画像中の画像ピクセル値との間の差を示す第1マッチ誤差指標を決定するステップと、前記第1マッチ誤差指標に応じて前記第1重みを決定するステップと、を有する。 In some embodiments, the step of determining the first depth value is the step of projecting the first depth map position onto the third depth map position in the third depth map of the corresponding depth maps. , The third depth map is for a third image, and the projection is based on the first candidate depth value, with a step and an image pixel value in the third image with respect to the third depth map position. , A step of determining a first match error index indicating the difference between the first depth map position and the image pixel value in the first image, and determining the first weight according to the first match error index. And have steps to do.

本発明のオプションの特徴によれば、重み付け結合は、第1奥行きマップ位置に応じて決定された第2奥行きマップの領域から決定された候補奥行き値を含む。 According to the optional features of the present invention, the weighted coupling includes candidate depth values determined from the area of the second depth map determined according to the first depth map position.

これは、多くの実施形態において、増加した奥行きマップ一貫性を提供することができる。第1候補奥行き値は、この領域の1つまたは複数の奥行き値から導出されることができる。 This can provide increased depth map consistency in many embodiments. The first candidate depth value can be derived from one or more depth values in this area.

本発明のオプションの特徴によれば、第2奥行きマップの領域は、第2奥行きマップ位置の周囲の領域として決定され、第2奥行きマップ位置は、第1奥行きマップ中の第1奥行きマップ位置に等しい第2奥行きマップ中の奥行きマップ位置として決定される。 According to the optional features of the present invention, the area of the second depth map is determined as the area around the second depth map position, and the second depth map position is the first depth map position in the first depth map. Determined as the depth map position in the same second depth map.

これは、低い複雑さ及び低いリソースによる、考慮すべき適切な奥行き値の効率的な決定を可能にする。 This allows for efficient determination of appropriate depth values to consider due to low complexity and low resources.

本発明のオプションの特徴によれば、第2奥行きマップの領域は、第1奥行きマップ位置における第1奥行きマップ中の元の奥行き値に基づく第1奥行きマップ位置からの投影によって決定される第2奥行きマップ中の位置の周りの領域として決定される。 According to the optional features of the present invention, the area of the second depth map is determined by projection from the first depth map position based on the original depth value in the first depth map at the first depth map position. Determined as the area around a position in the depth map.

これは、多くの実施形態において、増加した奥行きマップの一貫性を提供することができる。元の奥行き値は、受信機によって受信された第1奥行きマップの奥行き値であってもよい。 This can provide increased depth map consistency in many embodiments. The original depth value may be the depth value of the first depth map received by the receiver.

本発明のオプションの特徴によれば、前記方法はさらに、前記第2奥行きマップ位置に対する前記第2画像中の画像ピクセル値と、前記第1奥行きマップ位置に対する前記第1画像中の画像ピクセル値との間の差を示す第2マッチ誤差指標を決定するステップを有し、前記第1重みの決定は、さらに前記第2マッチ誤差指標に応じて行われる。 According to the optional features of the present invention, the method further comprises an image pixel value in the second image for the second depth map position and an image pixel value in the first image for the first depth map position. It has a step of determining a second match error index indicating the difference between the two, and the determination of the first weight is further made according to the second match error index.

これは、多くの実施形態において、改善された奥行きマップを提供することができる。 This can provide an improved depth map in many embodiments.

本発明の任意の特徴によれば、前記方法はさらに、前記第1奥行きマップ位置に対応する奥行きマップ位置に対する他の画像中の画像ピクセル値と、前記第1奥行きマップ位置に対する前記第1画像中の画像ピクセル値との間の差を示す追加のマッチ誤差指標を決定するステップを有し、前記第1重みの決定は、さらに前記追加のマッチ誤差指標に応じて行われる。 According to any feature of the invention, the method further comprises image pixel values in other images for the depth map position corresponding to the first depth map position and in the first image for the first depth map position. It has a step of determining an additional match error index indicating the difference between the image pixel value and the first weight, and the determination of the first weight is further performed according to the additional match error index.

本発明のオプションの特徴によれば、重み付け結合は、第1奥行きマップ位置の周りの領域における第1奥行きマップの奥行き値を含む。 According to the optional features of the present invention, the weighted coupling comprises the depth value of the first depth map in the area around the first depth map position.

本発明のオプションの特徴によれば、第1重みは、第1候補奥行き値の信頼値に依存する。 According to the features of the options of the present invention, the first weight depends on the confidence value of the first candidate depth value.

これは、多くのシナリオにおいて改善された奥行きマップを提供することができる。 This can provide an improved depth map in many scenarios.

本発明のオプションの特徴によれば、信頼値が閾値未満である第1奥行きマップの奥行き値のみが更新される。 According to the optional feature of the present invention, only the depth value of the first depth map whose confidence value is less than the threshold value is updated.

これは、多くのシナリオにおいて改善された奥行きマップを提供することができ、特に、正確な奥行き値があまり正確でない奥行き値によって更新されるリスクを低減することができる。 This can provide an improved depth map in many scenarios, and in particular can reduce the risk of accurate depth values being updated by less accurate depth values.

本発明の任意の特徴によれば、前記方法はさらに、前記重み付け結合に含めるために、前記第2奥行きマップの奥行き値のセットを、当該奥行き値のセットの奥行き値が閾値以上の信頼値を有していなければならないという要件に従って、選択するステップを有する。 According to any feature of the invention, the method further provides a set of depth values for the second depth map to include in the weighted coupling, a confidence value for which the depth value of the set of depth values is greater than or equal to a threshold. Have the steps to choose according to the requirement that you must have.

本発明の任意の特徴によれば、前記方法はさらに、所与の奥行きマップにおける所与の奥行き値に対する所与の奥行きマップ位置を、複数の対応する奥行きマップにおける対応する位置に投影するステップと、前記所与の奥行き値と、前記複数の対応する奥行きマップにおける前記対応する位置の奥行き値を有する奥行き値のセットに対する変動尺度を決定するステップと、前記変動尺度に応じて、前記所与の奥行きマップ位置に対する信頼値を決定するステップと、を有する。 According to any feature of the invention, the method further comprises projecting a given depth map position for a given depth value in a given depth map onto the corresponding positions in a plurality of corresponding depth maps. , A step of determining a variability scale for a set of depth values having the given depth value and the depth value of the corresponding position in the plurality of corresponding depth maps, and the given variability scale according to the variation scale. It has a step of determining a confidence value for a depth map position.

これは、改善された奥行きマップをもたらすことができる信頼値の特に有利な決定を提供することができる。 This can provide a particularly favorable determination of confidence values that can result in an improved depth map.

本発明のオプションの特徴によれば、前記方法はさらに所与の奥行きマップにおける所与の奥行き値に対する所与の奥行きマップ位置を、前記所与の奥行き値に基づいて、別の奥行きマップにおける対応する位置に投影するステップと、前記別の奥行きマップ中の前記対応する位置を、前記別の奥行きマップ中の前記対応する位置における奥行き値に基づいて、前記所与の奥行きマップ中のテスト位置に投影するステップと、前記所与の奥行きマップ位置と前記テスト位置との間の距離に応じて、前記所与の奥行きマップ位置に対する信頼値を決定するステップと、を有する。 According to the optional features of the present invention, the method further addresses a given depth map position for a given depth value in a given depth map in another depth map based on the given depth value. The step to project to the position and the corresponding position in the other depth map to the test position in the given depth map based on the depth value at the corresponding position in the other depth map. It has a step of projecting and a step of determining a confidence value for the given depth map position according to the distance between the given depth map position and the test position.

本発明の一態様によれば、奥行きマップを処理するための装置が提供され、当該装置は、異なるビューポーズからのシーンを表す複数の画像と対応する奥行きマップとを受信するための受信機と、対応する奥行きマップのうちの少なくとも第2奥行きマップの奥行き値に基づいて、対応する奥行きマップのうちの第1奥行きマップの奥行き値を更新するステップを実行するための更新器であって、前記第1奥行きマップは第1画像に対するものであり、前記第2奥行きマップは第2画像に対するものである、更新器を有し、前記更新するステップは、前記第1奥行きマップ中の第1奥行きマップ位置における前記第1奥行きマップの第1奥行きピクセルについて第1候補奥行き値を決定するステップであって、前記第1候補奥行き値が、前記第2奥行きマップ中の第2奥行きマップ位置における前記第2奥行きマップの第2奥行きピクセルの少なくとも1つの第2奥行き値に応じて決定される、ステップと、前記第1奥行きマップ位置のための複数の候補奥行き値の重み付け結合により、前記第1奥行きピクセルの第1奥行き値を決定するステップであって、前記重み付け結合は、第1重みで重み付けされた前記第1候補奥行き値を含む、ステップと、を有し、前記第1奥行き値を決定するステップは、前記第1奥行きマップ位置に対する前記第1画像中の第1画像位置を決定するステップと、前記複数の画像のうちの第3画像中の第3画像位置を決定するステップであって、前記第3画像位置は、前記第1候補奥行き値に基づく前記第1画像位置の前記第3画像への投影に対応する、ステップと、前記第3画像位置に対する前記第3画像中の画像ピクセル値と、前記第1画像位置に対する前記第1画像中の画像ピクセル値との間の差を示す第1マッチ誤差指標を決定するステップと、前記第1マッチ誤差指標に応じて前記第1重みを決定するステップと、を有する。 According to one aspect of the invention, a device for processing a depth map is provided, the device comprising a receiver for receiving a plurality of images representing scenes from different view poses and a corresponding depth map. An updater for performing a step of updating the depth value of the first depth map of the corresponding depth map based on at least the depth value of the second depth map of the corresponding depth map. The first depth map is for the first image, the second depth map is for the second image, has an updater, and the updating step is the first depth map in the first depth map. It is a step of determining the first candidate depth value for the first depth pixel of the first depth map at the position, and the first candidate depth value is the second at the second depth map position in the second depth map. A weighted combination of a step and a plurality of candidate depth values for the first depth map position, which is determined according to at least one second depth value of the second depth pixel of the depth map, of the first depth pixel. A step of determining a first depth value, wherein the weighted combination has a step including the first candidate depth value weighted by the first weight, and the step of determining the first depth value is The first step is to determine the position of the first image in the first image with respect to the position of the first depth map, and the step is to determine the position of the third image in the third image among the plurality of images. The three image positions include a step corresponding to the projection of the first image position onto the third image based on the first candidate depth value, and an image pixel value in the third image with respect to the third image position. A step of determining a first match error index indicating a difference between the first image position and an image pixel value in the first image, and a step of determining the first weight according to the first match error index. And have.

本発明のこれらおよび他の態様、特徴および利点は以下に記載される実施形態から明らかになり、それを参照して説明される。 These and other aspects, features and advantages of the present invention will be apparent from and described with reference to the embodiments described below.

本発明の実施形態は単なる例として、図面を参照して説明される。
仮想現実体験を提供するための構成の例を示す図。本発明のいくつかの実施形態による、奥行きマップを処理するための装置の要素の一例を示す図。本発明のいくつかの実施形態による奥行きマップを処理する方法の要素の例を示す図。シーンをキャプチャするためのカメラ構成の一例を示す図。本発明のいくつかの実施形態による奥行きマップを更新する方法の要素の例を示す図。本発明のいくつかの実施形態による重みを決定する方法の要素の例を示す図。本発明のいくつかの実施形態による、奥行きマップおよび画像の処理の例を示す図。 Embodiments of the present invention will be described with reference to the drawings as merely examples.
The figure which shows the example of the configuration for providing a virtual reality experience. The figure which shows an example of the element of the apparatus for processing a depth map according to some embodiments of this invention. The figure which shows the example of the element of the method of processing a depth map by some embodiments of this invention. The figure which shows an example of the camera composition for capturing a scene. The figure which shows the example of the element of the method of updating the depth map by some embodiments of this invention. The figure which shows the example of the element of the method of determining the weight by some embodiments of this invention. The figure which shows the example of the processing of a depth map and an image by some embodiments of this invention.

以下の説明は仮想現実体験に適用可能な本発明の実施形態に焦点を当てているが、本発明はこの用途に限定されず、ビュー合成を含む特定のアプリケーションなど、多くの他のシステムおよびアプリけしょんに適用することができることを理解されたい。 Although the following description focuses on embodiments of the invention applicable to virtual reality experiences, the invention is not limited to this application and many other systems and applications, such as specific applications including view compositing. Please understand that it can be applied to the system.

ユーザが仮想世界で動き回ることを可能にする仮想体験はますます人気が高まっており、そのような要求を満たすためにサービスが開発されている。しかしながら、効率的な仮想現実サービスの提供は、特に、体験が完全に仮想的に生成された人工世界ではなく、現実世界環境のキャプチャに基づくものである場合には、非常に困難である。 Virtual experiences that allow users to move around in the virtual world are becoming more and more popular, and services are being developed to meet such demands. However, providing efficient virtual reality services is very difficult, especially if the experience is based on a capture of the real world environment rather than a completely virtually virtualized artificial world.

多くの仮想現実アプリケーションでは、観察者ポーズ入力がシーン内のバーチャル観察者のポーズを反映して決定される。次に、仮想現実装置/システム/アプリケーションは、観察者ポーズに対応する観察者のために、シーンのビューとビューポートに対応する1つ以上の画像を生成する。 In many virtual reality applications, the observer pose input is determined to reflect the pose of the virtual observer in the scene. The virtual reality device / system / application then generates one or more images corresponding to the view and viewport of the scene for the observer corresponding to the observer pose.

典型的には、仮想現実アプリケーションは、左目及び右目のための別々のビュー画像の形で三次元出力を生成する。次いで、これらは、典型的にはVRヘッドセットの個々の左目ディスプレイおよび右目ディスプレイなどの適切な手段によってユーザに提示され得る。他の実施形態では、画像が例えば、自動立体ディスプレイ上で提示されてもよく(この場合、より多数のビュー画像が観察者ポーズのために生成されてもよい)、または実際に、いくつかの実施形態では、単一の2次元画像のみが生成されてもよい(例えば、従来の2次元ディスプレイを使用して)。 Typically, a virtual reality application produces 3D output in the form of separate view images for the left and right eyes. These can then be presented to the user, typically by appropriate means such as the individual left-eye and right-eye displays of the VR headset. In other embodiments, the image may be presented, for example, on an automatic stereoscopic display (in this case, a larger number of view images may be generated for the observer pose), or, in fact, some. In embodiments, only a single 2D image may be generated (eg, using a conventional 2D display).

観察者ポーズ入力は、異なるアプリケーションで異なる方法で決定される場合がある。多くの実施形態では、ユーザの物理的な動きを直接追跡することができる。例えば、ユーザエリアを測量するカメラがユーザの頭部(または目)を検出し、追跡することができる。多くの実施形態では、ユーザは、外部および/または内部手段によって追跡することができるVRヘッドセットを装着することができる。例えば、ヘッドセットは、ヘッドセット、したがって頭部の移動および回転に関する情報を提供する加速度計およびジャイロスコープを備えることができる。いくつかの例では、VRヘッドセットは、信号を送信することができ、または外部センサがVRヘッドセットの動きを決定することを可能にする(例えば視覚的な)識別子を備えることができる。 Observer pose input may be determined differently in different applications. In many embodiments, the physical movement of the user can be tracked directly. For example, a camera that surveys the user area can detect and track the user's head (or eyes). In many embodiments, the user can wear a VR headset that can be tracked by external and / or internal means. For example, the headset can include a headset, and thus an accelerometer and gyroscope that provide information about head movement and rotation. In some examples, the VR headset can transmit a signal or can be equipped with a (eg, visual) identifier that allows an external sensor to determine the movement of the VR headset.

いくつかのシステムでは、観察者ポーズは、マニュアルの手段によって、例えば、ユーザがジョイスティックまたは同様のマニュアル入力を手動で制御することによって、提供されてもよい。例えば、ユーザは、一方の手で第1のアナログジョイスティックを制御することによってシーン内で仮想観察者を手動で動かし、他方の手で第2のアナログジョイスティックを手動で動かすことによって仮想観察者が見ている方向を手動で制御することができる。 In some systems, observer poses may be provided by manual means, for example, by the user manually controlling a joystick or similar manual input. For example, the user manually moves the virtual observer in the scene by controlling the first analog joystick with one hand and the virtual observer manually moves the second analog joystick with the other hand. You can manually control the direction you are in.

いくつかのアプリケーションでは、手動アプローチと自動アプローチとの組み合わせを使用して、入力される観察者ポーズを生成することができる。例えば、ヘッドセットが頭部の向きを追跡することができ、シーン内の観察者の動き/位置は、ジョイスティックを使用してユーザによって制御されることができる。 Some applications can use a combination of a manual approach and an automated approach to generate an input observer pose. For example, the headset can track the orientation of the head and the movement / position of the observer in the scene can be controlled by the user using the joystick.

画像の生成は、仮想世界/環境/シーンの適切な表現に基づく。いくつかのアプリケーションでは、シーンについて完全な三次元モデルを提供することができ、特定の観察者ポーズからのシーンのビューを、このモデルを評価することによって決定することができる。 Image generation is based on the proper representation of the virtual world / environment / scene. Some applications can provide a complete 3D model of the scene, and the view of the scene from a particular observer pose can be determined by evaluating this model.

多くの実用的なシステムでは、シーンは、画像データを含む画像表現によって表されることができる。画像データは、典型的には、1つ以上のキャプチャポーズまたはアンカーポーズに関連する1つ以上の画像を含んでもよく、具体的には、1つ以上のビューポートについての画像が含まれてもよく、各ビューポートは特定のポーズに対応する。1つまたは複数の画像を含む画像表現を使用することができ、各画像は、所与のビューポーズに対する所与のビューポートのビューを表す。画像データが提供されるそのようなビューポーズまたは位置は、アンカーポーズまたは位置、あるいはキャプチャポーズまたは位置とも呼ばれることが多い（画像データが、典型的には、キャプチャポーズに対応する位置及び向きを有するシーン内に配置されたカメラによってキャプチャされるかまたはキャプチャされるであろう画像に対応し得るため）。 In many practical systems, a scene can be represented by an image representation that includes image data. The image data may typically include one or more images associated with one or more capture poses or anchor poses, specifically images about one or more viewports. Often, each viewport corresponds to a particular pose. Image representations that include one or more images can be used, and each image represents a view in a given viewport for a given view pose. Such view poses or positions to which image data is provided are often referred to as anchor poses or positions, or capture poses or positions (image data typically has a position and orientation corresponding to the capture pose). (Because it can correspond to an image captured or will be captured by a camera placed in the scene).

画像は、典型的には、奥行き情報に関連付けられ、具体的には奥行き画像又はマップが典型的に提供される。そのような奥行きマップは、対応する画像中の各ピクセルに対する奥行き値を提供することができ、奥行き値は、カメラ/アンカー/キャプチャ位置から、ピクセルによって描写されるオブジェクト/シーン点までの距離を示す。したがって、ピクセル値は、シーン内のオブジェクト/点からカメラのキャプチャ装置への光線を表すと考えられてもよく、ピクセルに対する奥行き値はこの光線の長さを反映することができる。 The image is typically associated with depth information, specifically a depth image or map is typically provided. Such a depth map can provide a depth value for each pixel in the corresponding image, which indicates the distance from the camera / anchor / capture position to the object / scene point depicted by the pixel. .. Therefore, the pixel value may be considered to represent a ray from an object / point in the scene to the camera's capture device, and the depth value for the pixel can reflect the length of this ray.

多くの実施形態において、画像及び対応する奥行きマップの解像度は同じであり得、従って、画像内の各ピクセルに関する個々の奥行き値が含まれ得、即ち、奥行きマップは、画像の各ピクセルに対して一つの奥行き値を含み得る。他の実施形態では、解像度は異なる場合があり、例えば、奥行きマップは、1つの奥行き値が複数の画像ピクセルに適用され得るように、より低い解像度を有することがある。以下の説明は、画像の解像度と対応する奥行きマップが同じであり、従って、各画像ピクセル（画像のピクセル）に対して、別個の奥行きマップピクセル（奥行きマップのピクセル）が存在する実施形態に焦点を当てる。 In many embodiments, the resolution of the image and the corresponding depth map can be the same and thus can include individual depth values for each pixel in the image, i.e. the depth map is for each pixel in the image. It may contain one depth value. In other embodiments, the resolutions may vary, for example, the depth map may have a lower resolution so that one depth value can be applied to multiple image pixels. The following description focuses on embodiments where the resolution of the image and the corresponding depth map are the same and therefore there is a separate depth map pixel (depth map pixel) for each image pixel (image pixel). Guess.

奥行き値は、ピクセルに対する奥行きを示す任意の値であってよく、従って、それは、カメラ位置から、所与のピクセルによって描写されるシーンのオブジェクトまでの距離を示す任意の値であってよい。奥行き値は例えば、視差値、z座標、距離測度などであってもよい。 The depth value can be any value that indicates the depth to the pixel, and thus it can be any value that indicates the distance from the camera position to the object in the scene depicted by a given pixel. The depth value may be, for example, a parallax value, a z coordinate, a distance measure, or the like.

多くの典型的なVRアプリケーションは、このような画像+奥行き表現に基づいて、現在の観察者ポーズのためのビューポートに対応するビュー画像を提供するように進行することができ、画像は、ビューアポーズの変化を反映するように動的に更新され、（場合によっては）仮想シーン/環境/世界を表す画像データに基づいて生成される。アプリケーションは、当業者に知られているように、ビュー合成およびビューシフトアルゴリズムを実行することによってこれを実行することができる。 Many typical VR applications can proceed to provide a viewport corresponding to the viewport for the current observer pose based on such an image + depth representation, and the image is a viewer. Dynamically updated to reflect changes in poses and (in some cases) generated based on image data representing virtual scenes / environments / worlds. Applications can do this by performing view composition and view shift algorithms, as is known to those of skill in the art.

この分野では、配置およびポーズという用語が位置および/または方向/向きに関する一般的な用語として使用される。例えばオブジェクト、カメラ、頭部またはビューの位置および方向/向きの組み合わせを、ポーズまたは配置と呼ぶ場合がある。したがって、配置またはポーズ指標は、6つの値/成分/自由度を含むことができ、各値/成分は、典型的には、対応するオブジェクトの位置/場所または向き/方向の個々の特性を記述する。もちろん、多くの状況において、例えば、1つ以上の成分が固定または無関係であると考えられる場合(例えば、全てのオブジェクトが同じ高さにあり、水平方向を有すると考えられる場合、4つの成分がオブジェクトのポーズの完全な表現を提供することができる)、配置またはポーズはより少ない成分で考慮または表現されてもよい。以下では、ポーズという用語は、1乃至6つの値(可能な最大自由度に対応する)によって表すことができる位置および/または向きを指すために使用される。 In this field, the terms placement and pose are used as general terms for position and / or orientation / orientation. For example, a combination of object, camera, head or view position and orientation / orientation may be referred to as a pose or alignment. Therefore, the placement or pose index can contain 6 values / components / degrees of freedom, where each value / component typically describes the individual characteristics of the position / location or orientation / orientation of the corresponding object. do. Of course, in many situations, for example, if one or more components are considered fixed or irrelevant (eg, if all objects are at the same height and are considered to have a horizontal orientation, then the four components are considered. A complete representation of the pose of an object can be provided), placement or pose may be considered or represented with less component. In the following, the term pose is used to refer to a position and / or orientation that can be represented by one to six values (corresponding to the maximum degree of freedom possible).

多くのVRアプリケーションは、最大自由度、すなわち、位置および向きのそれぞれの3つの自由度を有するポーズに基づいており、その結果、合計6つの自由度が得られる。したがって、ポーズは6つの自由度を表す6つの値のセットまたはベクトルによって表すことができ、したがって、ポーズベクトルは、三次元位置および/または三次元方向表示を与えることができる。しかしながら、他の実施形態では、ポーズがより少ない値によって表されてもよいことが理解されるであろう。 Many VR applications are based on maximum degrees of freedom, that is, poses with three degrees of freedom each for position and orientation, resulting in a total of six degrees of freedom. Thus, a pose can be represented by a set or vector of six values representing six degrees of freedom, and thus the pose vector can give a three-dimensional position and / or a three-dimensional directional representation. However, it will be appreciated that in other embodiments, the pose may be represented by a smaller value.

ポーズは、方位および位置のうちの少なくとも1つとすることができる。ポーズ値は、方位値および位置値のうちの少なくとも1つを示すことができる。 The pose can be at least one of orientation and position. The pose value can indicate at least one of the orientation value and the position value.

観察者に最大自由度を提供することに基づくシステムまたはエンティティは、通常、6自由度(6DoF)を有すると呼ばれる。多くのシステムおよびエンティティは、方向または位置のみを提供し、これらは、典型的には3自由度（3DoF）を有するものとして知られている。 A system or entity based on providing the observer with maximum degrees of freedom is commonly referred to as having 6 degrees of freedom (6DoF). Many systems and entities provide only directions or positions, which are typically known to have 3 degrees of freedom (3DoF).

システムによっては、VRアプリケーションは、例えば、いかなる遠隔VRデータまたは処理を使用せず、何らアクセスしないスタンドアロン装置によって、観察者にローカルで提供されてもよい。例えば、ゲームコンソールのような装置が、シーンデータを記憶するための記憶装置と、観察者ポーズを受信/生成するための入力と、シーンデータから対応する画像を生成するためのプロセッサとを備えることができる。 Depending on the system, the VR application may be provided locally to the observer, for example, by a stand-alone device that does not use any remote VR data or processing and has no access. For example, a device such as a game console comprises a storage device for storing scene data, an input for receiving / generating an observer pose, and a processor for generating a corresponding image from the scene data. Can be done.

他のシステムでは、VRアプリケーションは、観察者から遠隔で実装され、実行されることができる。例えば、ユーザにローカルな装置は、観察者ポーズを生成するためにデータを処理する遠隔装置に送信される動き/ポーズデータを検出/受信することができる。次いで、遠隔装置は、シーンを記述するシーンデータに基づいて、観察者ポーズのための適切なビュー画像を生成することができる。次に、ビュー画像は、それらが提示される観察者に対してローカルな装置に送信される。例えば、遠隔装置は、ローカル装置によって直接提示されるビデオストリーム(典型的にはステレオ/3Dビデオストリーム)を直接生成することができる。したがって、このような例では、ローカル装置は、移動データを送信し、受信したビデオデータを提示することを除いて、いかなるVR処理も実行しないことがある。 In other systems, VR applications can be implemented and run remotely from the observer. For example, a user-local device can detect / receive motion / pause data sent to a remote device that processes the data to generate an observer pose. The remote device can then generate an appropriate view image for the observer pose based on the scene data that describes the scene. The view images are then sent to a device local to the observer to whom they are presented. For example, a remote device can directly generate a video stream (typically a stereo / 3D video stream) presented directly by the local device. Therefore, in such an example, the local device may not perform any VR processing except to send the travel data and present the received video data.

多くのシステムでは、機能がローカル装置および遠隔装置にわたって分散され得る。例えば、ローカル装置は、受信した入力およびセンサデータを処理して、遠隔VR装置に連続的に送信される観察者ポーズを生成することができる。次いで、遠隔VR装置は、対応するビュー画像を生成し、これらを提示のためにローカル装置に送信することができる。他のシステムでは、遠隔VR装置がビュー画像を直接生成しなくてもよいが、関連するシーンデータを選択し、これをローカル装置に送信してもよく、そしてローカル装置が、提示されるビュー画像を生成してもよい。例えば、リモートVR装置は最も近いキャプチャポイントを識別し、対応するシーンデータ(例えば、キャプチャポイントからの球面画像および奥行きデータ)を抽出し、これをローカル装置に送信することができる。次いで、ローカル装置は、受信したシーンデータを処理して、特定の現在のビューポーズのための画像を生成することができる。ビューポーズは典型的には頭部ポーズに対応し、ビューポーズへの参照は、典型的には頭部ポーズへの参照に対応すると同等に考えることができる。 In many systems, functionality can be distributed across local and remote devices. For example, the local device can process the received input and sensor data to generate an observer pose that is continuously transmitted to the remote VR device. The remote VR device can then generate the corresponding view images and send them to the local device for presentation. In other systems, the remote VR device does not have to generate the view image directly, but the relevant scene data may be selected and sent to the local device, and the local device presents the view image. May be generated. For example, a remote VR device can identify the closest capture point, extract the corresponding scene data (eg, spherical image and depth data from the capture point) and send it to the local device. The local device can then process the received scene data to generate an image for a particular current view pose. A view pose typically corresponds to a head pose, and a reference to a view pose can be considered equivalent to a reference to a head pose.

多くのアプリケーション、特に放送サービスの場合、ソースは、観察者ポーズに依存しないシーンの画像(ビデオを含む)表現の形でシーンデータを送信してもよい。例えば、単一のキャプチャ位置に対する単一のビュー球に対する画像表現が複数のクライアントに送信されることができる。次に、個々のクライアントは、現在の観察者ポーズに対応するビュー画像をローカルで合成することができる。 For many applications, especially broadcast services, the source may send scene data in the form of an image (including video) representation of the scene that does not depend on the observer pose. For example, an image representation for a single view sphere for a single capture position can be sent to multiple clients. Individual clients can then locally compose the view image that corresponds to the current observer pose.

特に興味を引いているアプリケーションは、限定された量の動きがサポートされ、頭部の小さな動き及び回転のみを行う実質的に静的な観察者に対応する小さな動き及び回転に追従するように提示されるビューが更新される。例えば、座っている観察者は頭を回し、それをわずかに動かすことができ、提示されるビュー/画像は、これらのポーズ変化に追従するように適合される。そのようなアプローチは、非常に没入型の、例えばビデオ体験を提供することができる。たとえば、スポーツイベントを見ている観察者は、自分がアリーナの特定のスポットにいると感じることができる。 Applications of particular interest are presented to support a limited amount of movement and follow small movements and rotations corresponding to a virtually static observer who makes only small movements and rotations of the head. The view to be updated is updated. For example, a sitting observer can turn his head and move it slightly, and the views / images presented are adapted to follow these pose changes. Such an approach can provide a very immersive, eg video experience. For example, an observer watching a sporting event may feel that he or she is in a particular spot in the arena.

このような制限された自由度のアプリケーションは、多くの異なる位置からのシーンの正確な表現を必要とせずに、改善された経験を提供し、それによってキャプチャ要件を大幅に低減するという利点を有する。同様に、レンダラに提供される必要があるデータの量を大幅に低減することができる。実際、多くのシナリオでは、単一の視点のための画像及び典型的には奥行きデータのみが、これから所望のビューを生成することができるローカルレンダラに提供される必要がある。 Applications with such limited degrees of freedom have the advantage of providing an improved experience without the need for accurate representation of scenes from many different locations, thereby significantly reducing capture requirements. .. Similarly, the amount of data that needs to be provided to the renderer can be significantly reduced. In fact, in many scenarios, only images for a single point of view and typically depth data need to be provided to a local renderer that can now generate the desired view.

このアプローチは例えば、ブロードキャストまたはクライアント・サーバ・アプリケーションのような、データが、帯域制限された通信チャネルを介してソースから宛先へ通信される必要があるアプリケーションに特に適している。 This approach is particularly suitable for applications where data needs to be communicated from source to destination over a bandwidth-limited communication channel, such as broadcast or client-server applications.

図1は、遠隔VRクライアント装置101が例えばインターネットのようなネットワーク105を介してVRサーバ103と連携するVRシステムのこのような例を示す。サーバ103は、潜在的に多数のクライアント装置101を同時にサポートするように構成されてもよい。 FIG. 1 shows such an example of a VR system in which the remote VR client device 101 cooperates with the VR server 103 via a network 105 such as the Internet. The server 103 may be configured to support potentially a large number of client devices 101 at the same time.

VRサーバ103は、例えば、適切なポーズに対応するビュー画像をローカルで合成するためにクライアント装置によって使用され得る画像データの形の画像表現を有する画像信号を送信することによって、ブロードキャスト体験をサポートし得る。 The VR server 103 supports a broadcast experience, for example, by transmitting an image signal having an image representation in the form of image data that can be used by a client device to locally synthesize a view image corresponding to the appropriate pose. obtain.

図2は、奥行きマップを処理するための装置の例示的な実装形態の例示的な要素を示す。この装置は、具体的には、VRサーバ103に実装されてもよく、これを参照して説明する。図3は、図2の装置によって実行される奥行きマップを処理する方法のためのフローチャートを示す。 FIG. 2 shows exemplary elements of an exemplary implementation of a device for processing a depth map. Specifically, this device may be mounted on the VR server 103, and will be described with reference to this. FIG. 3 shows a flow chart for how to process the depth map performed by the device of FIG.

装置/VRサーバ103は、異なるビューポーズからのシーンを表す複数の画像及び対応する奥行きマップが受信されるステップ301を実行する受信機201を備える。 The device / VR server 103 includes a receiver 201 that performs step 301 in which a plurality of images representing scenes from different view poses and corresponding depth maps are received.

画像は光強度情報を含み、画像のピクセル値は光強度値を反映する。いくつかの例では、ピクセル値は、グレースケール画像の輝度のような単一の値であってもよいが、多くの実施例では、ピクセル値は、例えばカラー画像に対するカラーチャネル値のような（サブ）値の集合またはベクトルであってもよい（例えば、RGBまたはYuv値が提供されてもよい）。 The image contains light intensity information, and the pixel value of the image reflects the light intensity value. In some examples, the pixel value may be a single value, such as the brightness of a grayscale image, but in many embodiments, the pixel value is, for example, a color channel value for a color image ( Sub) may be a set or vector of values (eg RGB or Yuv values may be provided).

画像の奥行きマップは、同じビューポートの奥行き値を含むことができる。例えば、所与のビュー/キャプチャ/アンカーポーズに対する画像の各ピクセルに対して、対応する奥行きマップは奥行き値を有するピクセルを含む。したがって、画像およびその対応する奥行きマップ内の同じ位置は、ピクセルに対応する光線の光強度および奥行きをそれぞれ提供する。幾つかの実施形態において、奥行きマップは、より低い解像度を有し得、例えば、1つの奥行きマップピクセルが複数の画像ピクセルに対応し得る。しかしながら、そのような場合、奥行きマップ内の位置と奥行きマップ内の位置（サブピクセル位置を含む）との間には、依然として直接的な一対位置の対応があり得る。 The depth map of the image can contain the depth values of the same viewport. For example, for each pixel of an image for a given view / capture / anchor pose, the corresponding depth map contains pixels with depth values. Thus, the same location in the image and its corresponding depth map provides the light intensity and depth of the rays corresponding to the pixels, respectively. In some embodiments, the depth map may have a lower resolution, for example, one depth map pixel may correspond to multiple image pixels. However, in such cases, there may still be a direct paired position correspondence between the position in the depth map and the position in the depth map (including the subpixel position).

簡潔さおよび複雑さのために、以下の説明は、3つの画像および対応する奥行きマップのみが提供される例に焦点を当てる。さらに、3つの異なる視点位置からのシーンをキャプチャし、図4に示されているのと同じ向きを有するカメラの直線配置によって、これらの画像が提供されるものと仮定される。 For brevity and complexity, the discussion below focuses on an example where only three images and the corresponding depth map are provided. Further, it is assumed that these images are provided by a linear arrangement of cameras that capture scenes from three different viewpoint positions and have the same orientation as shown in FIG.

多くの実施形態では、かなり多数の画像が受信されることが多く、シーンはかなり多数のキャプチャポーズからキャプチャされることが多いことが理解されよう。 It will be appreciated that in many embodiments, a significant number of images are often received and the scene is often captured from a significant number of capture poses.

受信機は、以下では簡潔にするために単に更新器203と呼ばれる奥行きマップ更新器に供給される。更新器203は、ステップ303を実行し、ここで、受信された奥行きマップのうちの1つ以上（および典型的には全て）が更新される。更新は、少なくとも第2の受信された奥行きマップの奥行き値に基づいて、第1の受信された奥行きマップの奥行き値を更新することを含む。したがって、改良された奥行きマップを生成するために、奥行きマップ間およびビューポーズ間の更新が実行される。 The receiver is simply fed to a depth map updater called updater 203 for brevity below. Updater 203 performs step 303, where one or more (and typically all) of the received depth maps are updated. The update involves updating the depth value of the first received depth map based on at least the depth value of the second received depth map. Therefore, updates between depth maps and between view poses are performed to generate an improved depth map.

この例では、更新器203は、ステップ305を実行する画像信号発生器205に結合され、このステップでは、更新された奥行きマップと共に受信された画像を含む画像信号を生成する。次いで、画像信号は、例えば、VRクライアント装置101に送信され得、そこで、現在の観察者ポーズのためのビュー画像を合成するための基礎として使用され得る。 In this example, the updater 203 is coupled to an image signal generator 205 that performs step 305, which step generates an image signal containing the image received with the updated depth map. The image signal can then be transmitted, for example, to the VR client device 101, where it can be used as the basis for synthesizing the view image for the current observer pose.

この例では、奥行きマップ更新はVRサーバ103内で実行され、更新された奥行きマップは、VRクライアント装置101に配信される。しかしながら、他の実施形態では、奥行きマップ更新は、例えば、VRクライアント装置101において実行されてもよい。例えば、受信機201は、VRクライアント装置101の一部であってもよく、VRサーバ103から画像および対応する奥行きマップを受信する。次に、受信された奥行きマップは、更新器203によって更新されてもよく、画像信号生成器205の代わりに、装置は、画像および更新された奥行きマップに基づいて新しいビューを生成するように構成されたレンダラまたはビュー画像合成器を備えてもよい。 In this example, the depth map update is performed in the VR server 103, and the updated depth map is delivered to the VR client device 101. However, in other embodiments, the depth map update may be performed, for example, in the VR client device 101. For example, the receiver 201 may be part of the VR client device 101 and receives an image and a corresponding depth map from the VR server 103. The received depth map may then be updated by the updater 203, and instead of the image signal generator 205, the device is configured to generate a new view based on the image and the updated depth map. It may be equipped with a renderer or a view image synthesizer.

さらに他の実施形態では、すべての処理を単一の装置で実行することができる。例えば、同じ装置が、直接キャプチャされた情報を受信し、例えば、視差推定によって、初期の奥行きマップを生成することができる。結果として得られる奥行きマップが更新されることができ、装置の合成器は、新しいビューを動的に生成することができる。 In yet another embodiment, all processing can be performed on a single device. For example, the same device can receive the captured information directly and generate an initial depth map, for example by parallax estimation. The resulting depth map can be updated and the device synthesizer can dynamically generate new views.

したがって、説明された機能の位置および更新された奥行きマップの特定の使用は、個々の実施形態の選好および要件に依存する。 Therefore, the location of the described features and the specific use of the updated depth map will depend on the preferences and requirements of the individual embodiments.

したがって、奥行きマップの更新は、異なる空間位置からの異なる画像についての奥行きを表す他の奥行きマップのうちの1つまたは複数に基づいて行われる。このアプローチは、奥行きマップの場合、結果として得られる知覚品質にとって重要なのは、個々の奥行き値または奥行きマップの絶対的な精度または信頼性だけでなく、異なる奥行きマップ間の一貫性も非常に重要であるという認識を利用する。 Therefore, the depth map update is based on one or more of the other depth maps that represent the depth for different images from different spatial locations. For depth maps, this approach is important not only for the absolute accuracy or reliability of individual depth values or depth maps, but also for the consistency between different depth maps, which is important for the resulting perceptual quality. Use the perception that there is.

実際、ヒューリスティックに得られる洞察は、誤差または不正確さが奥行きマップ間で一貫性がないとき、すなわち、それらがソースビューにわたって変化するとき、それらは、視聴者が位置を変えたときに仮想シーンが振動するように知覚されるので、特に有害であると知覚されることである。 In fact, heuristic insights are that when errors or inaccuracies are inconsistent across depth maps, that is, when they change across the source view, they are virtual scenes when the viewer repositions them. Is perceived to be particularly harmful because it is perceived as vibrating.

このようなビュー一貫性は、奥行きマップ推定処理中に常に十分に遵守されるとは限らない。例えば、これは、各ビューの奥行きマップを取得するために別個の奥行きセンサ使用する場合である。その場合、奥行きデータは完全に独立して取り込まれる。（例えば、平面掃引アルゴリズムを使用して）奥行きを推定するためにすべてのビューが使用される他の極端な場合、結果は、使用される特定のマルチビュー視差アルゴリズムおよびそのパラメータ設定に依存するため、依然として一貫性がない可能性がある。以下に説明する特定のアプローチは、多くのシナリオにおいて、そのような問題を軽減し、奥行きマップ間の整合性を改善し、したがって、知覚される画像品質を改善するように奥行きマップを更新することができる。このアプローチは、シーンのマルチビュー画像のセットに対する奥行きマップの品質を改善することができる。 Such view consistency is not always fully adhered to during the depth map estimation process. For example, this is the case when a separate depth sensor is used to obtain a depth map for each view. In that case, the depth data is captured completely independently. In other extreme cases where all views are used to estimate depth (eg using a plane sweep algorithm), the result depends on the particular multiview parallax algorithm used and its parameter settings. , Still may be inconsistent. The specific approach described below, in many scenarios, mitigates such problems, improves consistency between depth maps, and thus updates depth maps to improve the perceived image quality. Can be done. This approach can improve the quality of the depth map for a set of multi-view images of the scene.

図5は、1つの奥行きマップの1つのピクセルに対して実行される更新のためのフローチャートを示す。このアプローチは、更新された第1奥行きマップを生成するために、奥行きマップピクセルのいくつかまたはすべてについて繰り返されてもよい。次いで、このプロセスは、他の奥行きマップについてさらに繰り返されることができる。 Figure 5 shows a flow chart for updates performed on a pixel in a depth map. This approach may be repeated for some or all of the depth map pixels to generate an updated first depth map. This process can then be repeated for other depth maps.

以下で第1奥行きマップと呼ばれる奥行きマップ中の、以下で第1奥行きピクセルと呼ばれるピクセルの更新は、ステップ501において開始し、第1候補奥行き値が第1奥行きピクセルに対して決定される。第1奥行きマップの第1奥行きピクセルの位置は、第1奥行きマップ位置と呼ばれる。対応する用語は、数字ラベルのみを変更して、他の図に使用される。 The update of the pixels in the depth map, which will be referred to below as the first depth map, which will be referred to below as the first depth pixel, will start in step 501 and the first candidate depth value will be determined for the first depth pixel. The position of the first depth pixel in the first depth map is called the first depth map position. Corresponding terms are used in other figures, changing only the numeric labels.

第1候補奥行き値は、第2奥行きマップ中の第2奥行きマップ位置における第2奥行きピクセルの奥行き値である少なくとも1つの第2奥行き値に応じて決定される。したがって、第1候補奥行き値は、別の1つの奥行きマップの1つまたは複数の奥行き値から決定される。第1候補奥行き値は、特に、第2奥行きマップに含まれる情報に基づく、第1奥行きピクセルに対する正しい奥行き値の推定値であり得る。 The first candidate depth value is determined according to at least one second depth value, which is the depth value of the second depth pixel at the second depth map position in the second depth map. Therefore, the first candidate depth value is determined from one or more depth values in another depth map. The first candidate depth value can be an estimate of the correct depth value for the first depth pixel, in particular, based on the information contained in the second depth map.

ステップ501に続くステップ503において、更新された第1奥行き値が、第1奥行きマップ位置に対する複数の候補奥行き値の重み付け結合によって、第1奥行きピクセルについて決定される。ステップ503で決定された第1候補奥行き値は、重み付け結合に含まれる。 In step 503 following step 501, the updated first depth value is determined for the first depth pixel by a weighted combination of a plurality of candidate depth values for the first depth map position. The first candidate depth value determined in step 503 is included in the weighted join.

したがって、ステップ501では、後続の結合のための複数の候補奥行き値のうちの1つが決定される。大部分の実施態様において、複数の候補奥行き値は、ステップ501において、第2奥行きマップ中の他の奥行き値及び/又は他の奥行きマップ中の奥行き値に対して、第1候補奥行き値について記載されたプロセスを繰り返すことにより、決定されることができる。 Therefore, in step 501, one of a plurality of candidate depth values for subsequent joins is determined. In most embodiments, the plurality of candidate depth values describe the first candidate depth value in step 501 with respect to the other depth values in the second depth map and / or the depth values in the other depth map. It can be determined by repeating the process.

多くの実施形態では、候補奥行き値の1つまたは複数は、他の方法で、または他のソースから、決定されることができる。多くの実施形態では、候補奥行き値の1つまたは複数は、第1奥行きピクセルの近傍の奥行き値など、第1奥行きマップからの奥行き値であり得る。多くの実施形態において、元の第1奥行き値、すなわち、受信機201によって受信された第1奥行きマップ中の第1奥行きピクセルに対する奥行き値が、候補奥行き値の1つとして含まれてもよい。 In many embodiments, one or more candidate depth values can be determined in other ways or from other sources. In many embodiments, one or more of the candidate depth values can be depth values from the first depth map, such as depth values in the vicinity of the first depth pixel. In many embodiments, the original first depth value, i.e., the depth value for the first depth pixel in the first depth map received by receiver 201, may be included as one of the candidate depth values.

したがって、更新器205は、上述のように決定された少なくとも1つの候補奥行き値を含む候補奥行き値の重み付け結合を実行することができる。任意の他の候補奥行き値の数、特性、起源などは、個々の実施形態の選好および要件、ならびに所望される実際の奥行き更新動作に依存する。 Therefore, the updater 205 can perform a weighted combination of candidate depth values including at least one candidate depth value determined as described above. The number, characteristics, origin, etc. of any other candidate depth value will depend on the preferences and requirements of the individual embodiments, as well as the desired actual depth update operation.

例えば、いくつかの実施形態では、重み付け結合は、ステップ501で決定された第1候補奥行き値および元の奥行き値のみを含むことができる。そのような場合、第1候補奥行き値に対する単一の重みのみが、例えば、決定されてもよく、元の奥行き値に対する重みは一定であってもよい。 For example, in some embodiments, the weighted join can include only the first candidate depth value and the original depth value determined in step 501. In such cases, only a single weight for the first candidate depth value may be determined, for example, and the weight for the original depth value may be constant.

別の例として、幾つかの実施例では、重み付け結合は、他の奥行きマップ及び/又は位置から決定された値、元の奥行き値、第1奥行きマップ中の近傍における奥行き値、又は、実際には、例えば異なる奥行き推定アルゴリズムを使用する奥行きマップのような代替の奥行きマップにおける奥行き値に基づくものも含む、多数の候補奥行き値の結合であり得る。そのような、より複雑な実施形態では、重みは例えば、各候補奥行き値について決定されてもよい。 As another example, in some embodiments, the weighted join is a value determined from another depth map and / or position, the original depth value, the depth value in the neighborhood in the first depth map, or actually. Can be a combination of a large number of candidate depth values, including those based on depth values in alternative depth maps, such as depth maps that use different depth estimation algorithms. In such a more complex embodiment, the weights may be determined, for example, for each candidate depth value.

例えば、非線形結合または（1つの候補奥行き値に1の重みが与えられ、他のすべての候補奥行き値に0の重みが与えられる）選択結合を含む、任意の適切な形の重み付け結合を使用することができることが理解されるであろう。しかし、多くの実施形態では、線形結合、具体的には加重平均を使用することができる。 Use any suitable form of weighted coupling, including non-linear coupling or selective coupling (one candidate depth value is weighted 1 and all other candidate depth values are weighted 0). It will be understood that it can be done. However, in many embodiments, linear combinations, specifically weighted averages, can be used.

したがって、特定の例として、奥行きマップ/ビューｋにおける画像座標（u,v）に対する更新された奥行き値

は、ステップ501について説明されたように少なくとも1つが生成される候補奥行き値ｚ_i

のセットの加重平均であってもよい。この場合、重み付け結合は、以下のように与えられるフィルタ関数に対応し得る。

ここで、

はビューｋのピクセル位置（u, v）における更新された奥行き値であり、z_iはi番目の入力候補奥行き値であり、w_iは i番目の入力候補奥行き値の重みである。 Therefore, as a specific example, the updated depth value for the image coordinates (u, v) in the depth map / view k.

Is a candidate depth value z _i for which at least one is generated as described for step 501.

It may be a weighted average of a set of. In this case, the weighted join may correspond to the filter function given as follows.

here,

Is the updated depth value at the pixel position (u, v) of the view k, z _i is the i-th input candidate depth value, and w _i is the weight of the i-th input candidate depth value.

この方法は、第1候補奥行き値の重み、すなわち第1重みを決定するための特定のアプローチを使用する。このアプローチは、図6のフローチャート及び図7の画像及び奥行きマップの例を参照して説明される。 This method uses a specific approach for determining the weight of the first candidate depth value, i.e. the first weight. This approach is illustrated with reference to the flowchart of FIG. 6 and the image and depth map examples of FIG.

図7は、3つの画像および3つの対応する奥行きマップが提供/考慮される例を示す。第1画像701は第1奥行きマップ703と共に提供される。同様に、第2画像705は第2奥行きマップ707と共に提供され、第3画像709は第3奥行きマップ711と共に提供される。以下の説明は、第2奥行きマップ707からの奥行き値に基づく、第1奥行きマップ703の第1奥行き値に対する第1重みの決定に焦点を当て、さらに第3画像709を考慮する。 Figure 7 shows an example where three images and three corresponding depth maps are provided / considered. The first image 701 is provided with the first depth map 703. Similarly, the second image 705 is provided with the second depth map 707 and the third image 709 is provided with the third depth map 711. The following description focuses on determining the first weight for the first depth value of the first depth map 703, based on the depth values from the second depth map 707, and also considers the third image 709.

したがって、（第1候補奥行き値に対する）第1重みの決定は、第2奥行きマップ707中の第2奥行きマップ位置における第2奥行きピクセルに対する1つ以上の第2奥行き値に基づいて、第1奥行きピクセル/第1奥行きマップ位置に対して決定される。具体的には、第1候補奥行き値は、図7の矢印713によって示されるように、第2奥行きマップ707中の対応する位置にある第2奥行き値として決定され得る。 Therefore, the determination of the first weight (for the first candidate depth value) is based on one or more second depth values for the second depth pixel at the second depth map position in the second depth map 707. Determined for pixel / first depth map position. Specifically, the first candidate depth value can be determined as the second depth value at the corresponding position in the second depth map 707, as indicated by arrow 713 in FIG.

第1重みの決定は、ステップ601で開始し、ここで、更新器は、矢印715によって示されるように、第1奥行きマップ位置に対応する第1画像701中の第1画像位置を決定する。典型的には、これは単に、同じ位置及び画像座標であってもよい。この第1画像位置に対応する第1画像701中のピクセルは、第1画像ピクセルと呼ばれる。 The determination of the first weight begins in step 601 where the renewer determines the first image position in the first image 701 corresponding to the first depth map position, as indicated by arrow 715. Typically, this may simply be the same position and image coordinates. The pixels in the first image 701 corresponding to this first image position are called the first image pixels.

そして、更新器203はステップ603に進み、複数の画像のうちの第3画像709中の第3画像位置を決定し、ここで、第3画像位置は、第1候補奥行き値に基づいた、第1画像位置の第3画像への投影に対応する。第3画像位置は、矢印717によって示されるように、第1画像701の画像座標からの直接投影によって決定されることができる。 Then, the updater 203 proceeds to step 603 to determine the third image position in the third image 709 of the plurality of images, where the third image position is based on the first candidate depth value. Corresponds to the projection of one image position onto the third image. The third image position can be determined by direct projection from the image coordinates of the first image 701, as indicated by arrow 717.

したがって、更新器203は、第1画像位置を第3画像709中の第3画像位置に投影することに進む。投影は、第1候補奥行き値に基づく。したがって、第3画像709への第1画像位置の投影は、第2奥行きマップ707に基づいて決定された第1奥行き値の推定値とみなされることができる奥行き値に基づく。 Therefore, the updater 203 proceeds to project the first image position to the third image position in the third image 709. The projection is based on the first candidate depth value. Therefore, the projection of the first image position onto the third image 709 is based on a depth value that can be considered an estimate of the first depth value determined based on the second depth map 707.

いくつかの実施形態では、第3画像位置の決定は、奥行きマップ位置の投影に基づくことができる。例えば、更新器203は、矢印719によって示されるように、第3奥行きマップ711中の第3奥行きマップ位置に第1奥行きマップ位置（第1奥行きピクセルの位置）を投影するように進むことができる。投影は、第1候補奥行き値に基づく。したがって、第1奥行きマップ位置の第3奥行きマップ711への投影は、第2奥行きマップ707に基づいて決定された第1奥行き値の推定値とみなされることができる奥行き値に基づく。 In some embodiments, the determination of the third image position can be based on the projection of the depth map position. For example, updater 203 can proceed to project the first depth map position (the position of the first depth pixel) to the third depth map position in the third depth map 711, as indicated by arrow 719. .. The projection is based on the first candidate depth value. Therefore, the projection of the first depth map position onto the third depth map 711 is based on a depth value that can be considered an estimate of the first depth value determined based on the second depth map 707.

次いで、第3画像位置は、矢印721によって示されるように、第3奥行きマップ位置に対応する第3画像709中の画像位置として決定され得る。 The third image position can then be determined as the image position in the third image 709 corresponding to the third depth map position, as indicated by arrow 721.

2つのアプローチは同等であることが理解されるであろう。 It will be understood that the two approaches are equivalent.

1つの奥行きマップ/画像から異なる奥行きマップ/画像への投影は、当該1つの奥行きマップ/画像中の奥行きマップ/画像位置と同じシーン点を表す異なる奥行きマップ/画像内の奥行きマップ/画像位置の決定であってもよい。奥行きマップ/画像は異なるビュー/キャプチャポーズを表すので、視差効果はシーン内の所与の点に対する画像位置のシフトをもたらす。このシフトは、ビューポーズの変化及びシーン内の点の奥行きに依存する。1つの画像/奥行きマップから別の画像/奥行きマップへの投影は、したがって、画像/奥行きマップ位置シフトまたは決定とも呼ばれ得る。 Projection from one depth map / image to a different depth map / image is of the one depth map / depth map in the image / different depth map representing the same scene point as the image position / depth map in the image / image position. It may be a decision. Since depth maps / images represent different view / capture poses, the parallax effect results in a shift in image position with respect to a given point in the scene. This shift depends on changes in view pose and the depth of points in the scene. Projection from one image / depth map to another image / depth map can therefore also be referred to as an image / depth map position shift or determination.

一例として、あるビュー（l）の画像座標（u,v）_lとその奥行き値z_l（u,v）を、隣接するビュー（k）の対応する画像座標（u,v）_kに投影することは、例えば、透視カメラの場合、以下のステップで行うことができる：
１．画像座標(u,v)_lは、カメラ(l)のカメラ固有のパラメータ(焦点距離、主点)を用いて、3D空間(x,y,z)_l中のz__lを用いて投影解除される。
２．カメラ(l)の座標系(x,y,z)_lにある未投影の点は、それらの相対的な外部パラメータ(カメラ回転行列Rと並進ベクトルt)を用いて、カメラ(k)の座標系(x,y,z)_kに変換される。
３．最後に、(kのカメラ固有値を使用して)点(x,y,z)_kがカメラ(k)の画像平面に投影され、画像座標(u,v)_kが得られる。 As an example, the image coordinates (u, v) _l of a view (l) and its depth value z _l (u, v) are projected onto the corresponding image coordinates (u, v) _k of an adjacent view (k). This can be done, for example, in the case of a fluoroscopic camera, in the following steps:
1. 1. Image coordinates (u, v) _l are unprojected using z_ _l in 3D space (x, y, z) _l using camera-specific parameters (focal length, principal point) of camera (l). To.
2. 2. Unprojected points in the camera (l) coordinate system (x, y, z) _l are the coordinates of the camera (k) using their relative external parameters (camera rotation matrix R and translation vector t). Converted to the system (x, y, z) _k .
3. 3. Finally, the point (x, y, z) _k is projected onto the image plane of the camera (k) (using the camera eigenvalues of k) to obtain the image coordinates (u, v) _k .

他のカメラ投影タイプ、例えば、類似投影（ERP）の場合、同様のメカニズムを使用することができる。 Similar mechanisms can be used for other camera projection types, such as enterprise resource planning (ERP).

記述されたアプローチにおいて、第1候補奥行き値に基づく投影は、第1候補奥行き値の奥行きを有する第1奥行きマップ／画像位置のシーン点に対する（および第1ビューポーズと第3ビューポーズとの間のビューポーズの変化に対する）第3奥行きマップ／画像位置の決定に対応すると考えられてもよい。 In the approach described, the projection based on the first candidate depth value is to the scene point of the first depth map / image position with the depth of the first candidate depth value (and between the first view pose and the third view pose). It may be considered to correspond to the determination of the third depth map / image position (for changes in the view pose of).

異なる奥行きは異なるシフトをもたらし、この場合、第1奥行きマップ703および第1画像701に対する第1ビューポーズと、第3奥行きマップ711および第3画像709に対する第3ビューポーズとの間の画像および奥行きマップ位置のシフトは、第2奥行きマップ707における少なくとも1つの奥行き値に基づく。 Different depths result in different shifts, in this case the image and depth between the first view pose for the first depth map 703 and the first image 701 and the third view pose for the third depth map 711 and the third image 709. The map position shift is based on at least one depth value in the second depth map 707.

ステップ603において、更新器203は、実際に第1候補奥行き値が第1奥行き値および第1画像ピクセルについての正しい値である場合には、第1画像701中の第1画像ピクセルと同じシーン点を反映するであろう第3奥行きマップ711および第3画像709内の位置をそれに応じて決定する。正しい値からの第1候補奥行き値のいかなる偏差も、第3画像709において決定される誤った位置をもたらし得る。ここで、シーン点とは、ピクセルに関連する光線上にあるシーン点を指すが、必ずしも両方のビューポーズに対して最前方のシーン点である必要はないことに留意されたい。例えば、第1ビューポーズから見られるシーン点が、第2ビューポーズから見られるときよりも（より）前景のオブジェクトによって遮蔽される場合、奥行きマップおよび画像の奥行き値は異なるシーン点を表す場合があり、したがって、潜在的に非常に異なる値を有する場合がある。 In step 603, the updater 203 actually has the same scene point as the first image pixel in the first image 701 if the first candidate depth value is actually the correct value for the first depth value and the first image pixel. The position in the third depth map 711 and the third image 709 that will reflect the above is determined accordingly. Any deviation of the first candidate depth value from the correct value can result in the wrong position determined in the third image 709. Note that the scene point refers to a scene point on a ray associated with a pixel, but does not necessarily have to be the frontmost scene point for both view poses. For example, if the scene points seen from the first view pose are obscured by (more) foreground objects than from the second view pose, the depth map and image depth values may represent different scene points. Yes, and therefore can potentially have very different values.

ステップ603の後にステップ605が続き、第1画像位置および第3画像位置それぞれにおける第1画像701および第3画像709の内容に基づいて、第1マッチング誤差指標が生成される。具体的には、第3画像位置における第3画像中の画像ピクセル値が検索される。幾つかの実施形態において、この画像ピクセル値は、第3奥行きマップ711中の第3奥行きマップ位置が奥行き値を提供する第3画像709中の画像ピクセル値として決定可能である。すなわち、同じ解像度が第3奥行きマップ711および第3画像709に使用される多くの実施形態では、第1奥行きマップ位置に対応する第3画像709中の位置の直接的な決定（矢印719）は、第3奥行きマップ711中の位置を決定して対応する画像ピクセル取り出すことに等しいことが理解されよう。 Step 603 is followed by step 605 to generate a first matching error index based on the contents of the first image 701 and the third image 709 at the first and third image positions, respectively. Specifically, the image pixel value in the third image at the third image position is searched. In some embodiments, the image pixel value can be determined as the image pixel value in the third image 709 where the third depth map position in the third depth map 711 provides the depth value. That is, in many embodiments where the same resolution is used for the third depth map 711 and the third image 709, the direct determination of the position in the third image 709 corresponding to the first depth map position (arrow 719) It will be understood that it is equivalent to determining the position in the third depth map 711 and extracting the corresponding image pixel.

同様に、更新器203は、第1画像位置における第1画像701中のピクセル値を抽出するように進む。次いで、これらの2つの画像ピクセル値の間の差を示す第1マッチ誤差指標を決定するように進む。例えば、単純な絶対差、例えば複数のカラーチャネルのピクセル値成分に適用される根二乗和差のような、任意の適切な差分尺度が使用できることが理解されよう。 Similarly, the updater 203 proceeds to extract the pixel values in the first image 701 at the first image position. It then proceeds to determine a first match error index that indicates the difference between these two image pixel values. It will be appreciated that any suitable difference scale can be used, for example, a simple absolute difference, for example a root-squared difference applied to the pixel value components of multiple color channels.

したがって、更新器203は、第3画像位置に対する第3画像中の画像ピクセル値と、第1画像位置に対する第1画像中の画像ピクセル値との差を示す第1マッチ誤差指標を605に決定する。 Therefore, the updater 203 determines 605 as the first match error index indicating the difference between the image pixel value in the third image with respect to the third image position and the image pixel value in the first image with respect to the first image position. ..

次に、更新器203は、ステップ607に進み、第1マッチ誤差指標に応じて第1重みが決定される。第1マッチ誤差指標から第1重みを決定するための特定のアプローチは、個々の実施形態に依存し得ることが理解されるであろう。多くの実施形態では、例えば、他のマッチ誤差指標を含む複雑な考慮事項を使用することができ、さらなる例を後に提供する。 Next, the updater 203 proceeds to step 607, and the first weight is determined according to the first match error index. It will be appreciated that the particular approach for determining the first weight from the first match error index may depend on the individual embodiment. In many embodiments, complex considerations can be used, including, for example, other match error indicators, further examples will be provided later.

低複雑度の例として、いくつかの実施形態では、第1重みは第1マッチ誤差指標の単調減少関数として決定されてもよく、多くの実施形態では他のパラメータを考慮することなく決定されてもよい。 As an example of low complexity, in some embodiments the first weight may be determined as a monotonic reduction function of the first match error index, and in many embodiments it is determined without considering other parameters. May be good.

例えば、重み付け結合が第1候補奥行き値と第1奥行きピクセルの元の奥行き値のみを含む例では、結合は、固定された重みを元の奥行き値に適用し、第1マッチ誤差指標が低いほど増加する第1重みを適用することができる（通常、重み正規化がさらに含まれる）。 For example, in an example where a weighted join contains only the first candidate depth value and the original depth value of the first depth pixel, the join applies a fixed weight to the original depth value, the lower the first match error index. An increasing first weight can be applied (usually with additional weight normalization).

第1マッチ誤差指標は、所与のシーン点を表す際に第1画像と第3画像とがどの程度良好に一致するかを反映すると考えることができる。第1画像と第3画像との間に遮蔽の違いが存在せず、第1候補奥行き値が正しい値である場合、画像ピクセル値は同じであるはずであり、第1マッチ誤差指標はゼロであるはずである。第1候補奥行き値が正しい値から逸脱する場合、第3画像内の画像ピクセルは同じシーン点に直接対応しない可能性があり、したがって、第1マッチ誤差指標が増加する可能性がある。遮蔽に変化がある場合、誤差は非常に高くなる可能性が高い。したがって、第1マッチ誤差指標は、第1候補奥行き値が第1奥行きピクセルに対してどれだけ正確かつ適切であるかの良好な指標を提供し得る。 The first match error index can be thought of as reflecting how well the first and third images match when representing a given scene point. If there is no shielding difference between the first and third images and the first candidate depth value is the correct value, then the image pixel values should be the same and the first match error index is zero. There should be. If the first candidate depth value deviates from the correct value, the image pixels in the third image may not directly correspond to the same scene point, and therefore the first match error index may increase. If there is a change in shielding, the error is likely to be very high. Therefore, the first match error index can provide a good indicator of how accurate and appropriate the first candidate depth value is for the first depth pixel.

異なる実施形態では、異なるアプローチを使用して、第2奥行きマップの奥行き値のうちの1つまたは複数から第1候補奥行き値を決定することができる。同様に、異なるアプローチを使用して、どの候補値が重み付け結合に対して生成されるかを決定することができる。具体的には、複数の候補値が第2奥行きマップの奥行き値から生成されてもよく、重みは、図6に関して説明されたアプローチに従って、これらのそれぞれについて個別に計算されてもよい。 In different embodiments, different approaches can be used to determine the first candidate depth value from one or more of the depth values in the second depth map. Similarly, different approaches can be used to determine which candidate values are generated for the weighted join. Specifically, a plurality of candidate values may be generated from the depth values of the second depth map, and the weights may be calculated individually for each of them according to the approach described with respect to FIG.

多くの実施形態では、第1候補奥行き値を導出するためにどの第2奥行き値を使用するかの決定は、2つの奥行きマップ中の対応する位置が決定されるように、第1奥行きマップと第2奥行きマップとの間の投影に依存する。具体的には、多くの実施形態では、第1候補奥行き値は、第1奥行きマップ位置に対応すると考えられる第2奥行きマップ位置における第2奥行き値として決定されてもよく、すなわち、第2奥行き値は同じシーン点を表すと考えられる奥行き値として選択される。 In many embodiments, the determination of which second depth value to use to derive the first candidate depth value is with the first depth map so that the corresponding position in the two depth maps is determined. Depends on the projection to and from the second depth map. Specifically, in many embodiments, the first candidate depth value may be determined as the second depth value at the second depth map position that is considered to correspond to the first depth map position, i.e., the second depth. The value is chosen as a depth value that is considered to represent the same scene point.

対応する第1奥行きマップ位置および第2奥行きマップ位置の決定は、第1奥行きマップから第2奥行きマップへの投影に基づくことができ、すなわち、元の第1奥行き値に基づくか、または、第2奥行きマップから第1奥行きマップへの投影に基づくこと、すなわち、第2奥行き値に基づくことができる。いくつかの実施形態では、両方向の投影が実行されてもよく、例えば、これらの平均が使用されてもよい。 The determination of the corresponding first depth map position and second depth map position can be based on the projection from the first depth map to the second depth map, i.e., based on the original first depth value or the first. It can be based on the projection from the 2 depth map to the 1st depth map, that is, on the 2nd depth value. In some embodiments, projections in both directions may be performed and, for example, an average of these may be used.

従って、第1候補奥行き値を決定することは、第2の値と第1奥行きマップの第1の元の奥行き値のうちの少なくとも1つに基づいて、第1画像の第1ビューポーズと第2画像の第2ビューポーズとの間の投影によって、第1奥行きマップ位置に対する第2奥行きマップ位置を決定することを含み得る。 Therefore, determining the first candidate depth value is based on at least one of the second value and the first original depth value of the first depth map, the first view pose and the first of the first image. The projection between the two image's second view poses may include determining the second depth map position relative to the first depth map position.

例えば、第1奥行きマップ中の所与の第1ピクセルに対して、更新器203は、奥行き値を抽出し、これを使用して、対応する第1奥行きマップ位置を、第2奥行きマップ中の対応する第2奥行きマップ位置に投影することができる。次に、この位置における第2奥行き値を抽出し、それを第1候補奥行き値として使用することができる。 For example, for a given first pixel in the first depth map, the updater 203 extracts the depth value and uses it to determine the corresponding first depth map position in the second depth map. It can be projected to the corresponding second depth map position. Next, the second depth value at this position can be extracted and used as the first candidate depth value.

別の例として、第2奥行きマップ中の所与の第2ピクセルに対して、更新器203は、奥行き値を抽出し、これを使用して、対応する第2奥行きマップ位置を、第1奥行きマップ中の対応する第1奥行きマップ位置に投影することができる。その後、第2奥行き値を抽出し、これを、第1奥行きマップ位置における第1奥行きピクセルに対する第1候補奥行き値として使用することができる。 As another example, for a given second pixel in the second depth map, the updater 203 extracts the depth value and uses it to determine the corresponding second depth map position in the first depth. It can be projected onto the corresponding first depth map position in the map. The second depth value can then be extracted and used as the first candidate depth value for the first depth pixel at the first depth map position.

そのような実施形態では、第2奥行きマップ中の奥行き値が第1候補奥行き値として直接使用される。しかしながら、2つの奥行きマップピクセルは同じシーン点への（遮蔽がない場合の）距離を表すが、異なる視点からの距離を表すので、奥行き値は異なる場合がある。多くの実際の実施形態では、異なる位置におけるカメラ/ビューポーズからの同じシーン点に対する距離のこの差は重要ではなく、無視できる。したがって、多くの実施形態では、カメラが完全に位置合わせされ、同じ方向を見て、同じ位置を有すると仮定することができる。その場合、オブジェクトが平坦でイメージセンサに対して平行であれば、奥行きは、2つの対応する奥行きマップにおいて実際に同じであり得る。このシナリオからの逸脱は、多くの場合、無視できるほど十分に小さい。 In such an embodiment, the depth value in the second depth map is directly used as the first candidate depth value. However, the two depth map pixels represent the distance to the same scene point (without occlusion), but the distance from different viewpoints, so the depth values may be different. In many practical embodiments, this difference in distance from the camera / view pose to the same scene point at different positions is not significant and can be ignored. Therefore, in many embodiments, it can be assumed that the cameras are perfectly aligned, look in the same direction, and have the same position. In that case, if the object is flat and parallel to the image sensor, the depth can actually be the same in the two corresponding depth maps. Deviations from this scenario are often small enough to be ignored.

しかしながら、いくつかの実施形態では、第2奥行き値からの第1候補奥行き値の決定は、奥行き値を修正する投影を含むことができる。これは、両方のビューの射影幾何学を考慮することを含む、より詳細な幾何計算に基づくことができる。 However, in some embodiments, the determination of the first candidate depth value from the second depth value can include a projection that modifies the depth value. This can be based on more detailed geometric calculations, including taking into account the projective geometry of both views.

いくつかの実施形態では、2つ以上の第2奥行き値を使用して第1候補奥行き値を生成することができる。例えば、異なる奥行き値の間で空間補間を実行して、ピクセルの中心と位置合わせされていない投影を補償してもよい。 In some embodiments, two or more second depth values can be used to generate the first candidate depth value. For example, spatial interpolation may be performed between different depth values to compensate for projections that are not aligned with the center of the pixel.

別の例として、いくつかの実施形態では、第1候補奥行き値は、第2奥行きマップ位置を中心とするカーネルが第2奥行きマップに適用される空間フィルタリングの結果として決定されてもよい。 As another example, in some embodiments, the first candidate depth value may be determined as a result of spatial filtering applied to the second depth map by the kernel centered on the second depth map position.

以下の説明は、各候補奥行き値が単一の第2奥行き値のみに依存し、さらに第2奥行き値に等しい実施形態に焦点を当てる。 The following description focuses on embodiments in which each candidate depth value depends only on a single second depth value and is still equal to the second depth value.

多くの実施形態では、重み付け結合は、異なる第2奥行き値から決定された複数の候補奥行き値をさらに含むことができる。 In many embodiments, the weighted coupling can further include a plurality of candidate depth values determined from different second depth values.

具体的には、多くの実施形態では、重み付け結合は、第2奥行きマップの或る領域からの候補奥行き値を含むことができる。この領域は、典型的には、第1奥行きマップ位置に基づいて決定されてもよい。具体的には、第2奥行きマップ位置は、前述したように投影（いずれかまたは両方向）によって決定されてもよく、領域は、この第2奥行きマップ位置の周囲の領域（例えば、所定の輪郭を有する）として決定されてもよい。 Specifically, in many embodiments, the weighted join can include candidate depth values from a region of the second depth map. This area may typically be determined based on the first depth map position. Specifically, the second depth map position may be determined by projection (either or both directions) as described above, and the area may be a region around this second depth map position (eg, a predetermined contour). May have).

このアプローチは、それに応じて、第1奥行きマップ中の第1奥行きピクセルのための候補奥行き値のセットを提供し得る。候補奥行き値のそれぞれについて、更新器203は、図6の方法を実行して、重み付け結合の重みを決定することができる。 This approach may provide a set of candidate depth values for the first depth pixel in the first depth map accordingly. For each of the candidate depth values, the updater 203 can perform the method of FIG. 6 to determine the weight of the weighted join.

このアプローチの特別な利点は、後続の重み決定が良好な候補と不良な候補とを適切に重み付けするので、候補奥行き値のための第2奥行き値の選択が過度に重要ではないことである。したがって、多くの実施形態では、候補を選択するために、比較的複雑性の低いアプローチを使用することができる A special advantage of this approach is that the choice of a second depth value for the candidate depth value is not overly important, as subsequent weighting properly weights good and bad candidates. Therefore, in many embodiments, a relatively less complex approach can be used to select candidates.

多くの実施形態では、領域は、例えば、元の第1奥行き値に基づく第1奥行きマップから第2奥行きマップへの投影によって決定される第2奥行きマップ中の位置の周囲の所定の領域として単に決定されてもよい。実際、多くの実施形態では、投影は、第1奥行きマップ中の奥行きマップ位置と同じ第2奥行きマップ中の奥行きマップ位置の周囲の領域として領域を単に選択することに置き換えることさえ可能である。したがって、アプローチは、第1奥行きマップ中の第1ピクセル位置と第2奥行きマップ中の同じ位置の周りの領域内で第2奥行き値を選択することによって、奥行き値の候補セットを単に選択することができる。 In many embodiments, the area is simply as a predetermined area around a position in the second depth map determined by projection from the first depth map to the second depth map based on the original first depth value, for example. It may be decided. In fact, in many embodiments, the projection can even be replaced by simply selecting the area as the area around the depth map position in the second depth map that is the same as the depth map position in the first depth map. Therefore, the approach is simply to select a candidate set of depth values by selecting the second depth value within the area around the first pixel position in the first depth map and the same position in the second depth map. Can be done.

このようなアプローチは、リソース使用量を低減しつつ、実際に効率的な動作を提供することができる。このアプローチは、奥行きマップ間で生じる位置/視差シフトと比較して領域のサイズが比較的大きい場合に、特に適している可能性がある。 Such an approach can actually provide efficient operation while reducing resource usage. This approach may be particularly suitable when the area size is relatively large compared to the position / parallax shifts that occur between depth maps.

前述のように、多くの異なるアプローチを使用して、重み付け結合における個々の候補奥行き値の重みを決定することができる。 As mentioned above, many different approaches can be used to determine the weight of individual candidate depth values in a weighted join.

多くの実施形態では、第1重みはさらに、第3画像以外の他の画像について決定された追加のマッチ誤差指標に応じて、決定されてもよい。多くの実施形態では、説明したアプローチを使用して、第1画像以外のすべての画像についてマッチ誤差指標を生成することができる。次に、例えば、これらの平均として、組み合わされたマッチ誤差指標を生成することができ、これに基づいて第1重みを決定することができる。 In many embodiments, the first weight may be further determined in response to additional match error indicators determined for images other than the third image. In many embodiments, the approach described can be used to generate a match error index for all images except the first image. Then, for example, a combined match error index can be generated as the average of these, from which the first weight can be determined.

具体的には、第1重みは、フィルタリングされているビューから他のすべてのビュー（l≠k）への別個のマッチ誤差の関数であるマッチ誤差メトリックに依存し得る。候補z_iのための重みを決定するためのメトリックの一例は以下である:

ここで、e_kl(z_i)は、候補z_iを所与としたビューkとビューlとの間のマッチ誤差である。マッチ誤差は、例えば、単一ピクセルに対する色差に依存し得るか、あるいは、ピクセル位置（u,v）の周りの空間平均として計算され得る。ビュー（l≠k）にわたる最小のマッチ誤差を計算する代わりに、平均値や中央値などを使用してもよい。多くの実施形態では、この評価関数は、好ましくは、遮蔽によって引き起こされるマッチ誤差異常値に対してロバストである。 Specifically, the first weight may depend on a match error metric, which is a function of a separate match error from the filtered view to all other views (l ≠ k). An example of a metric for determining weights for candidate z _i is:

Here, e _kl (z _i ) is the match error between the view k and the view l given the candidate z _i . The match error can depend, for example, on the color difference for a single pixel, or it can be calculated as a spatial average around the pixel position (u, v). Instead of calculating the minimum match error over the view (l ≠ k), you may use the mean, median, and so on. In many embodiments, this merit function is preferably robust to match error outliers caused by shielding.

多くの実施形態では、第2画像について、すなわち、第1候補奥行き値が生成されたビューについて、第2マッチ誤差指標を決定することができる。この第2マッチ誤差指標の決定は、第1マッチ誤差指標について説明したものと同じアプローチを使用することができ、第2マッチ誤差指標は、第2奥行きマップ位置に関する第2画像中の画像ピクセル値と第1奥行きマップ位置に関する第1画像中の画像ピクセル値との間の差を示すように生成されることができる。 In many embodiments, the second match error index can be determined for the second image, i.e., for the view in which the first candidate depth value was generated. The determination of this second match error index can use the same approach described for the first match error index, where the second match error index is the image pixel value in the second image with respect to the second depth map position. Can be generated to show the difference between and the image pixel value in the first image with respect to the first depth map position.

そして、第1重みは、第1マッチ誤差指標および第2マッチ誤差指標の両方（ならびに、おそらく他のマッチ誤差指標またはパラメータ）に応じて決定され得る。 The first weight can then be determined according to both the first match error index and the second match error index (and perhaps other match error indicators or parameters).

いくつかの実施形態では、この重み決定は、例えば、平均マッチ誤差指標を考慮するだけでなく、マッチ指標間の相対的な差を考慮することもできる。例えば、第1マッチ誤差指標が比較的小さいが、第2マッチ誤差指標が比較的大きい場合、これは第1画像に関して第2画像に生じる（しかし、第3の画像には生じない）遮蔽に起因する可能性がある。したがって、第1重みは、低減されるか、ゼロに設定されてもよい。 In some embodiments, this weighting determination can take into account, for example, not only the average match error index, but also the relative differences between the match indicators. For example, if the first match error index is relatively small, but the second match error index is relatively large, this is due to the obstruction that occurs in the second image with respect to the first image (but not in the third image). there's a possibility that. Therefore, the first weight may be reduced or set to zero.

重み考慮の他の例は、例えば、中央値マッチ誤差または別の変位値などの統計的尺度を使用することができる。上記と同様の論拠がここで適用される。たとえば、全て同じ方向を見ている9台のカメラの直線状のカメラアレイがある場合、中央のカメラに対して、オブジェクトエッジの周囲において、左にある4つのアンカー、または右にある4つのアンカーは、常に遮蔽されていない領域を見ることになる。この場合、或る候補に対する良好なトータルの重みは、全8つのマッチ誤差のうちの4つの最も低いもののみの関数であることができる。 Other examples of weight consideration can use statistical measures such as, for example, a median match error or another displacement value. The same rationale as above applies here. For example, if you have a linear camera array of nine cameras, all looking in the same direction, four anchors on the left or four anchors on the right around the object edge with respect to the center camera. Will always see the unobstructed area. In this case, the good total weight for a candidate can be a function of only the four lowest of all eight match errors.

多くの実施形態では、重み付け結合は、第1奥行きマップ自体の他の奥行き値を含むことができる。具体的には、第1奥行き位置の周りの第1奥行きマップ中の奥行きピクセルのセットが重み付け結合に含まれ得る。例えば、所定の空間カーネルを第1奥行きマップに適用して、第1奥行きマップのローパスフィルタリングを行うことができる。空間的にローパスフィルタリングされた第1奥行きマップ値および他のビューからの候補奥行き値の重み付けは、例えば、ローパスフィルタリングされた奥行き値に固定の重みを適用し、第1候補奥行き値に対して可変の第1重みを適用することによって、適合されてもよい。 In many embodiments, the weighted join can include other depth values of the first depth map itself. Specifically, the weighted join may include a set of depth pixels in the first depth map around the first depth position. For example, a given spatial kernel can be applied to the first depth map for lowpass filtering of the first depth map. The weighting of spatially lowpass filtered first depth map values and candidate depth values from other views is variable, for example, by applying a fixed weight to the lowpass filtered depth values and relative to the first candidate depth value. May be fitted by applying the first weight of.

多くの実施形態では、重み、具体的には第1重みの決定は、奥行き値の信頼値にも依存する。 In many embodiments, the determination of the weight, specifically the first weight, also depends on the confidence value of the depth value.

奥行きの推定と測定値は、本質的にノイズが多く、さまざまな誤差や変動が存在する可能性がある。多くの奥行き推定および測定アルゴリズムは、奥行き推定に加えて、提供された奥行き推定がどの程度信頼できるかを示す信頼値も生成することができる。例えば、視差推定は、異なる画像における一致領域を検出することに基づくことができ、信頼値は、一致領域がどの程度類似しているかを反映するように生成されることができる。 Depth estimates and measurements are inherently noisy and can have a variety of errors and variations. In addition to depth estimation, many depth estimation and measurement algorithms can also generate confidence values that indicate how reliable the depth estimation provided is. For example, parallax estimation can be based on detecting matching areas in different images, and confidence values can be generated to reflect how similar the matching areas are.

信頼値は、異なる態様で使用されることができる。例えば、多くの実施形態では、第1候補奥行き値に対する第1重みは、第1候補奥行き値に対する信頼値、特に、第1候補奥行き値を生成するために使用された第2奥行き値に対する信頼値に依存し得る。第1重みは、第2奥行き値に対する信頼値の単調増加関数であってもよく、したがって、第1重みは、第1候補奥行き値を生成するために使用された基礎となる奥行き値（複数可）の信頼性が増加するにつれて増加することができる。従って、重み付け結合は、信頼性があり且つ正確であると考えられる奥行き値に向かってバイアスされ得る。 Confidence values can be used in different ways. For example, in many embodiments, the first weight for the first candidate depth value is the confidence value for the first candidate depth value, in particular the confidence value for the second depth value used to generate the first candidate depth value. Can depend on. The first weight may be a monotonically increasing function of confidence values for the second depth value, so the first weight is the underlying depth value (s) used to generate the first candidate depth value. ) Can be increased as the reliability increases. Therefore, the weighted coupling can be biased towards a depth value that is considered reliable and accurate.

実施形態では、奥行きマップに対する信頼値を用いて、どの奥行き値/ピクセルが更新され、どの奥行きピクセルに対して奥行き値が変わらないように維持されるかを選択することができる。具体的には、更新器203は、信頼値が閾値未満である第1奥行きマップの奥行き値/ピクセルのみを更新されるべきものとして選択するように構成されてもよい。 In embodiments, the confidence value for the depth map can be used to select which depth value / pixel is updated and for which depth pixel the depth value remains unchanged. Specifically, the updater 203 may be configured to select only the depth values / pixels of the first depth map whose confidence value is less than the threshold value as to be updated.

したがって、第1奥行きマップ中のすべてのピクセルを更新するのではなく、更新器203は、信頼できないと考えられる奥行き値を具体的に識別し、これらの値のみを更新する。これは、多くの実施形態において、例えば、非常に正確で信頼できる奥行き推定値が、他の視点からの奥行き値から生成されたより不確実な値に置き換えられることを防止することができるので、全体的な奥行きマップが改善される結果となり得る。 Therefore, instead of updating all the pixels in the first depth map, the updater 203 specifically identifies the depth values that are considered unreliable and updates only those values. This can prevent, for example, a very accurate and reliable depth estimate from being replaced by a more uncertain value generated from the depth value from another viewpoint in many embodiments. Depth map can be improved.

いくつかの実施形態では、重み付け結合に含まれる第2奥行きマップの奥行き値のセットは、異なる候補奥行き値または同じ候補奥行き値に寄与することによって、奥行き値の信頼値に依存し得る。具体的には、所与の閾値を超える信頼値を有する奥行き値のみを含めることができ、他のすべての奥行き値を処理から除外することができる。 In some embodiments, the set of depth values in the second depth map contained in the weighted join may depend on the confidence value of the depth values by contributing to different candidate depth values or the same candidate depth values. Specifically, only depth values with confidence values above a given threshold can be included, and all other depth values can be excluded from processing.

例えば、更新器203は、最初に、第2奥行きマップをスキャンし、信頼値が閾値未満であるすべての奥行き値を除去することによって、修正された第2奥行きマップを生成することができる。次に、修正された第2奥行きマップを使用して前述の処理を実行することができ、そのような第2奥行き値が第2奥行きマップに存在しない場合、第2奥行き値を必要とするすべての動作はバイパスされる。例えば、そのような値が存在しない場合、第2奥行き値に対する候補奥行き値は生成されない。 For example, updater 203 can generate a modified second depth map by first scanning the second depth map and removing all depth values whose confidence values are below the threshold. You can then use the modified second depth map to perform the above processing, and if no such second depth value is present in the second depth map, everything that requires a second depth value. Operation is bypassed. For example, if no such value exists, no candidate depth value for the second depth value is generated.

いくつかの実施形態では、更新器203が奥行き値に対する信頼値を生成するように構成されてもよい。 In some embodiments, the updater 203 may be configured to generate confidence values for depth values.

いくつかの実施形態では、所与の奥行きマップ中の所与の奥行き値に対する信頼値がこれらの奥行きマップ中の対応する位置に対する他の奥行きマップ中の奥行き値の変動に応じて決定されることができる。 In some embodiments, confidence values for a given depth value in a given depth map are determined in response to variations in depth values in other depth maps for corresponding positions in these depth maps. Can be done.

更新器203は、最初に、信頼値が決定された所与の奥行き値についての奥行きマップ位置を、複数の他の奥行きマップ中の対応する位置に、典型的にはこれらのすべてに、投影することができる。 The updater 203 first projects a depth map position for a given depth value for which a confidence value has been determined to a corresponding position in a plurality of other depth maps, typically all of them. be able to.

具体的には、奥行きマップk中の画像座標（u, v）_kにおける所与の奥行き値について、（典型的には隣接するビューのための）他の奥行きマップのセットＬが決定される。これらの奥行きマップの各々（l∈L）について、対応する画像座標(u, v)_l（l∈L）が再投影によって計算される。 Specifically, for a given depth value at image coordinates (u, v) _k in the depth map k, another set L of depth maps (typically for adjacent views) is determined. For each of these depth maps (l ∈ L), the corresponding image coordinates (u, v) _l (l ∈ L) are calculated by reprojection.

次に、更新器203は、これらの対応する位置におけるこれらの他の奥行きマップ中の奥行き値を考慮することができる。対応する位置におけるこれらの奥行き値の変動尺度を決定することに進むことができる。変動の任意の適切な尺度（例えば、分散尺度）が使用され得る。 The updater 203 can then take into account the depth values in these other depth maps at these corresponding locations. We can proceed to determine the variability scale of these depth values at the corresponding positions. Any suitable measure of variability (eg, variance measure) can be used.

次いで、更新器203は、この変動尺度から所与の奥行きマップ位置の信頼値を決定することに進むことができ、具体的には、変動の程度の増加が信頼値の減少を示すことができる。したがって、信頼値は、変動尺度の単調減少関数であってもよい。 The updater 203 can then proceed from this variability scale to determine a confidence value for a given depth map position, specifically an increase in the degree of variability can indicate a decrease in confidence value. .. Therefore, the confidence value may be a monotonically decreasing function of the variability scale.

具体的には、奥行き値z_kと、（u, v）_lにおける対応する隣接する奥行き値z_l(l∈L)のセットとが与えられると、これらの奥行き値の一貫性に基づいて信頼メトリックを計算することができる。例えば、これらの奥行き値の分散を信頼度メトリックとして使用することができる。この場合、分散が小さいことは、信頼度が高いことを意味する。 Specifically, given a depth value z _k and a set of corresponding adjacent depth values z _l (l ∈ L) at (u, v) _l , the trust is based on the consistency of these depth values. You can calculate the metric. For example, the variance of these depth values can be used as a confidence metric. In this case, a small variance means a high degree of reliability.

この決定を、対応する画像座標（u, v）_kがシーン内のオブジェクトまたはカメラ境界によって遮蔽される可能性があることから生じる可能性がある外れ値に対して、よりロバストにすることが望ましいことが多い。これを実現する1 つの具体的な方法は、カメラビュー（k）の両側の2つの隣接するビューl₀とl₁を選択し、奥行きの差の最小値を使用することである：

It is desirable to make this decision more robust to outliers that may result from the corresponding image coordinates (u, v) _k being occluded by objects or camera boundaries in the scene. Often. One concrete way to do this is to select two adjacent views l ₀ and l ₁ on either side of the camera view (k) and use the minimum depth difference:

いくつかの実施形態では、所与の奥行きマップ中の所与の奥行き値に対する信頼値は、対応する所与の奥行き位置を別の奥行きマップに投影し、次いで2つの奥行きマップの2つの奥行き値を使用してそれを投影し戻すことから生じる誤差を評価することによって、決定されることができる。 In some embodiments, the confidence value for a given depth value in a given depth map projects the corresponding given depth position onto another depth map, and then the two depth values in the two depth maps. Can be determined by assessing the error resulting from projecting it back using.

したがって、更新器203は、最初に、所与の奥行き値に基づいて、所与の奥行きマップ位置を別の奥行きマップに投影することができる。次に、この投影された位置における奥行き値が検索され、他方の奥行きマップ中の位置が、この他方の奥行き値に基づいて元の奥行きマップに再投影される。これは、（例えば、カメラ及びキャプチャの特性及びジオメトリを考慮に入れて）投影に対する2つの奥行き値が完全に一致した場合、元の奥行きマップ位置と正確に同じであるテスト位置をもたらす。ただし、ノイズや誤差があると、2 つの位置の間に差が生じる。 Thus, the updater 203 can initially project a given depth map position onto another depth map based on a given depth value. The depth value at this projected position is then retrieved and the position in the other depth map is reprojected to the original depth map based on the other depth value. This results in a test position that is exactly the same as the original depth map position if the two depth values for the projection exactly match (eg, taking into account the characteristics and geometry of the camera and capture). However, noise and error can make a difference between the two positions.

したがって、更新器203は、所与の奥行きマップ位置とテスト位置との間の距離に応じて、所与の奥行きマップ位置の信頼値を決定することに進むことができる。距離が小さいほど信頼値は高くなり、したがって、信頼値は距離の単調減少関数として決定され得る。多くの実施形態では、複数の他の奥行きマップ、したがって距離を考慮に入れることができる。 Therefore, the updater 203 can proceed to determine the confidence value of a given depth map position depending on the distance between the given depth map position and the test position. The smaller the distance, the higher the confidence value, so the confidence value can be determined as a monotonous decrease function of the distance. In many embodiments, multiple other depth maps, and thus distances, can be taken into account.

したがって、いくつかの実施形態では、信頼値は、動きベクトルの幾何学的一貫性に基づいて決定され得る。d_klを、奥行きz_kが与えられたピクセル(u, v)_kを隣接ビューlに移す二次元動きベクトルとする。隣接するビューl中の各々の対応するピクセル位置(u, v)_lは、それぞれの奥行きz_lを持ち、その結果、ビューkに戻るベクトルd_lkが得られる。理想的な場合、誤差がゼロであれば、これらのベクトルは全て、元の点(u, v)_kに正確にマッピングして戻る。しかし、一般には、これは当てはまらず、確かに信頼性の低い領域については当てはまらない。したがって、信頼性の欠如についての良好な尺度は、逆投影位置における平均誤差である。この誤差メトリックは、以下のように定式化することができる:

ここで、f((u,v)_l; z_l)は、奥行き値z_lを用いた隣接するビューlからビューkへ逆投影された画像座標を示す。ノルム

は、L1またはL2、あるいは任意の他のノルムであってもよい。信頼値は、この値の単調減少関数として決定されることができる。 Therefore, in some embodiments, the confidence value can be determined based on the geometrical consistency of the motion vector. Let d _kl be a two-dimensional motion vector that moves the pixel (u, v) _k given the depth z _k to the adjacent view l. Each corresponding pixel position (u, v) _l in the adjacent view l has its own depth z _l , resulting in the vector d _lk returning to the view k. Ideally, if the error is zero, all of these vectors map back exactly to the original point (u, v) _k . But in general, this is not the case, and certainly not the case for unreliable areas. Therefore, a good measure of the lack of reliability is the mean error at the back projection position. This error metric can be formulated as follows:

Here, f ((u, v) _l ; z _l ) indicates the image coordinates back-projected from the adjacent view l to the view k using the depth value z _l . Norm

May be L1 or L2, or any other norm. The confidence value can be determined as a monotonically decreasing function of this value.

用語「候補」は、奥行き値に関するいかなる制限も意味せず、用語「候補奥行き値」は重み付け結合に含まれる任意の奥行き値を指し得ることが理解されるであろう。 It will be appreciated that the term "candidate" does not imply any limitation on the depth value and the term "candidate depth value" can refer to any depth value contained in the weighted join.

明確にするための上記の説明は、異なる機能回路、ユニットおよびプロセッサを参照して本発明の実施形態を説明したことが理解されるであろう。しかしながら、本発明から逸脱することなく、異なる機能回路、ユニットまたはプロセッサ間での機能の任意の適切な分散を使用できることは明らかであろう。例えば、別個のプロセッサまたはコントローラによって実行されることが示されている機能が同じプロセッサまたはコントローラによって実行されてもよい。したがって、特定の機能ユニットまたは回路への言及は、厳密な論理的または物理的構造または編成を示すのではなく、説明された機能を提供するための適切な手段への言及としてのみ見なされるべきである。 It will be appreciated that the above description for clarity has described embodiments of the invention with reference to different functional circuits, units and processors. However, it will be clear that any suitable distribution of functions between different functional circuits, units or processors can be used without departing from the present invention. For example, a function that has been shown to be performed by a separate processor or controller may be performed by the same processor or controller. Therefore, references to specific functional units or circuits should be viewed only as references to appropriate means for providing the described functionality, rather than indicating a strict logical or physical structure or organization. be.

本発明は、ハードウェア、ソフトウェア、ファームウェアまたはこれらの任意の組み合せを含む任意の適切な形態で実施することができる。本発明は、オプションで、1つまたは複数のデータプロセッサおよび/またはデジタル信号プロセッサ上で実行されるコンピュータソフトウェアとして少なくとも部分的に実装され得る。本発明の実施形態の要素およびコンポーネントは、任意の適切な方法で物理的、機能的および論理的に実装され得る。実際、機能は、単一のユニットで、複数のユニットで、または他の機能ユニットの一部として実装されてもよい。したがって、本発明は、単一のユニットで実施されてもよく、または異なるユニット、回路およびプロセッサの間で物理的および機能的に分散されてもよい。 The present invention can be implemented in any suitable form, including hardware, software, firmware or any combination thereof. The invention may optionally be implemented at least partially as computer software running on one or more data processors and / or digital signal processors. The elements and components of embodiments of the invention may be physically, functionally and logically implemented in any suitable manner. In fact, a function may be implemented in a single unit, in multiple units, or as part of another functional unit. Accordingly, the invention may be implemented in a single unit or may be physically and functionally distributed among different units, circuits and processors.

本発明はいくつかの実施形態に関連して説明されてきたが、本明細書に記載された特定の形態に限定されることは意図されていない。むしろ、本発明の範囲は、添付の特許請求の範囲によってのみ限定される。さらに、或る特徴が特定の実施形態に関連して説明されるように見えるかもしれないが、当業者は説明された実施形態の様々な特徴が本発明に従って組み合わされ得ることを認識するであろう。請求項において、「有する(comprising)」という用語は、他の要素又はステップの存在を排除するものではない。 The present invention has been described in connection with some embodiments, but is not intended to be limited to the particular embodiments described herein. Rather, the scope of the invention is limited only by the appended claims. Further, while certain features may appear to be described in relation to a particular embodiment, one of ordinary skill in the art will recognize that various features of the described embodiments can be combined according to the present invention. Let's do it. In the claims, the term "comprising" does not preclude the existence of other elements or steps.

さらに、個別に列挙されているが、複数の手段、素子、回路または方法ステップが、例えば単一の回路、ユニットまたはプロセッサによって実装され得る。さらに、個々の特徴が異なる請求項に含まれている場合があるが、これらは場合によっては有利に組み合わされてもよく、異なる請求項に含まれることは特徴の組み合わせが実現可能ではない及び/又は有利ではないことを意味しない。また、或る特徴を請求項の1つのカテゴリに含めることは、このカテゴリへの限定を意味するものではなく、むしろ、その特徴が必要に応じて他の請求項カテゴリに等しく適用可能であることを示す。さらに、請求項における特徴の順序は、当該特徴が動作しなければならない特定の順序を意味するものではなく、特に、方法の請求項における個々のステップの順序は、当該ステップがこの順序で実行されなければならないことを意味するものではない。むしろ、ステップは任意の適切な順序で実行されることができる。さらに、単数への言及は複数を除外しない。したがって、「a」、「an」、「第1」、「第2」等という用語は複数を排除するものではなく、「第1」、「第2」、「第3」等という用語はラベルとして使用されるため、対応する特徴を明確に特定するため以外の制限を意味するものではなく、いかなる方法によってもクレームの範囲を制限するものと解釈されるべきではない。請求項中の参照符号は、単に明確な例として提供されているにすぎず、請求の範囲を何らかの態様で限定するものと解釈してはならない。 Further, although listed individually, multiple means, elements, circuits or method steps may be implemented, for example, by a single circuit, unit or processor. Further, individual features may be included in different claims, which may be advantageously combined in some cases, and inclusion in different claims is not feasible and / Or it does not mean that it is not advantageous. Also, including a feature in one category of claims does not imply a limitation to this category, but rather that the feature is equally applicable to other claims categories as needed. Is shown. Furthermore, the order of the features in the claims does not mean the particular order in which the features must operate, and in particular the order of the individual steps in the claims of the method is such that the steps are performed in this order. It does not mean that it must be done. Rather, the steps can be performed in any suitable order. Moreover, references to the singular do not exclude multiple. Therefore, the terms "a", "an", "first", "second", etc. do not exclude multiple terms, and the terms "first", "second", "third", etc. are labels. As used as, it does not imply any limitation other than to explicitly identify the corresponding feature and should not be construed as limiting the scope of the claim in any way. The reference numerals in the claims are provided merely as clear examples and should not be construed as limiting the scope of the claims in any way.

Claims

How to handle a depth map,
Steps to receive multiple images representing scenes from different view poses and corresponding depth maps, and
A step of updating the depth value of the first depth map of the corresponding depth map based on the depth value of at least the second depth map of the corresponding depth map, wherein the first depth map is the first. The step and the step, which is for one image and the second depth map is for the second image,
Have,
The step to update is
It is a step of determining the first candidate depth value for the first depth pixel of the first depth map at the first depth map position in the first depth map, and the first candidate depth value is the second depth map. A step and a step determined according to at least one second depth value of the second depth pixel of the second depth map at the second depth map position in.
A step of determining the first depth value of the first depth pixel by weighted coupling of a plurality of depth candidate values for the first depth map position, wherein the weighted coupling is weighted by the first weight. Steps and steps, including the first candidate depth value,
Have,
The step of determining the first depth value is
A step of determining the first image position in the first image with respect to the first depth map position, and
It is a step of determining the third image position in the third image among the plurality of images, and the third image position is to the third image of the first image position based on the first candidate depth value. And the steps that correspond to the projection of
A step of determining a first match error index indicating the difference between the image pixel value in the third image with respect to the third image position and the image pixel value in the first image with respect to the first image position.
The step of determining the first weight according to the first match error index, and
The method.

The step of determining the first candidate depth value is the first view pose of the first image and the first view pose based on at least one of the second value and the first original depth value of the first depth map. 2. The method of claim 1, comprising determining the second depth map position relative to the first depth map position by projection between the second view pose of the image.

The method of claim 1 or 2, wherein the weighted coupling comprises a candidate depth value determined from a region of the second depth map determined according to the first depth map position.

The area of the second depth map is determined as an area around the second depth map position, and the second depth map position is equal to the first depth map position in the first depth map. The method of claim 3, wherein the depth map position in the map is determined.

In the second depth map, the region of the second depth map is determined by projection from the first depth map position based on the original depth value in the first depth map at the first depth map position. The method of claim 3, wherein the area around the position is determined.

A step of determining a second match error index that indicates the difference between the image pixel value in the second image for the second depth map position and the image pixel value in the first image for the first depth map position. The method according to any one of claims 1 to 5, wherein the step of determining the first weight is performed according to the second match error index.

An additional match error indicating the difference between an image pixel value in another image for the depth map position corresponding to the first depth map position and the image pixel value in the first image for the first depth map position. The method according to any one of claims 1 to 6, further comprising a step of determining an index, wherein the step of determining the first weight is performed in accordance with the additional match error index.

The method of any one of claims 1 to 7, wherein the weighted coupling comprises a depth value of the first depth map in a region around the first depth map position.

The method according to any one of claims 1 to 8, wherein the first weight depends on the confidence value of the first candidate depth value.

The method according to any one of claims 1 to 9, wherein only the depth value of the first depth map whose reliability value is below the threshold value is updated.

Further steps to select the set of depth values in the second depth map for inclusion in the weighted coupling, according to the requirement that the depth values in the set of depth values must have a confidence value greater than or equal to the threshold. The method according to any one of claims 1 to 10.

A step of projecting a given depth map position for a given depth value in a given depth map onto the corresponding positions in the corresponding depth map.
A step of determining a variability scale for a set of depth values having said given depth value and the depth value at said corresponding position in the plurality of said corresponding depth maps.
The step of determining the confidence value for the given depth map position according to the variation scale,
The method according to any one of claims 1 to 11, further comprising.

A step of projecting a given depth map position for a given depth value in a given depth map onto a corresponding position in another depth map, the projection being based on said given depth value. , Steps and
A step of projecting the corresponding position in the other depth map to a test position in the given depth map, wherein the projection is a depth value at the corresponding position in the other depth map. Based on the steps and
A step of determining a confidence value for a given depth map position according to the distance between the given depth map position and the test position.
The method according to any one of claims 1 to 12, further comprising.

A device for processing depth maps,
A receiver for receiving multiple images representing scenes from different view poses and a corresponding depth map,
An updater for performing a step of updating the depth value of the first depth map of the corresponding depth map based on at least the depth value of the second depth map of the corresponding depth map. The updater and the updater, the first depth map is for the first image and the second depth map is for the second image.
Have,
The updating step is a step of determining the first candidate depth value for the first depth pixel of the first depth map at the first depth map position in the first depth map, and the first candidate depth value is , A step, which is determined according to at least one second depth value of the second depth pixel of the second depth map at the second depth map position in the second depth map.
A step of determining the first depth value of the first depth pixel by weighted coupling of a plurality of candidate depth values for the first depth map position, wherein the weighted coupling is weighted by the first weight. Steps and steps, including the first candidate depth value,
Have,
The step of determining the first depth value is
A step of determining the first image position in the first image with respect to the first depth map position, and
It is a step of determining the third image position in the third image among the plurality of images, and the third image position is to the third image of the first image position based on the first candidate depth value. And the steps that correspond to the projection of
A step of determining a first match error index indicating the difference between the image pixel value in the third image with respect to the third image position and the image pixel value in the first image with respect to the first image position.
The step of determining the first weight according to the first match error index, and
A device with.

A computer program that is executed by a computer and causes the computer to perform the method according to any one of claims 1 to 13.