JP2013114682A

JP2013114682A - Method for generating virtual image

Info

Publication number: JP2013114682A
Application number: JP2012251455A
Authority: JP
Inventors: Dong Tian; ドン・ティアン; Yongzhe Wang; ヨンゼ・ワン; Vetro Anthony; アンソニー・ヴェトロ
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2011-11-30
Filing date: 2012-11-15
Publication date: 2013-06-10
Anticipated expiration: 2032-11-15
Also published as: JP5840114B2

Abstract

PROBLEM TO BE SOLVED: To facilitate countermeasures to an error in a depth map image in rendering based on a depth image.SOLUTION: The image of a virtual view of a scene is generated on the basis of a set of texture images obtained from the scene and a corresponding set of depth images. A set of candidate depths associated with the respective pixels of the selected image is searched. Costs that estimate the synthesis quality of the virtual image are determined for each candidate depth. The candidate depth having the minimum costs is selected, and the optimal depth of each pixel is obtained. Then, virtual images are synthesized on the basis of the optimal depth of each pixel and the texture images. First depth emphasis and the second depth emphasis are applied prior to view synthesis and during view synthesis so that an error can be corrected, or a noise due to the estimation or acquisition of the fine depth image and rough depth characteristics can be suppressed.

Description

本発明は、包括的には奥行き画像に基づくレンダリング（ＤＩＢＲ）に関し、より詳細には、トレリス構造を用いて奥行き画像を強調する方法に関する。 The present invention relates generally to depth image based rendering (DIBR), and more particularly to a method for enhancing a depth image using a trellis structure.

３Ｄディスプレイは、各眼の３Ｄシーンの異なるビューの画像を提示する。従来のステレオシステムでは、左のビュー及び右のビューの画像が取得され、符号化され、格納又は送信され、その後、復号化され表示される。より進んだシステムでは、既存の入力ビューとは異なる視点を有する仮想画像を合成して、向上した（または強調した（enhanced））３Ｄ特徴、例えば、３Ｄステレオ表示の知覚奥行きの調整、及びシーンの新規の仮想ビューのための多数の仮想画像の生成を可能にし、マルチビュー自動立体ディスプレイをサポートすることができる。 The 3D display presents images of different views of the 3D scene for each eye. In a conventional stereo system, the left view and right view images are acquired, encoded, stored or transmitted, and then decoded and displayed. In more advanced systems, virtual images having different viewpoints than existing input views are synthesized to improve (or enhance) 3D features, such as adjusting the perceived depth of a 3D stereo display, and Enables the generation of multiple virtual images for new virtual views and can support multi-view autostereoscopic displays.

奥行き画像に基づくレンダリング（ＤＩＢＲ）は、仮想画像を合成する方法であり、通常、シーンの奥行き画像を必要とする。奥行き画像はノイズを含む可能性があり、これによってレンダリング画像内にアーティファクト（artifacts）が生じる可能性があり、ピクセルレベルの奥行き画像は、通常、オブジェクト境界において発生する奥行き不連続部を必ずしも表すことができず、これはレンダリング画像内のアーティファクトの別の発生源となる。 Depth image based rendering (DIBR) is a method of synthesizing virtual images and typically requires a depth image of the scene. Depth images can contain noise, which can result in artifacts in the rendered image, and pixel-level depth images usually do not necessarily represent depth discontinuities that occur at object boundaries This is another source of artifacts in the rendered image.

図１に示すように、従来技術によるビュー合成はワーピングステップ１１０を含み、このワーピングステップ１１０において、仮想位置に対応するピクセルが、シーンのジオメトリに基づいて、基準入力画像１０１及び１０２、すなわち基準画像のテクスチャ画像及び奥行き画像から、ワーピング画像にワーピング（warping）される。テクスチャ画像において、各ピクセル（サンプル）は２Ｄロケーション及び輝度を有し、輝度は、３（ＲＧＢ）チャネルが用いられる場合、色とすることができる。奥行き画像において、２Ｄロケーションにある各ピクセルは、カメラからシーンの最近点への奥行きである。 As shown in FIG. 1, the prior art view synthesis includes a warping step 110, in which pixels corresponding to virtual locations are based on the geometry of the scene, the reference input images 101 and 102, ie the reference image. The warped image is warped from the texture image and the depth image. In a texture image, each pixel (sample) has a 2D location and brightness, and the brightness can be a color when 3 (RGB) channels are used. In the depth image, each pixel at the 2D location is the depth from the camera to the closest point of the scene.

ブレンディング（blending）１２０の間、入力視点ごとに、ワーピング画像が単一の画像に結合される。穴埋め１３０は、ブレンディング画像内の任意の残った穴を埋め、合成された仮想画像１０３を生成する。ブレンディングは、そこから合成仮想画像が生成される入力視点が複数存在するときのみ実行される。 During blending 120, for each input viewpoint, the warped images are combined into a single image. The hole filling 130 fills any remaining holes in the blended image and generates a combined virtual image 103. Blending is executed only when there are a plurality of input viewpoints from which a synthesized virtual image is generated.

ワーピングステップは、前方ワーピング及び後方ワーピングを含むことができる。前方ワーピングでは、基準画像内のピクセルは、３Ｄ射影を介して仮想画像にマッピングされる。後方ワーピングでは、基準画像内のピクセルは仮想画像に直接マッピングされない。代わりに、奥行きが仮想画像にマッピングされ、次にワーピングされた奥行き画像を用いて、仮想画像内のピクセルロケーションごとに基準画像内の対応するピクセルを求める。 The warping step can include forward warping and backward warping. In forward warping, pixels in the reference image are mapped to a virtual image via 3D projection. In backward warping, pixels in the reference image are not directly mapped to the virtual image. Instead, the depth is mapped to the virtual image, and then the warped depth image is used to determine a corresponding pixel in the reference image for each pixel location in the virtual image.

仮想画像内のピクセルのほとんどは、ワーピングプロセス後にマッピングされる。しかしながら、幾つかのピクセルは対応するマッピングされた奥行きを一切有しない。これは、或る視点から別の視点へのディスオクルージョン（disocclusion：非遮蔽）によって生じる。マッピングされた奥行きのないピクセルは、仮想画像内の穴として知られている。 Most of the pixels in the virtual image are mapped after the warping process. However, some pixels do not have any corresponding mapped depth. This is caused by disocclusion from one viewpoint to another. The mapped depthless pixels are known as holes in the virtual image.

複数の入力基準画像が存在するとき、ブレンディングを用いてワーピング結果を単一の画像に統合（merge）する。幾つかの穴は、このステップ中に補完的に埋めることができる。すなわち、左の基準画像の穴は、右の基準画像からマッピングされた値を有することができる。加えて、ブレンディングによって、異なる基準画像からの異なるマッピング値が存在するときに生じるマッピング衝突（mapping conflicts）を解消することもできる。例えば、加重平均を適用することができるか、又は基準画像に対する仮想視点ロケーションの近接度に依拠してマッピング値のうちの１つが選択される。 When multiple input reference images exist, the warping results are merged into a single image using blending. Some holes can be complementarily filled during this step. That is, the holes in the left reference image can have values mapped from the right reference image. In addition, blending can also eliminate mapping conflicts that occur when there are different mapping values from different reference images. For example, a weighted average can be applied or one of the mapping values is selected depending on the proximity of the virtual viewpoint location to the reference image.

ブレンディングプロセスの後、幾つかの穴が残っている。このため、最終的な穴埋めが必要となる。例えば、インペインティングを用いて周囲のピクセル値を残りの穴の中に広げる（propagate）ことができる。１つの実施態様（implementation）では、背景ピクセルを小さな穴の中に広げる。 Some holes remain after the blending process. For this reason, final hole filling is necessary. For example, inpainting can be used to propagate surrounding pixel values into the remaining holes. In one implementation, background pixels are spread into small holes.

従来技術の方法は、奥行きマップ画像におけるエラーに対処することができない。したがって、合成画像に境界アーティファクトがなくなり、かつ合成画像が、入力画像内に存在する画像特徴と幾何学的に一致するように、合成画像の品質を改善するためのより正確なビュー合成が必要とされている。 Prior art methods cannot handle errors in depth map images. Thus, there is a need for more accurate view synthesis to improve the quality of the composite image so that the composite image is free of boundary artifacts and the composite image geometrically matches the image features present in the input image. Has been.

ビュー合成は、自由視点ナビゲーション、及び自動立体ディスプレイ用の画像生成を含む、複数の３Ｄビデオ用途に不可欠な機能である。この目的では通常、奥行き画像に基づくレンダリング（ＤＩＢＲ）法が適用される。 View composition is an essential feature for multiple 3D video applications, including free viewpoint navigation and image generation for autostereoscopic displays. For this purpose, a depth image based rendering (DIBR) method is usually applied.

しかしながら、レンダリング画像の品質は奥行き画像の品質に非常に敏感であり、奥行き画像は通常、エラーを起こしやすい傾向のあるプロセスによって推定される。さらに、ピクセルごとの奥行き画像は、特に奥行き境界に沿った３Ｄシーンの理想的表現ではない。この表現によって、オクルージョン領域を有するシーンの合成結果が不自然になる可能性がある。 However, the quality of the rendered image is very sensitive to the quality of the depth image, which is usually estimated by a process that is prone to error. Furthermore, the depth image per pixel is not an ideal representation of a 3D scene, especially along the depth boundary. This expression may cause an unnatural result of a scene having an occlusion area.

本発明の実施の形態は、奥行き画像における上記の限界を克服し、レンダリング画像内のアーティファクトを低減することができる、トレリスに基づくビュー合成方法を提供する。この方法により、ワーピングされる必要があるピクセルごとに、奥行きの候補セットが、そのピクセルの推定奥行き及び近傍の奥行きに基づいて特定される。各候補奥行きのコストが、合成品質の推定値に基づいて定量化される。次に、最適期待品質を有する候補奥行きが選択される。 Embodiments of the present invention provide a trellis-based view synthesis method that can overcome the above limitations in depth images and reduce artifacts in rendered images. In this way, for each pixel that needs to be warped, a candidate set of depths is identified based on the estimated depth of that pixel and the neighboring depths. The cost of each candidate depth is quantified based on the composite quality estimate. Next, a candidate depth having the optimal expected quality is selected.

本方法は、ビュー合成の前及びビュー合成中に第１の奥行き強調及び第２の奥行き強調を適用し、誤りを補正するか、又は密な奥行き画像及び疎な奥行き特徴の推定若しくは取得に起因したノイズを抑制する。 The method applies first depth enhancement and second depth enhancement prior to and during view synthesis to correct errors or due to estimation or acquisition of dense depth images and sparse depth features. Suppress noise.

従来技術のビュー合成方法のブロック図である。It is a block diagram of a conventional view synthesis method. 本発明の実施形態に従って構築されたビュー合成のトレリスの概略図である。FIG. 4 is a schematic diagram of a view synthesis trellis constructed in accordance with an embodiment of the present invention. 本発明の実施形態による、次のピクセルの奥行きを予測するのに用いられる近傍のピクセルの概略図である。FIG. 4 is a schematic diagram of neighboring pixels used to predict the depth of the next pixel, in accordance with an embodiment of the present invention. 本発明の実施形態による、次のピクセルの奥行きを予測するのに用いられる近傍のピクセルの別の概略図である。FIG. 6 is another schematic diagram of neighboring pixels used to predict the depth of the next pixel, in accordance with an embodiment of the present invention. 本発明の実施形態による、次のピクセルの奥行きを予測するのに用いられる近傍のピクセルの別の概略図である。FIG. 6 is another schematic diagram of neighboring pixels used to predict the depth of the next pixel, in accordance with an embodiment of the present invention. 本発明の実施形態による、異なるコスト関数を割り当てられた奥行き境界を増減することの概略図である。FIG. 6 is a schematic diagram of increasing or decreasing depth boundaries assigned different cost functions according to an embodiment of the present invention. 本発明の実施形態による、トレリスに基づくビュー合成の方法のフローチャートである。3 is a flowchart of a trellis-based view synthesis method according to an embodiment of the present invention; 本発明の実施形態による、トレリスに基づくビュー合成の非反復的方法のフローチャートである。4 is a flowchart of a non-iterative method of trellis-based view synthesis according to an embodiment of the present invention. 本発明の実施形態による、トレリスに基づくビュー合成の反復的方法のフローチャートである。4 is a flowchart of an iterative method of trellis-based view synthesis according to an embodiment of the present invention. 本発明の実施形態による、密な奥行き推定と、疎な奥行き推定と、トレリスに基づくビュー合成とを含むシステムのブロック図である。1 is a block diagram of a system that includes dense depth estimation, sparse depth estimation, and trellis-based view synthesis in accordance with an embodiment of the present invention. 本発明の実施形態による、密な奥行き画像と、疎な奥行き特徴とに基づいた、トレリスに基づくビュー合成のブロック図である。FIG. 4 is a block diagram of trellis-based view synthesis based on dense depth images and sparse depth features, according to embodiments of the invention. 本発明の実施形態による、奥行き強調方法を用いた、トレリスに基づくビュー合成のブロック図である。FIG. 4 is a block diagram of trellis-based view synthesis using a depth enhancement method according to an embodiment of the present invention. 色差コスト計算に用いられるピクセルの図である。It is a figure of the pixel used for color difference cost calculation.

奥行き画像は、推定プロセス又は取得プロセスによって生じたエラーを有する可能性がある。加えて、ピクセルごとの奥行き画像の表現は、奥行き不連続部では必ずしも正確であるとは限らない。 The depth image can have errors caused by the estimation process or the acquisition process. In addition, the representation of the depth image for each pixel is not always accurate at the depth discontinuity.

したがって、本発明の実施形態は、奥行き画像の表現及び推定における限界を克服するトレリスに基づくビュー合成方法を提供する。奥行き画像は、レンジカメラによって取得することもできるし、左テクスチャ画像及び右テクスチャ画像における立体表示対応（stereo disparity correspondences）から推定することもできる。本方法は、奥行き画像に基づくレンダリング（ＤＩＢＲ）のワーピングプロセス中に適用される。 Accordingly, embodiments of the present invention provide trellis-based view synthesis methods that overcome limitations in depth image representation and estimation. The depth image can be acquired by a range camera or can be estimated from stereo display correspondences (stereo disparity correspondences) in the left texture image and the right texture image. The method is applied during the depth image based rendering (DIBR) warping process.

図２は、本発明の実施形態によるビュー合成のために構築されたトレリス２０１の一例を示している。トレリス２０１は、所定の数のピクセルについて構築される。１つの実施形態では、画像ピクセルの１つのラインがトレリスに構成され、ワーピングプロセスはラインごとに実行される。すなわち、トレリスの各列は異なる奥行きＡ〜Ｄを有する１つの画像ピクセルを表す。トレリスの各列内のノードは、仮想画像内のそのピクセルの候補奥行きマッピングを表す。 FIG. 2 shows an example of a trellis 201 constructed for view synthesis according to an embodiment of the present invention. The trellis 201 is constructed for a predetermined number of pixels. In one embodiment, one line of image pixels is configured into a trellis and the warping process is performed line by line. That is, each column of the trellis represents one image pixel having a different depth A-D. A node in each column of the trellis represents a candidate depth mapping for that pixel in the virtual image.

第１のステップにおいて、候補奥行きのセット（Ａ、Ｂ、Ｃ、Ｄ）２０２がピクセルごとに特定される。セットは、入力奥行き画像から推定奥行き、及び近傍の奥行きに基づく幾つかの他の候補奥行きを含む。候補奥行きの数は、トレリス内の行数に対応する。図２では、各ピクセルはトレリス内の４つの行に対応する４つの奥行きＡ〜Ｄを有する。 In the first step, a set of candidate depths (A, B, C, D) 202 is identified for each pixel. The set includes an estimated depth from the input depth image and several other candidate depths based on neighboring depths. The number of candidate depths corresponds to the number of rows in the trellis. In FIG. 2, each pixel has four depths A-D corresponding to four rows in the trellis.

第２のステップにおいて、コスト関数を用いて合成品質を推定する。合成品質は最適候補奥行きを選択する判断基準である。 In the second step, the synthesis quality is estimated using a cost function. Composite quality is a criterion for selecting the optimal candidate depth.

候補奥行きのセットを求める
第１のステップにおいて、入力奥行き画像からの推定奥行きを含む候補奥行きのセットが特定される。この値に加えて、幾つかの他の候補奥行きが近傍の奥行きから特定される。候補奥行きは、入力奥行き画像からの推定奥行きが正しくないとき、すなわち奥行きによってアーティファクト又は入力画像との不一致が生じるときに用いることができる。以下で、最適候補奥行きを求める幾つかの方法を説明する。 Determining a set of candidate depths In a first step, a set of candidate depths including an estimated depth from an input depth image is identified. In addition to this value, several other candidate depths are identified from neighboring depths. The candidate depth can be used when the estimated depth from the input depth image is not correct, i.e., when the depth causes artifacts or mismatches with the input image. Hereinafter, several methods for obtaining the optimal candidate depth will be described.

候補奥行きのセットを求める１つの方法は、入力奥行き画像からの推定奥行きに対する所定の増加及び／又は減少を用いる方法である。例えば、推定奥行きが５０である場合、奥行きの候補セットは、｛４９，５０，５１｝を含むことができる。１以外の係数によるインクリメントも検討することができる。奥行きの数は可変とすることもでき、必ずしも推定奥行きを中心に対称である必要はなく、例えばセットは｛４６，４８，５０，５２，５４｝とすることもできるし、｛４８，４９，５０，５２，５４｝とすることもできる。候補奥行きは、ルックアップテーブルによって求めることもできる。ルックアップテーブルでは、候補奥行きは場合によっては推定奥行きごとに変動することができる。 One way to determine the set of candidate depths is to use a predetermined increase and / or decrease with respect to the estimated depth from the input depth image. For example, if the estimated depth is 50, the depth candidate set may include {49, 50, 51}. Incrementing by a factor other than 1 can also be considered. The number of depths can be variable, and is not necessarily symmetrical about the estimated depth. For example, the set can be {46, 48, 50, 52, 54}, or {48, 49, 50, 52, 54}. Candidate depth can also be obtained by a look-up table. In the look-up table, the candidate depth can vary for each estimated depth in some cases.

候補奥行きのセットを求める第２の方法は、近傍のピクセルからの奥行きに基づく予測値を用いる方法である。例えば、近傍の奥行きからの平均奥行き又は中央奥行きを用いることができる。所定のウィンドウサイズを用いて、予測において検討する近傍ピクセルの数を確定することもできる。 A second method for obtaining a set of candidate depths is a method that uses a prediction value based on depths from neighboring pixels. For example, an average depth or a central depth from neighboring depths can be used. A predetermined window size can also be used to determine the number of neighboring pixels considered in the prediction.

好ましい方法は、ウィンドウ内に、同じラインからの前のピクセルを含める。図３では、左からの同じライン内の４つのピクセル３０１がウィンドウ内にある。図４では、上のラインからの同じ列内の４つのピクセル４０１がウィンドウ内にある。図５では、ピクセル５０１の４×４のウィンドウが特定される。別の実施形態では、ピクセルは任意の形状に適合することができる。候補奥行きの数が増加する結果として、計算複雑度が増加する。なぜなら、各候補がチェックされ、比較されるためである。 The preferred method includes the previous pixel from the same line in the window. In FIG. 3, there are four pixels 301 in the same line from the left in the window. In FIG. 4, there are four pixels 401 in the same column from the top line in the window. In FIG. 5, a 4 × 4 window of pixel 501 is identified. In another embodiment, the pixels can conform to any shape. As a result of the increase in the number of candidate depths, the computational complexity increases. This is because each candidate is checked and compared.

図２では、ピクセルごとに候補奥行きの数が４に設定されている。１つの例では、奥行きＡ（下から１行目）は入力奥行き画像からの推定奥行きを表す。奥行きＢ及びＣ（中央の行２及び３）は、それぞれ奥行きＡを１だけ増加又は減少した奥行きである。奥行きＤ（最上行）は、図３に示すように近傍のピクセルからの中央奥行きを用いることによって予測された奥行きを示す。 In FIG. 2, the number of candidate depths is set to 4 for each pixel. In one example, the depth A (first line from the bottom) represents the estimated depth from the input depth image. Depths B and C (middle rows 2 and 3) are depths obtained by increasing or decreasing depth A by 1, respectively. Depth D (top row) indicates the depth predicted by using the central depth from neighboring pixels as shown in FIG.

動的プログラミングを用いたビュー合成
候補奥行きのセットが求められた後、トレリス内の各ノードは、合成品質を推定するコスト関数に従ってメトリックを割り当てられる。次に、ビュー合成問題は、トレリスにわたって奥行きの最適なセットを求めることによって解決される。例えば、動的プログラミングを用いて最適化問題を解決する。 View Synthesis Using Dynamic Programming After a set of candidate depths is determined, each node in the trellis is assigned a metric according to a cost function that estimates the synthesis quality. The view synthesis problem is then solved by finding an optimal set of depths across the trellis. For example, dynamic programming is used to solve optimization problems.

合成品質を推定するために、コスト関数として評価関数が定義される。コスト関数は、ワーピングプロセスが前方ワーピングであるか、又は後方ワーピングであるかに依拠することができる。一般性を損なうことなく、本発明の好ましい実施形態について後方ワーピングを仮定してコスト関数の定義を説明する。この定義は前方ワーピングに適用される。 In order to estimate the composite quality, an evaluation function is defined as a cost function. The cost function can depend on whether the warping process is forward warping or backward warping. Without loss of generality, the cost function definition will be described assuming backward warping for the preferred embodiment of the present invention. This definition applies to forward warping.

１つの実施態様では、コスト関数はピクセルの２つの正方形ブロック間の平均二乗誤差（ＭＳＥ）を評価する。これらのブロックはピクセルロケーションに対し左上のブロックである。（ｘ，ｙ）が現在のピクセルロケーションを表すものとし、（ｘ’，ｙ’）が候補奥行きを用いてワーピングされた位置を表すものとする。 In one implementation, the cost function evaluates the mean square error (MSE) between two square blocks of pixels. These blocks are the upper left blocks relative to the pixel location. Let (x, y) represent the current pixel location and (x ', y') represent the warped position using the candidate depth.

第１のブロックは合成仮想画像内の（ｘ−ｓ，ｙ−ｓ）〜（ｘ，ｙ）に位置し、第２のブロックは基準画像内の（ｘ’−ｓ，ｙ’−ｓ）〜（ｘ’，ｙ’）に位置する。ここで、ｓはブロックサイズである。ブロックの一部分が画像エリアを越えている場合、クロッピングが適用される。 The first block is located at (x-s, ys) to (x, y) in the synthesized virtual image, and the second block is (x'-s, y'-s) to Located at (x ′, y ′). Here, s is a block size. Cropping is applied if a part of the block exceeds the image area.

ＭＳＥ以外のエネルギー関数もコスト関数として用いることができる。例えば、平均絶対誤差は、合成品質を推定する有効なコスト関数である。また、画像特徴又は構造類似度（structural similarity measure）もブロックから抽出することができ、整合プロセスを用いてブロックが幾何学的に一致しているか否かを判断することができる。 An energy function other than MSE can also be used as a cost function. For example, the average absolute error is an effective cost function that estimates the composite quality. Image features or structural similarity measures can also be extracted from the blocks, and a matching process can be used to determine whether the blocks are geometrically consistent.

前景オブジェクトにおけるどのアーティファクトも、人間の眼によって、より容易に知覚されるので、前景のオブジェクトを一貫した方式で合成する方法が必要とされている。このため、本方法では、コストメトリックを求めるのに左上のブロックが必ずしも用いられるとは限らない。 Since any artifact in the foreground object is more easily perceived by the human eye, there is a need for a way to synthesize foreground objects in a consistent manner. For this reason, in this method, the upper left block is not always used to obtain the cost metric.

図６に示すように、ピクセルは３つのタイプのエリア、すなわち、図６に示すようなフラットエリア６０１、奥行き減少エリア６０２、及び奥行き増加エリア６０３に分類される。奥行き減少境界（図６の右境界）又はフラットエリアにあるピクセルの場合、左上ブロックが用いられる。奥行き増加境界（図６の左境界）にあるピクセルの場合、右上ブロックが用いられる。 As shown in FIG. 6, the pixels are classified into three types of areas, namely, a flat area 601, a depth reduction area 602, and a depth increase area 603 as shown in FIG. For pixels that are in the depth-decreasing boundary (right boundary in FIG. 6) or flat area, the upper left block is used. In the case of pixels at the depth increasing boundary (left boundary in FIG. 6), the upper right block is used.

幾つかの応用形態では、合成プロセスへの入力として、推定奥行き画像に加えて信頼マップも用いることができる。奥行き推定器が高い信頼度を示すとき、奥行き画像からの奥行きのコスト関数を係数によって重み付けすることができる。 In some applications, a confidence map can be used in addition to the estimated depth image as input to the synthesis process. When the depth estimator shows high reliability, the cost function of depth from the depth image can be weighted by a coefficient.

システム実施形態
以下において、トレリスに基づく画像合成について図７〜図９に示す３つの実施形態を説明する。これらの実施形態は、複雑度の昇順で並べられる。これらの図において、「サンプル」は様々な画像内のピクセルである。 System Embodiments Hereinafter, three embodiments shown in FIGS. 7 to 9 will be described for image synthesis based on trellis. These embodiments are arranged in ascending order of complexity. In these figures, “samples” are pixels in various images.

図７に示すような第１の実施形態において、限られた複雑度で局所最適化が実行される。この実施形態では、候補奥行き選択は前のピクセルからの最適奥行き候補の選択に依拠しない。したがって、候補奥行き割当て及びピクセルの評価を並行に実行することができる。この実施態様のステップごとの説明を以下に記載する。 In the first embodiment as shown in FIG. 7, local optimization is performed with limited complexity. In this embodiment, candidate depth selection does not rely on selection of optimal depth candidates from previous pixels. Thus, candidate depth assignment and pixel evaluation can be performed in parallel. A step-by-step description of this embodiment is described below.

様々な図に示す各ステップは、当該技術分野において既知のメモリ及び入力／出力インターフェースに接続されたプロセッサにおいて実行することができる。仮想画像を表示デバイスにレンダリング及び出力することができる。代替的に、各ステップは、ビデオ符号化器又は復号化器（コーデック）内の分離した別個の（discrete）電子部品を含む手段を用いてシステム内に実装することができる。より具体的には、ビデオ符号化／復号化システム（コーデック）に関しては、本発明において説明される仮想画像を生成する方法を用いて、他のビューの画像を予測することもできる。例えば、参照により本明細書に援用される、米国特許第７，７２８，８７７号「Method and system for synthesizing multiview videos」を参照されたい。 Each step shown in the various figures may be performed in a processor connected to a memory and input / output interface as known in the art. A virtual image can be rendered and output to a display device. Alternatively, each step can be implemented in the system using means including separate discrete electronics within a video encoder or decoder (codec). More specifically, for video encoding / decoding systems (codecs), images of other views can be predicted using the method for generating virtual images described in the present invention. See, for example, US Pat. No. 7,728,877 “Method and system for synthesizing multiview videos”, incorporated herein by reference.

ステップ７０１：トレリス内の全てのピクセルの候補奥行きを特定する。このステップにおいて、以下の候補が求められる。
ａ．奥行きＡ：現在のピクセルの奥行き画像においてシグナリングされている奥行きを選択する。ピクセルがそのライン内の第１のピクセルでない場合、以下のように２つの更なる奥行き候補が選択される。
ｂ．奥行きＢ：同じラインの前の複数のピクセルの奥行き画像においてシグナリングされた奥行きのセットにおいて、奥行きＡと最も異なる奥行きを選択する。前のピクセルは図３に示すものである。４つの前のピクセルが好ましい。
ｃ．奥行きＣ：奥行きＣは奥行きＢと異なり、同じラインから選択され、図４に示すように上のラインからの同じ列内の奥行き間で選択され、奥行きＡと最も異なる。
ｄ．奥行きＤ：この実施形態ではそのような候補奥行きはない。 Step 701: Identify candidate depths for all pixels in the trellis. In this step, the following candidates are determined:
a. Depth A: Select the depth signaled in the depth image of the current pixel. If the pixel is not the first pixel in the line, two further depth candidates are selected as follows:
b. Depth B: Select the depth that is most different from Depth A in the set of depths signaled in the depth images of multiple pixels in front of the same line. The previous pixel is as shown in FIG. Four previous pixels are preferred.
c. Depth C: Unlike the depth B, the depth C is selected from the same line, and is selected between the depths in the same row from the upper line as shown in FIG.
d. Depth D: There is no such candidate depth in this embodiment.

ステップ７０２：各ピクセルの候補奥行きごとのコストを評価する。 Step 702: Evaluate the cost for each candidate depth of each pixel.

ステップ７０３：ピクセルごとの全ての候補奥行きのコストを比較し、最小コストを有する候補奥行きを求める。ピクセルごとに対応する奥行きを選択する。 Step 703: Compare the cost of all candidate depths for each pixel to find the candidate depth with the lowest cost. Select the corresponding depth for each pixel.

図８は、複雑度が制限された局所最適化でもある第２の実施形態を示している。この実施態様では、トレリスの列内の候補奥行き割当ては、トレリス内の直前のピクセル又は列の最適奥行き選択に依拠する。以下は、この実施態様のステップごとの説明である。 FIG. 8 illustrates a second embodiment that is also local optimization with limited complexity. In this embodiment, the candidate depth assignment within a trellis column relies on the optimal depth selection of the previous pixel or column in the trellis. The following is a step-by-step description of this embodiment.

ステップ８０１：インデックスｉを初期化する。 Step 801: The index i is initialized.

ステップ８０２：ピクセルｉの候補奥行きを特定する。このステップにおいて、図７に示す実施形態と同様にして選択された３つの奥行き候補を含む。しかしながら、奥行きＢ及びＣを導出するとき、前のピクセルからの最適奥行きが用いられ、これは奥行き画像においてシグナリングされるものと異なることができる。 Step 802: Identify the candidate depth of pixel i. In this step, three depth candidates selected in the same manner as the embodiment shown in FIG. 7 are included. However, when deriving depths B and C, the optimal depth from the previous pixel is used, which can be different from what is signaled in the depth image.

ステップ８０３：ピクセルｉの奥行き候補ごとのコストを評価する。 Step 803: Evaluate the cost for each depth candidate of pixel i.

ステップ８０４：全ての奥行き候補のコストを比較し、ピクセルｉの最小コストを求める。 Step 804: Compare the costs of all depth candidates and determine the minimum cost of pixel i.

ステップ８０５：トレリス内で処理されていない更なるピクセルが存在する場合、ｉを１だけ増加し（８０６）、反復する。 Step 805: If there are more pixels not processed in the trellis, i is incremented by 1 (806) and iterated.

第１の２つの実施形態では、最適奥行き候補は局所コスト関数を評価することによってトレリス内で行ごとに選択される。第３の実施形態では、列からの奥行き候補を結合したものである、トレリスにわたる最適経路が求められる。経路コストは、経路内のノードコストの和として定義される。 In the first two embodiments, optimal depth candidates are selected for each row in the trellis by evaluating a local cost function. In the third embodiment, an optimum path across the trellis, which is a combination of depth candidates from a column, is obtained. The route cost is defined as the sum of the node costs in the route.

ノードは異なる経路内で異なるコストを有することができる。この実施形態は図９に示されている。手順は、ｉ及びｐにわたって反復する２つのループを有する。外側のループは全ての可能な経路にわたるものである一方、内側のループは１つの可能な経路内の全てのノードについてのものである。 Nodes can have different costs in different paths. This embodiment is illustrated in FIG. The procedure has two loops that iterate over i and p. The outer loop spans all possible paths, while the inner loop is for all nodes in one possible path.

潜在的な経路ごとに、経路内で順次、ノードの候補奥行きを特定し（９０１）評価する（９０２）。奥行き候補割当ては以下のように求められる。経路内に更なるピクセルがあるか否かを判断する（９０３）。 For each potential path, node candidate depths are sequentially identified (901) and evaluated (902) within the path. Depth candidate assignment is obtained as follows. It is determined whether there are more pixels in the path (903).

次のノードが行「奥行きＡ」に位置する場合、ノードは奥行き画像内でシグナリングされた通りの奥行きに設定される。ノードが行「奥行きＢ」に位置する場合、同じライン内の前のピクセルの所与の奥行きのセットからの中央値である奥行きを選択する。前のピクセルの所与の奥行きが現在の経路について指定される。ノードが行「奥行きＣ」に位置する場合、ノードは、画像内の上のラインの同じ列からの奥行きの中央値として選択される。 If the next node is located in row “depth A”, the node is set to the depth as signaled in the depth image. If the node is located in row “depth B”, then select the depth that is the median from the given depth set of previous pixels in the same line. A given depth of the previous pixel is specified for the current path. If the node is located in row “depth C”, the node is selected as the median depth from the same column of the top line in the image.

異なる経路が同じノードと交差している場合、その同じノードについて奥行きＢに異なる奥行きを割り当てることができる。奥行きＡ及びＣは異なる経路について同じままにされる。 If different paths intersect the same node, a different depth can be assigned to depth B for that same node. Depths A and C remain the same for different paths.

経路内の全てのノードが評価された後、ノードコストの合計として経路コストが求められ（９０４）、更なる経路がない場合（９０５）、最小コストを有する経路が最終合成結果のために用いられる（９０６）。 After all nodes in the path have been evaluated, the path cost is determined as the sum of the node costs (904), and if there are no more paths (905), the path with the minimum cost is used for the final composite result. (906).

疎な奥行きを用いたビュー合成
本発明の関連出願である、米国特許出願第１３／０２６，７５０号において、ビュー合成プロセスへの入力として奥行き画像を用い、ここで、推定奥行きは、トレリスに基づくビュー合成プロセスにおける幾つかの候補奥行きのうちの１つとみなされる。このようにして、入力画像内の各ピクセルは対応する奥行きと関連付けられ、奥行き画像が形成される。これらの奥行き画像は、密な奥行き画像と呼ばれる。 View Synthesis Using Sparse Depth In US patent application Ser. No. 13 / 026,750, a related application of the present invention, uses depth images as input to the view synthesis process, where the estimated depth is based on trellis Considered one of several candidate depths in the view synthesis process. In this way, each pixel in the input image is associated with a corresponding depth to form a depth image. These depth images are called dense depth images.

対照的に、疎な奥行き特徴は、入力テクスチャ画像内のピクセルの小さなサブセットと関連付けられた奥行きの集まりを指す。既知の金出−ルーカス−トマシ（ＫＬＴ）特徴追跡器を含む、複数の既知の技法を用いて疎な奥行き特徴を求めることができる。ＫＬＴ特徴追跡器は、まず、画像、例えば左のビューの隅角点又は顕著な特徴を検出し、次に、もう１つの画像、例えば右のビュー内の対応する特徴を見つける。 In contrast, sparse depth features refer to a collection of depths associated with a small subset of pixels in the input texture image. A number of known techniques can be used to determine sparse depth features, including known money-Lucas-Tomasi (KLT) feature trackers. The KLT feature tracker first detects an image, eg, a corner point or salient feature of the left view, and then finds a corresponding feature in another image, eg, the right view.

図１０に示すように、入力ステレオビデオ画像（ビデオ）１００１から、密な奥行き推定１０１０が実行され、ステレオ対の左のビュー及び右のビューに対応する密な奥行き画像１０１１が生成される。同様に、入力ステレオビデオから疎な奥行き推定１０２０が実行され、ステレオ対の左のビュー及び右のビューにおける対応関係に基づいて、疎な奥行き特徴のセット１０２１が生成される。 As shown in FIG. 10, dense depth estimation 1010 is performed from an input stereo video image (video) 1001, and a dense depth image 1011 corresponding to the left view and right view of the stereo pair is generated. Similarly, sparse depth estimation 1020 is performed from the input stereo video, and a sparse depth feature set 1021 is generated based on the correspondence between the left and right views of the stereo pair.

次に、図７〜図９を参照して上述したように、密な奥行き画像と、疎な奥行き特徴と、入力ステレオビデオとを用いて、トレリスに基づくビュー合成１０３０が実行され、仮想画像１００２が生成される。 Next, as described above with reference to FIGS. 7 to 9, trellis-based view synthesis 1030 is performed using the dense depth image, the sparse depth feature, and the input stereo video, and the virtual image 1002. Is generated.

この説明の便宜上、疎な奥行き特徴はいわゆる疎な奥行き画像を形成する。 For convenience of this description, the sparse depth feature forms a so-called sparse depth image.

図１１に示すように、密な奥行き画像１０１１は密な奥行きワーピング１１１０を受ける。密な奥行きワーピング１１１０は、仮想ビューの位置に対応する、ワーピングされた密な奥行き画像を生成する。ワーピングは、仮想ビュー位置と、シーンのジオメトリのパラメーターとに従って、各奥行きを仮想ビュー内の対応する奥行きにマッピングすることによって達成される。 As shown in FIG. 11, the dense depth image 1011 receives a dense depth warping 1110. Dense depth warping 1110 generates a warped dense depth image corresponding to the position of the virtual view. Warping is accomplished by mapping each depth to a corresponding depth in the virtual view according to the virtual view position and the geometry parameters of the scene.

本発明の好ましい実施形態では、２つのワーピングされた密な奥行き画像が存在する。一方は、左のビューの密な奥行き画像のワーピングに対応し、もう一方は、右のビューの密な奥行き画像のワーピングに対応する。ワーピングされた密な奥行き画像の奥行きは、トレリスに基づくビュー合成の候補奥行きである。 In the preferred embodiment of the invention, there are two warped dense depth images. One corresponds to the warping of the dense depth image of the left view, and the other corresponds to the warping of the dense depth image of the right view. The depth of the warped dense depth image is a candidate depth for view synthesis based on the trellis.

さらに、疎な奥行き特徴１０２１は、疎な奥行きワーピング１１２０を受ける。疎な奥行きワーピング１１２０はまず、仮想ビュー内に、ワーピングされた疎な奥行き特徴を生成する。疎な奥行き特徴のワーピングは、密な奥行き画像のワーピングと類似しているが、入力画像内のピクセル位置の完全なセットと比べて、特徴のより小さなサブセットに対して実行される。次に、最近接点割当て、線形補間、双三次補間等の既知の従来技術による技法を用いて、ワーピングされた疎な特徴のセットから奥行きの密なセットが求められる。 Further, the sparse depth feature 1021 receives a sparse depth warping 1120. The sparse depth warping 1120 first generates warped sparse depth features in the virtual view. Sparse depth feature warping is similar to dense depth image warping, but is performed on a smaller subset of features compared to a complete set of pixel locations in the input image. Next, a dense set of depths is determined from a set of warped sparse features using known prior art techniques such as nearest neighbor assignment, linear interpolation, bicubic interpolation and the like.

代替的に、まず、疎な奥行き特徴に対して補間を実行し、次に仮想ビューにマッピングすることができる。疎な奥行きマッピングプロセスの出力は、トレリスに基づくビュー合成のための追加の候補奥行きを生成する。 Alternatively, interpolation can first be performed on sparse depth features and then mapped to a virtual view. The output of the sparse depth mapping process generates additional candidate depths for trellis-based view synthesis.

図２に示すように、仮想ビュー内の各ピクセルのビュー合成について複数の候補奥行きを評価することができる。本発明の好ましい実施形態において、密な奥行き画像及び疎な奥行き特徴から候補奥行きが求められる。 As shown in FIG. 2, multiple candidate depths can be evaluated for view synthesis of each pixel in the virtual view. In a preferred embodiment of the present invention, candidate depths are determined from dense depth images and sparse depth features.

図１１内のトレリス構造１１３０は、図２に示すようなトレリスを生成し、ここで各列は仮想ビュー内の１つのピクセル位置に対応し、１つの列内の各ノードは合成に用いられる１つの候補奥行きに対応する。 The trellis structure 1130 in FIG. 11 generates a trellis as shown in FIG. 2, where each column corresponds to one pixel location in the virtual view and each node in one column is used for synthesis. Corresponding to one candidate depth.

トレリスは、仮想ビュー画像の１つの行について構築される。各ノードは、視差候補を用いて、１つの候補奥行き及び推定された合成品質メトリックと関連付けられる。上記で説明した、候補奥行きを生成する全ての方法を用いることができる。加えて、疎な奥行き特徴から求められた候補奥行きを、トレリスを作成する際に用いることができる。 The trellis is constructed for one row of the virtual view image. Each node is associated with one candidate depth and an estimated composite quality metric using disparity candidates. All of the methods for generating candidate depths described above can be used. In addition, candidate depths determined from sparse depth features can be used when creating a trellis.

トレリスが構築された後、図７〜図９に関して説明した実施形態に従って、トレリスを通る最小コスト経路が求められる（１１４０）。結果として得られた奥行きのセットを用いて、入力画像を仮想ビュー位置にワーピングする（１１５０）。このプロセスは、左の入力ビュー及び右の入力ビューの双方について行われる。 After the trellis is built, a minimum cost path through the trellis is determined 1140 according to the embodiment described with respect to FIGS. The resulting depth set is used to warp the input image to the virtual view position (1150). This process is performed for both the left and right input views.

最後に、ブレンディングするステップ１１６０が、基準ビューからの左のビュー及び右のビューの距離によって決まる重み付け係数によって、左のビュー及び右のビューを平均する。仮想ビュー位置が左のビューにより近い場合、左のビューからワーピングされたビューは、右のビューからワーピングされたビューよりも大きな重み付け係数を有する。一方のワーピングされたビュー内の穴ピクセルは、他方のワーピングされたビュー内で穴ではない場合、その他方のワーピングされたビューを用いて埋められる。ブレンディングした後、最終的な仮想ビュー画像が表示される。 Finally, blending step 1160 averages the left and right views by a weighting factor that depends on the distance of the left and right views from the reference view. If the virtual view position is closer to the left view, the view warped from the left view has a greater weighting factor than the view warped from the right view. Hole pixels in one warped view are filled with the other warped view if they are not holes in the other warped view. After blending, the final virtual view image is displayed.

トレリス及び疎な奥行きを用いた奥行き強調
明確にするために、図１２に関して左のビューの処理のみを示して説明する。右のビューの処理は同様である。オプションの項目は破線で示す。 Depth Enhancement Using Trellis and Sparse Depth For clarity, only left view processing is shown and described with respect to FIG. The processing for the right view is similar. Optional items are indicated by broken lines.

本発明の上記の実施形態は、密な奥行き画像１０１１及び疎な奥行き特徴１０２１を利用するビュー合成方法を説明した。この方法では、推定奥行き及び疎な奥行き特徴を用いて、トレリスに基づくビュー合成プロセスの一部として、幾つかの候補奥行きが求められる。 The above embodiments of the present invention have described a view synthesis method that utilizes dense depth images 1011 and sparse depth features 1021. In this method, using the estimated depth and sparse depth features, several candidate depths are determined as part of the trellis-based view synthesis process.

本発明のこの実施形態では、密な奥行き画像及び疎な奥行き特徴に基づいて密な奥行き画像の品質を強調する方法が用いられる。この文脈では、強調とは、誤りの訂正、又は密な奥行き画像の推定若しくは取得に起因したノイズの抑制を指す。強調された奥行き画像は、後続のビュー合成において用いられる。 In this embodiment of the invention, a method is used to enhance the quality of the dense depth image based on the dense depth image and sparse depth features. In this context, enhancement refers to error correction or suppression of noise due to dense depth image estimation or acquisition. The enhanced depth image is used in subsequent view synthesis.

図１２に示すように、奥行き強調１２０１は、取得された左のビュー及び右のビューに対応する、密な奥行き画像１０１１及び疎な奥行き特徴１０２１に適用することができる。第２の奥行き強調１２０３は、ビュー合成１２３０中の仮想ビューに対応する。第１の奥行き強調が左のビューに適用され、第１の強調された密な奥行き画像が生成される（１２０２）。第１の奥行き強調は、右のビューにも独立して適用することができる。 As shown in FIG. 12, the depth enhancement 1201 can be applied to the dense depth image 1011 and the sparse depth feature 1021 corresponding to the acquired left view and right view. The second depth enhancement 1203 corresponds to the virtual view in the view composition 1230. A first depth enhancement is applied to the left view to generate a first enhanced dense depth image (1202). The first depth enhancement can also be applied independently to the right view.

この実施形態において、奥行き候補のセット２０２は、第１の強調された奥行き画像から選択される。トレリスを通る最小コストを有する経路が選択される前に、各奥行き候補のコストが求められる。 In this embodiment, the set of depth candidates 202 is selected from the first enhanced depth image. Before the path with the lowest cost through the trellis is selected, the cost of each depth candidate is determined.

奥行き強調の後に、オクルージョンハンドリング１３０、すなわち、単一のビューが用いられる場合、穴埋め、及び、ワーピング１１０、又は、外挿が、続く。これは第１の強調された奥行き画像（複数の場合もあり）にのみ適用される。このとき、テクスチャ画像は用いられない。 Depth enhancement is followed by occlusion handling 130, ie, if a single view is used, hole filling and warping 110 or extrapolation. This applies only to the first enhanced depth image (s). At this time, the texture image is not used.

ビュー合成１２３０の間、仮想ビューに対するオクルージョンハンドリング・ワーピング・外挿の後、第１の強調された奥行き画像に第２の奥行き強調１２０３が適用され、仮想ビュー位置において第２の強調された奥行き画像が生成される。次に、第２の強調された奥行き画像は、第２のワーピング・外挿の間、テクスチャ画像（複数の場合もあり）とともに用いられる。 During view synthesis 1230, after occlusion handling, warping and extrapolation to the virtual view, a second depth enhancement 1203 is applied to the first enhanced depth image and the second enhanced depth image at the virtual view position. Is generated. The second enhanced depth image is then used with the texture image (s) during the second warping and extrapolation.

第１の奥行き強調及び第２の奥行き強調の双方の間、密な奥行き画像内で特定された入力奥行きは、常に第１の候補として選択される。この奥行きは不正確である可能性があり、このため、アーティファクト又は入力画像に対する不一致につながる可能性がある。したがって、追加の候補が検討される。 During both the first depth enhancement and the second depth enhancement, the input depth identified in the dense depth image is always selected as the first candidate. This depth can be inaccurate, which can lead to artifacts or mismatches to the input image. Therefore, additional candidates are considered.

第１の奥行き強調の間、テクスチャ画像内の同じ行の前のピクセルの最小連結（色）輝度差（minimal collocated (color) intensity difference）に基づいて代替的な奥行き候補が選択される。また、最も近い疎な奥行き特徴の奥行きも、代替的な候補として選択される。 During the first depth enhancement, alternative depth candidates are selected based on the minimal collocated (color) intensity difference of the previous pixel in the same row in the texture image. The depth of the nearest sparse depth feature is also selected as an alternative candidate.

第２の奥行き強調の間、同じ行及び同じ列の前の５つの奥行きからの奥行きの中央値が代替的な候補として選択される。 During the second depth enhancement, the median depth from the previous five depths in the same row and column is selected as an alternative candidate.

第２の奥行き強調に続いて、テクスチャ画像１２２１及び１２２２をワーピングし（１２４０）、ブレンディング・外挿を行い（１２５０）、仮想画像１００２を生成することができる。 Following the second depth enhancement, texture images 1221 and 1222 can be warped (1240), blended and extrapolated (1250) to generate virtual image 1002.

第１の奥行き強調の間、奥行き候補のコスト関数のために、３つの測定値、すなわち、ステレオコスト、すなわち２つのビュー間の色輝度整合性と、色差コスト、すなわち現在のロケーションと候補ピクセルロケーションとの間の色輝度差と、奥行き差コスト、すなわち現在のピクセルと代替的な候補との間の奥行き差とが検討される。 During the first depth enhancement, because of the cost function of the depth candidate, there are three measurements: stereo cost, ie color intensity consistency between the two views, and color difference cost, ie current location and candidate pixel location. And the color difference between and the depth difference cost, i.e. the depth difference between the current pixel and the alternative candidate.

図１３に示すように、ステレオコストは、現在のピクセルロケーション及び奥行きにロケーションが依存する、左の色（テクスチャ）画像及び右の色（テクスチャ）画像内の２つの窓（Ａ、Ｂ）間の平均値分離平均絶対差（ｍｒＭＡＤ）によって定義される。 As shown in FIG. 13, the stereo cost is between the two windows (A, B) in the left color (texture) image and the right color (texture) image, where the location depends on the current pixel location and depth. Defined by mean value separated mean absolute difference (mrMAD).

図１３は、色差コストの例を示している。左の色画像１３０１は上部に示され、奥行き（視差）画像は下部に示されている。色差コストは、現在のピクセル位置１３１１及び候補ピクセル位置１３１２の２つの周囲ウィンドウ１３０３のＭＡＤとして定義される。右の画像も同様に処理される。 FIG. 13 shows an example of the color difference cost. The left color image 1301 is shown at the top and the depth (parallax) image is shown at the bottom. The color difference cost is defined as the MAD of the two surrounding windows 1303 at the current pixel location 1311 and the candidate pixel location 1312. The right image is processed in the same way.

奥行き強調Ｂにおいて、ステレオコスト及び奥行き差コストのみが検討される。ステレオコストを求めるために、基準ビュー内の２つのウィンドウのロケーションが求められる。これは、現在のピクセルロケーションと、候補奥行きと、基準ビューに対する仮想ビューの位置とに依拠して求められる。 In depth enhancement B, only the stereo cost and the depth difference cost are considered. In order to determine the stereo cost, the location of two windows in the reference view is determined. This is determined depending on the current pixel location, the candidate depth, and the position of the virtual view relative to the reference view.

候補視差のコストが求められた後、このコストは、現在の視差のコストと比較される。候補視差のコストが現在の視差のコストよりも低い場合、現在の視差は候補視差に更新される。全てのピクセルが処理された後、後続のステップにおいて強調された奥行きマップが出力される。 After the candidate parallax cost is determined, this cost is compared to the current parallax cost. If the cost of the candidate parallax is lower than the cost of the current parallax, the current parallax is updated to the candidate parallax. After all the pixels have been processed, an enhanced depth map is output in subsequent steps.

Claims

A method for generating a virtual image of a virtual view of a 3D scene based on a texture image and a depth image, wherein the depth image includes an associated dense depth image and an associated sparse depth feature, The method is
Applying a first depth enhancement to the dense depth image and the sparse depth feature to obtain a first enhanced depth image;
Determining a plurality of candidate depths for each pixel of the first enhanced depth image;
Obtaining a cost for estimating the synthesis quality of the virtual image for each candidate depth;
Selecting the candidate depth with the lowest cost to generate an enhanced depth of the pixel in the first enhanced depth image;
Applying view synthesis to the first enhanced depth image and the corresponding texture image to generate the virtual image; and
Each said step is executed in a processor,
A method for generating virtual images.

The method further comprising: applying a second depth enhancement to the first enhanced depth image for the virtual view to obtain a second enhanced depth image used during the view synthesis. The method according to 1.

The method of claim 1, wherein each of the steps is performed similarly for a left view and a right view of the 3D scene.

The method of claim 1, wherein the first depth enhancement and the second depth enhancement correct errors and suppress noise due to estimation or acquisition of the dense depth image and the sparse depth features. .

The method of claim 1, wherein after the first depth enhancement, occlusion handling is performed on the first enhanced depth image.

The method of claim 1, wherein the minimum cost is determined before the path with the minimum cost through the trellis is selected.

The method of claim 6, wherein the candidate depth is based on a minimum connected luminance difference of previous pixels in the same row of the texture image.

The method of claim 1, wherein the candidate depth includes a nearest sparse depth feature.

The method of claim 1, wherein the minimum cost is based on a stereo cost, ie, color luminance consistency between two views and a color difference cost.

The method of claim 9, wherein the stereo cost is defined by an average separation average absolute difference between two windows in a left texture image and a right texture image.

The method of claim 9, further comprising determining a cost of the candidate depth parallax.