JP2017060111A

JP2017060111A - Image encoder, image decoder, image encoding method and image decoding method

Info

Publication number: JP2017060111A
Application number: JP2015185565A
Authority: JP
Inventors: 信哉志水; Shinya Shimizu; 志織杉本; Shiori Sugimoto; 広太竹内; Kota Takeuchi
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2015-09-18
Filing date: 2015-09-18
Publication date: 2017-03-23
Anticipated expiration: 2035-09-18
Also published as: JP6450288B2

Abstract

PROBLEM TO BE SOLVED: To easily confirm an imaging content for a light field image and to achieve highly efficient compression encoding.SOLUTION: An image encoder comprises: light field image parameter setting means for setting a light field image parameter which characterizes a first light field image on the basis of the first light field image; base image generating means for generating a base image being a normal image of the same scene as that of the first light field image by using the first light field image; base image encoding means for encoding the base image; light field image generating means for generating a second light field image to the light field image of an encoding object by using the base image in accordance with the light field image parameter; and light field image encoding means for encoding the first light field image of the encoding object to output a bit stream while a second light field image is set to be a prediction image.SELECTED DRAWING: Figure 1

Description

本発明は、画像を符号化及び復号する画像符号化装置、画像復号装置、画像符号化方法及び画像復号方法に関する。 The present invention relates to an image encoding device, an image decoding device, an image encoding method, and an image decoding method for encoding and decoding an image.

デジタル画像や動画像の品質において、その空間解像度は非常に大きな要素である。そのため、より高解像度な動画像を取り扱うことのできる高精細動画像／画像システムの研究開発が継続的に行われている。高解像度な動画像／画像を用いることで被写体や背景を細部まで鮮明に表現することが可能となる。その一方で各被写体に対してフォーカスが合っているか否かという解像度が低かった際には視認不可能であった要素まで視認されることになる。一般に、注視する被写体にフォーカスが合っていない動画像／画像は、ボケが生じていると認識され、その画質は低いと評価されてしまう。そのため、解像度の高い動画像／画像を撮影する際には、フォーカスを正確にコントロールすることが非常に重要であると考えられている。 The spatial resolution is a very important factor in the quality of digital images and moving images. Therefore, research and development of a high-definition moving image / image system capable of handling a higher-resolution moving image is continuously performed. By using a high-resolution moving image / image, it is possible to clearly express the subject and the background in detail. On the other hand, when the resolution of whether or not each subject is in focus is low, elements that could not be visually recognized are visible. In general, a moving image / image in which the subject to be watched is out of focus is recognized as blurring, and the image quality is evaluated to be low. For this reason, it is considered that it is very important to accurately control the focus when shooting a high resolution moving image / image.

なお、本明細書において、画像とは、静止画像又は動画像を構成する１フレーム分の画像のことをいう。 Note that in this specification, an image refers to an image for one frame constituting a still image or a moving image.

しかしながら、画像を撮影する際のフォーカスコントロールは非常に困難な作業であることが知られている。低解像度の動画像／画像を撮影する際にはビューファインダや小型の確認用のモニタを用いてフォーカスの状況を確認しながら撮影することが可能であるが、解像度の高い画像／映像を撮影する場合、小さなモニタでは細かな合焦状況まで確認することができないためである。 However, it is known that focus control when taking an image is a very difficult task. When shooting low-resolution moving images / images, it is possible to shoot while checking the focus status using a viewfinder or small monitor for confirmation, but shooting high-resolution images / videos. In this case, it is because a small monitor cannot confirm a fine focusing state.

これに対して、撮影後に画像処理を行うことを前提とすることで、撮影後にフォーカスを調節できる撮像装置が開発されている。非特許文献１はライトフィールドカメラと呼ばれる撮像装置であり、従来のカメラにおけるメインレンズと投影面の間にマイクロレンズアレイを挿入した構成になっている。このような構成を取ることで、カメラに入射する光線を入射角毎に記録することが可能となり、そこから異なる距離にフォーカスを合わせた動画像／画像を生成することができる。ライトフィールドカメラにより撮像された画像（以下、ライトフィールド画像という）は、各画素の位置における光線の強度を光線の進行方向ごとに表現した画像である。 On the other hand, an imaging apparatus that can adjust the focus after shooting has been developed on the assumption that image processing is performed after shooting. Non-Patent Document 1 is an imaging device called a light field camera, which has a configuration in which a microlens array is inserted between a main lens and a projection surface in a conventional camera. By adopting such a configuration, it is possible to record the light rays incident on the camera for each incident angle, and it is possible to generate a moving image / image focused at different distances therefrom. An image captured by a light field camera (hereinafter referred to as a light field image) is an image expressing the intensity of light rays at each pixel position for each traveling direction of the light rays.

R. Ng, "Digital light field photography", Ph.D dissertation, Stanford University, July 2006.R. Ng, "Digital light field photography", Ph.D dissertation, Stanford University, July 2006.

非特許文献１に記載の方法を用いることで、撮影後に処理を行うことでフォーカスの調整を行うことが可能となる。しかしながら、非特許文献１によって撮影された画像は、通常の画像と異なり、マイクロレンズによって角度ごとに分離された光線を記録するため、通常の画像符号化手法を用いた場合では、効率的な符号化を実現することができないという問題がある。さらに、非特許文献１によって撮影された画像は、画像処理を行わなければ、一般な画像のように内容を確認することができないという問題もある。 By using the method described in Non-Patent Document 1, it is possible to perform focus adjustment by performing processing after shooting. However, unlike an ordinary image, an image photographed according to Non-Patent Document 1 records light beams separated for each angle by a microlens. Therefore, when an ordinary image coding method is used, an efficient code is used. There is a problem that it cannot be realized. Furthermore, there is a problem that an image taken according to Non-Patent Document 1 cannot be confirmed like a general image unless image processing is performed.

本発明は、このような事情に鑑みてなされたもので、ライトフィールドカメラによって撮像されたライトフィールド画像に対して、撮像内容を容易に確認でき、高効率な圧縮符号化を実現することができる画像符号化装置、画像復号装置、画像符号化方法及び画像復号方法を提供することを目的とする。 The present invention has been made in view of such circumstances, and it is possible to easily confirm the captured content of a light field image captured by a light field camera and to realize highly efficient compression coding. An object is to provide an image encoding device, an image decoding device, an image encoding method, and an image decoding method.

本発明の一態様は、ライトフィールドカメラによって得られる第１のライトフィールド画像の符号化を行う画像符号化装置であって、前記第１のライトフィールド画像に基づいて当該第１のライトフィールド画像を特徴付けるライトフィールド画像パラメータを設定するライトフィールド画像パラメータ設定手段と、前記第１のライトフィールド画像を使用して、当該第１のライトフィールド画像と同じシーンの通常の画像であるベース画像を生成するベース画像生成手段と、前記ベース画像を符号化するベース画像符号化手段と、前記ライトフィールド画像パラメータに従って、前記ベース画像を使用して、符号化対象ライトフィールド画像に対する第２のライトフィールド画像を生成するライトフィールド画像生成手段と、前記第２のライトフィールド画像を予測画像としながら、符号化対象の前記第１のライトフィールド画像を符号化してビットストリームを出力するライトフィールド画像符号化手段とを備える画像符号化装置である。 One aspect of the present invention is an image encoding device that performs encoding of a first light field image obtained by a light field camera, and the first light field image is converted based on the first light field image. A light field image parameter setting means for setting a light field image parameter to be characterized, and a base for generating a base image which is a normal image of the same scene as the first light field image, using the first light field image An image generation unit, a base image encoding unit for encoding the base image, and a second light field image for the encoding target light field image using the base image according to the light field image parameter A light field image generating means and the second laser While the door field image and the prediction image, is an image encoding device and a light field image encoding means for said first light field image to be coded and outputs a bit stream by encoding.

本発明の一態様は、前記画像符号化装置であって、前記ベース画像生成手段は、前記第１のライトフィールド画像から画像全面において合焦している全焦点画像を前記ベース画像として生成する。 One aspect of the present invention is the image encoding device, wherein the base image generation unit generates an omnifocal image focused on the entire surface of the first light field image as the base image.

本発明の一態様は、前記画像符号化装置であって、前記ベース画像生成手段は、前記第１のライトフィールド画像における前記ライトフィールドカメラのメインレンズの中央に位置するマイクロレンズの画像であるサブアパチャ画像を前記ベース画像として生成する。 One aspect of the present invention is the image encoding device, wherein the base image generation means is a sub-aperture that is an image of a microlens located in the center of a main lens of the light field camera in the first light field image. An image is generated as the base image.

本発明の一態様は、ライトフィールドカメラによって得られる第１のライトフィールド画像のビットストリームから、前記第１のライトフィールド画像の復号を行う画像復号装置であって、前記第１のライトフィールド画像を特徴付けるパラメータを復号し、復号したパラメータをライトフィールド画像パラメータとして設定するライトフィールド画像パラメータ設定手段と、前記第１のライトフィールド画像と同じシーンの通常の画像であるベース画像を復号するベース画像復号手段と、前記ライトフィールド画像パラメータに従って、前記ベース画像を使用して、符号化対象ライトフィールド画像に対する第２のライトフィールド画像を生成するライトフィールド画像生成手段と、前記第２のライトフィールド画像を予測画像としながら、前記第１のライトフィールド画像を復号するライトフィールド画像復号手段とを備える画像復号装置である。 One aspect of the present invention is an image decoding apparatus that decodes the first light field image from a bit stream of the first light field image obtained by a light field camera, wherein the first light field image is Light field image parameter setting means for decoding a parameter to be characterized and setting the decoded parameter as a light field image parameter, and base image decoding means for decoding a base image that is a normal image in the same scene as the first light field image And a light field image generating means for generating a second light field image for the light field image to be encoded using the base image according to the light field image parameters, and the second light field image as a predicted image. While An image decoding apparatus and a light field image decoding means for decoding the first light field image.

本発明の一態様は、前記画像復号装置であって、前記ベース画像復号手段では、前記第１のライトフィールド画像と同じシーンに対する画像全面において合焦している全焦点画像を前記ベース画像として復号する。 One aspect of the present invention is the image decoding apparatus, wherein the base image decoding unit decodes an omnifocal image focused on the entire image of the same scene as the first light field image as the base image. To do.

本発明の一態様は、前記画像復号装置であって、前記ベース画像復号手段では、前記第１のライトフィールド画像における前記ライトフィールドカメラのメインレンズの中央に位置するマイクロレンズの画像であるサブアパチャ画像を前記ベース画像として復号する。 One aspect of the present invention is the image decoding device, wherein the base image decoding unit includes a sub-aperture image that is an image of a microlens positioned in the center of a main lens of the light field camera in the first light field image. Are decoded as the base image.

本発明の一態様は、ライトフィールドカメラによって得られる第１のライトフィールド画像の符号化を行う画像符号化装置が行う画像符号化方法であって、前記第１のライトフィールド画像に基づいて当該第１のライトフィールド画像を特徴付けるライトフィールド画像パラメータを設定するライトフィールド画像パラメータ設定ステップと、前記第１のライトフィールド画像を使用して、当該第１のライトフィールド画像と同じシーンの通常の画像であるベース画像を生成するベース画像生成ステップと、前記ベース画像を符号化するベース画像符号化ステップと、前記ライトフィールド画像パラメータに従って、前記ベース画像を使用して、符号化対象ライトフィールド画像に対する第２のライトフィールド画像を生成するライトフィールド画像生成ステップと、前記第２のライトフィールド画像を予測画像としながら、符号化対象の前記第１のライトフィールド画像を符号化してビットストリームを出力するライトフィールド画像符号化ステップとを有する画像符号化方法である。 One aspect of the present invention is an image encoding method performed by an image encoding device that encodes a first light field image obtained by a light field camera, the first encoding method based on the first light field image. A light field image parameter setting step for setting a light field image parameter characterizing one light field image, and a normal image of the same scene as the first light field image using the first light field image A base image generating step for generating a base image; a base image encoding step for encoding the base image; and a second image for a light field image to be encoded using the base image according to the light field image parameters. Light fee for generating light field images Image coding step, and a light field image coding step of coding the first light field image to be coded and outputting a bitstream while using the second light field image as a predicted image It is a conversion method.

本発明の一態様は、ライトフィールドカメラによって得られる第１のライトフィールド画像のビットストリームから、前記第１のライトフィールド画像の復号を行う画像復号装置が行う画像復号方法であって、前記第１のライトフィールド画像を特徴付けるパラメータを復号し、復号したパラメータをライトフィールド画像パラメータとして設定するライトフィールド画像パラメータ設定ステップと、前記第１のライトフィールド画像と同じシーンの通常の画像であるベース画像を復号するベース画像復号ステップと、前記ライトフィールド画像パラメータに従って、前記ベース画像を使用して、符号化対象ライトフィールド画像に対する第２のライトフィールド画像を生成するライトフィールド画像生成ステップと、前記第２のライトフィールド画像を予測画像としながら、前記第１のライトフィールド画像を復号するライトフィールド画像復号ステップとを有する画像復号方法である。 One aspect of the present invention is an image decoding method performed by an image decoding apparatus that performs decoding of the first light field image from a bit stream of the first light field image obtained by a light field camera. A light field image parameter setting step for decoding parameters that characterize the light field image and setting the decoded parameters as light field image parameters; and decoding a base image that is a normal image of the same scene as the first light field image A base image decoding step, a light field image generation step of generating a second light field image for the light field image to be encoded using the base image according to the light field image parameters, and the second light field image While the field image and the predicted image is an image decoding method and a light field image decoding step of decoding said first light field image.

本発明によれば、撮像内容の容易な確認ができるようになるとともに、ライトフィールド画像の高効率な圧縮符号化を同時に実現することができるという効果が得られる。 According to the present invention, it is possible to easily confirm the captured content, and to obtain an effect that it is possible to simultaneously realize highly efficient compression coding of a light field image.

本発明の実施形態における画像符号化装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image coding apparatus in embodiment of this invention. 図１に示す画像符号化装置１００の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the image coding apparatus 100 shown in FIG. 本発明の実施形態における画像復号装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image decoding apparatus in embodiment of this invention. 図３に示す画像復号装置２００の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the image decoding apparatus 200 shown in FIG. 図１に示す画像符号化装置１００をコンピュータとソフトウェアプログラムとによって構成する場合のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions when the image coding apparatus 100 shown in FIG. 1 is comprised by a computer and a software program. 図３に示す画像復号装置２００をコンピュータとソフトウェアプログラムとによって構成する場合のハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions when the image decoding apparatus 200 shown in FIG. 3 is comprised by a computer and a software program.

以下、図面を参照して、本発明の実施形態による画像符号化装置及び画像復号装置を説明する。ここでは、１枚の画像に対する処理を説明するが、複数の連続する画像に対して処理を繰り返すことで動画像（映像）を符号化及び復号することができる。なお、動画像の全てのフレームに適用せずに、一部のフレームに対して本手法による処理を適用し、その他のフレームに対しては別の処理を適用しても構わない。 Hereinafter, an image encoding device and an image decoding device according to an embodiment of the present invention will be described with reference to the drawings. Here, a process for one image will be described, but a moving image (video) can be encoded and decoded by repeating the process for a plurality of consecutive images. Note that the processing according to the present technique may be applied to some frames without being applied to all frames of the moving image, and another processing may be applied to other frames.

図１は本実施形態における画像符号化装置の構成を示すブロック図である。画像符号化装置１００は、図１に示すように、ライトフィールド画像入力部１０１、ライトフィールド画像パラメータ設定部１０２、ライトフィールド画像パラメータ符号化部１０３、ベース画像生成部１０４、ベース画像符号化部１０５、ベース画像復号部１０６、ライトフィールド画像合成部１０７、ライトフィールド画像符号化部１０８及び多重化部１０９を備えている。 FIG. 1 is a block diagram illustrating a configuration of an image encoding device according to the present embodiment. As shown in FIG. 1, the image encoding apparatus 100 includes a light field image input unit 101, a light field image parameter setting unit 102, a light field image parameter encoding unit 103, a base image generation unit 104, and a base image encoding unit 105. A base image decoding unit 106, a light field image synthesis unit 107, a light field image encoding unit 108, and a multiplexing unit 109.

ライトフィールド画像入力部１０１は、符号化対象となるライトフィールド画像を入力する。以下では、このライトフィールド画像を符号化対象ライトフィールド画像と称する。なお、どのような形式のライトフィールド画像が入力されても構わない。例えば、非特許文献１のようなメインレンズによって結像した被写体の光学像を複数のマイクロレンズを用いて取得したライトフィールド画像であっても、別の方法を用いて取得したライトフィールド画像であっても構わない。ここでは、非特許文献１のライトフィールド画像が入力されるものとする。 The light field image input unit 101 inputs a light field image to be encoded. Hereinafter, this light field image is referred to as an encoding target light field image. Any type of light field image may be input. For example, even a light field image obtained by using a plurality of microlenses, an optical image of a subject formed by a main lens as in Non-Patent Document 1, may be a light field image obtained using another method. It doesn't matter. Here, it is assumed that the light field image of Non-Patent Document 1 is input.

ライトフィールド画像パラメータ設定部１０２は、ライトフィールド画像を特徴付けるパラメータを設定する。どのようなパラメータを設定しても構わないが、後述するライトフィールド画像合成部１０７において、符号化対象ライトフィールド画像を生成する際に必要となるパラメータが設定されるものとする。例えば、マイクロレンズアレイにおけるマイクロレンズの配置、マイクロレンズアレイと画像センサの回転・平行移動のズレ、各マイクロレンズの焦点パラメータ、マイクロレンズに対応する画像素子の数、画像センサのＲＧＢ配列、符号化対象ライトフィールド画像に対するデプスマップなどがある。以下では、これらのパラメータを総じて、符号化対象ライトフィールド画像パラメータと称する。ライトフィールド画像パラメータ符号化部１０３は、符号化対象ライトフィールド画像パラメータを符号化する。 The light field image parameter setting unit 102 sets parameters that characterize the light field image. Any parameter may be set, but it is assumed that the light field image synthesis unit 107 described later sets parameters necessary when generating the encoding target light field image. For example, the arrangement of microlenses in the microlens array, the rotation / translational displacement between the microlens array and the image sensor, the focus parameter of each microlens, the number of image elements corresponding to the microlens, the RGB arrangement of the image sensor, and encoding There is a depth map for the target light field image. Hereinafter, these parameters are collectively referred to as encoding target light field image parameters. The light field image parameter encoding unit 103 encodes the encoding target light field image parameter.

ベース画像生成部１０４は、符号化対象画像ライトフィールドから一般的なカメラで撮影された場合の画像を生成する。以下では、この画像をベース画像と称する。ベース画像符号化部１０５はベース画像を符号化する。ベース画像復号部１０６はベース画像に対するビットストリームを復号してベース画像を生成する。ライトフィールド画像合成部１０７は、ライトフィールド画像パラメータに従って、復号されたベース画像から、符号化対象ライトフィールド画像の合成ライトフィールド画像（第２のライトフィールド画像）を生成する。ここで、ライトフィールド画像パラメータに従って、復号されたベース画像から画像（合成ライトフィールド画像）を生成することを合成という。ライトフィールド画像符号化部１０８は、合成ライトフィールド画像を予測画像として用いながら、符号化対象ライトフィールド画像を符号化する。多重化部１０９は、ベース画像のビットストリーム、ライトフィールド画像パラメータのビットストリーム及び符号化対象ライトフィールド画像のビットストリームを多重化して出力する。 The base image generation unit 104 generates an image taken with a general camera from the encoding target image light field. Hereinafter, this image is referred to as a base image. The base image encoding unit 105 encodes the base image. The base image decoding unit 106 generates a base image by decoding a bit stream for the base image. The light field image combining unit 107 generates a combined light field image (second light field image) of the encoding target light field image from the decoded base image according to the light field image parameter. Here, generating an image (combined light field image) from the decoded base image in accordance with the light field image parameter is called combining. The light field image encoding unit 108 encodes the encoding target light field image while using the combined light field image as a predicted image. The multiplexing unit 109 multiplexes and outputs the base image bit stream, the light field image parameter bit stream, and the encoding target light field image bit stream.

次に、図２を参照して、図１に示す画像符号化装置１００の動作を説明する。図２は、図１に示す画像符号化装置１００の動作を示すフローチャートである。まず、ライトフィールド画像入力部１０１は、符号化対象ライトフィールド画像を入力する（ステップＳ１０１）。 Next, the operation of the image coding apparatus 100 shown in FIG. 1 will be described with reference to FIG. FIG. 2 is a flowchart showing the operation of the image coding apparatus 100 shown in FIG. First, the light field image input unit 101 inputs an encoding target light field image (step S101).

符号化対象ライトフィールド画像の入力が終了したら、ライトフィールド画像パラメータ設定部１０２は、符号化対象ライトフィールド画像に対して、符号化対象ライトフィールド画像パラメータを設定する（ステップＳ１０２）。なお、符号化対象ライトフィールド画像パラメータは、画像符号化装置１００に入力されても構わないし、符号化対象ライトフィールド画像を解析することで推定しても構わない。例えば、マイクロレンズアレイにおけるマイクロレンズの配置、マイクロレンズアレイと画像センサの回転・平行移動のズレ、各マイクロレンズの焦点パラメータ、マイクロレンズに対応する画像素子の数、画像センサのＲＧＢ配列などは、画像によらない情報であるため、外部から入力しても構わない。 When the input of the encoding target light field image is completed, the light field image parameter setting unit 102 sets the encoding target light field image parameter for the encoding target light field image (step S102). The encoding target light field image parameter may be input to the image encoding apparatus 100 or may be estimated by analyzing the encoding target light field image. For example, the arrangement of microlenses in the microlens array, the rotation / translational displacement between the microlens array and the image sensor, the focus parameter of each microlens, the number of image elements corresponding to the microlens, the RGB array of the image sensor, etc. Since the information does not depend on the image, it may be input from the outside.

一方、符号化対象ライトフィールド画像に対するデプスマップは、画像に依存する情報であるため、符号化対象ライトフィールド画像を解析することで生成しても構わない。ライトフィールド画像からデプスマップを推定する方法としては、どのような方法を用いても構わないが、例えば、参考文献１「S. Wanner, and B. Goldluecke, “Globally consistent depth labeling of 4D light fields,” 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.41-48, June 201.2」に記載の方法を用いても構わない。 On the other hand, since the depth map for the encoding target light field image is information depending on the image, it may be generated by analyzing the encoding target light field image. As a method for estimating the depth map from the light field image, any method may be used. For example, Reference 1 “S. Wanner, and B. Goldluecke,“ Globally consistent depth labeling of 4D light fields, "2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.41-48, June 201.2" may be used.

次に、ライトフィールド画像パラメータ符号化部１０３は、設定した符号化対象ライトフィールド画像パラメータを符号化する（ステップＳ１０３）。符号化にはどのような方法を用いても構わない。符号化対象ライトフィールド画像に対するデプスマップの符号化については、任意の画像符号化方法または動画像符号化方法を用いても構わない。 Next, the light field image parameter encoding unit 103 encodes the set encoding target light field image parameter (step S103). Any method may be used for encoding. For encoding the depth map for the light field image to be encoded, any image encoding method or moving image encoding method may be used.

ライトフィールド画像パラメータの符号化が終了したら、ベース画像生成部１０４は、符号化対象ライトフィールド画像パラメータに従って、符号化対象ライトフィールド画像からベース画像を生成する（ステップＳ１０４）。ここで生成されるベース画像は、符号化対象ライトフィールド画像と同じ位置・向き・画角などで撮影した通常の画像であればどのようなものでも構わない。例えば、全焦点画像であっても、特定の被写界深度を持った画像であっても構わない。なお、どのような方法でベース画像を生成しても構わないが、例えば、特定の被写界深度を持った画像であれば、フーリエスライス法（参考文献２「R. Ng, “Fourier slice photography,” ACM SIGGRAPH 2005 Pap. - SIGGRAPH ’05, p. 735, 2005.」に記載）を用いてフーリエ変換領域での処理によって生成しても構わない。 When the encoding of the light field image parameter is completed, the base image generation unit 104 generates a base image from the encoding target light field image according to the encoding target light field image parameter (step S104). The base image generated here may be any normal image captured at the same position, orientation, angle of view, etc. as the encoding target light field image. For example, it may be an omnifocal image or an image having a specific depth of field. The base image may be generated by any method. For example, if the image has a specific depth of field, the Fourier slice method (reference document 2 “R. Ng,“ Fourier slice photography ” , "ACM SIGGRAPH 2005 Pap.-SIGGRAPH '05, p. 735, 2005."), may be generated by processing in the Fourier transform domain.

また、シフト加算法（参考文献３「R. Ng, M. Levoy, G. Duval, M. Horowitz, and P. Hanrahan, “Light Field Photography with a Hand-held Plenoptic Camera,” Stanford Tech Rep. CTSR, pp. 1-11, 2005.」に記載）を用いて、ライトフィールド画像から得られるサブアパチャ画像を、角度成分にしたがってシフトし、それらの平均画像を求めることで生成しても構わない。また、全焦点画像は、上述の方法によって異なる被写界深度を持った画像を複製し、合焦している部分を組み合わせることで生成しても構わない。更に、符号化対象ライトフィールド画像から生成できる中央のサブアパチャ画像をベース画像として設定しても構わない。 Also, the shift addition method (reference 3 “R. Ng, M. Levoy, G. Duval, M. Horowitz, and P. Hanrahan,“ Light Field Photography with a Hand-held Plenoptic Camera, ”Stanford Tech Rep. CTSR, pp. 1-11, 2005.), the sub-aperture image obtained from the light field image may be shifted according to the angle component, and the average image thereof may be obtained. Further, the omnifocal image may be generated by duplicating an image having a different depth of field by the above-described method and combining the focused portions. Furthermore, a central sub-aperture image that can be generated from the encoding target light field image may be set as the base image.

ライトフィールド画像からサブアパチャ画像を生成する方法にはどのような方法を用いても構わないが、例えば、参考文献４「D. G. Dansereau, O. Pizarro, and S. B. Williams: Decoding, calibration and rectification for lenselet-based plenoptic cameras. In: Computer Vision and Pattern Recognition (CVPR), IEEE Conference on. IEEE, Jun 2013.」に記載の方法を用いても構わない。 Any method may be used to generate the sub-aperture image from the light field image. For example, Reference 4 “DG Dansereau, O. Pizarro, and SB Williams: Decoding, calibration and rectification for lenselet-based plenoptic cameras. In: Computer Vision and Pattern Recognition (CVPR), IEEE Conference on. IEEE, Jun 2013. "

次に、ベース画像符号化部１０５は、生成したベース画像を符号化する（ステップＳ１０５）。符号化にはどのような方法を用いても構わない。例えば、ＪＰＥＧやＪＰＥＧ２００などの画像符号化方法や、ＡＶＣやＨＥＶＣなどの動画像符号化方法を用いても構わない。ベース画像復号部１０６は、符号化した方法に応じた手法によって、符号化の結果得られるビットストリームを復号して復号ベース画像を生成する（ステップＳ１０６）。なお、ステップＳ１０５においてベース画像がロスレス符号化された場合は、ステップＳ１０６を省略し、生成したベース画像を復号ベース画像としても構わない。 Next, the base image encoding unit 105 encodes the generated base image (step S105). Any method may be used for encoding. For example, an image encoding method such as JPEG or JPEG200, or a moving image encoding method such as AVC or HEVC may be used. The base image decoding unit 106 generates a decoded base image by decoding the bitstream obtained as a result of encoding by a method according to the encoded method (step S106). If the base image is lossless encoded in step S105, step S106 may be omitted and the generated base image may be a decoded base image.

復号ベース画像を得られたら、ライトフィールド画像合成部１０７は、復号ベース画像と符号化対象ライトフィールド画像パラメータを用いて、符号化対象ライトフィールド画像に対する合成ライトフィールド画像を生成する（ステップＳ１０７）。符号化対象ライトフィールド画像パラメータを歪みあり符号化する場合は、ステップＳ１０３の結果を復号して得られるパラメータを用いる。なお、どのような方法を用いて合成を行っても構わない。 When the decoded base image is obtained, the light field image combining unit 107 generates a combined light field image for the encoding target light field image using the decoding base image and the encoding target light field image parameter (step S107). When encoding the light field image parameter to be encoded with distortion, a parameter obtained by decoding the result of step S103 is used. In addition, you may synthesize | combine using what kind of method.

例えば、全ての被写体が完全拡散反射をすると仮定して生成しても構わないし、全ての被写体がマイクロレンズアレイ平面上で合焦すると仮定して生成しても構わない。具体的な方法の例としては、各マイクロレンズ下の画素群に対して、対応する復号ベース画像の画素値を割り当てて作成しても構わないし、復号ベース画像と符号化対象ライトフィールド画像パラメータに従ってサブアパチャ画像群と呼ばれる多視点画像を生成し、その多視点画像を変換することでライトフィールド画像を合成しても構わないし、各マイクロレンズに対応する復号ベース画像上の画素を同定し、その画素を中心とする画素群を当該マイクロレンズ下の画素群とすることで合成しても構わない。なお、復号ベース画像の画素をそのまま用いるのではなく、マイクロレンズごとの焦点パラメータにしたがって、復号ベース画像をフィルタリングしたものを用いるようにしても構わない。 For example, it may be generated on the assumption that all the objects are completely diffusely reflected, or may be generated on the assumption that all the objects are focused on the microlens array plane. As an example of a specific method, a pixel value of a corresponding decoding base image may be assigned to a pixel group under each microlens, and may be created according to the decoding base image and an encoding target light field image parameter. A multi-viewpoint image called a sub-aperture image group may be generated, and the light-field image may be synthesized by converting the multi-viewpoint image. The pixel on the decoded base image corresponding to each microlens is identified, and the pixel You may combine by making the pixel group centering on a pixel group under the said micro lens. Instead of using the pixels of the decoded base image as they are, a pixel obtained by filtering the decoded base image according to the focus parameter for each microlens may be used.

合成ライトフィールド画像を生成したら、ライトフィールド画像符号化部１０８は、合成ライトフィールド画像を予測画像としながら、符号化対象ライトフィールド画像を符号化する（ステップＳ１０８）。どのような方法を用いても構わないが、例えば、符号化対象ライトフィールド画像と合成ライトフィールド画像の差分を符号化しても構わないし、符号化対象ライトフィールド画像と合成ライトフィールド画像の差分を空間的または時間的に予測して符号化しても構わない。なお、符号化処理は、画像全体を分割したブロックごとに行っても構わないし、画像全体を対象に行っても構わない。更に、予測残差をそのまま符号化しても構わないし、ＤＣＴなどの変換や量子化などを用いて符号化しても構わない。更に別の方法として、合成ライトフィールド画像を通信路等によって誤りが生じた符号化対象ライトフィールド画像とみなして、その誤りを訂正する誤り訂正符号を生成することで符号化を行っても構わない。 After generating the combined light field image, the light field image encoding unit 108 encodes the encoding target light field image while using the combined light field image as a predicted image (step S108). Any method may be used. For example, the difference between the encoding target light field image and the combined light field image may be encoded. It is also possible to encode with prediction or temporal prediction. The encoding process may be performed for each block obtained by dividing the entire image, or may be performed on the entire image. Furthermore, the prediction residual may be encoded as it is, or may be encoded using transformation such as DCT or quantization. As another method, encoding may be performed by regarding the combined light field image as an encoding target light field image in which an error has occurred due to a communication path or the like and generating an error correction code for correcting the error. .

最後に、多重化部１０９は、ベース画像符号化部１０５の出力するベース画像のビットストリーム、ライトフィールド画像パラメータ符号化部１０３の出力する符号化対象ライトフィールド画像パラメータのビットストリーム、ライトフィールド画像符号化部１０８の出力する符号化対象ライトフィールド画像のビットストリームを多重化して、画像符号化装置１００の出力とする（ステップＳ１０９）。 Finally, the multiplexing unit 109 outputs the bit stream of the base image output from the base image encoding unit 105, the bit stream of the encoding target light field image parameter output from the light field image parameter encoding unit 103, and the light field image code. The bit stream of the encoding target light field image output from the encoding unit 108 is multiplexed and output as the output of the image encoding device 100 (step S109).

次に、本実施形態における画像復号装置について説明する。図３は本実施形態における画像復号装置の構成を示すブロック図である。画像復号装置２００は、図３に示すように、ビットストリーム入力部２０１、分離部２０２、ライトフィールド画像パラメータ復号部２０３、ベース画像復号部２０４、ライトフィールド画像合成部２０５及びライトフィールド画像復号部２０６を備えている。 Next, the image decoding apparatus in this embodiment will be described. FIG. 3 is a block diagram showing the configuration of the image decoding apparatus according to this embodiment. As shown in FIG. 3, the image decoding apparatus 200 includes a bit stream input unit 201, a separation unit 202, a light field image parameter decoding unit 203, a base image decoding unit 204, a light field image synthesis unit 205, and a light field image decoding unit 206. It has.

ビットストリーム入力部２０１は、復号対象となるライトフィールド画像のビットストリームを入力する。以下では、この復号対象となるライトフィールド画像を復号対象ライトフィールド画像と呼ぶ。分離部２０２は、入力されたビットストリームを、後述するベース画像のビットストリーム、後述するライトフィールド画像パラメータのビットストリーム、復号対象ライトフィールド画像のビットストリームに分離する。 A bit stream input unit 201 inputs a bit stream of a light field image to be decoded. Hereinafter, the light field image to be decoded is referred to as a decoding target light field image. The separation unit 202 separates the input bit stream into a base image bit stream described later, a light field image parameter bit stream described later, and a decoding target light field image bit stream.

ライトフィールド画像パラメータ復号部２０３は、分離されたビットストリームの１つから、ライトフィールド画像を特徴付けるパラメータを設定する。どのようなパラメータを設定しても構わないが、後述するライトフィールド画像合成部２０５において、復号対象ライトフィールド画像を合成する際に必要となるパラメータが設定されるものとする。例えば、マイクロレンズアレイにおけるマイクロレンズの配置、マイクロレンズアレイと画像センサの回転・平行移動のズレ、各マイクロレンズの焦点パラメータ、マイクロレンズに対応する画像素子の数、画像センサのＲＧＢ配列、復号対象ライトフィールド画像に対するデプスマップなどがある。以下では、これらのパラメータを総じて、復号対象ライトフィールド画像パラメータと称する。 The light field image parameter decoding unit 203 sets parameters characterizing the light field image from one of the separated bit streams. Although any parameter may be set, it is assumed that the light field image synthesis unit 205 described later sets parameters necessary for synthesizing the decoding target light field image. For example, the arrangement of microlenses in the microlens array, the rotation / parallel shift between the microlens array and the image sensor, the focus parameter of each microlens, the number of image elements corresponding to the microlens, the RGB array of the image sensor, and the decoding target Depth maps for light field images. Hereinafter, these parameters are collectively referred to as decoding target light field image parameters.

ベース画像復号２０４は、分離されたビットストリームの１つから、復号対象ライトフィールド画像と同じ位置・向き・画角などで撮影した通常の画像を復号する。以下では、この画像をベース画像と称する。ライトフィールド画像合成部２０５は、ライトフィールド画像パラメータに従って、復号されたベース画像から、符号化対象ライトフィールド画像の合成ライトフィールド画像を生成する。ライトフィールド画像復号部２０６は、合成ライトフィールド画像を予測画像として用いながら、分離されたビットストリームの１つから、復号対象ライトフィールド画像を復号する。 The base image decoding 204 decodes a normal image captured at the same position / orientation / angle of view as the decoding target light field image from one of the separated bitstreams. Hereinafter, this image is referred to as a base image. The light field image synthesis unit 205 generates a synthesized light field image of the encoding target light field image from the decoded base image according to the light field image parameter. The light field image decoding unit 206 decodes the decoding target light field image from one of the separated bitstreams while using the combined light field image as a predicted image.

次に、図４を参照して、図３に示す画像復号装置２００の動作を説明する。図４は、図３に示す画像復号装置２００の動作を示すフローチャートである。まず、ビットストリーム入力部２０１は、復号対象ライトフィールド画像を符号化したビットストリームを入力する（ステップＳ２０１）。 Next, the operation of the image decoding apparatus 200 shown in FIG. 3 will be described with reference to FIG. FIG. 4 is a flowchart showing the operation of the image decoding apparatus 200 shown in FIG. First, the bitstream input unit 201 inputs a bitstream obtained by encoding a decoding target light field image (step S201).

ビットストリームの入力が終了したら、分離部２０２は、入力されたビットストリームを、後述するベース画像のビットストリーム、後述するライトフィールド画像パラメータのビットストリーム、復号対象ライトフィールド画像のビットストリームに分離する（ステップＳ２０２）。 When the input of the bit stream is completed, the separation unit 202 separates the input bit stream into a base image bit stream described later, a light field image parameter bit stream described later, and a decoding target light field image bit stream ( Step S202).

ビットストリーム分離が終了したら、ベース画像復号部２０４は、分離されたビットストリームの１つから、復号対象ライトフィールド画像に対するベース画像を復号する（ステップＳ２０３）。ここでの処理は前述のステップＳ１０６と同じであり、符号化時に用いた方法に対応して、ビットストリームを正しく復号できるならばどのような方法を用いても構わない。なお、ここで得られるベース画像を画像復号装置２００の出力の１つとして出力しても構わない。 When the bitstream separation ends, the base image decoding unit 204 decodes the base image for the light field image to be decoded from one of the separated bitstreams (step S203). The processing here is the same as in step S106 described above, and any method may be used as long as the bitstream can be correctly decoded in accordance with the method used at the time of encoding. Note that the base image obtained here may be output as one of the outputs of the image decoding apparatus 200.

次に、ライトフィールド画像パラメータ復号部２０３は、分離されたビットストリームの１つから、復号対象ライトフィールド画像パラメータを復号する（ステップＳ２０４）。復号対象ライトフィールド画像パラメータの１部または全ては、別途画像復号装置２００に入力されても構わないし、ベース画像を用いて推定しても構わない。なお、ここでの復号手法は符号化時に用いられた手法に応じて、正しく復号できるものであれば、どのような方法を用いても構わない。 Next, the light field image parameter decoding unit 203 decodes the decoding target light field image parameter from one of the separated bit streams (step S204). Some or all of the decoding target light field image parameters may be separately input to the image decoding device 200 or may be estimated using the base image. Note that any decoding method may be used as long as it can be correctly decoded according to the method used at the time of encoding.

復号対象ライトフィールド画像パラメータとベース画像の復号が終わったら、ライトフィールド画像合成部２０５は、復号対象ライトフィールド画像に対する合成ライトフィールド画像を生成する（ステップＳ２０５）。ここでの処理は前述のステップＳ１０７と同じであり、符号化時と同じであれば、どのような方法を用いても構わない。なお、ドリフト等の符号化歪みが発生しても構わない場合や復号対象ライトフィールド画像が誤り訂正符号で符号化されているのであれば、符号化時と異なる方法を用いても構わない。 When decoding of the decoding target light field image parameter and the base image is completed, the light field image combining unit 205 generates a combined light field image for the decoding target light field image (step S205). The processing here is the same as in step S107 described above, and any method may be used as long as it is the same as that at the time of encoding. If encoding distortion such as drift may occur or if the light field image to be decoded is encoded with an error correction code, a method different from that for encoding may be used.

合成ライトフィールド画像を生成したら、ライトフィールド画像復号部２０６は、合成ライトフィールド画像を予測画像としながら、復号対象ライトフィールド画像を復号し、画像復号装置２００から出力する（ステップＳ２０６）。符号化時に用いた方法に対応して、ビットストリームを正しく復号できるならばどのような方法を用いても構わない。 When the combined light field image is generated, the light field image decoding unit 206 decodes the decoding target light field image while using the combined light field image as a predicted image, and outputs the decoded light field image from the image decoding apparatus 200 (step S206). Any method may be used as long as the bitstream can be correctly decoded in accordance with the method used at the time of encoding.

上述した実施形態では、画像全体を符号化／復号する処理として書かれているが、画像の一部分のみに適用することも可能である。この場合、処理を適用するか否かを判断して、それを示すフラグを符号化／復号しても構わないし、なんらか別の手段でそれを指定しても構わない。例えば、領域ごとの予測画像を生成する手法を示すモードの１つとして表現する方法を用いても構わない。 In the above-described embodiment, it is written as a process for encoding / decoding the entire image, but it can also be applied to only a part of the image. In this case, it may be determined whether or not to apply the process, and a flag indicating the process may be encoded / decoded, or may be designated by some other means. For example, you may use the method of expressing as one of the modes which show the method of producing | generating the estimated image for every area | region.

また、上述した実施形態では、１つの処理対象のライトフィールド画像に対して、１つのベース画像を使用しているが、複数のベース画像を使用しても構わない。例えば、ライトフィールド画像に対応するサブアパチャ画像群のうち、予め定められた位置に対応する複数のサブアパチャ画像を全てベース画像として設定しても構わない。その時、ベース画像として設定された複数のサブアパチャ画像から推定できる処理対象ライトフィールド画像に対するデプスマップなどは符号化せずに、符号化側と復号側の双方で推定して使用するようにしても構わない。なお、複数のベース画像を効率的に符号化するために、ＭＶ−ＨＥＶＣなどの多視点映像符号化方法を用いても構わない。 In the above-described embodiment, one base image is used for one light field image to be processed. However, a plurality of base images may be used. For example, a plurality of sub-aperture images corresponding to a predetermined position in the sub-aperture image group corresponding to the light field image may be set as the base image. At this time, the depth map or the like for the processing target light field image that can be estimated from the plurality of sub-aperture images set as the base image may be estimated and used on both the encoding side and the decoding side without being encoded. Absent. In order to efficiently encode a plurality of base images, a multi-view video encoding method such as MV-HEVC may be used.

図５は、前述した画像符号化装置１００をコンピュータとソフトウェアプログラムとによって構成する場合のハードウェア構成を示すブロック図である。図５に示すシステムは、プログラムを実行するＣＰＵ５０と、ＣＰＵ５０がアクセスするプログラムやデータが格納されるＲＡＭ等のメモリ５１と、ライトフィールドカメラ等からの処理対象の画像信号を入力する符号化対象ライトフィールド画像入力部５２（ディスク装置等による画像信号を記憶する記憶部でもよい）と、画像処理をＣＰＵ５０に実行させるソフトウェアプログラムである画像符号化プログラム５３１が格納されたプログラム記憶装置５３と、ＣＰＵ５０がメモリ５１にロードされた画像符号化プログラム５３１を実行することにより生成されたビットストリームを出力するビットストリーム出力部５５（ディスク装置等によるビットストリームを記憶する記憶部でもよい）とが、バスで接続された構成になっている。なお、符号化対象ライトフィールド画像パラメータが外部から与えられる場合は、その入力部が更にバスに接続される。 FIG. 5 is a block diagram showing a hardware configuration when the above-described image encoding device 100 is configured by a computer and a software program. The system shown in FIG. 5 includes a CPU 50 that executes a program, a memory 51 such as a RAM that stores programs and data accessed by the CPU 50, and an encoding target light that inputs an image signal to be processed from a light field camera or the like. A field image input unit 52 (which may be a storage unit that stores image signals from a disk device or the like), a program storage device 53 that stores an image encoding program 531 that is a software program that causes the CPU 50 to execute image processing, and a CPU 50 A bit stream output unit 55 that outputs a bit stream generated by executing the image encoding program 531 loaded in the memory 51 (may be a storage unit that stores a bit stream by a disk device or the like) is connected via a bus. It has been configured. When the encoding target light field image parameter is given from the outside, the input unit is further connected to the bus.

図６は、前述した画像復号装置２００をコンピュータとソフトウェアプログラムとによって構成する場合のハードウェア構成を示すブロック図である。図６に示すシステムは、プログラムを実行するＣＰＵ６０と、ＣＰＵ６０がアクセスするプログラムやデータが格納されるＲＡＭ等のメモリ５１と、画像符号化装置が本手法により符号化したビットストリームを入力するビットストリーム入力部６２（ディスク装置等によるビットストリームを記憶する記憶部でもよい）と、画像復号処理をＣＰＵ６０に実行させるソフトウェアプログラムである画像復号プログラム６３１が格納されたプログラム記憶装置６３と、ＣＰＵ６０がメモリ６１にロードされた画像復号プログラム６５１を実行することにより、ビットストリームを復号して得られた復号対象ライトフィールド画像を、再生装置などに出力する復号対象ライトフィールド画像出力部６４（ディスク装置等による画像信号を記憶する記憶部でもよい）とが、バスで接続された構成になっている。なお、復号対象ライトフィールド画像パラメータが外部から与えられる場合には、その入力部が更にバスに接続される。また、ベース画像の出力が必要になる場合は、その出力部が更にバスに接続される。 FIG. 6 is a block diagram showing a hardware configuration when the above-described image decoding apparatus 200 is configured by a computer and a software program. The system shown in FIG. 6 includes a CPU 60 that executes a program, a memory 51 such as a RAM that stores programs and data accessed by the CPU 60, and a bit stream that receives a bit stream encoded by the image encoding apparatus according to the present technique. An input unit 62 (which may be a storage unit that stores a bit stream by a disk device or the like), an image decoding program 631 that is a software program that causes the CPU 60 to execute an image decoding process, and a CPU 60 that has a memory 61 By executing the image decoding program 651 loaded on the decoding target light field image output unit 64 that outputs a decoding target light field image obtained by decoding the bitstream to a playback device or the like (an image by a disk device or the like). Remember signal It may also be) and in 憶部 but have become connected to each other by a bus. When the decoding target light field image parameter is given from the outside, the input unit is further connected to the bus. Further, when the output of the base image is necessary, the output unit is further connected to the bus.

以上説明したように、本実施形態による画像符号化装置及び画像復号装置は、ライトフィールドカメラによって撮影された(角度ごとに分離された光線が記録された)画像の符号化／復号化をする際に、通常の画像符号方法／復号方法では効率的に符号化／復号化をすることができないライトフィールドカメラによって撮影された画像の効率的な符号化と、容易な復号化を可能とするものである。すなわち、ライトフィールド画像から生成される代表的な画像を基本レイヤとして、そこからライトフィールド画像を予測符号化するスケーラブル符号化を行うことで、撮像内容の容易な確認ができるようになるとともに、ライトフィールド画像の高効率な圧縮符号化を同時に実現するものである。この構成によれば、メインレンズと投影面の間にマイクロレンズアレイを挿入した構成をもつようなライトフィールドカメラによって撮像されたライトフィールド画像に対して、撮像内容を容易に確認でき、高効率な圧縮符号化を実現することができる。 As described above, the image encoding device and the image decoding device according to the present embodiment encode / decode an image captured by a light field camera (recorded with rays separated for each angle). In addition, it enables efficient encoding and easy decoding of images taken by a light field camera, which cannot be efficiently encoded / decoded by a normal image encoding method / decoding method. is there. That is, by using a representative image generated from a light field image as a base layer and performing scalable coding for predictive coding of the light field image from the base layer, it becomes possible to easily check the captured contents and This realizes highly efficient compression coding of field images at the same time. According to this configuration, it is possible to easily confirm the imaging content of a light field image captured by a light field camera having a configuration in which a microlens array is inserted between the main lens and the projection surface, and highly efficient. Compression encoding can be realized.

前述した実施形態における画像符号化装置１００及び画像復号装置２００の全部または一部をコンピュータで実現するようにしてもよい。その場合、この機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含んでもよい。また上記プログラムは、前述した機能の一部を実現するためのものであってもよく、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよく、ＰＬＤ（Programmable Logic Device）やＦＰＧＡ（Field Programmable Gate Array）等のハードウェアを用いて実現されるものであってもよい。 You may make it implement | achieve all or one part of the image coding apparatus 100 and the image decoding apparatus 200 in embodiment mentioned above with a computer. In that case, a program for realizing this function may be recorded on a computer-readable recording medium, and the program recorded on this recording medium may be read into a computer system and executed. Here, the “computer system” includes an OS and hardware such as peripheral devices. The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory inside a computer system serving as a server or a client in that case may be included and a program held for a certain period of time. Further, the program may be a program for realizing a part of the above-described functions, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system. It may be realized using hardware such as PLD (Programmable Logic Device) or FPGA (Field Programmable Gate Array).

以上、図面を参照して本発明の実施の形態を説明してきたが、上記実施の形態は本発明の例示に過ぎず、本発明が上記実施の形態に限定されるものではないことは明らかである。したがって、本発明の技術思想及び範囲を逸脱しない範囲で構成要素の追加、省略、置換、その他の変更を行ってもよい。 As mentioned above, although embodiment of this invention has been described with reference to drawings, the said embodiment is only the illustration of this invention, and it is clear that this invention is not limited to the said embodiment. is there. Therefore, additions, omissions, substitutions, and other modifications of the components may be made without departing from the technical idea and scope of the present invention.

ライトフィールド画像を符号化及び復号する際に、当該ライトフィールド画像に含まれるシーンの内容の容易な確認と、効率的な圧縮符号化を同時に実現することが不可欠な用途に適用できる。 When a light field image is encoded and decoded, the present invention can be applied to an application in which it is indispensable to easily confirm the contents of a scene included in the light field image and to realize efficient compression encoding at the same time.

１００・・・画像符号化装置、１０１・・・ライトフィールド画像入力部、１０３・・・ライトフィールド画像パラメータ符号化部、１０４・・・ベース画像生成部、１０５・・・ベース画像符号化部、１０６・・・ベース画像復号部、１０７・・・ライトフィールド画像合成部、１０８・・・ライトフィールド画像符号化部、１０９・・・多重化部、２００・・・画像復号装置、２０１・・・ビットストリーム入力部、２０２・・・分離部、２０３・・・ライトフィールド画像パラメータ復号部、２０４・・・ベース画像復号部、２０５・・・ライトフィールド画像合成部、２０６・・・ライトフィールド画像復号部、５０、６０・・・ＣＰＵ、５１、６１・・・メモリ、５２・・・符号化対象ライトフィールド画像入力部（記憶部）、５３、６３・・・プログラム記憶装置、５３１・・・画像符号化プログラム、５４・・・ビットストリーム出力部（記憶部）、６２・・・ビットストリーム入力部（記憶部）、６３１・・・画像復号プログラム、６４・・・復号対象ライトフィールド画像出力部（記憶部） DESCRIPTION OF SYMBOLS 100 ... Image encoding apparatus, 101 ... Light field image input part, 103 ... Light field image parameter encoding part, 104 ... Base image generation part, 105 ... Base image encoding part, 106: Base image decoding unit, 107: Light field image synthesis unit, 108 ... Light field image encoding unit, 109 ... Multiplexing unit, 200 ... Image decoding device, 201 ... Bit stream input unit, 202 ... separation unit, 203 ... light field image parameter decoding unit, 204 ... base image decoding unit, 205 ... light field image synthesis unit, 206 ... light field image decoding , 50, 60... CPU, 51, 61... Memory, 52... Encoding target light field image input unit (storage unit), 5 63 ... Program storage device, 531 ... Image encoding program, 54 ... Bit stream output unit (storage unit), 62 ... Bit stream input unit (storage unit), 631 ... Image decoding Program, 64... Decoding target light field image output unit (storage unit)

Claims

An image encoding device for encoding a first light field image obtained by a light field camera,
Light field image parameter setting means for setting a light field image parameter characterizing the first light field image based on the first light field image;
Base image generation means for generating a base image that is a normal image of the same scene as the first light field image using the first light field image;
Base image encoding means for encoding the base image;
A light field image generating means for generating a second light field image for the light field image to be encoded using the base image according to the light field image parameter;
An image encoding device comprising: a light field image encoding unit that encodes the first light field image to be encoded and outputs a bit stream while using the second light field image as a predicted image.

The image encoding device according to claim 1, wherein the base image generation unit generates an omnifocal image focused on the entire surface of the first light field image as the base image.

2. The image code according to claim 1, wherein the base image generation unit generates, as the base image, a sub-aperture image that is an image of a microlens located in the center of a main lens of the light field camera in the first light field image. Device.

An image decoding device for decoding the first light field image from a bit stream of the first light field image obtained by a light field camera,
A light field image parameter setting means for decoding a parameter characterizing the first light field image and setting the decoded parameter as a light field image parameter;
Base image decoding means for decoding a base image that is a normal image of the same scene as the first light field image;
A light field image generating means for generating a second light field image for the light field image to be encoded using the base image according to the light field image parameter;
An image decoding apparatus comprising: a light field image decoding unit configured to decode the first light field image while using the second light field image as a predicted image.

The image decoding device according to claim 4, wherein the base image decoding unit decodes an omnifocal image focused on the entire image of the same scene as the first light field image as the base image.

5. The image decoding according to claim 4, wherein the base image decoding unit decodes, as the base image, a sub-aperture image that is an image of a microlens located in the center of a main lens of the light field camera in the first light field image. apparatus.

An image encoding method performed by an image encoding device that encodes a first light field image obtained by a light field camera,
A light field image parameter setting step for setting a light field image parameter characterizing the first light field image based on the first light field image;
Using the first light field image to generate a base image that is a normal image of the same scene as the first light field image; and
A base image encoding step for encoding the base image;
A light field image generation step of generating a second light field image for the encoding light field image using the base image according to the light field image parameter;
And a light field image encoding step of encoding the first light field image to be encoded and outputting a bitstream while using the second light field image as a predicted image.

An image decoding method performed by an image decoding apparatus that decodes the first light field image from a bit stream of a first light field image obtained by a light field camera,
A light field image parameter setting step of decoding a parameter characterizing the first light field image and setting the decoded parameter as a light field image parameter;
A base image decoding step of decoding a base image that is a normal image of the same scene as the first light field image;
A light field image generation step of generating a second light field image for the encoding light field image using the base image according to the light field image parameter;
And a light field image decoding step of decoding the first light field image while using the second light field image as a predicted image.