JP4937161B2

JP4937161B2 - Distance information encoding method, decoding method, encoding device, decoding device, encoding program, decoding program, and computer-readable recording medium

Info

Publication number: JP4937161B2
Application number: JP2008051752A
Authority: JP
Inventors: 信哉志水; 英明木全; 一人上倉; 由幸八島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2008-03-03
Filing date: 2008-03-03
Publication date: 2012-05-23
Anticipated expiration: 2028-03-03
Also published as: JP2009212648A

Description

本発明は，多視点距離情報の符号化および復号技術に関するものである。 The present invention relates to a technique for encoding and decoding multi-view distance information.

多視点画像とは，複数のカメラで同じ被写体と背景を撮影した複数の画像のことであり，多視点動画像（多視点映像）とは，その動画像のことである。また，ここで言う距離情報とは，ある画像に対して与えられる領域ごとのカメラから被写体までの距離を表す情報である。多視点距離情報とは，多視点画像に対する距離情報であり，通常の距離情報複数個からなる集合となる。カメラから被写体までの距離はシーンの奥行きということもできるため，距離情報は奥行き情報と呼ばれることもある。 A multi-view image is a plurality of images obtained by photographing the same subject and background with a plurality of cameras, and a multi-view video (multi-view video) is a moving image. The distance information referred to here is information representing the distance from the camera to the subject for each region given to a certain image. The multi-view distance information is distance information for a multi-view image, and is a set of a plurality of normal distance information. Since the distance from the camera to the subject can be called the depth of the scene, the distance information is sometimes called depth information.

一般に，このような距離情報はカメラで撮影された結果の２次元平面に対して与えられるため，その距離を画像の画素値にマッピングすることで距離画像として表される。２次元平面のある点に対する情報としては１つの距離という情報のみになるため，距離画像はグレースケール画像として表現することが可能である。なお，距離画像は奥行き画像やデプスマップ(Depth Map) と呼ばれることもある。 In general, such distance information is given to a two-dimensional plane obtained as a result of being photographed by a camera, so that the distance information is represented as a distance image by mapping the distance to a pixel value of the image. Since the information for a certain point on the two-dimensional plane is only one distance, the distance image can be expressed as a gray scale image. The distance image is sometimes called a depth image or a depth map.

距離情報の利用用途の１つとして立体画像がある。一般的な立体画像の表現では，観測者の右目用の画像と左目用の画像からなるステレオ画像であるが，あるカメラにおける画像とその距離情報とを用いて立体画像を表現することができる（詳しい技術は非特許文献１を参照）。 One of the uses of distance information is a stereoscopic image. In general stereo image representation, a stereo image is composed of an observer's right-eye image and left-eye image, but a stereo image can be represented using an image from a camera and its distance information ( For details, see Non-Patent Document 1.)

このような１視点における映像と距離情報とを用いて表現された立体映像を符号化する方式には，ＭＰＥＧ−ＣＰａｒｔ．３（ＩＳＯ／ＩＥＣ２３００２−３）を使用することが可能である（詳しい内容は非特許文献２を参照）。 As a method for encoding a stereoscopic video represented by using the video at one viewpoint and the distance information, MPEG-C Part. 3 (ISO / IEC 23002-3) can be used (refer to Non-Patent Document 2 for details).

多視点距離情報は，単視点の距離情報を用いて表現可能な立体映像よりも，大きな視差を持った立体映像を表現するのに利用される（詳細は非特許文献３を参照）。 The multi-view distance information is used to represent a stereoscopic image having a larger parallax than a stereoscopic image that can be expressed using single-view distance information (see Non-Patent Document 3 for details).

また，このような立体映像を表現する用途以外に，多視点距離情報は，鑑賞者が撮影カメラの配置を気にせずに自由に視点を移動できる自由視点映像を生成するデータの１つとしても使用される。このような撮影カメラとは別のカメラからシーンを見ているとしたときの合成画像を仮想視点画像と呼ぶことがあり，Ｉｍａｇｅ−ｂａｓｅｄＲｅｎｄｅｒｉｎｇの分野で盛んにその生成法が検討されている。多視点映像と多視点距離情報とから仮想視点映像を生成する代表的な手法としては，非特許文献４に記載されている手法がある。 In addition to the purpose of representing such stereoscopic images, multi-view distance information can be used as one of data for generating a free viewpoint image that allows the viewer to freely move the viewpoint without worrying about the location of the shooting camera. used. A composite image when a scene is viewed from a camera different from such a camera is sometimes referred to as a virtual viewpoint image, and its generation method is actively studied in the field of Image-based Rendering. As a representative method for generating a virtual viewpoint video from multi-view video and multi-view distance information, there is a method described in Non-Patent Document 4.

前述のとおり，距離情報はグレースケール動画像とみなすことができ，被写体は実空間上で連続的に存在し，瞬間的に移動することができないため，画像信号と同様に空間的相関および時間的相関を持つと言える。したがって，通常の映像信号を符号化するために用いられる画像符号化方式や動画像符号化方式によって，距離情報は空間的冗長性や時間的冗長性を取り除きながら効率的に符号化される。実際にＭＰＥＧ−ＣＰａｒｔ．３では，既存の動画像符号化方式を用いて符号化を行っている。 As described above, the distance information can be regarded as a gray scale moving image, and the subject exists continuously in the real space and cannot move instantaneously. It can be said that there is a correlation. Therefore, distance information is efficiently encoded while removing spatial redundancy and temporal redundancy by an image encoding method and a moving image encoding method used for encoding a normal video signal. Actually MPEG-C Part. In No. 3, encoding is performed using an existing moving image encoding method.

ここで，従来の一般的な映像信号の符号化方式について説明する。一般に被写体が実空間上で空間的および時間的連続性を持つことから，その見え方は空間的および時間的に高い相関を持つ。映像信号の符号化では，そのような相関性を利用して高い符号化効率を達成している。 Here, a conventional general video signal encoding method will be described. In general, since an object has spatial and temporal continuity in real space, its appearance is highly correlated in space and time. In video signal encoding, such a correlation is utilized to achieve high encoding efficiency.

具体的には，符号化対象ブロックの映像信号を既に符号化済みの映像信号から予測して，その予測残差のみを符号化することで，符号化される必要のある情報を減らし，高い符号化効率を達成する。代表的な映像信号の予想の手法としては，単視点映像では，隣接するブロックから空間的に予測信号を生成する画面内予測や，近接時刻に撮影された符号化済みフレームから被写体の動きを推定して時間的に予測信号を生成する動き補償予測があり，多視点映像では，これらの他に別のカメラで撮影された符号化済みフレームから被写体の視差を推定して，カメラ間で予測信号を生成する視差補償予測がある。各手法の詳細は，非特許文献５，非特許文献６などに記載されている。
C.Fehn, P.Kauff, M.Op de Beeck, F.Emst, W.IJsselsteijn, M.Pollefeys, L.Van Gool, E.Ofek and I.Sexton, “An Evolutionary and Optimised Approach on 3D-TV ”, Proceedings of International Broadcast Conference, pp. 357-365, Amsterdam, The Netherlands, September 2002. W.H.A. Bruls, C.Varekamp, R.Klein Gunnewiek, B.Barenbrug and A.Bourge,“Enabling Introduction of Stereoscopic(3D）Video: Formats and Compression Standards”, Proceedings of IEEE International Conference on Image Processing, pp.I-89-I-92, San Antonio, USA, September 2007. A.Smolic, K.Mueller, P.Merkle, N.Atzpadin, C.Fehn, M.Mueller, O.Schreer, R.Tanger, P.Kauff and T.Wiegand, “Multi-view video plus depth (MVD) format for advanced 3D video systems”, Joint Video Team of ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, Doc. JVT-W100, San Jose, USA, April 2007. C.L.Zitnick, S.B.Kang, M.Uyttendaele, S.A.J.Winder, and R.Szeliski, “High-quality Video View Interpolation Using a Layered Representation”, ACM Transactions on Graphics, vol.23, no.3, pp.600-608, August 2004. ITU-T Rec.H.264/ISO/IEC 11496-10, “Advanced Video Coding ”, Final Committee Draft, Document JVT-E022, September 2002. H.Kimata and M.Kitahara,“Preliminary results on multiple view video coding （3DAV）”, document M10976 MPEG Redmond Meeting, July, 2004. Specifically, the video signal of the encoding target block is predicted from the already encoded video signal, and only the prediction residual is encoded, thereby reducing the information that needs to be encoded, Achieve efficiency. As typical video signal prediction methods, in single-view video, intra-screen prediction that generates a prediction signal spatially from adjacent blocks, and estimation of subject motion from encoded frames taken at close times Then, there is motion compensated prediction that generates a prediction signal in time, and in multi-view images, in addition to these, the parallax of the subject is estimated from an encoded frame taken by another camera, and the prediction signal is transmitted between the cameras. There is a disparity compensation prediction that generates Details of each method are described in Non-Patent Document 5, Non-Patent Document 6, and the like.
C.Fehn, P.Kauff, M.Op de Beeck, F.Emst, W.IJsselsteijn, M.Pollefeys, L.Van Gool, E.Ofek and I.Sexton, “An Evolutionary and Optimised Approach on 3D-TV” , Proceedings of International Broadcast Conference, pp. 357-365, Amsterdam, The Netherlands, September 2002. WHA Bruls, C. Varekamp, R. Klein Gunnewiek, B. Barenbrug and A. Bourge, “Enabling Introduction of Stereoscopic (3D) Video: Formats and Compression Standards”, Proceedings of IEEE International Conference on Image Processing, pp.I-89 -I-92, San Antonio, USA, September 2007. A.Smolic, K.Mueller, P.Merkle, N.Atzpadin, C.Fehn, M.Mueller, O.Schreer, R.Tanger, P.Kauff and T.Wiegand, “Multi-view video plus depth (MVD) format for advanced 3D video systems ”, Joint Video Team of ISO / IEC JTC1 / SC29 / WG11 and ITU-T SG16 Q.6, Doc. JVT-W100, San Jose, USA, April 2007. CLZitnick, SBKang, M. Uyttendaele, SAJWinder, and R. Szeliski, “High-quality Video View Interpolation Using a Layered Representation”, ACM Transactions on Graphics, vol.23, no.3, pp.600-608, August 2004. ITU-T Rec.H.264 / ISO / IEC 11496-10, “Advanced Video Coding”, Final Committee Draft, Document JVT-E022, September 2002. H. Kimata and M. Kitahara, “Preliminary results on multiple view video coding (3DAV)”, document M10976 MPEG Redmond Meeting, July, 2004.

被写体は実空間上で連続であるため高い空間相関を持ち，瞬間的に移動することが不可能であるため高い時間相関を持つ。したがって，空間相関と時間相関とを利用する既存の映像符号化方式を用いることで，グレースケール画像として表した距離情報を効率的に符号化することが可能である。 The subject has a high spatial correlation because it is continuous in real space, and has a high temporal correlation because it cannot move instantaneously. Therefore, it is possible to efficiently encode distance information represented as a grayscale image by using an existing video encoding method that uses spatial correlation and temporal correlation.

しかしながら，図５のようにカメラの位置や向きが異なる場合，同じ被写体であっても，カメラから被写体までの距離は異なる。そのような場合，同じ被写体であってもフレームごとに距離が異なるため，時間的もしくは空間的にその距離を精度よく予測することは不可能である。 However, when the position and orientation of the camera are different as shown in FIG. 5, the distance from the camera to the subject is different even for the same subject. In such a case, even if the subject is the same, the distance varies from frame to frame, so it is impossible to accurately predict the distance in time or space.

シーン全体の明るさの変化等に対応するために，ルックアップテーブル(Look up table) やＨ．２６４／ＡＶＣの重み付き予測等を用いて，参照先の元の値と参照する値を変化させることで，フレームごとに変化する値を効率的に予測する手法がある。 In order to respond to changes in the brightness of the entire scene, etc. There is a technique for efficiently predicting a value that changes for each frame by changing the original value of the reference destination and the value to be referred to using H.264 / AVC weighted prediction or the like.

このような手法では，ある与えられた値に対して唯一の変換後の値を与えるため，多対一の変換になる。しかしながら，図６に示すように，あるカメラからは同じ距離の被写体であっても，別のカメラからは異なる距離の被写体になる場合がある。この場合，変換は一対多の変換を取り扱う必要があり，従来の処理では効率的な予測を行うことができない。 In such a method, since a single converted value is given for a given value, it becomes a many-to-one conversion. However, as shown in FIG. 6, even if the subject is the same distance from one camera, the subject may be a different distance from another camera. In this case, the conversion needs to handle one-to-many conversion, and the conventional processing cannot perform efficient prediction.

本発明は係る事情に鑑みてなされたものであって，距離情報に対して適切な変換を行うことで，従来よりも効率的に距離情報を符号化することを目的とする。 This invention is made | formed in view of the situation which concerns, Comprising: It aims at encoding distance information more efficiently than before by performing appropriate conversion with respect to distance information.

前述の課題を解決するために，本発明では，カメラから被写体までの距離を符号化対象の距離の基準となっているカメラの位置や向きによらない三次元位置を表す値に変換し，さらにその三次元位置をその座標値の表す点から予め定められた三次元空間上の数直線に下ろした足に対する値へ変換して，その値を符号化する。これにより，復号側では符号化されたデータを復号することで数直線上のある点を示す値が得られ，その値が示す被写体は，数直線上でその値が示す点を通り，その数直線に直交する平面上に存在することが分かる。したがって，復号対象のカメラの位置とその値がサンプリングされていた画像平面上の位置情報とを用いて，被写体の存在する三次元位置を同定することができる。被写体の三次元位置が分かるということは，カメラの位置と向きから，復号した値の示すカメラから被写体までの距離を復元することが可能ということである。 In order to solve the above-described problem, in the present invention, the distance from the camera to the subject is converted into a value representing a three-dimensional position that does not depend on the position or orientation of the camera that is the reference of the distance to be encoded. The three-dimensional position is converted from a point represented by the coordinate value to a value for a foot that is lowered to a predetermined number line in a three-dimensional space, and the value is encoded. As a result, a value indicating a certain point on the number line is obtained by decoding the encoded data on the decoding side, and the subject indicated by the value passes through the point indicated by the value on the number line, and the number It turns out that it exists on the plane orthogonal to a straight line. Therefore, it is possible to identify the three-dimensional position where the subject exists by using the position of the decoding target camera and the position information on the image plane from which the value was sampled. Knowing the three-dimensional position of the subject means that the distance from the camera to the subject indicated by the decoded value can be restored from the position and orientation of the camera.

このようなプレ処理を行うことで，カメラから被写体までの距離というカメラの位置や向きに依存する情報ではなく，予め設定された共通の数直線上での位置というカメラの位置や向きによらない情報を符号化することになる。これによって，カメラの位置や向きが時間的・空間的に異なる場合においても，符号化対象の情報と参照する情報との表現が統一されることになり，効率的な予測符号化を実現することが可能となる。 By performing such pre-processing, the information is not dependent on the camera position and orientation, which is the distance from the camera to the subject, but is not dependent on the camera position or orientation, which is a preset position on a number line. Information will be encoded. As a result, even when the position and orientation of the camera differ temporally and spatially, the representation of the information to be encoded and the information to be referenced will be unified, and efficient predictive encoding will be realized. Is possible.

なお，距離を三次元座標に変換しただけでは，一次元量が三次元量になり符号化対象サンプル数が増えてしまうため，符号化効率が向上するとは限らないが，本発明では距離をある数直線上の値という一次元量に変換するため符号化対象サンプル数が増加することはない。 Note that just converting the distance to three-dimensional coordinates does not necessarily improve the encoding efficiency because the one-dimensional amount becomes a three-dimensional amount and the number of samples to be encoded increases, but in the present invention there is a distance. The number of samples to be encoded does not increase because it is converted into a one-dimensional quantity of values on the number line.

また，この変換はカメラによって実空間のオブジェクトが撮影される際の物理現象に従って行われるため，カメラによる射影変換を十分にモデル化することが可能であれば非常に高い精度で変換を実現することができる。 In addition, since this conversion is performed according to the physical phenomenon when an object in real space is photographed by the camera, if the projection conversion by the camera can be sufficiently modeled, the conversion can be realized with very high accuracy. Can do.

一般に，三次元空間における平面と直線は必ずしも交点を持つとは限らない。つまり，復号時に，数直線に直交する平面とカメラによってサンプリングされた光線が交点を持つ保証はない。交点がないということは，符号化する数直線上の値を決められないことや，与えられた値から被写体の三次元位置を復元できないことを表す。すなわち，前述した距離情報の符号化および復号が行えないことになる。しかしながら，数直線としてカメラの焦点と投影面とを通る直線と直交するどの平面にも存在しない直線を用いることで，必ず符号化および復号を行えるような数直線を設定できる。 In general, a plane and a straight line in a three-dimensional space do not necessarily have an intersection. That is, at the time of decoding, there is no guarantee that a plane orthogonal to the number line and a ray sampled by the camera have an intersection. The fact that there is no intersection means that the value on the number line to be encoded cannot be determined, and that the three-dimensional position of the subject cannot be restored from the given value. That is, the above-described distance information cannot be encoded and decoded. However, by using a straight line that does not exist on any plane orthogonal to the straight line passing through the focal point of the camera and the projection plane as the number straight line, it is possible to set a number straight line that can be encoded and decoded.

上記の条件を満たす数直線であれば，どのような数直線でも適切な変換が可能であり，カメラの位置や向きが時間的・空間的に異なる場合に，符号化対象の情報と参照する情報との表現が統一され，効率的な予測符号化を実現できる。しかしながら，数直線の選び方によっては，符号化する値のレンジや分布が異なる。そのため，いくつかのサンプルを用いて，変換後の値の分散が小さくなるような数直線を選択することで，さらに符号化効率を向上させることが可能となる。また，一方で変換後の分散を大きくすることで，量子化歪みの影響を受け難い値に変換することも可能である。なお，このようにシーケンスに対して数直線を変化させる場合には，復号側にどのような数直線を用いたかを通知する。 Any number line that satisfies the above conditions can be converted appropriately, and information to be encoded and information to be referred to when the position and orientation of the camera differ temporally and spatially. Is unified and efficient predictive coding can be realized. However, the range and distribution of values to be encoded differ depending on how the number line is selected. For this reason, it is possible to further improve the coding efficiency by selecting a number line using several samples so that the variance of the values after conversion becomes small. On the other hand, by increasing the variance after conversion, it is possible to convert to a value that is less susceptible to quantization distortion. When the number line is changed with respect to the sequence in this way, the decoding side is notified of what number line is used.

また，分散ではなく，変換後の値の個数が最小となるような軸を決定することで，距離情報をロスレス符号化する際に，少ない代表値を用いて符号化できるため，効率的な符号化を実現することが可能となる。 In addition, by determining the axis that minimizes the number of values after conversion, instead of variance, distance information can be encoded using fewer representative values when lossless encoding is performed. Can be realized.

また，上記三次元空間上の数直線に下ろした足に対する値の符号化にあたって量子化する際に，数直線の選択に用いたいくつかのサンプルについて，最も量子化による誤差が小さくなるような量子化手法を選択し，その量子化手法を用いて符号化対象の値を量子化して符号化することも好適である。 In addition, when quantizing the value for the foot down to the number line in the above three-dimensional space, the quantum that minimizes the error due to quantization is selected for some samples used to select the number line. It is also preferable to select an encoding method and quantize and encode the value to be encoded using the quantization method.

本発明によれば，参照対象の距離情報と符号化対象の距離情報の基準とするカメラの位置や向きが異なる場合においても，符号化対象の距離情報もしくは参照対象の距離情報を適切に変換することで，従来に比べて効率よく距離情報を符号化することができるようになる。 According to the present invention, the encoding target distance information or the reference target distance information is appropriately converted even when the reference position distance information and the encoding target distance information are based on different camera positions and orientations. Thus, distance information can be encoded more efficiently than in the past.

以下，実施の形態に従って本発明を詳細に説明する。まず，第１の実施例（以下，実施例１）として，距離情報符号化装置について説明する。実施例１に係る距離情報符号化装置の構成図を図１に示す。 Hereinafter, the present invention will be described in detail according to embodiments. First, a distance information encoding apparatus will be described as a first embodiment (hereinafter referred to as a first embodiment). FIG. 1 shows a configuration diagram of a distance information encoding apparatus according to the first embodiment.

図１に示すように，距離情報符号化装置１００は，符号化対象となる距離情報を入力する距離情報入力部１０１と，入力された距離情報をあらかじめ定められた表示形式へ変換する距離情報プレ変換部１０２と，プレ変換された距離情報を実際に符号化する距離情報符号化部１０３とを備える。 As shown in FIG. 1, a distance information encoding apparatus 100 includes a distance information input unit 101 that inputs distance information to be encoded, and a distance information pre-set that converts the input distance information into a predetermined display format. A conversion unit 102 and a distance information encoding unit 103 that actually encodes the pre-converted distance information are provided.

距離情報プレ変換部１０２は，撮影を行ったカメラの位置や向きなどのカメラパラメータと符号化対象となる距離情報とを用いて，その距離情報によって示される被写体の三次元座標を計算する三次元点復元部１０２１と，計算された三次元座標値によって表現される点から予め定められた数直線に下ろした垂線の足を計算する変換距離情報計算部１０２２と，計算された垂線の足の示す値を予め定められた方法で量子化する変換距離情報量子化部１０２３とを備える。 The distance information pre-conversion unit 102 uses the camera parameters such as the position and orientation of the camera that has taken the image and the distance information to be encoded to calculate the three-dimensional coordinates of the subject indicated by the distance information. The point restoration unit 1021, the conversion distance information calculation unit 1022 that calculates a perpendicular foot drawn from a point expressed by the calculated three-dimensional coordinate value to a predetermined number line, and the calculated perpendicular foot A transform distance information quantization unit 1023 that quantizes the value by a predetermined method.

図２に，このようにして構成される距離情報符号化装置１００の実行する処理フローを示す。この処理フローに従って，距離情報符号化装置１００の実行する処理について詳細に説明する。 FIG. 2 shows a processing flow executed by the distance information encoding apparatus 100 configured as described above. The processing executed by the distance information encoding apparatus 100 will be described in detail according to this processing flow.

まず，距離情報入力部１０１より，符号化対象となる距離情報が入力される［ステップＡ１］。本実施例１では，各時刻・各カメラの距離情報はグレースケール画像として与えられるものとする。なお，カメラから被写体までの距離は，適切に量子化され画素値として表されているものとし，距離の値から画素値への写像をＳで表す。以下では，距離情報をＤと表し，記号［］で挟まれた画像上の位置情報を付加することで特定の領域の距離情報を表す。 First, distance information to be encoded is input from the distance information input unit 101 [step A1]. In the first embodiment, it is assumed that distance information of each time and each camera is given as a gray scale image. Note that the distance from the camera to the subject is appropriately quantized and represented as a pixel value, and the mapping from the distance value to the pixel value is represented by S. Below, distance information is represented as D, and position information on an image sandwiched between symbols [] is added to represent distance information of a specific region.

入力された距離情報Ｄは，距離情報プレ変換部１０２でＤ′へと変換される［ステップＡ２−Ａ６］。変換はサンプル（画素）ごとに行われる。つまり，サンプルを表すインデックスをｐｉｘ，入力サンプル数をｎｕｍＰｉｘｓとすると，ｐｉｘを０に初期化した後［ステップＡ２］，ｐｉｘに１を加算しながら［ステップＡ５］，ｐｉｘがｎｕｍＰｉｘｓになるまで［ステップＡ６］，以下のステップＡ３−Ａ４の処理を繰り返す。 The inputted distance information D is converted into D ′ by the distance information pre-conversion unit 102 [step A2-A6]. Conversion is performed for each sample (pixel). In other words, if the index representing the sample is pix and the number of input samples is numPixs, after initializing pix to 0 [Step A2], adding 1 to pix [Step A5], until pix becomes numPixs [Step A6], the following steps A3-A4 are repeated.

サンプルごとに行われる処理では，まず三次元点復元部１０２１が，カメラパラメータを用いて，画像上のｐｉｘの位置で観測された被写体の統一座標系における三次元位置ｇpix を求める［ステップＡ３］。三次元空間上の被写体はカメラの投影モデルに従って，２次元の画像平面へと投影されるため，その逆変換を行うことで三次元位置を求めることが可能である。具体的には次の数式を用いて計算できる。 In the process performed for each sample, the three-dimensional point restoration unit 1021 first obtains the three-dimensional position gpix in the unified coordinate system of the subject observed at the position of pix on the image using the camera parameters [Step A3]. Since the subject in the three-dimensional space is projected onto a two-dimensional image plane according to the projection model of the camera, it is possible to obtain the three-dimensional position by performing the inverse transformation. Specifically, it can be calculated using the following formula.

カメラパラメータの表現法には様々なものがあるため，定義に従った数式を用いる必要がある。本実施例では，画像座標ｍと世界座標Ｍの対応関係が，Ｍ* ＝ＲＡ-1ｍ* ＋ｔの式で得られるカメラパラメータ表現を用いているものとする。なお，Ａ，Ｒ，ｔは，それぞれカメラの内部パラメータ行列，回転行列，並進ベクトルを表し，Ｍ* およびｍ* は，Ｍ，ｍについて任意スカラ倍を許した斉次座標を表す。ＡとＲは３×３の行列であり，ｔは三次元ベクトルである。（ｕpix ，ｖpix ）がｐｉｘの画像平面上での位置を表す。Ｓ-1はＳの逆射影を表す。 Since there are various ways to represent camera parameters, it is necessary to use mathematical formulas according to the definition. In the present embodiment, it is assumed that the correspondence between the image coordinate m and the world coordinate M uses a camera parameter expression obtained by the equation M * = RA-1m * + t. A, R, and t represent the internal parameter matrix, rotation matrix, and translation vector of the camera, respectively. M * and m * represent homogeneous coordinates that allow arbitrary scalar multiplication for M and m. A and R are 3 × 3 matrices, and t is a three-dimensional vector. (Upix, vpix) represents the position of pix on the image plane. S-1 represents the reverse projection of S.

ｐｉｘの三次元座標ｇpix が得られたら，変換距離情報計算部１０２２はその点を予め定められた数直線Ｖに投影しスカラ値を求める。また，変換距離情報量子化部１０２３は，その値を予め定められた手法で量子化して変換後の値Ｄ′［ｕpix ，ｖpix ］とする［ステップＡ４］。 When the three-dimensional coordinate gpix of pix is obtained, the conversion distance information calculation unit 1022 projects the point onto a predetermined number line V to obtain a scalar value. Also, the transform distance information quantization unit 1023 quantizes the value by a predetermined method to obtain a transformed value D ′ [upix, vpix] [step A4].

カメラの焦点と任意の画素とを結ぶ直線と直交するどの平面とも平行でなければ，任意の数直線をＶとしても構わない。なお，このＶと量子化手法は全ての距離情報に対して同じものを用いる。 As long as it is not parallel to any plane orthogonal to the straight line connecting the focal point of the camera and any pixel, any number of straight lines may be V. Note that the same V and quantization method are used for all distance information.

例えば，カメラの投影面に垂直な軸をＶ，原点をカメラの焦点を投影した点，向きをカメラの向きと同じとすると，ｇpix を投影した値はＳ-1（Ｄ［ｕpix ，ｖpix ］）となり，量子化をＳで行えばＤの値が得られ変換を行わないことになる。 For example, assuming that the axis perpendicular to the projection plane of the camera is V, the origin is the point where the focus of the camera is projected, and the orientation is the same as the orientation of the camera, the projected value of gpix is S-1 (D [upix, vpix]) Thus, if quantization is performed with S, a value of D is obtained and conversion is not performed.

処理が簡単な例としては，統一座標系の１つの軸をＶとし，ｇpix のある成分を量子化することで変換を行う手法がある。ｘ軸をＶとするならば第１成分を，ｙ軸をＶとするならば第２成分を，ｚ軸をＶとするならば第３成分を量子化することになる。このとき，次の連立方程式を解くことで距離の値を復元できる。 As an example of simple processing, there is a method of performing transformation by quantizing a certain component of gpix with one axis of the unified coordinate system as V. If the x-axis is V, the first component is quantized, if the y-axis is V, the second component is quantized, and if the z-axis is V, the third component is quantized. At this time, the distance value can be restored by solving the following simultaneous equations.

これは変換においてｚ軸をＶとし，量子化を行わずにｇpix の第３成分の値をそのまま変換後の値とした場合の復元式である。なお，Ｄ′［ｕpix ，ｖpix ］＝ｚpix である。上記の式２において，未知数はｘ，ｙ，ｄの３つであり，３つの等式が与えられているため，未知数に対して式を解くことが可能である。ここで，距離の値のみを復元する場合，３行目に対する等式をｄに対して解くことで，無駄な演算をすることなく距離の値を復元することができる。 This is a restoration formula when the z-axis is set to V in the conversion and the value of the third component of gpix is used as it is after conversion without quantization. Note that D ′ [upix, vpix] = zpix. In Equation 2 above, there are three unknowns x, y, and d, and since three equations are given, it is possible to solve the equation for the unknown. Here, when only the distance value is restored, the distance value can be restored without wasteful calculation by solving the equation for the third row with respect to d.

Ｖとして点ｐを零点として大きさが１の向きベクトルｍを持つ直線を用いた場合，ｇpix をＶに投影した際の値Ｌは次の式で計算される。 When a straight line having a direction vector m having a size of 1 with point p as zero is used as V, a value L when gpix is projected onto V is calculated by the following equation.

Ｌ＝（ｇpix −ｐ）●ｍ（式３）
なお，●はベクトルの内積を表す。また，Ｌから距離を復元する処理は，次の連立方程式を解くことで実現できる。 L = (gpix−p) m (Formula 3)
● represents the inner product of vectors. The process of restoring the distance from L can be realized by solving the following simultaneous equations.

ここで，未知数はｘ，ｙ，ｚ，ｄの４つであり，４つの等式が与えられているため必ず解くことができる。具体的には１つ目の式からｘ，ｙ，ｚそれぞれをｄの式で表し，それを２つ目の式に代入することでｄを求めることが可能である。 Here, there are four unknowns x, y, z, and d, and since four equations are given, they can always be solved. Specifically, it is possible to obtain d by expressing each of x, y, z from the first formula as a formula d and substituting it into the second formula.

また，いくつかの距離情報を用いて，変換後の値Ｌの分散を加味して軸を決定しても構わない。この分散に対する最適化問題は，サンプル距離情報集合に対して主成分分析を行うことで解くことができる。なお，今回の場合は分散を最大化すると量子化の影響を受け難い表現となり，分散を最小化するとコンパクトな表現を作り出すことができる。また，サンプル距離情報集合に対して主成分分析を行い，主成分得点となる値の数が最小となる軸を決定してもよい。変換後の値の個数が最小となるような軸を決定することにより，距離情報をロスレス符号化する際に，少ない代表値を用いて符号化することができる。 In addition, the axis may be determined using some distance information and taking into account the variance of the converted value L. This optimization problem for variance can be solved by performing principal component analysis on the sample distance information set. In this case, maximizing the variance makes the representation less susceptible to quantization, and minimizing the variance makes it possible to create a compact representation. In addition, principal component analysis may be performed on the sample distance information set to determine an axis that minimizes the number of values serving as principal component scores. By determining an axis that minimizes the number of converted values, distance information can be encoded using a small number of representative values when lossless encoding is performed.

量子化には，線形量子化と非線形量子化のどちらを使っても構わない。また，上記変換後のＬの値に対して量子化を行ってもよいし，Ｌの逆数に対して量子化を行っても構わない。距離の値そのものに意味がある場合にはＬの値をそのまま量子化し，被写体距離によって生じる視差量に意味がある場合にはＬの逆数に対して行ったほうがよい。これは，Ｌがある視点を定めた際の被写体距離に比例した値であり，視差量は被写体距離の逆数に比例する値であるためである。 For the quantization, either linear quantization or nonlinear quantization may be used. Further, the L value after the conversion may be quantized, or the inverse of L may be quantized. If the distance value itself is meaningful, the L value should be quantized as it is, and if the amount of parallax caused by the subject distance is meaningful, it should be performed on the reciprocal of L. This is because L is a value proportional to the subject distance when a certain viewpoint is determined, and the parallax amount is a value proportional to the reciprocal of the subject distance.

また，量子化では，例えば上記のサンプル距離情報集合に対して，最も量子化による誤差が小さくなるような量子化手法を設定する量子化手法設定手段を設け，設定された量子化手法を表す情報を符号化するとともに，その設定された量子化手法を用いて，符号化対象の値を量子化するようにしてもよい。 In quantization, for example, a quantization method setting means for setting a quantization method that minimizes an error due to quantization is provided for the sample distance information set described above, and information indicating the set quantization method is provided. And the value to be encoded may be quantized using the set quantization method.

全てのｐｉｘに対してＤ′の値が得られたら，変換された距離情報Ｄ′を距離情報符号化部１０３で符号化する［ステップＡ７］。その符号化結果が距離情報符号化装置１００の出力となる。 When the value of D ′ is obtained for all pixes, the converted distance information D ′ is encoded by the distance information encoding unit 103 [step A7]. The encoding result is the output of the distance information encoding apparatus 100.

なお，数直線の軸Ｖや量子化の手法は，符号化側と復号側で共通のものを用いる必要がある。つまり，符号化側と復号側で予め共通して定められたものを常に用いるか，利用した値を符号化して復号側へ通知する必要がある。 In addition, it is necessary to use a common axis on the encoding side and the decoding side for the axis V of the number line and the quantization method. In other words, it is necessary to always use a predetermined one in common on the encoding side and the decoding side, or encode the used value and notify the decoding side.

このようなプレ変換を用いることで，カメラの位置や向きによらず被写体の位置が変化しない限りは同じ値で表現されるようになる。また，値の変化も被写体の位置の変化に応じたものになるため，符号化対象となる情報のフレーム間相関が高まり効率のよい予測符号化を実現することができる。 By using such pre-conversion, the same value is expressed as long as the position of the subject does not change regardless of the position and orientation of the camera. In addition, since the change in the value also corresponds to the change in the position of the subject, the inter-frame correlation of the information to be encoded is increased, and efficient predictive encoding can be realized.

なお，本実施例１ではサンプル毎に統一座標系の座標値を求め，Ｄ′を計算しているが，全てのサンプルに対して統一座標系の座標値を求めてから，Ｄ′を計算しても構わない。 In the first embodiment, the coordinate value of the unified coordinate system is obtained for each sample and D ′ is calculated. However, after obtaining the coordinate value of the unified coordinate system for all samples, D ′ is calculated. It doesn't matter.

次に，第２の実施例（以下，実施例２）として，距離情報復号装置について説明する。実施例２に係る距離情報復号装置の構成図を図３に示す。 Next, a distance information decoding apparatus will be described as a second embodiment (hereinafter referred to as a second embodiment). FIG. 3 shows a configuration diagram of a distance information decoding apparatus according to the second embodiment.

図３に示すように，距離情報復号装置２００は，復号対象となる距離情報の符号化データを入力する距離情報符号化データ入力部２０１と，入力された符号化データを実際に復号する距離情報復号部２０２と，復号結果を実際のカメラから被写体までの距離を表す情報に変換する距離情報ポスト変換部２０３とを備える。 As shown in FIG. 3, a distance information decoding apparatus 200 includes a distance information encoded data input unit 201 that inputs encoded data of distance information to be decoded, and distance information that actually decodes the input encoded data. A decoding unit 202 and a distance information post conversion unit 203 that converts the decoding result into information representing the distance from the actual camera to the subject.

距離情報ポスト変換部２０３は，復号された復号距離情報を，予め定められた方法で逆量子化する復号距離情報逆量子化部２０３１と，逆量子化された値に対応する予め定められた数直線上の点を通り，数直線に直交する平面を設定する被写体平面設定部２０３２と，復号対象の距離情報の撮影を行ったカメラの焦点とカメラの投影面上の復号対象の距離情報の位置とを結ぶ直線を被写体光線として設定する被写体光線設定部２０３３と，カメラから被写体平面と被写体光線との交点までの距離を復号対象のカメラから被写体までの距離とする被写体距離算出部２０３４とを備える。 The distance information post-conversion unit 203 includes a decoding distance information inverse quantization unit 2031 that inversely quantizes the decoded decoding distance information by a predetermined method, and a predetermined number corresponding to the dequantized value. A subject plane setting unit 2032 that sets a plane that passes through points on a straight line and is orthogonal to a number line, and the position of the focus information of the camera that has captured the distance information of the decoding target and the distance information of the decoding target on the projection plane of the camera And a subject distance setting unit 2033 that sets a distance from the camera to the intersection of the subject plane and the subject ray as a distance from the decoding target camera to the subject. .

図４に，このようにして構成される距離情報復号装置２００の実行する処理フローを示す。この処理フローに従って，距離情報復号装置２００の実行する処理について詳細に説明する。 FIG. 4 shows a processing flow executed by the distance information decoding apparatus 200 configured as described above. The processing executed by the distance information decoding device 200 will be described in detail according to this processing flow.

まず，距離情報符号化データ入力部２０１より，復号対象となる距離情報の符号化データが入力される［ステップＢ１］。入力された符号化データは距離情報復号部２０２で復号される［ステップＢ２］。本実施例２では，各時刻・各カメラの距離情報はグレースケール画像として表現されるものとする。つまり，ここでの復号処理の結果，復号したい距離情報に何らかの変換が加えられた擬似距離情報が得られる。以下では，復号された擬似距離情報をＤｅｃ′と表し，記号［］で挟まれた位置情報を付加することで特定の領域の擬似距離情報を表す。 First, encoded data of distance information to be decoded is input from the encoded distance information data input unit 201 [step B1]. The input encoded data is decoded by the distance information decoding unit 202 [step B2]. In the second embodiment, it is assumed that distance information of each time and each camera is expressed as a gray scale image. That is, as a result of the decoding process here, pseudo-distance information obtained by adding some conversion to the distance information to be decoded is obtained. Hereinafter, the decoded pseudo distance information is represented as Dec ′, and the position information sandwiched between symbols [] is added to represent the pseudo distance information of a specific region.

復号された擬似距離情報Ｄｅｃ′は，距離情報ポスト変換部２０３で実際のカメラから被写体までを現す距離情報Ｄｅｃへと変換される［ステップＢ３−Ｂ７］。変換は画素ごとに行われる。つまり，画素を表すインデックスをｐｉｘ，画素数をｎｕｍＰｉｘｓとすると，ｐｉｘを０に初期化した後［ステップＢ３］，ｐｉｘに１を加算しながら［ステップＢ６］，ｐｉｘがｎｕｍＰｉｘｓになるまで［ステップＢ７］，以下のステップＢ４−Ｂ５の処理を繰り返す。 The decoded pseudo distance information Dec ′ is converted by the distance information post conversion unit 203 into distance information Dec representing the actual camera to the subject [step B3-B7]. Conversion is performed for each pixel. That is, assuming that the index representing the pixel is pix and the number of pixels is numPixs, after initializing pix to 0 [Step B3], while adding 1 to pix [Step B6], until pix becomes numPixs [Step B7] ], The following steps B4-B5 are repeated.

画素ごとに行われる処理では，まず，擬似距離情報Ｄｅｃ′［ｐｉｘ］から，画像上のｐｉｘの位置で観測された被写体の三次元点が存在する統一座標系における平面Ｐを求める［ステップＢ４］。復号された擬似距離情報Ｄｅｃ′［ｐｉｘ］は被写体の三次元点をある数直線Ｖに対して投影した際のスカラ値を量子化したものである。つまり，求める平面Ｐは数直線ＶにＤｅｃ′［ｐｉｘ］の示す位置で直交する平面となる。なお，数直線と量子化手法は予め符号化側で設定されたものを用いる。常に一定のものを用いても構わないし，シーケンスごとに適切な数直線と量子化手法を設計して用いても構わない。後者の場合は，それらを示す情報が符号化されて本復号装置に通知されることとなる。 In the processing performed for each pixel, first, the plane P in the unified coordinate system in which the three-dimensional point of the object observed at the position of pix on the image exists is obtained from the pseudo distance information Dec ′ [pix] [Step B4]. . The decoded pseudo distance information Dec ′ [pix] is obtained by quantizing a scalar value when a three-dimensional point of a subject is projected onto a certain number line V. That is, the plane P to be obtained is a plane orthogonal to the number line V at a position indicated by Dec ′ [pix]. As the number line and the quantization method, those previously set on the encoding side are used. A constant number may be used at all times, or an appropriate number line and quantization method may be designed and used for each sequence. In the latter case, information indicating them is encoded and notified to the decoding apparatus.

ここで，数直線Ｖとして点ｐ0 を零点として大きさが１の向きベクトルｎを持つ直線とし，量子化関数をＱとすると，求める平面Ｐは以下の式で表される。なお平面上の任意の点をｇで表した。 Here, if a number line V is a straight line having a direction vector n with a point p0 as a zero and a magnitude of 1, and a quantization function is Q, the plane P to be obtained is expressed by the following equation. An arbitrary point on the plane is represented by g.

ｎ・（ｐ0 ＋Ｑ-1［Ｄｅｃ′［ｐｉｘ］］・ｎ）＝ｎ・ｇ（式５）
次に，この平面Ｐとカメラパラメータとから，ｐｉｘの位置で観測された被写体の距離情報を計算し，結果をＤｅｃ［ｐｉｘ］とする［ステップＢ５］。具体的には，カメラパラメータとｐｉｘの位置という情報から，被写体の存在する直線を求め，直線と平面の交点に被写体が存在するとして，距離の値を求める。被写体が存在する直線は以下の数式で表される。 n. (p0 + Q-1 [Dec '[pix]]. n) = n.g (Formula 5)
Next, the distance information of the object observed at the position of pix is calculated from the plane P and the camera parameters, and the result is set to Dec [pix] [Step B5]. Specifically, a straight line where the subject exists is obtained from the information of the camera parameters and the position of pix, and the distance value is obtained assuming that the subject exists at the intersection of the straight line and the plane. The straight line on which the subject exists is expressed by the following mathematical formula.

カメラパラメータの表現法は実施例１で説明したものと同様であり，Ａ，Ｒ，ｔはそれぞれカメラの内部パラメータ行列，回転行列，並進ベクトルを表す。ｄは任意のスカラ変数であり，ｇが直線上の点を示す。（ｕpix ，ｖpix ）はｐｉｘの画像平面上での位置を表す。 The camera parameter expression method is the same as that described in the first embodiment, and A, R, and t represent an internal parameter matrix, a rotation matrix, and a translation vector of the camera, respectively. d is an arbitrary scalar variable, and g represents a point on a straight line. (Upix, vpix) represents the position of pix on the image plane.

ここで，式６のｄがカメラから被写体までの距離を表すため，式５と式６の連立方程式をｄについて解くことで，距離の値を求めることができる。なお，ｇは被写体の三次元座標値を表すが，ここでは必ずしも求める必要はない。なお，実施例１で述べたように，数直線Ｖとしてカメラの焦点と任意の画素とを結ぶ直線と直交するどの平面とも平行ではない直線を選択している場合，式５と式６からなる連立方程式は必ず唯一の解を持つ。 Here, since d in Expression 6 represents the distance from the camera to the subject, the distance value can be obtained by solving the simultaneous equations of Expression 5 and Expression 6 for d. In addition, although g represents the three-dimensional coordinate value of the subject, it is not always necessary to obtain it here. As described in the first embodiment, when a straight line that is not parallel to any plane orthogonal to the straight line connecting the focal point of the camera and an arbitrary pixel is selected as the number line V, Expression 5 and Expression 6 are established. Simultaneous equations always have a unique solution.

また，本実施例ではカメラから被写体までの距離の値を出力の値Ｄｅｃ［ｐｉｘ］としているが，カメラから被写体までの距離を予め定められた方法で量子化した値を出力する必要がある場合には，上記得られたｄの値を与えられた手法で量子化して出力することになる。 In this embodiment, the value of the distance from the camera to the subject is used as the output value Dec [pix], but it is necessary to output a value obtained by quantizing the distance from the camera to the subject by a predetermined method. In this case, the value of d obtained above is quantized by a given method and output.

以上説明した処理は，コンピュータとソフトウェアプログラムとによっても実現することができ，そのプログラムをコンピュータ読み取り可能な記録媒体に記録して提供することも，ネットワークを通して提供することも可能である。 The processing described above can be realized by a computer and a software program, and the program can be provided by being recorded on a computer-readable recording medium or can be provided through a network.

また，以上の実施の形態では距離情報符号化装置および距離情報復号装置を中心に説明したが，これらの装置の各部の動作に対応したステップによって本発明の距離情報符号化方法および距離情報復号方法を実現することができる。 In the above embodiments, the distance information encoding device and the distance information decoding device have been mainly described. However, the distance information encoding method and the distance information decoding method of the present invention are performed according to steps corresponding to the operations of the respective units of these devices. Can be realized.

以上，図面を参照して本発明の実施の形態を説明してきたが，上記実施の形態は本発明の例示に過ぎず，本発明が上記実施の形態に限定されるものでないことは明らかである。したがって，本発明の精神および範囲を逸脱しない範囲で構成要素の追加，省略，置換，その他の変更を行っても良い。 The embodiments of the present invention have been described above with reference to the drawings. However, the above embodiments are merely examples of the present invention, and it is clear that the present invention is not limited to the above embodiments. . Accordingly, additions, omissions, substitutions, and other modifications of the components may be made without departing from the spirit and scope of the present invention.

実施例１の距離情報符号化装置の構成を示す図である。It is a figure which shows the structure of the distance information encoding apparatus of Example 1. FIG. 実施例１における距離情報符号化フローチャートである。It is a distance information encoding flowchart in Example 1. 実施例２の距離情報復号装置の構成を示す図である。It is a figure which shows the structure of the distance information decoding apparatus of Example 2. FIG. 実施例２における距離情報復号フローチャートである。It is a distance information decoding flowchart in Example 2. カメラの位置や向きによって同じ被写体に対する距離が異なることを示す図である。It is a figure which shows that the distance with respect to the same to-be-photographed object changes with camera positions and directions. 被写体によってカメラ間の距離の差が異なることを示す図である。It is a figure which shows that the difference of the distance between cameras differs with a to-be-photographed object.

Explanation of symbols

１００距離情報符号化装置
１０１距離情報入力部
１０２距離情報プレ変換部
１０２１三次元点復元部
１０２２変換距離情報計算部
１０２３変換距離情報量子化部
１０３距離情報符号化部
２００距離情報復号装置
２０１距離情報符号化データ入力部
２０２距離情報復号部
２０３距離情報ポスト変換部
２０３１復号距離情報逆量子化部
２０３２被写体平面設定部
２０３３被写体光線設定部
２０３４被写体距離算出部 DESCRIPTION OF SYMBOLS 100 Distance information encoding apparatus 101 Distance information input part 102 Distance information preconversion part 1021 Three-dimensional point decompression | restoration part 1022 Transformation distance information calculation part 1023 Transformation distance information quantization part 103 Distance information encoding part 200 Distance information decoding apparatus 201 Distance information Encoded data input unit 202 Distance information decoding unit 203 Distance information post-conversion unit 2031 Decoding distance information inverse quantization unit 2032 Subject plane setting unit 2033 Subject light beam setting unit 2034 Subject distance calculation unit

Claims

In a distance information encoding method for encoding distance information representing a distance from a camera to a subject with respect to an image taken by a camera,
A three-dimensional point restoration step of calculating the three-dimensional coordinates of the subject indicated by the distance information using the parameters of the camera that has taken the image and the distance information to be encoded;
A conversion distance information calculation step for calculating a leg of a perpendicular drawn from a point represented by the calculated three-dimensional coordinate value to a predetermined number line;
A transform distance information quantization step for quantizing the calculated value of the perpendicular foot by a predetermined method;
A transform distance information encoding step for encoding the quantized value;
A distance information encoding method characterized by comprising:

In the distance information encoding method according to claim 1,
Use a straight line that does not exist in any plane orthogonal to the straight line passing through the focal point of the camera and the projection plane as the number straight line used in the conversion distance information calculation step.
The distance information encoding method characterized by these.

In the distance information encoding method according to claim 1,
An axis setting step for performing principal component analysis on a subset of the 3D point set obtained in the 3D point restoration step and setting an axis that maximizes the variance of the principal component scores;
An axis information encoding step for encoding information indicating the axis set in the axis setting step,
Using the axis set in the axis setting step as a number line used in the conversion distance information calculation step,
The distance information encoding method characterized by these.

In the distance information encoding method according to claim 1,
An axis setting step for performing principal component analysis on a subset of the 3D point set obtained in the 3D point restoration step and setting an axis that minimizes the variance of the principal component scores;
An axis information encoding step for encoding information indicating the axis set in the axis setting step,
Using the axis set in the axis setting step as a number line used in the conversion distance information calculation step,
The distance information encoding method characterized by these.

In the distance information encoding method according to claim 1,
An axis setting step for performing principal component analysis on a subset of the three-dimensional point set obtained in the three-dimensional point restoration step and setting an axis that minimizes the number of values serving as principal component scores;
An axis information encoding step for encoding information indicating the axis set in the axis setting step,
Using the axis set in the axis setting step as a number line used in the conversion distance information calculation step,
The distance information encoding method characterized by these.

In the distance information encoding method according to any one of claims 3 to 5,
A sample conversion distance information calculation step for calculating a foot of a perpendicular drawn to the axis obtained in the axis setting step from each point included in the subset of the three-dimensional point set used in the axis setting step;
A quantization method setting step for setting a quantization method that minimizes an error due to quantization when the value indicated by the calculated vertical line is quantized to a predetermined number;
A quantization method encoding step for encoding information representing the set quantization method, and
In the transform distance information quantization step, the quantization method set in the quantization method setting step is used to quantize the value indicating the perpendicular foot calculated in the transform distance information calculation step,
The distance information encoding method characterized by these.

In a distance information decoding method for decoding encoded data of distance information representing a distance from a camera to a subject with respect to an image taken by a camera,
An input data decoding step for decoding the input encoded data and setting decoding distance information;
A decoding distance information inverse quantization step for inversely quantizing the decoded decoding distance information by a predetermined method;
A subject plane setting step for setting a plane that passes through a point on a predetermined number line corresponding to the dequantized value and is orthogonal to the number line;
A subject ray setting step for setting, as a subject ray, a straight line connecting the focus of the camera that has captured the distance information to be decoded and the position of the distance information to be decoded on the projection plane of the camera;
Calculating the distance from the camera to the intersection of the object plane and the object ray, and calculating the distance from the camera to be decoded to the object;
A distance information decoding method characterized by comprising:

The distance information decoding method according to claim 7,
An axis information decoding step for decoding encoded data of information defining an axis in a three-dimensional space,
In the subject plane setting step, the axis decoded in the axis information decoding step is used as a number line.
The distance information decoding method characterized by this.

In the distance information decoding method according to claim 7 or 8,
A quantization method decoding step for decoding encoded data of information indicating a method of quantizing a value on a number line,
In the decoding distance information inverse quantization step, the inverse transformation of the quantization method decoded in the quantization method decoding step is used as an inverse quantization method,
The distance information decoding method characterized by this.

In a distance information encoding device that encodes distance information representing a distance from a camera to a subject with respect to an image taken by a camera,
Three-dimensional point restoration means for calculating the three-dimensional coordinates of the subject indicated by the distance information using the parameters of the camera that has taken the image and the distance information to be encoded;
Conversion distance information calculation means for calculating a perpendicular foot drawn from a point represented by the calculated three-dimensional coordinate value to a predetermined number line;
Transform distance information quantization means for quantizing the calculated value of the perpendicular foot by a predetermined method;
Transform distance information encoding means for encoding the quantized value;
A distance information encoding device comprising:

In a distance information decoding device that decodes encoded data of distance information representing a distance from a camera to a subject with respect to an image photographed by the camera,
Input data decoding means for decoding input encoded data and setting decoding distance information;
Decoding distance information inverse quantization means for inversely quantizing the decoded decoding distance information by a predetermined method;
Subject plane setting means for setting a plane that passes through a point on a predetermined number line corresponding to the dequantized value and is orthogonal to the number line;
Subject light beam setting means for setting, as a subject light beam, a straight line connecting the focal point of the camera that has captured the distance information of the decoding target and the position of the distance information of the decoding target on the projection plane of the camera;
Subject distance calculating means for calculating a distance from the camera to the intersection of the subject plane and the subject light beam, and setting the distance from the camera to be decoded to the subject;
A distance information decoding apparatus comprising:

A distance information encoding program for causing a computer to execute the distance information encoding method according to any one of claims 1 to 6.

The computer-readable recording medium which recorded the distance information encoding program of Claim 12.

A distance information decoding program for causing a computer to execute the distance information decoding method according to any one of claims 7 to 9.

The computer-readable recording medium which recorded the distance information decoding program of Claim 14.