JP5355606B2

JP5355606B2 - Stereo video encoding method, apparatus, and program

Info

Publication number: JP5355606B2
Application number: JP2011046557A
Authority: JP
Inventors: 裕江岩崎; 卓佐野; 隆之大西; 淳嵯峨田; 一人上倉
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2011-03-03
Filing date: 2011-03-03
Publication date: 2013-11-27
Anticipated expiration: 2031-03-03
Also published as: JP2012186544A

Abstract

<P>PROBLEM TO BE SOLVED: To provide a stereo video encoding method which realizes high compression of stereo video by an inter-view prediction, even when a base view and non-base views are encoded independently of each other by using an existing encoder. <P>SOLUTION: A stereo video encoding method is the one which accepts data consisting of a base view and non-base views as its input and outputs them as one bit stream. An inter-view prediction vector is generated by an inter-view prediction of a base view and non-base views, and the inter-view prediction vector is superposed, as data, on the non-base views to generate superposed non-base views. The superposed non-base views and the base view are combined to generate new non-base views. The base view is encoded by a specific encoding method, and the new non-base views are encoded by a specific encoding method, from this encoded data is removed the encoded data of the base view. The encoded base view and the encoded data which has had the encoded data of the base view removed are multiplexed, from which one bit stream is output. <P>COPYRIGHT: (C)2012,JPO&INPIT

Description

本発明は、ステレオ動画像のデジタル信号をリアルタイムに映像符号化するステレオ動画像符号化方法、装置およびプログラムに関するものである。 The present invention relates to a stereo video encoding method, apparatus, and program for encoding a digital video of a stereo video in real time.

動画像のデジタル信号を圧縮符号化する技術としてのＨ．２６４ハイプロファイルは、Ｂｌｕ−ｒａｙ、ワンセグなどに用いられている。このような符号化技術を用いてステレオ動画像、すなわち２視点動画像を圧縮するために、ベースビューと非ベースビューを独立な動画像として符号化した場合には、十分な圧縮効率を実現できない。ステレオ動画像の圧縮効率を高めるためには、ビュー間予測を実現することが不可欠である。 As a technique for compressing and encoding a digital signal of a moving image, H.264 is used. H.264 High Profile is used for Blu-ray, One Seg, and the like. In order to compress a stereo moving image, that is, a two-viewpoint moving image using such an encoding technique, when the base view and the non-base view are encoded as independent moving images, sufficient compression efficiency cannot be realized. . In order to increase the compression efficiency of stereo moving images, it is essential to realize inter-view prediction.

例えば、ビュー間予測を利用したステレオ動画像のデジタル信号を圧縮符号化する技術として、Ｂｌｕ−ｒａｙなどでも採用されているＨ．２６４ステレオハイプロファイルが注目されている。Ｈ．２６４ステレオハイプロファイルでは、左目映像と右目映像をそれぞれ１個のベースビューと１個の非ベースビューとして符号化することができる。ベースビューは、Ｈ．２６４ハイプロファイルとの互換性を保ち、Ｈ．２６４ハイプロファイルとして復号することができる。非ベースビューは、他のビューに含まれるフレームを参照（ビュー間予測）して符号化される。ビュー間予測を用いることで、ステレオ動画像の高圧縮を実現できる（非特許文献１）。 For example, as a technique for compressing and encoding a digital signal of a stereo moving image using inter-view prediction, H.264, which is also adopted in Blu-ray and the like. H.264 stereo high profile is attracting attention. H. In the H.264 stereo high profile, the left-eye video and the right-eye video can be encoded as one base view and one non-base view, respectively. The base view is H.264. H.264 high profile compatibility is maintained. H.264 can be decoded as a high profile. The non-base view is encoded with reference to a frame included in another view (inter-view prediction). By using inter-view prediction, high compression of a stereo moving image can be realized (Non-Patent Document 1).

ＡｎｔｈｏｎｙＶｅｔｒｏ，ＴｈｏｍａｓＷｉｅｇａｎｄ，ＧａｒｙＪ．Ｓｕｌｌｉｖａｎ：ＯｖｅｒｖｉｅｗｏｆｔｈｅＳｔｅｒｅｏａｎｄＭｕｌｔｉｖｉｅｗＶｉｄｅｏＣｏｄｉｎｇＥｘｔｅｎｓｉｏｎｓｏｆｔｈｅＨ．２６４／ＭＰＥＧ−４ＡＶＣＳｔａｎｄａｒｄ，ＰｒｏｃｅｅｄｉｎｇｓｏｆｔｈｅＩＥＥＥＶｏｌ．９９，Ｎｏ．４，Ａｐｒｉｌ２０１１（Ｔｏｂｅｉｓｓｕｅｄ）Anthony Vetro, Thomas Wiegand, Gary J. et al. Sullivan: Overview of the Stereo and Multiview Video Coding Extensions of the H.C. H.264 / MPEG-4 AVC Standard, Proceedings of the IEEE Vol. 99, no. 4, April 2011 (To be issued)

しかしながら、Ｈ．２６４ステレオハイプロファイルでは、特に実装についての規定はなく、ベースビューと非ベースビューを独立して符号化しない場合、非ベースビューからベースビューを共有しなければならないため、既存のＨ．２６４ハイプロファイル符号化方法などの既存の符号化器が利用できないという問題があった。 However, H.C. In the H.264 stereo high profile, there is no particular provision on the implementation, and if the base view and the non-base view are not encoded independently, the base view must be shared from the non-base view. There is a problem that an existing encoder such as the H.264 high profile encoding method cannot be used.

そこで、本発明は、Ｈ．２６４ハイプロファイルの符号化装置の外部でビュー間予測を行い、非ベースビューのビュー間予測ベクトルデータを非ベースビューに付加情報として画像データに重畳し、非ベースビューを符号化するＨ．２６４ハイプロファイルの符号化装置に入力することにより、既存のＨ．２６４ハイプロファイルの符号化装置でベースビューと非ベースビューを独立に符号化してもビュー間予測によるステレオ動画像の高圧縮を実現するステレオ動画像符号化方法、装置およびプログラムを提供する。 Therefore, the present invention relates to H.264. H.264 performs inter-view prediction outside the H.264 high profile encoding apparatus, superimposes inter-view prediction vector data of a non-base view on image data as additional information on the non-base view, and encodes the non-base view. H.264 high profile encoding device by inputting into the existing H.264 encoding device. The present invention provides a stereo video encoding method, apparatus, and program for realizing high compression of a stereo video by inter-view prediction even when a base view and a non-base view are independently encoded by an H.264 high profile encoding apparatus.

上述した課題を解決するために、本発明は、ステレオ画像の一方の画像であるベースビューと他方の画像である非ベースビューとの２種類の映像のデータを入力として１つのビットストリームとして出力する映像符号化装置における映像符号化方法であって、前記映像符号化装置が、前記ベースビューと前記非ベースビューのビュー間予測によりビュー間予測ベクトルを生成する第１のステップと、前記ビュー間予測ベクトルを非ベースビューにデータを重畳し、重畳済非ベースビューを出力する第２のステップと、前記重畳済非ベースビューと第１のステップで用いたベースビューとを合わせて新たな非ベースビューとして出力する第３のステップと、前記第１のステップで用いたベースビューを特定の符号化方式で符号化する第４のステップと、前記第３のステップで出力される非ベースビューを特定の符号化方式で符号化する第５のステップと、前記第５のステップによる符号化データから、前記第１のステップで用いたベースビューの符号化データを削除する第６のステップと、前記第４のステップと前記第６のステップによる符号化データを多重化し、１つのビットストリームを出力する第７のステップとを実行する。 In order to solve the above-described problems, the present invention inputs two types of video data, a base view that is one image of a stereo image and a non-base view that is the other image, and outputs the data as one bit stream. A video encoding method in a video encoding device, wherein the video encoding device generates an inter-view prediction vector by inter-view prediction of the base view and the non-base view, and the inter-view prediction A second step of superimposing data on a non-base view and outputting a superimposed non-base view, and a new non-base view by combining the superimposed non-base view and the base view used in the first step And a fourth step for encoding the base view used in the first step with a specific encoding method. And flop, and a fifth step of encoding the non-base view that is output by the third step in a particular coding scheme, from the encoded data by said fifth step, used in the first step The sixth step of deleting the encoded data of the base view, and the seventh step of multiplexing the encoded data of the fourth step and the sixth step and outputting one bit stream are executed.

また、本発明は、上述の映像符号化方法において、前記第１のステップでは、ベースビューのＧＯＰ構造でＩピクチャとなるタイミングで入力された画像を用いてビュー間予測をし、前記映像符号化装置が、さらに、前記第４のステップにおいて符号化する際に用いたパラメータを前記第５のステップの符号化において用いる第９のステップと、前記第５のステップにおいて第９のステップで用いられたパラメータによって符号化を行う第１０のステップと、を備えることを特徴とする。
In the video encoding method described above, in the video encoding method described above, in the first step, inter-view prediction is performed using an image input at a timing of an I picture in a GOP structure of a base view, and the video encoding is performed. The apparatus was further used in the ninth step in the fifth step and the ninth step in the fifth step using the parameters used in the encoding in the fourth step. And a tenth step of performing encoding using parameters.

また、本発明は、上述の映像符号化方法において、第９のステップおよび第１０のステップにおけるパラメータとしてピクチャ単位で符号化するための時刻情報及び前記ベースビューの画像を符号化した時に使用した符号化パラメータを用いることを特徴とする。 Further, the present invention provides the above-described video encoding method, the time information for encoding in units of pictures as parameters in the ninth step and the tenth step, and the code used when the base view image is encoded It is characterized by using a conversion parameter.

また、本発明は、上述の映像符号化方法において、前記符号化パラメータは、量子化マトリクスや量子化値を決定するためのパラメータを含むことを特徴とする。 The present invention is also characterized in that, in the video encoding method described above, the encoding parameter includes a parameter for determining a quantization matrix and a quantization value.

また、本発明は、ステレオ画像の一方の画像であるベースビューと他方の画像である非ベースビューとの２種類の映像のデータを入力として１つのビットストリームとして出力する映像符号化装置であって、前記ベースビューと前記非ベースビューのビュー間予測によりビュー間予測ベクトルを生成するビュー間予測部と、前記ビュー間予測ベクトルを非ベースビューにデータを重畳し、重畳済非ベースビューを生成するとともに、前記重畳済非ベースビューと前記ビュー間予測部で用いられたベースビューとを合わせて新たな非ベースビューとして生成するビュー間予測ベクトル重畳部と、前記ビュー間予測部で用いられたベースビューを特定の符号化方式で符号化する第１の符号化部と、前記ビュー間予測ベクトル重畳部によって生成された新たな非ベースビューを特定の符号化方式で符号化する第２の符号化部と、第２の符号化部によって得られた符号化データから、前記ビュー間予測部で用いられたベースビューの符号化データを削除するベースビューストリーム削除部と、前記第１の符号化部によって得られた符号化データと前記ベースビューストリーム削除部によって得られた符号化データとを多重化し、１つのビットストリームを出力する多重化部とを備えることを特徴とする。 The present invention also relates to a video encoding device that inputs two types of video data, a base view that is one image of a stereo image and a non-base view that is the other image, and outputs the data as one bit stream. An inter-view prediction unit that generates an inter-view prediction vector by inter-view prediction of the base view and the non-base view, and superimposes the inter-view prediction vector on a non-base view to generate a superimposed non-base view And an inter-view prediction vector superimposing unit that generates a new non-base view by combining the superimposed non-base view and the base view used in the inter-view prediction unit, and a base used in the inter-view prediction unit Generated by a first encoding unit that encodes a view using a specific encoding method and the inter-view prediction vector superimposing unit. A second encoding unit that encodes the new non-base view using a specific encoding method, and a base view used by the inter-view prediction unit from encoded data obtained by the second encoding unit. A base view stream deleting unit that deletes the encoded data of the first encoding unit, the encoded data obtained by the first encoding unit, and the encoded data obtained by the base view stream deleting unit, And a multiplexing unit that outputs a stream.

また、本発明は、上述の映像符号化方法の実現に用いられる処理をコンピュータに実行させるための映像符号化プログラムである。 The present invention is also a video encoding program for causing a computer to execute processing used to realize the above-described video encoding method.

以上説明したように、この発明によれば、既存の符号化装置を用いてベースビューと非ベースビューを独立に符号化する場合であっても、ビュー間予測によるステレオ動画像の高圧縮を実現することができる。
また、本発明によれば、上記二つの符号化装置の間でパラメータを同一にすることができ、その結果、符号化劣化を発生させないようすることができる。 As described above, according to the present invention, even when a base view and a non-base view are independently encoded using an existing encoding device, high compression of a stereo moving image by inter-view prediction is realized. can do.
In addition, according to the present invention, it is possible to make the parameters the same between the two encoding apparatuses, and as a result, it is possible to prevent encoding degradation.

この発明の一実施形態によるステレオ動画像符号化装置の構成を示す概略ブロック図である。It is a schematic block diagram which shows the structure of the stereo moving image encoder by one Embodiment of this invention. ビュー間予測ベクトルを生成して非ベースビューに重畳する動作を説明する図である。It is a figure explaining the operation | movement which produces | generates an inter-view prediction vector and superimposes on a non-base view. Ｈ．２６４ハイプロファイル符号化装置７とＨ．２６４ハイプロファイル符号化装置８との間でデータを送受信する方法を説明する図である。H. H.264 high profile encoder 7 and H.264. It is a figure explaining the method of transmitting / receiving data between H.264 high profile encoding apparatuses 8. FIG. パラメータ３３、パラメータ３４について説明する図である。It is a figure explaining the parameter 33 and the parameter 34. FIG. 時刻情報および符号化パラメータ４３、時刻情報および符号化パラメータ４４について具体的に説明する図である。It is a figure which demonstrates concretely about the time information and the encoding parameter 43, the time information, and the encoding parameter 44. FIG.

以下、本発明の一実施形態によるステレオ動画像符号化装置について図面を参照して説明する。図１は、この発明の一実施形態によるステレオ動画像符号化装置の構成を示す概略ブロック図である。 A stereo video encoding apparatus according to an embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a schematic block diagram showing the configuration of a stereo video encoding apparatus according to an embodiment of the present invention.

１は、ベースビューの画像フレーム、２は、非ベースビューの画像フレームであり、ビュー間予測部３に入力されるステレオ画像フレームである。ここで、ベースビューは、例えば、左目画像に対応し、非ベースビューは、右目画像に対応している。ビュー間予測部３は、非ベースビューからベースビューに対する予測ベクトルを算出する。 Reference numeral 1 denotes a base-view image frame, and reference numeral 2 denotes a non-base-view image frame, which is a stereo image frame input to the inter-view prediction unit 3. Here, the base view corresponds to, for example, the left eye image, and the non-base view corresponds to the right eye image. The inter-view prediction unit 3 calculates a prediction vector for the base view from the non-base view.

４は、予測ベクトルを非ベースビューに重畳するビュー間予測ベクトル重畳部である。５は、ベースビューの画像フレームであり、図１のベースビューの画像フレーム１と同じものである。６は、ビュー間予測ベクトル重畳部４で処理された重畳済非ベースビューとビュー間予測部３で使われたベースビューの画像をあわせた新たな非ベースビューである。７は、ベースビューをＨ．２６４ハイプロファイルで符号化するＨ．２６４ハイプロファイル符号化装置である。 Reference numeral 4 denotes an inter-view prediction vector superimposing unit that superimposes a prediction vector on a non-base view. Reference numeral 5 denotes a base-view image frame, which is the same as the base-view image frame 1 of FIG. 6 is a new non-base view in which the superimposed non-base view processed by the inter-view prediction vector superimposing unit 4 and the base view image used by the inter-view prediction unit 3 are combined. 7 shows the base view as H.264. H.264 encoding with high profile H.264 high profile encoding device.

８は、重畳済非ベースビューをＨ．２６４ハイプロファイルで符号化するＨ．２６４ハイプロファイル符号化装置である。９は、ベースビューと非ベースビューを符号化する時に、Ｈ．２６４ハイプロファイル符号化装置７とＨ．２６４ハイプロファイル符号化装置８との間でデータを送受信するデータパスである。１０は、Ｈ．２６４ステレオハイプロファイルのヘッダにシンタックスを入れ替えるシンタックス修正部である。通常、ヘッダなどのシンタックスなどは、プログラムを入れ替えることで実現可能である。シンタックス修正部１０は、重畳済非ベースビューのＨ．２６４ハイプロファイルのビットストリームの該当部分をＨ．２６４ステレオハイプロファイルのヘッダに書き換えるものである。 8 shows the superimposed non-base view as H.264. H.264 encoding with high profile H.264 high profile encoding device. 9 encodes a base view and a non-base view. H.264 high profile encoder 7 and H.264. 2 is a data path for transmitting / receiving data to / from the H.264 high profile encoding device 8. 10 is H. This is a syntax correction unit that replaces the syntax with the header of the H.264 stereo high profile. Usually, syntax such as a header can be realized by replacing programs. The syntax correction unit 10 is configured to display the H.264 of the superimposed non-base view. H.264 High Profile Bitstream The header is rewritten to a H.264 stereo high profile header.

１１は、余分に入力されたベースビュー部分のストリームを取り除くベースビューストリーム削除部である。このことにより、非ベースビューのビットストリーム（１３）のみを出力する。１２は、ベースビューのビットストリームであり、１３は、非ベースビューのビットストリームである。１４は、ベースビューのビットストリーム１２と非ベースビューのビットストリーム１３とを多重化する多重化部である。１５は、多重化部１４によって多重化されたＨ．２６４ステレオハイプロファイルのビットストリームである。 Reference numeral 11 denotes a base view stream deletion unit that removes an extra input base view stream. As a result, only the non-base view bit stream (13) is output. Reference numeral 12 denotes a base view bit stream, and reference numeral 13 denotes a non-base view bit stream. Reference numeral 14 denotes a multiplexing unit that multiplexes the base view bit stream 12 and the non-base view bit stream 13. 15 is an H.264 multiplexed by the multiplexing unit 14. This is a bit stream of H.264 stereo high profile.

図２に、ベースビューと非ベースビューからビュー間予測ベクトルを生成し、当該ビュー間予測ベクトルを当該非ベースビューに重畳する方法を具体的に図示する。
この図において、２１は、一連のベースビューのシーケンスであり、２２は、一連の非ベースビューのシーケンスである。２３および２５は、ベースビューの符号化におけるＧＯＰ（ＧｒｏｕｐＯｆＰｉｃｔｕｒｅｓ）でＩピクチャとなるタイミングのピクチャであり、ベースビューのシーケンス２１に含まれる。２４は、ピクチャ２３と同時に入力される非ベースビューのピクチャであり、非ベースビューのシーケンス２２に含まれている。２６は、ピクチャ２５と同時に入力される非ベースビューのピクチャであり、非ベースビューのシーケンス２２に含まれている。 FIG. 2 specifically illustrates a method of generating an inter-view prediction vector from a base view and a non-base view and superimposing the inter-view prediction vector on the non-base view.
In this figure, 21 is a sequence of a series of base views, and 22 is a sequence of a series of non-base views. Reference numerals 23 and 25 are pictures at the timing of becoming I pictures in GOP (Group Of Pictures) in base view encoding, and are included in the base view sequence 21. Reference numeral 24 denotes a non-base view picture input simultaneously with the picture 23, and is included in the non-base view sequence 22. 26 is a non-base view picture input simultaneously with the picture 25 and is included in the non-base view sequence 22.

２７は、ビュー間予測部を示し、２３と２４などのＩピクチャで符号化されるタイミングのピクチャに対して、ビュー間予測を行う。ビュー間予測の間隔をあけることにより、ビュー間予測での時間あたりの演算量を削減し、回路規模を小さくすることができる。２８は、ビュー間予測部２７で求めたビュー間予測ベクトルを非ベースビューの画像データに重畳して、非ベースビューのＨ．２６４ハイプロファイルで符号化するビュー間予測ベクトル重畳部である。 Reference numeral 27 denotes an inter-view prediction unit, which performs inter-view prediction on pictures at timings encoded with I pictures such as 23 and 24. By providing an inter-view prediction interval, the amount of computation per time in inter-view prediction can be reduced, and the circuit scale can be reduced. 28 superimposes the inter-view prediction vector obtained by the inter-view prediction unit 27 on the image data of the non-base view, so It is an inter-view prediction vector superimposing unit that encodes with H.264 high profile.

ビュー間予測部２７は、図１のビュー間予測部３、ビュー間予測ベクトル重畳部２８は、図１のビュー間予測ベクトル重畳部４に相当する。 The inter-view prediction unit 27 corresponds to the inter-view prediction unit 3 in FIG. 1, and the inter-view prediction vector superimposing unit 28 corresponds to the inter-view prediction vector superimposing unit 4 in FIG.

図３に、図１のＨ．２６４ハイプロファイル符号化装置７と図１のＨ．２６４ハイプロファイル符号化装置８との間でデータを送受信する方法を具体的に図示する。
３１は、ベースビューを符号化するＨ．２６４ハイプロファイル符号化装置である。３２は、非ベースビューを符号化するＨ．２６４ハイプロファイル符号化装置である。３３は、図２のピクチャ２３、ピクチャ２５を符号化する時のパラメータであり、Ｈ．２６４ハイプロファイル符号化装置３１に記憶されている。３４は、図２のピクチャ２４、ピクチャ２６を符号化する時に使用するパラメータであり、Ｈ．２６４ハイプロファイル符号化装置３２に記憶されている。 In FIG. H.264 high profile encoding device 7 and the H.264 high profile encoding device 7 of FIG. A method for transmitting and receiving data to and from the H.264 high profile encoding device 8 is specifically illustrated.
31 is an H.264 encoding base view. H.264 high profile encoding device. 32 is an H.32 encoding non-base view. H.264 high profile encoding device. 33 is a parameter for encoding picture 23 and picture 25 in FIG. H.264 high profile encoding device 31. 34 is a parameter used when coding the picture 24 and the picture 26 in FIG. H.264 high profile encoding device 32.

３５は、ベースビューと非ベースビューを符号化する時にＨ．２６４ハイプロファイル符号化装置３１とＨ．２６４ハイプロファイル符号化装置３２との間でデータを送受信するデータパスである。具体的には、データパス３５は、Ｈ．２６４ハイプロファイル符号化装置３１とＨ．２６４ハイプロファイル符号化装置３２のホスト間でのデータ送受信を実現する。データパス３５は、例えば、ピクチャ毎にデータ送受信をする。このデータ送受信による符号化を実現するために、非ベースビューへの一連のピクチャの入力を遅延させる。 35 is an H.35 code when encoding a base view and a non-base view. H.264 high profile encoding device 31 and H.264. 2 is a data path for transmitting and receiving data to and from the H.264 high profile encoding device 32. Specifically, the data path 35 is H.264. H.264 high profile encoding device 31 and H.264. Data transmission / reception between the hosts of the H.264 high profile encoding device 32 is realized. For example, the data path 35 transmits and receives data for each picture. In order to realize this encoding by data transmission / reception, the input of a series of pictures to the non-base view is delayed.

Ｈ．２６４ハイプロファイル符号化装置３１は、図１のＨ．２６４ハイプロファイル符号化装置７、Ｈ．２６４ハイプロファイル符号化装置３２は、図１のＨ．２６４ハイプロファイル符号化装置８に相当する。 H. The H.264 high profile encoding device 31 is similar to the H.264 high profile encoding device 31 shown in FIG. H.264 high profile encoding device 7, H.264. The H.264 high profile encoding device 32 is similar to the H.264 high profile encoding device 32 shown in FIG. This corresponds to the H.264 high profile encoding device 8.

図４は、図３のパラメータ３３、パラメータ３４について具体的なパラメータを説明するための図である。
４１は、ベースビューを符号化するＨ．２６４ハイプロファイル符号化装置である。４２は、非ベースビューを符号化するＨ．２６４ハイプロファイル符号化装置である。４３は、図２のピクチャ２３、ピクチャ２５を符号化する時の時刻情報および符号化パラメータである。４４は、図２のピクチャ２４、ピクチャ２６を符号化する時に使用する時刻情報および符号化パラメータである。 FIG. 4 is a diagram for explaining specific parameters for the parameters 33 and 34 in FIG.
41 is an H.264 encoding base view. H.264 high profile encoding device. 42 encodes a non-base view. H.264 high profile encoding device. 43 is time information and encoding parameters when encoding the picture 23 and the picture 25 in FIG. Reference numeral 44 denotes time information and encoding parameters used when encoding the picture 24 and the picture 26 in FIG.

４５は、ベースビューと非ベースビューを符号化する時にデータを送受信するデータパスである。具体的には、データパス４５は、Ｈ．２６４ハイプロファイル符号化装置４１とＨ．２６４ハイプロファイル符号化装置４２のホスト間でのデータ送受信を実現する。データパス４５は、例えば、ピクチャ毎にデータ送受信をする。このデータ送受信による符号化を実現するために、非ベースビューへの一連のピクチャの入力を遅延させる。 Reference numeral 45 denotes a data path for transmitting and receiving data when encoding the base view and the non-base view. Specifically, the data path 45 is H.264. H.264 high profile encoder 41 and H.264. Data transmission / reception between the hosts of the H.264 high profile encoding device 42 is realized. For example, the data path 45 transmits and receives data for each picture. In order to realize this encoding by data transmission / reception, the input of a series of pictures to the non-base view is delayed.

Ｈ．２６４ハイプロファイル符号化装置４１は、図３のＨ．２６４ハイプロファイル符号化装置３１、Ｈ．２６４ハイプロファイル符号化装置４２は、図３のＨ．２６４ハイプロファイル符号化装置３２に相当する。 H. The H.264 high profile encoding apparatus 41 is similar to the H.264 high profile encoding apparatus 41 shown in FIG. H.264 high profile encoder 31, H.264. The H.264 high profile encoding device 42 is similar to the H.264 high profile encoding device 42 shown in FIG. This corresponds to the H.264 high profile encoding device 32.

図５は、図４の時刻情報および符号化パラメータ４３、時刻情報および符号化パラメータ４４について具体的に説明する図である。
５１は、ベースビューを符号化するＨ．２６４ハイプロファイル符号化装置である。５２は、非ベースビューを符号化するＨ．２６４ハイプロファイル符号化装置である。５３は、図２のピクチャ２３、ピクチャ２５を符号化する時の時刻情報および量子化マトリクス、量子化を決定するためのパラメータである。５４は、図２のピクチャ２４、ピクチャ２６を符号化する時に使用する時刻情報および量子化マトリクス、量子化を決定するためのパラメータである。 FIG. 5 is a diagram specifically explaining the time information and the encoding parameter 43, the time information and the encoding parameter 44 of FIG.
51 is an H.264 encoding base view. H.264 high profile encoding device. 52 is an H.264 encoding non-base view. H.264 high profile encoding device. 53 are parameters for determining time information, quantization matrix, and quantization when coding the picture 23 and the picture 25 in FIG. 54 are parameters for determining time information, a quantization matrix, and quantization used when coding the picture 24 and the picture 26 in FIG.

５５は、ベースビューと非ベースビューを符号化する時にＨ．２６４ハイプロファイル符号化装置５１とＨ．２６４ハイプロファイル符号化装置５２との間でデータを送受信するデータパスである。具体的には、データパス５５は、Ｈ．２６４ハイプロファイル符号化装置５１とＨ．２６４ハイプロファイル符号化装置５２のホスト間でのデータ送受信を実現する。データパス５５は、例えば、ピクチャ毎にデータ送受信をする。このデータ送受信による符号化を実現するために、非ベースビューへの一連のピクチャの入力を遅延させる。 55 is an H.55 code when encoding a base view and a non-base view. H.264 high profile encoder 51 and H.264. 2 is a data path for transmitting / receiving data to / from the H.264 high profile encoding device 52. Specifically, the data path 55 is H.264. H.264 high profile encoder 51 and H.264. Data transmission / reception between the hosts of the H.264 high profile encoding device 52 is realized. For example, the data path 55 transmits and receives data for each picture. In order to realize this encoding by data transmission / reception, the input of a series of pictures to the non-base view is delayed.

Ｈ．２６４ハイプロファイル符号化装置５１は、図４のＨ．２６４ハイプロファイル符号化装置４１、Ｈ．２６４ハイプロファイル符号化装置５２は、図４のＨ．２６４ハイプロファイル符号化装置４２に相当する。 H. The H.264 high profile encoding device 51 is similar to the H.264 high profile encoding device 51 shown in FIG. H.264 high profile encoding device 41, H.264. The H.264 high profile encoding device 52 is similar to the H.264 high profile encoding device 52 shown in FIG. This corresponds to the H.264 high profile encoding device 42.

上述した実施形態において、ベースビューと非ベースビューは、独立した符号化装置で符号化する。ここでは、Ｈ．２６４ハイプロファイルで符号化する場合について説明する。ベースビューを入力画像として、Ｈ．２６４ハイプロファイルに符号化する方法では、非ベースビューでビュー間予測に用いられる参照画像となるベースビューを符号化するための情報を保持する。ベースビューと非ベースビューのビュー間予測の予測ベクトルは、非ベースビューの画像データに付加情報として入力する。非ベースビューとビュー間予測の予測ベクトルを付加された非ベースビューと非ベースビューの参照画像となるベースビューを入力画像として、Ｈ．２６４ハイプロファイルに符号化する方法では、ビュー間予測ベクトル、参照画像となるベースビューを符号化した情報を用いて、Ｈ．２６４ハイプロファイルで符号化する。 In the above-described embodiment, the base view and the non-base view are encoded by independent encoding devices. Here, H. A case of encoding with H.264 high profile will be described. Using the base view as an input image, In the H.264 high profile encoding method, information for encoding a base view serving as a reference image used for inter-view prediction in a non-base view is retained. A prediction vector for inter-view prediction between the base view and the non-base view is input as additional information to the image data of the non-base view. A base view that is a reference image of a non-base view and a non-base view to which a non-base view and a prediction vector for inter-view prediction are added is used as an input image. In the method of encoding to H.264 high profile, information obtained by encoding the inter-view prediction vector and the base view serving as the reference image is used. H.264 encoding with high profile.

ビュー間予測の参照画像となるベースビューは、ベースビューで符号化する時に保持したデータを用いて、同一の符号化を行う。非ベースビューとビュー間予測の予測ベクトルを付加された非ベースビューと非ベースビューの参照画像となるベースビューをＨ．２６４ハイプロファイルで符号化したビットストリームから、非ベースビューの参照画像となるベースビューのビットストリームを削除し、非ベースビューのみのビットストリームとする。さらに、ベースビューおよび非ベースビューのビットストリームを多重化して、１つのビットストリームとする。これにより、Ｈ．２６４ハイプロファイルの符号化方法および装置を２つ用いて、ステレオ動画像の高圧縮を可能とする。 The base view, which is a reference image for inter-view prediction, performs the same encoding using the data stored when encoding with the base view. A base view that is a reference image of a non-base view and a non-base view to which a prediction vector of a non-base view and inter-view prediction is added is referred to as H.264. The bit stream of the base view that becomes the reference image of the non-base view is deleted from the bit stream encoded by the H.264 high profile, and the bit stream is made only of the non-base view. Furthermore, the base view and non-base view bit streams are multiplexed into one bit stream. As a result, H.C. The H.264 high profile encoding method and apparatus are used to enable high compression of stereo moving images.

また、図１におけるステレオ動画像符号化装置の機能を実現するためのプログラムをコンピュータ読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することにより符号化処理を行ってもよい。なお、ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。 Further, by recording a program for realizing the function of the stereo moving image encoding apparatus in FIG. 1 on a computer-readable recording medium, and causing the computer system to read and execute the program recorded on the recording medium. An encoding process may be performed. Here, the “computer system” includes an OS and hardware such as peripheral devices.

また、「コンピュータシステム」は、ＷＷＷシステムを利用している場合であれば、ホームページ提供環境（あるいは表示環境）も含むものとする。
また、「コンピュータ読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ−ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。さらに「コンピュータ読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時間の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時間プログラムを保持しているものも含むものとする。また上記プログラムは、前述した機能の一部を実現するためのものであっても良く、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよい。 Further, the “computer system” includes a homepage providing environment (or display environment) if a WWW system is used.
The “computer-readable recording medium” refers to a storage device such as a flexible medium, a magneto-optical disk, a portable medium such as a ROM and a CD-ROM, and a hard disk incorporated in a computer system. Furthermore, the “computer-readable recording medium” dynamically holds a program for a short time like a communication line when transmitting a program via a network such as the Internet or a communication line such as a telephone line. In this case, a volatile memory in a computer system serving as a server or a client in that case, and a program that holds a program for a certain period of time are also included. The program may be a program for realizing a part of the functions described above, and may be a program capable of realizing the functions described above in combination with a program already recorded in a computer system.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 The embodiment of the present invention has been described in detail with reference to the drawings. However, the specific configuration is not limited to this embodiment, and includes designs and the like that do not depart from the gist of the present invention.

１、５ベースビューの画像フレーム
２非ベースビューの画像フレーム
３、２７ビュー間予測部
４、２８ビュー間予測ベクトル重畳部
６新たな非ベースビュー
７、８、３１、３２、４１、４２、５１、５２Ｈ．２６４ハイプロファイル符号化装置
９、３５、４５、５５データパス
１０シンタックス修正部
１１ベースビューストリーム削除部
１２ベースビューのビットストリーム
１３非ベースビューのビットストリーム
１４多重化部
１５ビットストリーム
２１ベースビューのシーケンス
２２非ベースビューのシーケンス
２３、２５、２６ピクチャ
３３、３４パラメータ
４３、４４時刻情報および符号化パラメータ
５３、５４時刻情報および量子化マトリクス、量子化を決定するためのパラメータ 1, 5 Base-view image frame 2 Non-base-view image frame 3, 27 Inter-view prediction unit 4, 28 Inter-view prediction vector superimposition unit 6 New non-base view 7, 8, 31, 32, 41, 42, 51 52H. H.264 high profile encoding device 9, 35, 45, 55 Data path 10 Syntax modification unit 11 Base view stream deletion unit 12 Base view bit stream 13 Non-base view bit stream 14 Multiplexing unit 15 Bit stream 21 Base view Sequence 22 Sequence of non-base view 23, 25, 26 Picture 33, 34 Parameter 43, 44 Time information and encoding parameter 53, 54 Time information, quantization matrix, parameter for determining quantization

Claims

A video encoding method in a video encoding apparatus that receives data of two types of video, a base view that is one image of a stereo image and a non-base view that is the other image, and outputs the data as one bit stream,
The video encoding device is
A first step of generating an inter-view prediction vector by inter-view prediction of the base view and the non-base view;
A second step of superimposing data on the inter-view prediction vector on a non-base view and outputting a superimposed non-base view;
A third step of outputting the superimposed non-base view and the base view used in the first step as a new non-base view;
A fourth step of encoding the base view used in the first step with a specific encoding method;
A fifth step of encoding the non-base view output in the third step with a specific encoding method;
A sixth step of deleting the encoded data of the base view used in the first step from the encoded data in the fifth step;
A video encoding method for executing the seventh step of multiplexing the encoded data of the fourth step and the sixth step and outputting one bit stream.

In the first step, inter-view prediction is performed using an image input at the timing of becoming an I picture in the GOP structure of the base view,
The video encoding device further includes:
A ninth step in which the parameters used in the encoding in the fourth step are used in the encoding in the fifth step;
A tenth step of performing encoding according to the parameters used in the ninth step in the fifth step;
The video encoding method according to claim 1, further comprising:

The time information for encoding in units of pictures and the encoding parameter used when the base view image is encoded are used as parameters in the ninth step and the tenth step, respectively. Video encoding method.

The video encoding method according to claim 3, wherein the encoding parameter includes a parameter for determining a quantization matrix and a quantization value.

A video encoding device that inputs data of two types of video, a base view that is one image of a stereo image and a non-base view that is the other image, and outputs the data as one bit stream,
An inter-view prediction unit that generates an inter-view prediction vector by inter-view prediction of the base view and the non-base view;
The inter-view prediction vector is superimposed on the non-base view to generate a superimposed non-base view, and a new non-base view is combined with the base view used by the inter-view prediction unit. An inter-view prediction vector superimposing unit generated as a base view;
A first encoding unit that encodes the base view used in the inter-view prediction unit using a specific encoding method;
A second encoding unit that encodes a new non-base view generated by the inter-view prediction vector superimposing unit using a specific encoding method;
A base view stream deletion unit that deletes encoded data of the base view used in the inter-view prediction unit from the encoded data obtained by the second encoding unit;
A multiplexing unit that multiplexes the encoded data obtained by the first encoding unit and the encoded data obtained by the base-view stream deletion unit, and outputs one bit stream. A video encoding device.

A video encoding program for causing a computer to execute processing used to realize the video encoding method according to any one of claims 1 to 4.