JP2015186052A

JP2015186052A - Stereoscopic video encoding device and stereoscopic video encoding method

Info

Publication number: JP2015186052A
Application number: JP2014061078A
Authority: JP
Inventors: 遠藤　寛朗; Hiroo Endo; 寛朗遠藤
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2014-03-25
Filing date: 2014-03-25
Publication date: 2015-10-22

Abstract

PROBLEM TO BE SOLVED: To provide a stereoscopic video encoding device and method capable of reducing hindrance to a correct stereoscopic effect even when bit rate for coding is not enough.SOLUTION: A device which encodes stereoscopic video including right-eye images and left-eye images includes: an encoding part which performs in-screen encoding in which a predictive image is generated using pixel values adjoining a block to be encoded and already having been encoded and differences between pixel values of the block to be encoded and the predictive image are encoded, and an in-screen prediction mode selection part which selects one of a plurality of available prediction modes for in-screen encoding. The in-screen prediction mode selection part selects a prediction mode on the basis of prediction mode coefficients set for the respective prediction modes, and weights prediction mode coefficients such that a prediction mode coefficient for a prediction mode in which a predictive image is generated using horizontally adjacent pixels of the blocks to be encoded is less selected than prediction mode coefficients for other prediction modes.

Description

本発明は、映像を符号化する符号化装置及び方法に関し、特に複数の撮像装置で撮られた複数の映像を用いて立体映像を符号化する符号化技術に関する。 The present invention relates to an encoding apparatus and method for encoding video, and particularly to an encoding technique for encoding stereoscopic video using a plurality of videos taken by a plurality of imaging devices.

近年、立体表示可能なディスプレイの普及にともない、立体映像の撮影可能な撮像装置についても注目されている。立体映像の撮影については、通常、視差を有する２つのカメラで撮影した右目用映像、左目用映像を用いて作成している。立体映像の符号化においては、従来の圧縮符号化復号化技術として知られるMPEG方式などで用いられた時間方向の物体の動きベクトルを予測する動き予測に加え、複数の撮像装置間の視差ベクトルを予測する視差予測を用いることができる（特許文献１参照）。 In recent years, with the spread of displays capable of stereoscopic display, attention has been focused on imaging apparatuses capable of capturing stereoscopic images. The stereoscopic image is usually created using a right-eye image and a left-eye image captured by two cameras having parallax. In stereoscopic video coding, in addition to motion prediction for predicting motion vectors of objects in the time direction used in the MPEG method known as a conventional compression coding / decoding technology, a parallax vector between a plurality of imaging devices is also calculated. The parallax prediction to be predicted can be used (see Patent Document 1).

特開２０００−１６５９０９号公報JP 2000-165909 A

しかしながら、上記従来技術では、特に符号化のビットレートが充分でなく符号化劣化が顕著となった場合に、符号化劣化が原因で右目用画像と左目用画像の視差が乱され、正しい立体感が阻害されるという問題があった。 However, in the above prior art, especially when the encoding bit rate is not sufficient and the encoding deterioration becomes remarkable, the parallax between the right eye image and the left eye image is disturbed due to the encoding deterioration, and the correct stereoscopic effect is obtained. There was a problem that was disturbed.

そこで、本発明は上記の問題に鑑みて為されたものであり、符号化のビットレートが充分でない場合であっても、正しい立体感の阻害が軽減できる立体映像符号化装置及び方法を提供することを目的とする。 Accordingly, the present invention has been made in view of the above problems, and provides a stereoscopic video encoding apparatus and method that can reduce the inhibition of the correct stereoscopic effect even when the encoding bit rate is not sufficient. For the purpose.

本発明に係る立体映像符号化装置の構成は、右目画像及び左目画像を含む立体映像を符号化する立体映像符号化装置において、符号化対象ブロックに隣接する既に符号化済の画素値を用いて予測画像を生成し、符号化対象ブロックの画素値と前記予測画像の画素値との差分を符号化する画面内符号化を行う符号化手段と、前記画面内符号化の予測モードは複数あり、複数の前記予測モードの中から１つの予測モードを選択する画面内予測モード選択手段とを備え、前記画面内予測モード選択手段は、各々の予測モードに設定されている予測モード係数に基づいて予測モードを選択し、前記符号化対象ブロックの水平方向に隣接する画素を用いて予測画像を生成する予測モードに対する予測モード係数が、他の予測モードに対する予測モード係数より、選択されにくくなるよう重み付けすることを特徴とする。 The configuration of the stereoscopic video encoding device according to the present invention is such that a stereoscopic video encoding device that encodes a stereoscopic video including a right-eye image and a left-eye image uses an already encoded pixel value adjacent to an encoding target block. There are a plurality of prediction modes for generating the predicted image and performing the intra-frame encoding for encoding the difference between the pixel value of the encoding target block and the pixel value of the predicted image, and the prediction mode of the intra-screen encoding, An intra-screen prediction mode selection unit that selects one prediction mode from among the plurality of prediction modes, and the intra-screen prediction mode selection unit predicts based on a prediction mode coefficient set for each prediction mode. A prediction mode coefficient for a prediction mode in which a mode is selected and a prediction image is generated using pixels adjacent in the horizontal direction of the encoding target block, and a prediction mode coefficient for another prediction mode is selected. More, wherein the weighting to be less likely to be selected.

本発明に係る立体映像符号化方法の構成は、右目画像及び左目画像を含む立体映像を符号化する立体映像符号化方法において、符号化対象ブロックに隣接する既に符号化済の画素値を用いて予測画像を生成し、符号化対象ブロックの画素値と前記予測画像の画素値との差分を符号化する画面内符号化を行う符号化工程と、前記画面内符号化の予測モードは複数あり、複数の前記予測モードの中から１つの予測モードを選択する画面内予測モード選択工程とを有し、前記画面内予測モード選択工程では、各々の予測モードに設定されている予測モード係数に基づいて予測モードを選択し、前記符号化対象ブロックの水平方向に隣接する画素を用いて予測画像を生成する予測モードに対する予測モード係数が、他の予測モードに対する予測モード係数より、選択されにくくなるよう重み付けすることを特徴とする。 The configuration of the stereoscopic video encoding method according to the present invention is a stereoscopic video encoding method for encoding a stereoscopic video including a right-eye image and a left-eye image, using already encoded pixel values adjacent to the encoding target block. There are a plurality of encoding steps for generating a prediction image and performing intra-screen encoding for encoding the difference between the pixel value of the encoding target block and the pixel value of the prediction image, and a plurality of prediction modes of the intra-screen encoding, An intra-screen prediction mode selection step of selecting one prediction mode from the plurality of prediction modes, and in the intra-screen prediction mode selection step, based on the prediction mode coefficient set for each prediction mode A prediction mode coefficient for a prediction mode that selects a prediction mode and generates a prediction image using pixels adjacent to the encoding target block in the horizontal direction is a prediction mode for another prediction mode. Than the number, and wherein the weighting to be less likely to be selected.

本発明によれば、符号化のビットレートが充分でない場合であっても、正しい立体感の阻害が軽減できる立体映像符号化装置を提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, even if it is a case where the bit rate of encoding is not enough, the stereoscopic video encoding apparatus which can reduce inhibition of a correct stereoscopic effect can be provided.

本発明の第１の実施形態に係る立体映像符号化装置の構成例を示すブロック図である。It is a block diagram which shows the structural example of the stereo image coding apparatus which concerns on the 1st Embodiment of this invention. イントラ予測方法の予測画像の生成方法を説明するための図である。It is a figure for demonstrating the production | generation method of the estimated image of an intra prediction method. 本発明の第１の実施形態に係るイントラ予測部のブロック図である。It is a block diagram of the intra estimation part which concerns on the 1st Embodiment of this invention. 本発明の第２の実施形態に係るイントラ／インター判定部のブロック図である。It is a block diagram of the intra / inter determination part which concerns on the 2nd Embodiment of this invention. 本発明の第３の実施形態に係るイントラ予測部のブロック図である。It is a block diagram of the intra estimation part which concerns on the 3rd Embodiment of this invention. 予測モード係数と符号化ビットレートの関係を示す図である。It is a figure which shows the relationship between a prediction mode coefficient and an encoding bit rate.

以下、図面を参照して本発明をその好適な実施形態に基づき詳細に説明する。 Hereinafter, the present invention will be described in detail based on preferred embodiments with reference to the drawings.

（第１の実施形態）
図１は本発明に係る立体映像符号化装置のブロック図である。図１に示すように、本発明に係る立体映像符号化装置は、主画像撮像部１０１、副画像撮像部１０２、フレームメモリ１０３、フィルタ後参照フレームメモリ１０４、動き／視差予測部１０５、動き／視差補償部１０６、イントラ予測部１０７、直交変換部１０８、量子化部１０９、エントロピー符号化部１１０、逆量子化部１１１、逆直交変換部１１２、イントラ／インター判定部１１３、減算器１１４、加算器１１５、フィルタ前参照フレームメモリ１１６、ループフィルタ１１７とから成っている。 (First embodiment)
FIG. 1 is a block diagram of a stereoscopic video encoding apparatus according to the present invention. As shown in FIG. 1, the stereoscopic video encoding apparatus according to the present invention includes a main image capturing unit 101, a sub image capturing unit 102, a frame memory 103, a filtered reference frame memory 104, a motion / disparity prediction unit 105, Disparity compensation unit 106, intra prediction unit 107, orthogonal transform unit 108, quantization unit 109, entropy coding unit 110, inverse quantization unit 111, inverse orthogonal transform unit 112, intra / inter determination unit 113, subtractor 114, addition , A pre-filter reference frame memory 116, and a loop filter 117.

係る構成において、まず入力画像を符号化する方法について述べる。主画像撮像部１０１および副画像撮像部１０２は、撮影レンズおよび結像した画像を光電変換するための撮像素子と、撮像素子から読み出したアナログ信号をデジタル信号へ変換するＡ／Ｄ変換処理部等を含み、さらに変換されたデジタル信号から輝度信号や色差信号を生成してフレームメモリ１０３へデータを書き込む。なお、本実施例では主画像は左目用の映像を撮影し、副画像は右目用の映像を撮影するものとする。また副画像は主画像を参照画像として使用できるが主画像は副画像を参照画像として使用できないものとする。 In such a configuration, first, a method for encoding an input image will be described. The main image capturing unit 101 and the sub image capturing unit 102 include an image pickup lens and an image pickup device for photoelectrically converting the formed image, an A / D conversion processing unit for converting an analog signal read from the image pickup device into a digital signal, and the like. In addition, a luminance signal and a color difference signal are generated from the converted digital signal, and data is written to the frame memory 103. In this embodiment, it is assumed that the main image captures a left-eye image, and the sub-image captures a right-eye image. The sub image can use the main image as a reference image, but the main image cannot use the sub image as a reference image.

フレームメモリ１０３には表示順に主画像、副画像の各々が保存され、符号化順に符号化対象ブロックを動き／視差予測部１０５、イントラ予測部１０７、減算器１１４、原点差分演算部１１８に順次送信する。 The main image and the sub image are stored in the frame memory 103 in the display order, and the encoding target block is sequentially transmitted to the motion / disparity prediction unit 105, the intra prediction unit 107, the subtractor 114, and the origin difference calculation unit 118 in the encoding order. To do.

フィルタ後参照フレームメモリ１０４はフィルタ処理された符号化済み画像が参照画像として保存され、符号化順に符号化対象ブロックの参照画像を動き／視差予測部１０５、動き／視差補償部１０６に順次送信する。 The filtered reference frame memory 104 stores the filtered encoded image as a reference image, and sequentially transmits the reference image of the encoding target block to the motion / disparity prediction unit 105 and the motion / disparity compensation unit 106 in the encoding order. .

フィルタ前参照フレームメモリ１１６はフィルタ処理される前の符号化済み画像が保存されイントラ予測部１０７に順次送信する。 The pre-filter reference frame memory 116 stores the encoded image before being filtered and sequentially transmits it to the intra prediction unit 107.

動き／視差予測部１０５はフレームメモリ１０３から送信される符号化対象ブロックが主画像であるか副画像であるかによって処理を変える。 The motion / disparity prediction unit 105 changes processing depending on whether the encoding target block transmitted from the frame memory 103 is a main image or a sub image.

主画像の場合はフィルタ後参照フレームメモリ１０４から送信される事前に符号化し復号化された主画像を参照画像として用いて動きベクトルとして検出し、参照画像データ番号と共に動き／視差補償部１０６に送信する。 In the case of the main image, the pre-encoded and decoded main image transmitted from the post-filter reference frame memory 104 is used as a reference image, detected as a motion vector, and transmitted to the motion / disparity compensation unit 106 together with the reference image data number. To do.

副画像の場合は事前に符号化し復号化された表示順で同一時刻の主画像あるいは事前に符号化し復号化された副画像を参照画像としてフィルタ後参照フレームメモリ１０４から受け動きベクトルあるいは視差ベクトルを検出する。検出された動きベクトルあるいは視差ベクトルは参照画像データ番号と共に動き／視差補償部１０６に送信する。 In the case of a sub-image, the received motion vector or disparity vector is received from the filtered reference frame memory 104 using the main image at the same time in the display order encoded and decoded in advance or the sub-image previously encoded and decoded as a reference image. To detect. The detected motion vector or disparity vector is transmitted to the motion / disparity compensation unit 106 together with the reference image data number.

また、副画像の場合に事前に符号化し復号化された表示順で同一時刻の主画像と事前に符号化し復号化された副画像とから、参照画像のいずれかを最終的な参照画像として選択する方法については特に問わない。本実施例ではベクトル位置での符号化対象ブロックと参照画像の差分値が小さい方を選択するものとする。 In addition, in the case of a sub-image, one of the reference images is selected as a final reference image from the main image at the same time in the display order encoded and decoded in advance and the sub-image encoded and decoded in advance. There is no particular limitation on how to do this. In this embodiment, it is assumed that the smaller difference value between the encoding target block and the reference image at the vector position is selected.

動き／視差補償部１０６は動き／視差予測部１０５から送られた動きベクトルあるいは視差ベクトルを用いて、フィルタ後参照フレームメモリ１０４中のフィルタ後参照フレーム画像データ番号で示される参照フレーム画像を参照して、各ブロックの予測画像データを生成し、動きベクトルあるいは視差ベクトルとともに、イントラ／インター判定部１１３に送信する。 The motion / disparity compensation unit 106 uses the motion vector or the disparity vector transmitted from the motion / disparity prediction unit 105 to refer to the reference frame image indicated by the filtered reference frame image data number in the filtered reference frame memory 104. Thus, predicted image data of each block is generated and transmitted to the intra / inter determination unit 113 together with the motion vector or the disparity vector.

一方、イントラ予測部１０７はフィルタ前参照フレームメモリ１１６から送信される符号化対象ブロック周辺の復号化済みデータを用いて複数のイントラ予測モードごとにイントラ予測画像を生成する。図２にイントラ予測の種類とそれぞれのイントラ予測方法での予測画像の生成方法を概略で示した。白い四角が符号化対象のブロックを示しており、ここでは水平４画素、垂直４画素のブロックとしている。黒い四角は前記符号化対象のブロックに隣接する既に符号化された画素を示している。前記隣接する画素値を矢印の方向に使用して予測画像を生成する。例えば、図２（ａ）Ｖｅｒｔｉｃａｌ予測は上隣接の画素を使用して予測画像を生成する。なお、矢印が斜めになっている場合に、複数の隣接する画素値を用いて演算により予測画像を生成するものもある。 On the other hand, the intra prediction unit 107 generates an intra prediction image for each of a plurality of intra prediction modes using the decoded data around the encoding target block transmitted from the pre-filter reference frame memory 116. FIG. 2 schematically shows the type of intra prediction and the method of generating a predicted image in each intra prediction method. A white square indicates a block to be encoded, and here, a block of four horizontal pixels and four vertical pixels is used. A black square indicates an already encoded pixel adjacent to the block to be encoded. A predicted image is generated using the adjacent pixel values in the direction of the arrow. For example, in FIG. 2 (a) Vertical prediction, a predicted image is generated using upper adjacent pixels. In addition, when an arrow is slanting, there exists what produces | generates an estimated image by a calculation using a several adjacent pixel value.

フレームメモリ１０３から送信される符号化対象ブロックと生成した予測画像とを用いて適切なイントラ予測モードを選択し、予測画像とともにイントラ／インター判定部１１３に送信する。 An appropriate intra prediction mode is selected using the encoding target block transmitted from the frame memory 103 and the generated predicted image, and is transmitted to the intra / inter determination unit 113 together with the predicted image.

イントラ／インター判定部１１３は動き／視差補償部１０６およびイントラ予測部１０７から送信されてくる予測画像データのうち符号化対象ブロックとの相関の高い予測画像データを選択して減算器１１４に送信する。また、イントラ予測を選択した場合にはイントラ予測モード、インター予測を選択した場合には動きベクトルなどの情報をエントロピー符号化部１１０に送信する。 The intra / inter determination unit 113 selects predicted image data having a high correlation with the encoding target block from the predicted image data transmitted from the motion / disparity compensation unit 106 and the intra prediction unit 107, and transmits the selected predicted image data to the subtracter 114. . Further, when intra prediction is selected, information such as an intra prediction mode is transmitted to the entropy coding unit 110, and when inter prediction is selected, information such as a motion vector is transmitted.

減算器１１４はフレームメモリ１０３から送信されてくる符号化対象ブロックとイントラ／インター判定部１１３から送信されてくる予測画像ブロックを減算し、画像残差データを出力する。 The subtracter 114 subtracts the encoding target block transmitted from the frame memory 103 and the predicted image block transmitted from the intra / inter determination unit 113, and outputs image residual data.

直交変換部１０８では減算器１１４から出力された画像残差データを直交変換処理して、変換係数を量子化部１０９に送信する。 The orthogonal transform unit 108 performs orthogonal transform processing on the image residual data output from the subtractor 114 and transmits transform coefficients to the quantization unit 109.

量子化部１０９は直交変換部１０８からの変換係数を所定の量子化パラメータを用いて量子化し、エントロピー符号化部１１０および逆量子化部１１１に送信する。 The quantization unit 109 quantizes the transform coefficient from the orthogonal transform unit 108 using a predetermined quantization parameter, and transmits the quantized coefficient to the entropy coding unit 110 and the inverse quantization unit 111.

エントロピー符号化部１１０は量子化部１０９で量子化された変換係数を入力し、ＣＡＶＬＣ、ＣＡＢＡＣなどのエントロピー符号化を施して、符号化データとして出力する。 The entropy encoding unit 110 receives the transform coefficient quantized by the quantization unit 109, performs entropy encoding such as CAVLC, CABAC, and outputs the encoded data.

続いて、量子化部１０９で量子化された変換係数を用いて参照画像データを生成する方法について述べる。 Next, a method for generating reference image data using the transform coefficient quantized by the quantization unit 109 will be described.

逆量子化部１１１は量子化部１０９から送信されてくる量子化された変換係数を逆量子化する。 The inverse quantization unit 111 performs inverse quantization on the quantized transform coefficient transmitted from the quantization unit 109.

逆直交変換部１１２は逆量子化部１１１で逆量子化された変換係数を逆直交変換し、復号残差データを生成し加算器１１５に送信する。 The inverse orthogonal transform unit 112 performs inverse orthogonal transform on the transform coefficient inversely quantized by the inverse quantization unit 111, generates decoded residual data, and transmits the decoded residual data to the adder 115.

加算器１１５は復号残差データと後述する予測画像データとを加算し参照画像データを生成し、フィルタ前参照フレームメモリ１１６に保存する。また、ループフィルタ１１７に送信される。 The adder 115 adds decoded residual data and predicted image data described later to generate reference image data, and stores the reference image data in the pre-filter reference frame memory 116. Also, it is transmitted to the loop filter 117.

ループフィルタ１１７は参照画像データをフィルタリングしてノイズを除去したフィルタ後の参照画像データをフィルタ後参照フレームメモリ１０４に保存する。 The loop filter 117 stores the filtered reference image data obtained by filtering the reference image data to remove noise in the filtered reference frame memory 104.

続いて本発明の特徴であるイントラ予測部１０７におけるイントラ予測モードの決定方法について詳細に述べる。図３はイントラ予測部１０７のブロック図である。予測モード発生部３０４は順次図２に示した複数の予測モードのうちの１つを選択して、予測画像生成部３０１および予測モード係数発生部３０５へ送信する。予測画像生成部３０１は指定された予測モードの予測画像を生成するために必要な、既に符号化済みの隣接する画素をフィルタ前参照フレームメモリ１１６から読み出し、予測画像を生成して評価値算出部３０２へ送信する。 Next, a method for determining an intra prediction mode in the intra prediction unit 107, which is a feature of the present invention, will be described in detail. FIG. 3 is a block diagram of the intra prediction unit 107. The prediction mode generation unit 304 sequentially selects one of the plurality of prediction modes illustrated in FIG. 2 and transmits the selected prediction mode to the prediction image generation unit 301 and the prediction mode coefficient generation unit 305. The predicted image generation unit 301 reads the already-encoded adjacent pixels necessary for generating a predicted image of the designated prediction mode from the pre-filter reference frame memory 116, generates a predicted image, and generates an evaluation value calculation unit. To 302.

予測モード係数発生部３０５は指定された予測モードに応じて予測モード係数を発生する。例えば、予測モードがＭｏｄｅ１（Ｈｏｒｉｚｏｎｔａｌ予測）の場合は２．０、それ以外の予測モードの場合は１．０を発生する。予測モード係数の他の発生例としては、予測モードがＭｏｄｅ１の場合は２．０、Ｍｏｄｅ４，５，６，８の場合は１．５、それ以外の予測モードの場合は１．０を発生してもよい。予測モード係数の発生は、前述の例に限られず、予測方向が水平成分を強くもつものほど予測モード係数が大きくなればよい。 The prediction mode coefficient generation unit 305 generates a prediction mode coefficient according to the designated prediction mode. For example, 2.0 is generated when the prediction mode is Mode 1 (Horizontal prediction), and 1.0 is generated in other prediction modes. Other generation examples of the prediction mode coefficient include 2.0 when the prediction mode is Mode 1, 1.5 when Mode 4, 4, 6, and 8 and 1.0 when other prediction modes are used. It may occur. The generation of the prediction mode coefficient is not limited to the above example, and the prediction mode coefficient only needs to be larger as the prediction direction has a stronger horizontal component.

評価値算出部３０２は、符号化対象のブロック画像をフレームメモリ１０３から読み出し、前記予測画像生成部３０１が生成した予測画像との差分値を算出する。差分値としては、画素毎の差分値の絶対値をブロックの画素数分加算した差分絶対値和を算出する。前記差分絶対値和に、前記予測モード係数発生部３０５が発生した予測モード係数を乗算して、予測モード発生部３０４が発生した予測モードに対する評価値を算出し、予測モード決定部３０３へ送信する。予測モード決定部３０３は順次送信されてくる、各予測モードに対する評価値のうち最も小さいものを、符号化対象のブロックに対する予測モードとして決定し、予測画像とともにイントラ／インター判定部１１３に送信する。 The evaluation value calculation unit 302 reads out the block image to be encoded from the frame memory 103 and calculates a difference value from the prediction image generated by the prediction image generation unit 301. As the difference value, a sum of absolute differences is calculated by adding the absolute value of the difference value for each pixel by the number of pixels of the block. The difference absolute value sum is multiplied by the prediction mode coefficient generated by the prediction mode coefficient generation unit 305 to calculate an evaluation value for the prediction mode generated by the prediction mode generation unit 304 and transmitted to the prediction mode determination unit 303. . The prediction mode determination unit 303 determines the smallest evaluation value for each prediction mode, which is sequentially transmitted, as the prediction mode for the block to be encoded, and transmits it to the intra / inter determination unit 113 together with the predicted image.

以上により、イントラ予測の予測画像が水平方向に延長されるような予測モードが選択されにくくなり、その結果として水平方向に延長されるような符号化劣化が抑えられ、正しい立体感の阻害を低減することができる。 As a result, it becomes difficult to select a prediction mode in which a prediction image of intra prediction is extended in the horizontal direction, and as a result, encoding deterioration that is extended in the horizontal direction is suppressed, and the inhibition of correct stereoscopic effect is reduced. can do.

（第２の実施形態）
以下、本発明の第２の実施形態に係る立体映像符号化装置について説明する。本発明の第２の実施形態に係る立体映像符号化装置の構成は上述の第１の実施形態と同様なため説明は省略する。 (Second Embodiment)
Hereinafter, a stereoscopic video encoding apparatus according to the second embodiment of the present invention will be described. Since the configuration of the stereoscopic video encoding apparatus according to the second embodiment of the present invention is the same as that of the first embodiment described above, description thereof is omitted.

図４はイントラ／インター判定部１１３のブロック図である。イントラ／インター判定部１１３には、動き／視差補償部１０６からインター差分が入力され、イントラ予測部１０７からイントラ差分が入力される。前記インター差分、イントラ差分は、それぞれの予測画像と符号化画像との差分絶対値和である。イントラ差分には、あらかじめ決められているイントラ係数が乗算されてイントラ評価値としてイントラ／インター決定部４０１に入力される。ここではイントラ係数は２．０としている。なお、イントラ係数の値は１．０より大きければよい。一方のインター差分はそのままインター評価値としてイントラ／インター決定部４０１に入力される。イントラ／インター決定部４０１は、インター評価値とイントラ評価値の小さい方を予測方式として決定し、エントロピー符号化部１１０へ送信する。 FIG. 4 is a block diagram of the intra / inter determination unit 113. The intra / inter determination unit 113 receives the inter difference from the motion / disparity compensation unit 106 and the intra difference from the intra prediction unit 107. The inter difference and the intra difference are sums of absolute differences between respective predicted images and encoded images. The intra difference is multiplied by a predetermined intra coefficient and input to the intra / inter determination unit 401 as an intra evaluation value. Here, the intra coefficient is 2.0. Note that the value of the intra coefficient only needs to be larger than 1.0. One inter difference is directly input to the intra / inter determination unit 401 as an inter evaluation value. The intra / inter determination unit 401 determines the smaller of the inter evaluation value and the intra evaluation value as a prediction method, and transmits the prediction method to the entropy encoding unit 110.

以上により、イントラ予測が選択されにくくなり、イントラ予測の場合に発生しやすい水平方向に延長されるような符号化劣化が抑えられ、正しい立体感の阻害を低減することができる。 As described above, it is difficult to select intra prediction, encoding deterioration that is likely to occur in the case of intra prediction and extending in the horizontal direction is suppressed, and inhibition of correct stereoscopic effect can be reduced.

（第３の実施形態）
以下、本発明の第３の実施形態に係る立体映像符号化装置について説明する。本発明の第３の実施形態に係る立体映像符号化装置の構成は上述の第１の実施形態とほぼ同様のため説明は省略する。 (Third embodiment)
Hereinafter, a stereoscopic video encoding apparatus according to the third embodiment of the present invention will be described. Since the configuration of the stereoscopic video encoding apparatus according to the third embodiment of the present invention is substantially the same as that of the first embodiment, the description thereof is omitted.

図１のイントラ予測部１０７を、図５のイントラ予測５００に置き換えることで本発明の第３の実施形態に係る立体映像符号化装置のブロック図になる。イントラ予測部５００は、イントラ予測部１０７とほぼ同じ構成であり、予測モード係数発生部５０５が異なっている。予測モード係数発生部５０５は、ユーザーなどが設定する符号化ビットレートが入力され、前記符号化ビットレートに応じて予測モード係数を発生する。予測モード係数は、前述の第１の実施形態で説明した通り、予測方向が水平成分を強くもつものほど予測モード係数が大きくなればよい。ただし、符号化ビットレートが高くなるほど、各々の予測モードに対する予測モード係数の、最大と最小の差分が小さくなるように設定する。ここでは例として、Ｍｏｄｅ０の予測モード係数と、Ｍｏｄｅ１の予測モード係数について、符号化ビットレートに応じた変化について図６を用いて説明する。 By replacing the intra prediction unit 107 in FIG. 1 with the intra prediction 500 in FIG. 5, a block diagram of the stereoscopic video encoding apparatus according to the third embodiment of the present invention is obtained. The intra prediction unit 500 has substantially the same configuration as the intra prediction unit 107, and the prediction mode coefficient generation unit 505 is different. The prediction mode coefficient generation unit 505 receives a coding bit rate set by a user or the like, and generates a prediction mode coefficient according to the coding bit rate. As described in the first embodiment, the prediction mode coefficient only needs to be larger as the prediction direction has a higher horizontal component. However, the higher the encoding bit rate, the smaller the difference between the maximum and minimum prediction mode coefficients for each prediction mode. Here, as an example, changes in the prediction mode coefficient of Mode 0 and the prediction mode coefficient of Mode 1 according to the encoding bit rate will be described with reference to FIG.

図６は、Ｍｏｄｅ０の予測モード係数と、Ｍｏｄｅ１の予測モード係数を横軸に符号化ビットレートをとって示したグラフである。符号化ビットレートが高い場合はＭｏｄｅ０の予測モード係数は小さくＭｏｄｅ１の予測モード係数と近い値となる。符号化ビットレートが低い場合はＭｏｄｅ１の予測モード係数は大きく、Ｍｏｄｅ０の予測モード係数とは離れた値となる。すなわち、符号化ビットレートが低い場合には、水平方向に延長されるような予測画像となる予測モードが強く抑制される。 FIG. 6 is a graph showing the prediction mode coefficient of Mode 0 and the prediction mode coefficient of Mode 1 with the encoding bit rate on the horizontal axis. When the encoding bit rate is high, the prediction mode coefficient of Mode 0 is small and is close to the prediction mode coefficient of Mode 1. When the coding bit rate is low, the prediction mode coefficient of Mode 1 is large and becomes a value different from the prediction mode coefficient of Mode 0. That is, when the encoding bit rate is low, the prediction mode that is a prediction image extended in the horizontal direction is strongly suppressed.

以上により、特に符号化ビットレートが低い場合に顕著となる、水平方向に延長されるような符号化劣化が抑えられ、正しい立体感の阻害を低減することができる。 As described above, it is possible to suppress the deterioration of encoding such as extending in the horizontal direction, which becomes remarkable particularly when the encoding bit rate is low, and to reduce the inhibition of the correct stereoscopic effect.

１０１主画像撮像部
１０２副画像撮像部
１０３フレームメモリ
１０４フィルタ後参照フレームメモリ
１０５動き／視差予測部
１０６視差補償部
１０７イントラ予測部
１０８直交変換部
１０９量子化部
１１０エントロピー符号化部
１１１逆量子化部
１１２逆直交変換部
１１３イントラ／インター判定部
１１４減算器
１１５加算器
１１６フィルタ前参照フレームメモリ
１１７ループフィルタ 101 Main Image Imaging Unit 102 Sub Image Imaging Unit 103 Frame Memory 104 Filtered Reference Frame Memory 105 Motion / Parallax Prediction Unit 106 Parallax Compensation Unit 107 Intra Prediction Unit 108 Orthogonal Transform Unit 109 Quantization Unit 110 Entropy Coding Unit 111 Inverse Quantization Unit 112 inverse orthogonal transform unit 113 intra / inter determination unit 114 subtractor 115 adder 116 pre-filter reference frame memory 117 loop filter

Claims

In a stereoscopic video encoding device that encodes a stereoscopic video including a right-eye image and a left-eye image,
A prediction image is generated using already encoded pixel values adjacent to the encoding target block, and intra-screen encoding is performed to encode a difference between the pixel value of the encoding target block and the pixel value of the prediction image. Encoding means;
There are a plurality of prediction modes of the intra-picture encoding, and it comprises an intra-screen prediction mode selection means for selecting one prediction mode from the plurality of prediction modes,
The intra prediction mode selection means selects a prediction mode based on a prediction mode coefficient set for each prediction mode,
The prediction mode coefficient for a prediction mode for generating a prediction image using pixels adjacent in the horizontal direction of the encoding target block is weighted so as to be less likely to be selected than prediction mode coefficients for other prediction modes. Stereoscopic video encoding device.

Each prediction mode coefficient is weighted such that the larger the ratio of the number of pixels adjacent in the horizontal direction of the encoding target block to the number of pixels adjacent in the vertical direction, the more difficult it is to select. The stereoscopic video encoding apparatus according to claim 1.

The encoding means can select an encoding bit rate,
The lower the encoding bit rate, the greater the difference between the prediction mode coefficient for the prediction mode for generating a prediction image using pixels adjacent in the horizontal direction of the encoding target block and the prediction mode coefficient for other prediction modes. 3. The stereoscopic video encoding apparatus according to claim 1 or 2, wherein weighting is performed as described above.

The encoding by the encoding means includes motion compensation encoding using an already encoded image that is temporally different from the encoding target image as a reference image, and the intra-frame encoding in units of the encoding block. The stereoscopic video encoding apparatus according to any one of claims 1 to 3, wherein the motion compensation encoding can be selected.

In a stereoscopic video encoding method for encoding a stereoscopic video including a right eye image and a left eye image,
A prediction image is generated using already encoded pixel values adjacent to the encoding target block, and intra-screen encoding is performed to encode a difference between the pixel value of the encoding target block and the pixel value of the prediction image. Encoding process;
There are a plurality of prediction modes of the intra-picture encoding, and it has an intra-screen prediction mode selection step of selecting one prediction mode from the plurality of prediction modes,
In the intra prediction mode selection step, select a prediction mode based on the prediction mode coefficient set for each prediction mode,
The prediction mode coefficient for a prediction mode for generating a prediction image using pixels adjacent in the horizontal direction of the encoding target block is weighted so as to be less likely to be selected than prediction mode coefficients for other prediction modes. A stereoscopic video encoding method.