JP5038367B2

JP5038367B2 - Scalable video encoding method, scalable video encoding device, and scalable video encoding program

Info

Publication number: JP5038367B2
Application number: JP2009173956A
Authority: JP
Inventors: 和也早瀬; 幸浩坂東; 誠之高村; 一人上倉; 由幸八島
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2009-07-27
Filing date: 2009-07-27
Publication date: 2012-10-03
Anticipated expiration: 2029-07-27
Also published as: JP2011029962A

Description

本発明は、動画像をスケーラブルに符号化するスケーラブル動画像符号化方法およびその装置と、そのスケーラブル動画像符号化方法の実現に用いられるスケーラブル動画像符号化プログラムとに関し、特に、符号化時間の削減を実現するスケーラブル動画像符号化方法およびその装置と、そのスケーラブル動画像符号化方法の実現に用いられるスケーラブル動画像符号化プログラムとに関する。 The present invention relates to a scalable video encoding method and apparatus for encoding a video in a scalable manner, and a scalable video encoding program used for realizing the scalable video encoding method. The present invention relates to a scalable moving picture coding method and apparatus for realizing the reduction, and a scalable moving picture coding program used for realizing the scalable moving picture coding method.

近年の多様な表示端末・ネットワーク環境の背景を受け、ＪＶＴ（Joint Video Team：合同ビデオ・チーム）では、ＡＶＣ(Advanced Video Coding：高度動画像圧縮符号化標準）に対して、空間／時間／ＳＮＲのスケーラビリティを付与した符号化方式ＳＶＣ（Scalable Video Coding)が検討されている（例えば、非特許文献１参照）。 In response to the background of various display terminals and network environments in recent years, JVT (Joint Video Team) has a space / time / SNR for AVC (Advanced Video Coding Standard). An encoding scheme SVC (Scalable Video Coding) to which the above scalability has been added has been studied (for example, see Non-Patent Document 1).

ＳＶＣでは、Ｉnter予測、Ｉntra予測、レイヤ間予測の３つの予測方法を取り入れており、時間、空間、レイヤ間に内在する冗長性除去を行う。ＳＶＣでとり得る予測モードを下記に列挙する。 SVC adopts three prediction methods, namely, Inter prediction, Intra prediction, and inter-layer prediction, and performs redundancy removal inherent in time, space, and layers. The prediction modes that can be used in SVC are listed below.

〔Ｉnter予測〕
・Ｓkip モード（Ｓkip ）
・Ｄirect モード（Ｄirect ）
・１６×１６ブロックサイズ動き予測モード（Ｐ１６×１６）
・１６×８ブロックサイズ動き予測モード（Ｐ１６×８）
・８×１６ブロックサイズ動き予測モード（Ｐ８×１６）
・８×８ブロックサイズ動き予測モード（Ｐ８×８）
〔Ｉntra予測〕
・１６×１６ブロックサイズＩntra予測モード（Ｉ１６×１６）
・８×８ブロックサイズＩntra予測モード（Ｉ８×８）
・４×４ブロックサイズＩntra予測モード（Ｉ４×４）
〔レイヤ間予測〕
・ＢＬＳkip モード（ＢＬＳkip ）
・ＩntraＢＬモード（ＩntraＢＬ）
Ｐ８×８を行う場合の各８×８ブロックは、さらに８×４、４×４、４×４のブロックサイズに分割可能である。ＳＶＣでは、マクロブロックごとに、これらの予測モード探索候補の中から１つを最適予測モードとして選択する。 [Inter prediction]
・ Skip mode (Skip)
・ Direct mode (Direct)
16 × 16 block size motion prediction mode (P16 × 16)
16 × 8 block size motion prediction mode (P16 × 8)
8 × 16 block size motion prediction mode (P8 × 16)
8 × 8 block size motion prediction mode (P8 × 8)
[Intra prediction]
16 × 16 block size intra prediction mode (I16 × 16)
・ 8 × 8 block size intra prediction mode (I8 × 8)
・ 4 × 4 block size intra prediction mode (I4 × 4)
[Inter-layer prediction]
・ BLskip mode (BLskip)
・ IntraBL mode (IntraBL)
Each 8 × 8 block when performing P8 × 8 can be further divided into block sizes of 8 × 4, 4 × 4, and 4 × 4. In SVC, one of these prediction mode search candidates is selected as the optimum prediction mode for each macroblock.

最適予測モードの決定方法の例を以下に挙げる。 An example of a method for determining the optimum prediction mode is given below.

ＪＶＴがＳＶＣの参照エンコーダとして提供しているＪＳＶＭ（例えば、非特許文献２参照）では、各予測モードにおいて符号量と符号化歪みとからなる符号化コストを計算し、上に挙げたすべての予測モード中で最も符号化コストが小さくなる予測モードを最適として定める。 In JSVM provided by JVT as an SVC reference encoder (see, for example, Non-Patent Document 2), the encoding cost including the code amount and the encoding distortion is calculated in each prediction mode, and all the predictions listed above are calculated. The prediction mode with the lowest coding cost among the modes is determined as the optimum.

また、下記に示す特許文献１では、参照フレームの動きベクトルを符号化対象フレームに外挿／内挿したベクトルを生成し、それによって動いたマクロブロックの各画素の座標を求め、画素が一致する回数を各画素毎にカウントする。そして、符号化対象マクロブロック内の各画素のカウント数から算出されるスコアの値の大小に従って予測モード探索候補の絞り込みを行う。本絞り込み方法はＨ．２６４／ＡＶＣの予測モード探索高速化のために提案されたものであるが、Ｈ．２６４／ＡＶＣと同じ予測モード探索の仕組みであるＳＶＣにおいても適用可能である。 Further, in Patent Document 1 shown below, a vector obtained by extrapolating / interpolating a motion vector of a reference frame to an encoding target frame is generated, and coordinates of each pixel of the moved macroblock are obtained thereby to match the pixels. The number of times is counted for each pixel. Then, prediction mode search candidates are narrowed down according to the score value calculated from the count number of each pixel in the encoding target macroblock. This narrowing down method is H.264. H.264 / AVC is proposed for speeding up the prediction mode search. The present invention is also applicable to SVC, which is the same prediction mode search mechanism as H.264 / AVC.

また、下記に示す特許文献２では、フレーム内符号化を高速で行うことができるようにするために、近接符号化ブロックの画素値を用いてフレーム内符号化を行うブロックの例えば９通りの画面内予測誤差を求めて、それに基づいてそのブロックの予測モードを決定する。次に、近接既符号化ブロックの画面内予測モードを用いてそのブロックの予測モードを決定して、その２つの予測モードが一致する場合には、そのままその予測モードを選択し、一致しない場合は、符号化コストの小さい方の予測モードを選択するようにしている。 Further, in Patent Document 2 shown below, for example, nine screens of blocks that perform intra-frame coding using pixel values of adjacent coding blocks in order to perform intra-frame coding at high speed. An inner prediction error is obtained, and a prediction mode of the block is determined based on the inner prediction error. Next, when the prediction mode of the block is determined using the intra prediction mode of the adjacent coded block and the two prediction modes match, the prediction mode is selected as it is. The prediction mode with the lower coding cost is selected.

しかしながら、非特許文献２のＪＳＶＭにおける最適予測モードの決定方法では、予測モード探索候補の絞り込みを行っていないため、高い符号化性能を実現できる反面、予測モード探索に莫大な時間を要する。マクロブロック内の画像の特性を考慮すれば明らかに選ばれる可能性が低い予測モード（例えば、静止領域におけるＩntra予測モード）も探索しており無駄が多い。 However, in the determination method of the optimal prediction mode in JSVM of Non-Patent Document 2, since prediction mode search candidates are not narrowed down, high coding performance can be realized, but a huge amount of time is required for the prediction mode search. A prediction mode (for example, an intra prediction mode in a still region) that is clearly not likely to be selected in consideration of the characteristics of an image in a macroblock is also searched for, which is wasteful.

また、特許文献１の予測モード探索候補の絞り込みは、Ｉntra予測をするのか否かの判定を下す方法であるため、Ｉntra予測モードの探索と比べて長い計算時間を要するＩnter予測モード探索の削減効果はない。つまり、Ｉnter予測モード探索については、改良の余地をそのまま残している。 Further, narrowing down prediction mode search candidates in Patent Document 1 is a method for determining whether or not to perform intra prediction, and therefore, the effect of reducing the Inter prediction mode search that requires a longer calculation time than the search in intra prediction mode. There is no. That is, the room for improvement is left as it is for the search in the Inter prediction mode.

また、特許文献２の予測モード探索候補の絞り込みは、Ｉntra予測のみの絞り込みであるため、特許文献１の予測モード探索候補の絞り込みと同様に、Ｉntra予測モードの探索と比べて長い計算時間を要するＩnter予測モード探索の削減効果はない。つまり、Ｉnter予測モード探索については、改良の余地をそのまま残している。 In addition, since narrowing down prediction mode search candidates in Patent Document 2 is narrowing down only to intra prediction, similarly to narrowing down prediction mode search candidates in Patent Document 1, a long calculation time is required as compared to the search in Intra prediction mode. There is no reduction effect of searching for the Inter prediction mode. That is, the room for improvement is left as it is for the search in the Inter prediction mode.

このようなことを背景にして、本発明者は、下記の非特許文献３で、予測モードの使用に制限を設けることなく行ったスケーラブル動画像符号化で選択された最適予測モードの情報に基づいて、空間的に対応するブロックで選択された上位レイヤと下位レイヤの最適予測モードの組み合わせの発生率を求めることで、その最適予測モードの組み合わせとその発生率との対応関係について記述する対応表を生成するようにして、上位レイヤのブロックを符号化する場合に、下位レイヤの空間的に対応するブロックの符号化で選択された最適予測モードの情報を取得して、その取得した最適予測モードの情報とその対応表とに基づいて、上位レイヤのブロックの符号化で探索する予測モードの探索候補を絞り込むようにするという発明を開示した。 Against this backdrop, the present inventor based on the information on the optimal prediction mode selected in scalable video coding performed without restriction on the use of the prediction mode in Non-Patent Document 3 below. A correspondence table describing the correspondence between the combination of the optimum prediction mode and the occurrence rate by obtaining the occurrence rate of the combination of the optimum prediction mode of the upper layer and the lower layer selected in the spatially corresponding block. When the upper layer block is encoded, the information of the optimal prediction mode selected by the encoding of the spatially corresponding block of the lower layer is acquired, and the acquired optimal prediction mode is obtained. Disclosed an invention that narrows down search candidates for a prediction mode to be searched by encoding a block of a higher layer based on the information and its correspondence table

特開２００６−０３３４５１号公報JP 2006-033451 A 特開２００５−１８４２４１号公報JP-A-2005-184241

T. Wiegand, G. Sullivan, J. Reichel, H. Schwarz and M. Wien :"Joint Draft ITU-T Rec. H.264｜ISO/IEC 14496-10 / Amd.3 Scalable video coding," ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, JVT-X201,2007. http://ftp3.itu.ch/av-arch/jvt-site/2007＿06＿Geneva/JVTX201.zipT. Wiegand, G. Sullivan, J. Reichel, H. Schwarz and M. Wien: "Joint Draft ITU-T Rec. H.264 ｜ ISO / IEC 14496-10 / Amd.3 Scalable video coding," ISO / IEC JTC1 / SC29 / WG11 and ITU-T SG16 Q.6, JVT-X201,2007.http: //ftp3.itu.ch/av-arch/jvt-site/2007_06_Geneva/JVTX201.zip J. Reichel, H. Schwarz and M. Wien: "Joint Scalable Video Model JSVM-11," ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, JVT-X202, 2007. http://ftp3.itu.ch/av-arch/jvt-site/2007＿06＿Geneva/JVTX202.zipJ. Reichel, H. Schwarz and M. Wien: "Joint Scalable Video Model JSVM-11," ISO / IEC JTC1 / SC29 / WG11 and ITU-T SG16 Q.6, JVT-X202, 2007. http: // ftp3 .itu.ch / av-arch / jvt-site / 2007_06_Geneva / JVTX202.zip 早瀬和也, 坂東幸浩, 高村誠之, 上倉一人, 八島由幸： "ＳＶＣにおけるレイヤ間予測モード相関性を利用したモード選択高速化",画像符号化シンポジウムPCSJ2008.Kazuya Hayase, Yukihiro Bando, Masayuki Takamura, Kazuo Uekura, Yoshiyuki Yashima: "Acceleration of mode selection using inter-layer prediction mode correlation in SVC", Image Coding Symposium PCSJ2008.

本発明者が非特許文献３で開示した発明は、スケーラブル動画像符号化におけるレイヤ間の予測モードの相関性を利用して、拡張レイヤにおけるマクロブロックの予測モードの探索候補を絞り込むようにすることから、スケーラブル動画像符号化の高速化を実現することができるようになる。 The invention disclosed by the present inventor in Non-Patent Document 3 uses the correlation of prediction modes between layers in scalable video coding to narrow down search candidates for prediction modes of macroblocks in an enhancement layer. Therefore, it is possible to realize high speed scalable video encoding.

しかしながら、非特許文献３で開示した発明では、上位レイヤのブロックと下位レイヤのブロックとが空間的に１対１に対応するというレイヤ構造を想定しており、そのようなレイヤ構造を持たない場合には適用できないという問題が残されている。 However, the invention disclosed in Non-Patent Document 3 assumes a layer structure in which upper layer blocks and lower layer blocks correspond one-to-one spatially, and does not have such a layer structure. Remains a problem that cannot be applied.

すなわち、非特許文献３で開示した発明は、上位レイヤと下位レイヤとの間に２のべき乗の解像度スケーラビリティが成立する場合には適用できるものの、２のべき乗の解像度スケーラビリティが成立しない場合には、拡張レイヤのマクロブロックの直下の画像領域に複数の最適予測モードが存在することになることから適用できないのである。 That is, the invention disclosed in Non-Patent Document 3 can be applied when a power-of-two resolution scalability is established between the upper layer and the lower layer, but when a power-of-two resolution scalability is not established, This is not applicable because there are a plurality of optimum prediction modes in the image area immediately below the macroblock of the enhancement layer.

ちなみに、上位レイヤの映像から一部の映像を切り出すことで下位レイヤの映像を生成する場合にも、上位レイヤのブロックと下位レイヤのブロックとが空間的に１対１に対応しないレイヤ構造をとることになり、この問題は、そのような場合にも同様に発生することになる。 By the way, even when a lower layer video is generated by cutting out a part of the video from the upper layer video, the upper layer block and the lower layer block have a layer structure that does not spatially correspond one-to-one. That is, this problem occurs in such a case as well.

本発明はかかる事情に鑑みてなされたものであり、レイヤ構造によってスケーラビリティを実現するスケーラブル動画像符号化において、レイヤ間の最適予測モードの相関性を利用して上位レイヤの予測モード探索候補の絞り込みを行い高速化する新たなスケーラブル動画像符号化技術の実現を目的とするものであり、さらに、この実現にあたって、レイヤ構造が上位レイヤのブロックと下位レイヤのブロックとが空間的に１対１に対応しない構造をとる場合に、その高速化を実現できるようにすることを目的とする。 The present invention has been made in view of such circumstances, and in scalable video coding that achieves scalability by layer structure, narrowing down prediction mode search candidates for higher layers using the correlation of optimal prediction modes between layers The purpose of this is to realize a new scalable moving picture coding technique that increases the speed, and in this realization, the layer structure has a spatially one-to-one correspondence between the upper layer block and the lower layer block. It is an object to make it possible to increase the speed when a non-corresponding structure is adopted.

〔１〕本発明の基本的な考え方
この目的を達成するために、本発明では、レイヤ構造によってスケーラビリティを実現するスケーラブル動画像符号化において、
（ｉ）予測モード対応率表（レイヤ間の最適予測モードの相関性について記述する
表）の生成
（ii）予測モード対応率表を使った予測モード探索候補の絞り込み
という２つの処理によって予測モード探索の高速化を実現する。 [1] Basic concept of the present invention In order to achieve this object, the present invention provides a scalable video coding that realizes scalability by a layer structure.
(I) Prediction mode correspondence rate table (describes the correlation of optimum prediction modes between layers
(Ii) Speeding up the prediction mode search is realized by two processes of narrowing down prediction mode search candidates using the prediction mode correspondence rate table.

このとき、レイヤ構造が上位レイヤのブロックと下位レイヤのブロックとが空間的に１対１に対応しない構造をとることを考慮して、予測モード対応率表を生成するように処理するとともに、そのことを考慮して、予測モード対応率表を使った予測モード探索候補の絞り込みを実行するように処理する。 At this time, considering that the layer structure has a structure in which the block of the upper layer and the block of the lower layer do not spatially correspond to each other, processing is performed so as to generate a prediction mode correspondence rate table, In consideration of this, processing is performed so as to narrow down prediction mode search candidates using the prediction mode correspondence rate table.

以降、図１のような例に従って説明を進める。すなわち、レイヤＬとレイヤＬ−１の両方が、ＩＢＢＢＰの階層的Ｂ構造で符号化をしていると仮定する。図中の矢印は予測参照先を示している。符号化対象レイヤをＬ、符号化対象フレームをＢ２ｂとし、予測モード対応率表の生成対象フレームをＢ２ａとする。また、Ｂ２ｂの同一時刻のレイヤＬ−１のフレームをＢ’２ｂ、Ｂ２ａの同一時刻のレイヤＬ−１のフレームをＢ’２ａとする。時間レベルが低い順に符号化され、同一時間レベル中では時間が早いフレーム順に符号化されるとする。また、レイヤはレベルが小さい順に符号化されるとする。 Hereinafter, the description proceeds according to an example as shown in FIG. That is, it is assumed that both layer L and layer L-1 are encoded with a hierarchical B structure of IBBBP. Arrows in the figure indicate prediction reference destinations. The encoding target layer is L, the encoding target frame is B2b, and the generation target frame of the prediction mode correspondence rate table is B2a. Also, the frame of layer L-1 at the same time B2b is B'2b, and the frame of layer L-1 at the same time B2a is B'2a. It is assumed that encoding is performed in the order from the lowest time level, and in the same time level, the coding is performed in the order of frames with the earlier time. Further, it is assumed that layers are encoded in ascending order of level.

次に、図２に示すフローチャートに従って、本発明の処理の大きな流れについて説明する。 Next, a large flow of processing according to the present invention will be described with reference to the flowchart shown in FIG.

本発明では、動画像をスケーラブル符号化する場合、図２のフローチャートに示すように、まず最初に、ステップＳ１０１で、変数ｎに１をセットし、続くステップＳ１０２で、全てのフレームを符号化したのか否かを判断して、全てのフレームを符号化したことを判断する場合には、処理を終了する。 In the present invention, in the case of scalable coding of a moving image, as shown in the flowchart of FIG. 2, first, in step S101, 1 is set to the variable n, and in the subsequent step S102, all frames are coded. If it is determined whether or not all frames have been encoded, the process ends.

一方、ステップＳ１０２の判断処理に従って、全てのフレームを符号化していないことを判断するときには、ステップＳ１０３に進んで、先頭フレームからの順番に従って、未処理のフレームを１つ選択し、続くステップＳ１０４で、使用可能なものとして定義された予測モードの使用に制限を加えることなく予測を行うことで、その選択したフレームを符号化する。 On the other hand, when it is determined that all the frames are not encoded according to the determination process in step S102, the process proceeds to step S103, and one unprocessed frame is selected according to the order from the first frame, and in subsequent step S104. The selected frame is encoded by performing the prediction without limiting the use of the prediction mode defined as usable.

続いて、ステップＳ１０５で、変数ｎの値を１つインクリメントし、続くステップＳ１０６で、変数ｎの値が所定の閾値Ｎ１（Ｎ１は１以上の整数）よりも大きくなったのか否かを判断して、閾値Ｎ１よりも大きくなっていないことを判断するときには、ステップＳ１０２の処理に戻ることで、使用可能なものとして定義された予測モードの使用に制限を加えることなくフレームを符号化することを続行する。 Subsequently, in step S105, the value of the variable n is incremented by one, and in the subsequent step S106, it is determined whether or not the value of the variable n has become larger than a predetermined threshold value N1 (N1 is an integer of 1 or more). Thus, when it is determined that it is not larger than the threshold value N1, the process returns to the process of step S102 to encode the frame without limiting the use of the prediction mode defined as usable. continue.

一方、ステップＳ１０６の判断処理で、変数ｎの値が閾値Ｎ１よりも大きくなったことを判断するときには、ステップＳ１０７に進んで、後述するようなデータ構造を持つ、レイヤ間の最適予測モードの相関性について記述する予測モード対応率表を生成する。 On the other hand, when it is determined in step S106 that the value of the variable n has become larger than the threshold value N1, the process proceeds to step S107 to correlate the optimum prediction mode between layers having a data structure as described later. A prediction mode correspondence rate table describing the sex is generated.

続いて、ステップＳ１０８で、変数ｎに１をセットし、続くステップＳ１０９で、全てのフレームを符号化したのか否かを判断して、全てのフレームを符号化したことを判断する場合には、処理を終了する。 Subsequently, in step S108, the variable n is set to 1, and in the subsequent step S109, it is determined whether or not all the frames have been encoded, and when it is determined that all the frames have been encoded, The process ends.

一方、ステップＳ１０９の判断処理に従って、全てのフレームを符号化していないことを判断するときには、ステップＳ１１０に進んで、先頭フレームからの順番に従って、未処理のフレームを１つ選択し、続くステップＳ１１１で、予測モード対応率表を使って予測モードの探索候補の絞り込みを行いつつ予測を行うことで、その選択したフレームを符号化する。 On the other hand, when it is determined that all the frames are not encoded according to the determination process in step S109, the process proceeds to step S110, and one unprocessed frame is selected according to the order from the first frame, and in the subsequent step S111. The selected frame is encoded by performing prediction while narrowing down prediction mode search candidates using the prediction mode correspondence rate table.

続いて、ステップＳ１１２で、変数ｎの値を１つインクリメントし、続くステップＳ１１３で、変数ｎの値が所定の閾値Ｎ２（Ｎ２は１以上の整数）よりも大きくなったのか否かを判断して、閾値Ｎ２よりも大きくなっていないことを判断するときには、ステップＳ１０９の処理に戻ることで、予測モード対応率表を使って予測モードの探索候補の絞り込みを行いつつフレームを符号化することを続行する。 Subsequently, in step S112, the value of the variable n is incremented by one, and in the subsequent step S113, it is determined whether or not the value of the variable n has become larger than a predetermined threshold value N2 (N2 is an integer of 1 or more). Thus, when it is determined that it is not larger than the threshold value N2, by returning to the process of step S109, it is possible to encode the frame while narrowing down prediction mode search candidates using the prediction mode correspondence rate table. continue.

一方、ステップＳ１１３の判断処理で、変数ｎの値が閾値Ｎ２よりも大きくなったことを判断するときには、予測モード対応率表を更新する必要があることを判断して、ステップＳ１０１の処理に戻ることで、予測モード対応率表を更新しつつ、ステップＳ１０１〜ステップＳ１１３の処理を続行する。 On the other hand, when it is determined in step S113 that the value of the variable n is larger than the threshold value N2, it is determined that the prediction mode correspondence rate table needs to be updated, and the process returns to step S101. As a result, the processing of step S101 to step S113 is continued while updating the prediction mode correspondence rate table.

このように、本発明では、動画像をスケーラブル符号化する場合に、Ｎ１枚のフレームを符号化した後、その符号化結果に基づいて、レイヤ間の最適予測モードの相関性について記述する予測モード対応率表を生成し、続いて、それに続くＮ２枚のフレームの符号化に入って、その生成した予測モード対応率表を使って予測モードの探索候補の絞り込みを行いつつ、そのＮ２枚のフレームを符号化することを繰り返していくように処理するのである。 As described above, in the present invention, when a moving image is scalable encoded, a prediction mode that describes the correlation of the optimum prediction mode between layers based on the encoding result after encoding N1 frames. A correspondence rate table is generated, and then N2 frames are encoded. The N2 frames are searched while narrowing down prediction mode search candidates using the generated prediction mode correspondence rate table. It processes so that encoding may be repeated.

（ｉ）予測モード対応率表の生成処理
次に、ステップＳ１０７で実行する予測モード対応率表の生成処理について説明する。以降の説明は、レイヤＬ−１がレイヤＬの映像の切り出し処理によって生成される任意倍率の空間スケーラブル符号化を行う場合の例である。 (I) Prediction mode correspondence rate table generation processing Next, the prediction mode correspondence rate table generation processing executed in step S107 will be described. The following description is an example in the case of performing spatial scalable encoding at an arbitrary magnification generated by layer L-1 video L clipping processing.

ここで、この切り出し処理では、例えば、レイヤＬの映像の上下左右の端の部分を取り除くことでレイヤＬ−１の映像を生成することになり、これにより、上位レイヤのブロックと下位レイヤのブロックとが空間的に１対１に対応しないレイヤ構造をとることになる。 Here, in this cut-out process, for example, layer L-1 video is generated by removing the upper, lower, left, and right end portions of the layer L video, whereby the upper layer block and the lower layer block are generated. And take a layer structure that does not spatially correspond to one to one.

ステップＳ１０４の処理に従って、図１に示す予測モード対応率表の生成対象フレームＢ２ａとその直下のＢ’２ａはすでに符号化済みであり、最適予測モードはすでに選択されている。 According to the processing in step S104, the generation target frame B2a and the B′2a immediately below the generation target frame B2a of the prediction mode correspondence rate table shown in FIG. 1 have already been encoded, and the optimal prediction mode has already been selected.

フレームＢ２ａとＢ’２ａの符号化時には、選択された最適予測モードの情報をバッファに格納しておく。そして、そのバッファに格納されるフレームＢ２ａのマクロブロック（以下、ＭＢと略記する）の最適予測モードと、フレームＢ’２ａの空間的に対応する画像領域に含まれるブロックが属するＭＢの最適予測モードとの対応関係を調べる。 When the frames B2a and B'2a are encoded, information on the selected optimum prediction mode is stored in the buffer. Then, the optimal prediction mode of the macroblock (hereinafter abbreviated as MB) of the frame B2a stored in the buffer, and the optimal prediction mode of the MB to which the block included in the spatially corresponding image area of the frame B′2a belongs Check the correspondence with.

図３に、これらの空間的な位置関係を示す。レイヤＬ−１の映像がレイヤＬの映像を切り取ることによって生成された場合、フレームＢ２ａのＭＢと空間的に対応するフレームＢ’２ａのブロックは最大４つ存在しうる。ここでは例として、４つ存在しうるとし、それらのブロックをそれぞれＢＬＫ１〜ＢＬＫ４と定める。 FIG. 3 shows these spatial positional relationships. When the layer L-1 video is generated by cutting out the layer L video, there may be a maximum of four blocks of the frame B′2a spatially corresponding to the MB of the frame B2a. Here, as an example, it is assumed that there may be four, and these blocks are defined as BLK1 to BLK4, respectively.

予測モード対応率表の生成は、以降に記す２つの手順で行われる。 The generation of the prediction mode correspondence rate table is performed by the following two procedures.

まず、フレームＢ２ａのＭＢがある予測モードであるとき、その直下のフレームＢ’２ａのＢＬＫ１〜４ではどの予測モードが選択されているのかを解析し、その対応する組み合わせの画素数を計算する。そして、その画素数を図４に示すような表の各セルに入力していき、その数値を累積加算していく。例えば、フレームＢ２ａでＰ１６×１６が選択されたＭＢの直下の画像領域の２５画素においてＰ８×１６が選択された、ということになる。 First, when the MB of the frame B2a is a certain prediction mode, which prediction mode is selected in the BLK1 to BLK1 of the frame B'2a immediately below it is analyzed, and the number of pixels of the corresponding combination is calculated. Then, the number of pixels is input to each cell of the table as shown in FIG. 4, and the numerical values are cumulatively added. For example, P8 × 16 is selected in 25 pixels in the image area immediately below the MB in which P16 × 16 is selected in the frame B2a.

次に、図５に示すような予測モード対応率表を生成する。図５の数値は、図４の横軸の各予測モードの累積画素数で各セルの画素数を割ったものである。これは、フレームＢ’２ａのある画像領域においてある予測モードが選ばれているときに、その直上のフレームＢ２ａのＭＢで選ばれた各予測モードの割合を示している。 Next, a prediction mode correspondence rate table as shown in FIG. 5 is generated. The numerical values in FIG. 5 are obtained by dividing the number of pixels in each cell by the cumulative number of pixels in each prediction mode on the horizontal axis in FIG. This indicates the ratio of each prediction mode selected by the MB of the frame B2a immediately above when a certain prediction mode is selected in a certain image region of the frame B'2a.

フレームＢ２ａやＢ’２ａにおける最適予測モードの選択方法は、非特許文献２に記したＪＳＶＭの方法でもよいし、特許文献１に記したような予測モード探索候補の絞り込みを行う方法でもよい。また、この例では、予測モード対応率表の生成対象フレームは符号化対象フレームと同一時間レイヤの符号化済みフレーム１枚としているが、これに限らない。異なる時間レイヤの符号化済みフレーム（例えば、フレームＢ１）を生成対象としてもよい。また、複数のフレームを生成対象（例えば、フレームＢ１とＢ２ａ）として、その複数のフレームの累積で予測モード対応率表を計算してもよい。つまり、符号化対象レイヤおよびその直下レイヤにおいて符号化済みのフレームであれば、予測モード対応率表の生成対象フレームと成りうる。 The selection method of the optimal prediction mode in the frames B2a and B′2a may be the JSVM method described in Non-Patent Document 2 or the method of narrowing down prediction mode search candidates as described in Patent Document 1. In this example, the generation target frame of the prediction mode correspondence rate table is one encoded frame in the same time layer as the encoding target frame, but is not limited thereto. An encoded frame (for example, frame B1) of a different time layer may be a generation target. Further, the prediction mode correspondence rate table may be calculated by accumulating the plurality of frames with a plurality of frames as generation targets (for example, frames B1 and B2a). That is, any frame that has been encoded in the encoding target layer and the layer immediately below the encoding target layer can be a generation target frame of the prediction mode correspondence rate table.

（ii）予測モード対応率表を使った予測モード探索候補の絞り込み処理
次に、ステップＳ１１１で実行する予測モード対応率表を使った予測モード探索候補の絞り込み処理について説明する。 (Ii) Prediction mode search candidate narrowing process using prediction mode correspondence rate table Next, the prediction mode search candidate narrowing process using the prediction mode correspondence rate table executed in step S111 will be described.

ステップＳ１０７で生成した予測モード対応率表中の予測モード対応率の値に従って、符号化対象フレームＢ２ｂの各ＭＢにおいて、予測モード探索候補を絞り込む。この予測モード対応率表の中の数値を、符号化対象フレームＢ２ｂにおける各予測モードの発生確率とみなす。つまり、フレームＢ’２ｂのある画像領域において最適予測モードｊ（図５に示す予測モード対応率表の縦軸の予測モード）が選択されているときに、符号化対象フレームＢ２ｂの空間的に対応する画像領域が属するＭＢにおいて予測モードｉ（図５に示す予測モード対応率表の横軸の予測モード）が最適となる確率が各セルに記述されている、とみなす。 In accordance with the value of the prediction mode correspondence rate in the prediction mode correspondence rate table generated in step S107, the prediction mode search candidates are narrowed down in each MB of the encoding target frame B2b. The numerical value in this prediction mode correspondence rate table is regarded as the occurrence probability of each prediction mode in the encoding target frame B2b. That is, when the optimum prediction mode j (prediction mode on the vertical axis of the prediction mode correspondence rate table shown in FIG. 5) is selected in an image region with the frame B′2b, the spatial correspondence of the encoding target frame B2b The probability that the prediction mode i (the prediction mode on the horizontal axis of the prediction mode correspondence rate table shown in FIG. 5) is optimal in the MB to which the image region belongs is described in each cell.

符号化対象フレームＢ２ｂの符号化対象マクロブロックをＭＢ_Lと表記し、ＭＢ_Lと空間的に同一位置にあるレイヤＬ−１のフレームＢ’２ｂのブロックをＢＬＫ１_L-1、ＢＬＫ２_L-1、ＢＬＫ３_L-1、ＢＬＫ４_L-1と表記する。 The encoding target macroblock of the encoding target frame B2b is denoted as MB _L, and the block of the frame B′2b of the layer L-1 located spatially at the same position as MB _L is represented by BLK1 _L−1 , BLK2 _L−1 , Indicated as BLK3 _L-1 and BLK4 _L-1 .

まず、下記の（ａ）〜（ｃ）に示す３つのいずれかの方法に従って、予測モード対応率表の縦軸ｊに入力する参照予測モードを決定する。 First, according to any one of the following three methods (a) to (c), the reference prediction mode to be input to the vertical axis j of the prediction mode correspondence rate table is determined.

ここで、参照予測モードについては、１つの予測モードを参照予測モードとして決定することも可能であるし、複数の予測モードを参照予測モードとして決定することも可能であるが、下記に示す例では、１つの参照予測モードを決定することを想定している。 Here, with respect to the reference prediction mode, it is possible to determine one prediction mode as the reference prediction mode and it is also possible to determine a plurality of prediction modes as the reference prediction mode, but in the example shown below, It is assumed that one reference prediction mode is determined.

（ａ）累積面積が最大となる予測モード
ＢＬＫ１_L-1〜ＢＬＫ４_L-1の内で、累積の面積が最大となるブロックの予測モードを同定し、それを予測モード対応率表の参照予測モードとする。このとき、累積面積が同じになる予測モードがある場合には、あらかじめ決められた手法に従って決定する。例えば、後述の（ｂ）や（ｃ）の手法などが考えられる。累積面積の対象となるブロックは、ＢＬＫ１_L-1〜ＢＬＫ４_L-1にとどまる必要はない。あらかじめ決められた範囲にある当該領域の周囲のブロックを含めてもよい。さらに、距離情報などを参考に、それらの面積に重み付けをして累積を求めてもよい。 (A) Prediction mode with maximum accumulated area Among BLK1 _{L-1 to} BLK4 _L-1 , a prediction mode of a block with the maximum cumulative area is identified, and is used as a reference prediction mode of the prediction mode correspondence rate table And At this time, if there is a prediction mode in which the accumulated area is the same, it is determined according to a predetermined method. For example, the following methods (b) and (c) can be considered. The blocks targeted for the cumulative area need not remain in BLK1 _{L-1 to} BLK4 _L-1 . You may include the block around the said area | region in the predetermined range. Further, referring to distance information or the like, the area may be weighted to obtain an accumulation.

（ｂ）優先順位が最も高い予測モード
あらかじめ初期の予測モード間で優先順位付けを行っておき、ＢＬＫ１_L-1〜ＢＬＫ４_L-1がとりうる各予測モードの中で、最も優先順位の高い予測モードを同定し、それを予測モード対応率表の参照予測モードとする。この優先順位のつけ方として例としては、高速性を重視して絞り込みが厳しく行われる順に優先順位を高くする方法や、反対に、符号化性能を重視して絞り込みが少ない順に優先順位を高くする方法などが考えられる。 (B) Prediction mode with highest priority Prioritization is performed between the initial prediction modes in advance, and the prediction with the highest priority among the prediction modes that BLK1 _{L-1 to} BLK4 _L-1 can take. The mode is identified and used as the reference prediction mode of the prediction mode correspondence rate table. As an example of how to set the priority, a method of increasing the priority in the order in which narrowing down is performed with emphasis on high speed, and conversely, the priority is increased in the order of narrowing down with emphasis on coding performance. Possible methods.

（ｃ）レイヤ間予測で参照する予測モード
レイヤ間で参照する予測モードを調査し、それを予測モード対応率表の参照予測モードとする。 (C) Prediction mode to be referred to in inter-layer prediction A prediction mode to be referred to between layers is investigated and set as a reference prediction mode in the prediction mode correspondence rate table.

このようにして、予測モード対応率表の縦軸ｊに入力する参照予測モードを決定すると、次に、予測モード対応率表と参照予測モードとを照合し、符号化対象マクロブロックＭＢ_Lにおいて各予測モードが最適となりうる確率を調査する。ここで、参照予測モードは図５の縦軸に入力される予測モードとなる。そして、この最適予測モードとなりうる確率をもとに予測モード探索候補を絞り込む。下記に、絞り込みの例を２つ示す。 When the reference prediction mode to be input to the vertical axis j of the prediction mode correspondence rate table is determined in this way, next, the prediction mode correspondence rate table and the reference prediction mode are collated, and each macroblock MB _L is encoded. Investigate the probability that the prediction mode can be optimal. Here, the reference prediction mode is a prediction mode input on the vertical axis of FIG. Then, the prediction mode search candidates are narrowed down based on the probability that the optimum prediction mode can be obtained. Two examples of narrowing down are shown below.

（イ）絞り込み手法１
絞り込み手法１は、予測モード探索候補絞り込み閾値を用いて予測モード探索候補を絞り込む手法である。 (A) Refinement method 1
The narrowing down method 1 is a method of narrowing down prediction mode search candidates using a prediction mode search candidate narrowing threshold.

この絞り込み手法１では、予測モード探索候補絞り込み閾値ｔ％を設け、この閾値ｔ％未満の予測モードを探索候補から除外する。閾値ｔの値は外部より与える。値の決定方法としては、符号化性能の劣化を許容範囲以内に抑える値を複数回のエンコード処理により決定する方法が一例に考えられる。 In this narrowing down method 1, a prediction mode search candidate narrowing down threshold t% is provided, and a prediction mode less than this threshold t% is excluded from search candidates. The value of the threshold value t is given from the outside. As an example of a method for determining a value, a method for determining a value that suppresses deterioration in coding performance within an allowable range by performing a plurality of encoding processes may be considered.

ここで、参照予測モードの情報を取得した時点に、予測モード対応率表からＭＢ_Lにおける各予測モードの最適予測モードとなりうる確率（対応率）を読み出して、予測モード探索候補絞り込み閾値と比較するという方法を用いると、その比較処理が煩雑なものとなる。 Here, the time of obtaining the reference prediction mode information, reads out the probability that can be the optimum prediction mode for each prediction mode in the MB _L from the prediction mode correspondence rate table (correspondence rate), compared with the prediction mode search candidate narrowing down threshold value If this method is used, the comparison process becomes complicated.

そこで、前もって、予測モード対応率表の対応率を予測モード探索候補絞り込み閾値で閾値処理しておいて、予測モード対応率表の対応率を２値化しておくようにする。 Therefore, in advance, the correspondence rate of the prediction mode correspondence rate table is binarized by thresholding the correspondence rate of the prediction mode correspondence rate table with the prediction mode search candidate narrowing threshold.

図６に、図５に示す予測モード対応率表において予測モード探索候補絞り込み閾値５％と設定した場合の予測モード探索候補の絞り込み結果を図示する。図中に示す○が探索候補、×が探索候補から除外された予測モードを示している。 FIG. 6 illustrates a result of narrowing down prediction mode search candidates when the prediction mode search candidate narrowing threshold is set to 5% in the prediction mode correspondence rate table shown in FIG. In the figure, ○ indicates a search candidate, and × indicates a prediction mode excluded from the search candidate.

（ロ）絞り込み手法２
絞り込み手法２は、予測モード対応率が最大となる予測モードのみを探索候補と設定する手法である。 (B) Refinement method 2
The narrowing-down method 2 is a method of setting only the prediction mode with the maximum prediction mode correspondence rate as a search candidate.

予測モード対応率が最大となる予測モードを探索候補として設定する。通常はここで１つの予測モードに絞られるが、最大値を与える予測モード探索候補が複数ある場合には、それらをすべて探索候補として設定する。 A prediction mode that maximizes the prediction mode correspondence rate is set as a search candidate. Normally, this is narrowed down to one prediction mode, but when there are a plurality of prediction mode search candidates giving the maximum value, they are all set as search candidates.

ここで、参照予測モードの情報を取得した時点に、予測モード対応率表からＭＢ_Lにおける各予測モードの最適予測モードとなりうる確率（対応率）を読み出して、その中から最大値の対応率を特定するという方法を用いると、その特定処理が煩雑なものとなる。 Here, at the time when the information of the reference prediction mode is acquired, the probability (correspondence rate) that can be the optimum prediction mode of each prediction mode in MB _L is read from the prediction mode correspondence rate table, and the correspondence rate of the maximum value is determined from among them. If the method of specifying is used, the specifying process becomes complicated.

そこで、前もって、予測モード対応率表の対応率の中に含まれる最大値の対応率を特定しておいて、予測モード対応率表の対応率を２値化しておくようにする。 Therefore, the correspondence rate of the maximum value included in the correspondence rate of the prediction mode correspondence rate table is specified in advance, and the correspondence rate of the prediction mode correspondence rate table is binarized.

図７に、図５に示す予測モード対応率表において最大値の予測モードを予測モード探索候補として設定した場合の絞り込み結果を図示する。図中に示す○が探索候補、×が探索候補から除外された予測モードを示している。 FIG. 7 illustrates a result of narrowing down when the maximum prediction mode is set as a prediction mode search candidate in the prediction mode correspondence rate table shown in FIG. In the figure, ○ indicates a search candidate, and × indicates a prediction mode excluded from the search candidate.

〔２〕本発明の構成
次に、本発明の構成について説明する。 [2] Configuration of the Present Invention Next, the configuration of the present invention will be described.

本発明のスケーラブル動画像符号化装置は、上位レイヤのブロックと下位レイヤのブロックとが空間的に１対１に対応しないレイヤ構造を処理対象とするスケーラブル動画像符号化において予測モード探索の高速化を実現するために、（１）予測モードの使用に制限を設けることなく行ったスケーラブル符号化で選択された最適予測モードに基づいて、上位レイヤのブロックとその直下の下位レイヤの画像領域で選択された最適予測モードの組み合わせの発生率を求めて、その最適予測モードの組み合わせとその発生率との対応関係について記述する対応表を生成する生成手段と、（２）上位レイヤのブロックを符号化する場合に、その直下の下位レイヤの画像領域で選択された最適予測モードの情報を取得する取得手段と、（３）取得手段の取得した最適予測モードの中から処理対象の最適予測モードを選択する選択手段と、（４）選択手段の選択した最適予測モードと生成手段の生成した対応表に記述される発生率とに基づいて、その対応表に記述される最適予測モードの組み合わせの中から有効な組み合わせを抽出して、その抽出した組み合わせの持つ上位レイヤの最適予測モードを、上位レイヤのブロックの符号化で探索する予測モード探索候補として決定する決定手段と、（５）対応表を用いて実行される予測モードの使用に制限を設けるスケーラブル符号化と、対応表を用いないで実行される予測モードの使用に制限を設けないスケーラブル符号化とを交互に繰り返すように制御する制御手段とを備えるように構成する。 The scalable video encoding device of the present invention speeds up prediction mode search in scalable video encoding in which a layer structure in which an upper layer block and a lower layer block do not spatially correspond one-to-one is processed. (1) Based on the optimal prediction mode selected by scalable coding performed without restriction on the use of the prediction mode, the upper layer block and the image region of the lower layer immediately below it are selected. Generating means for obtaining an occurrence rate of the combination of the optimum prediction modes and generating a correspondence table describing a correspondence relationship between the combination of the optimum prediction modes and the occurrence rate; and (2) encoding a block of an upper layer. (3) acquisition means for acquiring information on the optimum prediction mode selected in the image region of the lower layer immediately below the image area; Based on the selection means for selecting the optimum prediction mode to be processed from the acquired optimum prediction modes, and (4) the optimum prediction mode selected by the selection means and the occurrence rate described in the correspondence table generated by the generation means , A prediction mode in which an effective combination is extracted from the combinations of optimal prediction modes described in the correspondence table, and the optimal prediction mode of the upper layer possessed by the extracted combination is searched by encoding the block of the upper layer Determination means for determining as a search candidate, (5) scalable coding for limiting the use of a prediction mode executed using a correspondence table, and use of a prediction mode executed without using a correspondence table And a control means for controlling to repeat non-scalable coding alternately.

この構成をとるときに、選択手段は、上位レイヤのブロックと下位レイヤの画像領域との重なりの大きさに基づいて、処理対象の最適予測モードを選択したり、あらかじめ設定された予測モードの優先順位に基づいて、処理対象の最適予測モードを選択したり、レイヤ間予測で参照する予測モードがある場合には、その予測モードを優先する形で処理対象の最適予測モードを選択することがある。 When adopting this configuration, the selecting means selects the optimum prediction mode to be processed based on the size of the overlap between the upper layer block and the lower layer image area, or prioritizes a preset prediction mode. When there is a prediction mode to be selected based on the ranking or to be referred to in inter-layer prediction, the optimal prediction mode to be processed may be selected in a manner that gives priority to the prediction mode. .

ここで、選択手段は、取得手段の取得した最適予測モードのすべてを処理対象の最適予測モードとして選択するようにしてもよい。 Here, the selection unit may select all of the optimum prediction modes acquired by the acquisition unit as the optimum prediction modes to be processed.

また、生成手段は、上位レイヤのブロックと下位レイヤのブロックとが空間的に１対１に対応しないことを考慮して、上位レイヤのブロックで選択された最適予測モードのそれぞれについて、そのブロックの直下の下位レイヤの画像領域の画素数をその画像領域で選択された最適予測モード毎に集計して、その集計結果に基づいて対応表を生成することがある。 Further, the generation means considers that the block of the upper layer and the block of the lower layer do not spatially correspond one-to-one, and for each of the optimum prediction modes selected in the block of the upper layer, In some cases, the number of pixels in the image area of the immediately lower layer is aggregated for each optimum prediction mode selected in the image area, and a correspondence table is generated based on the aggregation result.

このように構成される本発明のスケーラブル動画像符号化装置では、選択手段は、上位レイヤのブロックと下位レイヤのブロックとが空間的に１対１に対応しないことで、取得手段の取得した最適予測モードが複数となることがあることを考慮して、その中から処理対象の最適予測モードを選択する。 In the scalable video encoding apparatus of the present invention configured as described above, the selection unit is configured so that the upper layer block and the lower layer block do not correspond one-to-one spatially, and thus the optimum obtained by the obtaining unit is obtained. Considering that there may be a plurality of prediction modes, the optimum prediction mode to be processed is selected from among them.

例えば、取得手段の取得した最適予測モードの中から処理対象となる１つの最適予測モードを選択する場合、取得手段の取得した最適予測モードが１つである場合には、その最適予測モードを選択し、一方、取得手段の取得した最適予測モードが複数である場合には、上位レイヤのブロックと最も重なりの大きい直下の下位レイヤの画像領域で選択された最適予測モードを選択したり、あらかじめ設定された予測モードの優先順位に従って、その複数の最適予測モードの中の最も優先順位の高い最適予測モードを選択したり、その複数の最適予測モードの中にレイヤ間予測で参照する予測モードがある場合には、その予測モードを選択する。 For example, when selecting one optimal prediction mode to be processed from among the optimal prediction modes acquired by the acquisition unit, if there is one optimal prediction mode acquired by the acquisition unit, select the optimal prediction mode On the other hand, if there are a plurality of optimum prediction modes acquired by the acquisition means, the optimum prediction mode selected in the image region of the immediately lower layer with the largest overlap with the block of the upper layer is selected or set in advance. In accordance with the priority order of the predicted prediction modes, the optimal prediction mode with the highest priority among the plurality of optimal prediction modes is selected, and there is a prediction mode referred to in inter-layer prediction among the plurality of optimal prediction modes. In that case, the prediction mode is selected.

ここで、選択手段の選択した最適予測モードが前述した参照予測モードに相当することになる。 Here, the optimum prediction mode selected by the selection means corresponds to the reference prediction mode described above.

この選択手段による選択処理を受けて、決定手段は、選択手段の選択した最適予測モードの情報をキーにして対応表を参照することで、その最適予測モードに対応付けられる発生率を特定して、その特定した発生率に含まれる所定の閾値よりも大きな値を示す発生率を持つ最適予測モードの組み合わせを抽出したり、その特定した発生率に含まれる最も大きな値を示す発生率を持つ最適予測モードの組み合わせを抽出したり、その特定した発生率に含まれるその値の大きな順に選択される所定の個数の発生率を持つ最適予測モードの組み合わせを抽出し、そして、その抽出した最適予測モードの組み合わせの持つ上位レイヤの最適予測モードを、上位レイヤのブロックの符号化で探索する予測モード探索候補として決定する。 Upon receiving the selection process by the selection unit, the determination unit specifies the occurrence rate associated with the optimum prediction mode by referring to the correspondence table using the information of the optimum prediction mode selected by the selection unit as a key. , Extract the optimal prediction mode combination that has an occurrence rate that is larger than a predetermined threshold included in the specified occurrence rate, or the optimum that has the occurrence rate that shows the largest value included in the specified occurrence rate A combination of prediction modes is extracted, or a combination of optimum prediction modes having a predetermined number of occurrence rates selected in descending order of the values included in the specified occurrence rate, and the extracted optimum prediction mode Is determined as a prediction mode search candidate to be searched for by encoding the block of the upper layer.

さらに、決定手段による決定処理が効率的な処理になることを実現するために、本発明のスケーラブル動画像符号化装置は、前もって、対応表に記述される発生率の値に基づいて、対応表に記述される最適予測モードの組み合わせを絞り込むことで有効な最適予測モードの組み合わせを抽出して、その抽出した有効な最適予測モードの組み合わせについて記述する予測モード対応情報を生成するようにする。 Further, in order to realize that the determination process by the determination unit becomes an efficient process, the scalable video encoding device of the present invention is based on the occurrence rate value described in the correspondence table in advance. The effective optimum prediction mode combinations are extracted by narrowing down the optimum prediction mode combinations described in (1), and prediction mode correspondence information describing the extracted effective optimum prediction mode combinations is generated.

この場合には、本発明のスケーラブル動画像符号化装置は、（１）予測モードの使用に制限を設けることなく行ったスケーラブル符号化で選択された最適予測モードに基づいて、上位レイヤのブロックとその直下の下位レイヤの画像領域で選択された最適予測モードの組み合わせの発生率を求めて、その最適予測モードの組み合わせとその発生率との対応関係について記述する対応表を生成する対応表生成手段と、（２）対応表生成手段の生成した対応表に記述される発生率の値に基づいて、対応表に記述される最適予測モードの組み合わせを絞り込むことで有効な最適予測モードの組み合わせを抽出して、その抽出した有効な最適予測モードの組み合わせについて記述する予測モード対応情報を生成する予測モード対応情報生成手段と、（３）上位レイヤのブロックを符号化する場合に、その直下の下位レイヤの画像領域で選択された最適予測モードの情報を取得する取得手段と、（４）取得手段の取得した最適予測モードの中から処理対象の最適予測モードを選択する選択手段と、（５）選択手段の選択した最適予測モードを組み合わせに持つ予測モード対応情報を参照することで、上位レイヤのブロックの符号化で探索する予測モード探索候補を決定する決定手段と、（６）対応表を用いて実行される予測モードの使用に制限を設けるスケーラブル符号化と、対応表を用いないで実行される予測モードの使用に制限を設けないスケーラブル符号化とを交互に繰り返すように制御する制御手段とを備えるように構成する。 In this case, the scalable video coding apparatus of the present invention (1) based on the optimal prediction mode selected by the scalable coding performed without limiting the use of the prediction mode, Correspondence table generating means for determining the occurrence rate of the combination of the optimum prediction modes selected in the image region of the immediately lower layer and generating a correspondence table describing the correspondence between the combination of the optimum prediction modes and the occurrence rate And (2) extracting effective optimum prediction mode combinations by narrowing down the optimum prediction mode combinations described in the correspondence table based on the occurrence rate values described in the correspondence table generated by the correspondence table generating means. A prediction mode correspondence information generating means for generating prediction mode correspondence information describing the combination of the extracted effective optimum prediction modes; A) acquiring means for acquiring the information of the optimal prediction mode selected in the image area of the immediately lower layer when encoding a block of the upper layer; and (4) from among the optimal prediction modes acquired by the acquiring means. A selection mode for selecting an optimum prediction mode to be processed, and (5) a prediction mode for searching by encoding a block in a higher layer by referring to prediction mode correspondence information having a combination of the optimum prediction mode selected by the selection unit (6) Scalable coding for limiting the use of a prediction mode executed using a correspondence table, and use of a prediction mode executed without using a correspondence table And a control means for controlling to repeat non-scalable coding alternately.

この構成をとるときに、予測モード対応情報生成手段は、図６に示すように、所定の閾値よりも大きな値を示す発生率を持つ最適予測モードの組み合わせを有効なものとして抽出することで予測モード対応情報を生成したり、図７に示すように、下位レイヤについて同一の最適予測モードを持つ最適予測モードの組み合わせの中から、最も大きな値を示す発生率を持つ最適予測モードの組み合わせを有効なものとして抽出することで予測モード対応情報を生成したり、大きな値を示す発生率の順に選択される所定の個数の最適予測モードの組み合わせを有効なものとして抽出することで予測モード対応情報を生成することがある。 When taking this configuration, the prediction mode correspondence information generating means predicts by extracting, as shown in FIG. 6, an effective combination of optimum prediction modes having an occurrence rate that indicates a value larger than a predetermined threshold. The mode correspondence information is generated or, as shown in FIG. 7, the optimum prediction mode combination having the highest occurrence rate indicating the largest value is selected from among the optimum prediction mode combinations having the same optimum prediction mode for the lower layer. The prediction mode correspondence information is generated by extracting as a valid one, or the combination of a predetermined number of optimum prediction modes selected in the order of the occurrence rate indicating the large value is extracted as a valid one. May be generated.

以上の各処理手段が動作することで実現される本発明のスケーラブル動画像符号化方法はコンピュータプログラムでも実現できるものであり、このコンピュータプログラムは、適当なコンピュータ読み取り可能な記録媒体に記録して提供されたり、ネットワークを介して提供され、本発明を実施する際にインストールされてＣＰＵなどの制御手段上で動作することにより本発明を実現することになる。 The scalable video encoding method of the present invention realized by the operation of each of the above processing means can also be realized by a computer program, and this computer program is recorded on an appropriate computer-readable recording medium and provided. Or provided via a network, installed when implementing the present invention, and operated on a control means such as a CPU, thereby realizing the present invention.

本発明では、上位レイヤのブロックと下位レイヤのブロックとが空間的に１対１に対応しないレイヤ構造を処理対象とするスケーラブル動画像符号化において、レイヤ間の最適予測モードの相関性を利用して上位レイヤの予測モード探索候補の絞り込みを行うことから、符号化時間を削減することができるようになる。 In the present invention, in scalable video coding in which a layer structure in which an upper layer block and a lower layer block do not spatially correspond to each other is processed, the correlation of the optimum prediction mode between layers is used. Thus, the encoding time can be reduced because the prediction mode search candidates for the higher layer are narrowed down.

そして、本発明では、予測モード探索候補を絞り込むことで符号化時間の削減を図るときに、符号化済みフレームにおけるレイヤ間の最適予測モードの対応関係をもとに、その絞り込みを行うことから、最適予測モードが絞り込みによって省かれてしまう危険性を回避できることで、予測モード探索候補を絞り込むことにより発生する可能性がある符号化性能の低下を抑制することができるようになる。 And, in the present invention, when the encoding time is reduced by narrowing down the prediction mode search candidates, the narrowing is performed based on the correspondence relationship between the optimal prediction modes between layers in the encoded frame. By avoiding the risk that the optimum prediction mode is omitted by narrowing down, it is possible to suppress a decrease in encoding performance that may occur by narrowing down prediction mode search candidates.

予測モード対応率表の生成対象となるフレームと符号化対象フレームとの一例を示す説明図である。It is explanatory drawing which shows an example of the flame | frame used as the production | generation object of a prediction mode correspondence rate table, and an encoding object frame. 本発明の処理の大きな流れを示すフローチャートである。It is a flowchart which shows the big flow of the process of this invention. 上位レイヤと下位レイヤの空間的な位置関係の説明図である。It is explanatory drawing of the spatial positional relationship of an upper layer and a lower layer. 予測モード対応率表の生成のために作成される表の説明図である。It is explanatory drawing of the table | surface created for the production | generation of a prediction mode corresponding | compatible ratio table | surface. 予測モード対応率表の説明図である。It is explanatory drawing of a prediction mode corresponding | compatible ratio table. 予測モード探索候補の絞り込み結果の説明図である。It is explanatory drawing of the narrowing-down result of a prediction mode search candidate. 予測モード探索候補の絞り込み結果の説明図である。It is explanatory drawing of the narrowing-down result of a prediction mode search candidate. 本発明により実行されるスケーラブル動画像符号化処理のフローチャートである。It is a flowchart of the scalable moving image encoding process performed by this invention. 本発明により実行されるスケーラブル動画像符号化処理のフローチャートである。It is a flowchart of the scalable moving image encoding process performed by this invention. 本発明により実行されるスケーラブル動画像符号化処理のフローチャートである。It is a flowchart of the scalable moving image encoding process performed by this invention. 本発明を具備するスケーラブル動画像符号化装置の装置構成図である。It is an apparatus block diagram of the scalable moving image encoder which comprises this invention. 本発明を具備するスケーラブル動画像符号化装置の装置構成図である。It is an apparatus block diagram of the scalable moving image encoder which comprises this invention. 本発明を具備するスケーラブル動画像符号化装置の装置構成図である。It is an apparatus block diagram of the scalable moving image encoder which comprises this invention. 本発明の有効性を説明するための図である。It is a figure for demonstrating the effectiveness of this invention.

以下、実施の形態に従って本発明を詳細に説明する。 Hereinafter, the present invention will be described in detail according to embodiments.

図８〜図１０に、本発明により実行されるスケーラブル動画像符号化処理のフローチャートを図示する。 8 to 10 show flowcharts of scalable video encoding processing executed according to the present invention.

図８は、本発明により実行されるスケーラブル動画像符号化処理の全体的なフローチャートであり、図９と図１０のそれぞれは、図８のフローチャートのステップＳ２０１で実行する処理の詳細なフローチャートの一例である。 FIG. 8 is an overall flowchart of the scalable video encoding process executed according to the present invention. Each of FIGS. 9 and 10 is an example of a detailed flowchart of the process executed in step S201 of the flowchart of FIG. It is.

次に、これらのフローチャートに従って、本発明により実行されるスケーラブル動画像符号化処理について詳細に説明する。 Next, according to these flowcharts, the scalable video encoding process executed by the present invention will be described in detail.

ここで、本発明の符号化処理は拡張レイヤに対する処理であり、基本レイヤには非スケーラブルのシングルレイヤ符号化処理を適用する。シングルレイヤ符号化処理の一例には、非特許文献２で挙げたＳＶＣの参照エンコーダＪＳＶＭの基本レイヤ部分の符号化処理が挙げられる。 Here, the encoding process of the present invention is a process for the enhancement layer, and a non-scalable single layer encoding process is applied to the base layer. An example of the single layer encoding process is the encoding process of the base layer portion of the SVC reference encoder JSVM mentioned in Non-Patent Document 2.

まず最初に、図８のフローチャートで実行するステップＳ２０１〜ステップＳ２０６の処理について説明する。 First, the processing of steps S201 to S206 executed in the flowchart of FIG. 8 will be described.

ステップＳ２０１：符号化対象マクロブロック（ＭＢ）において探索する予測モード探索候補の初期値を読み込み、最終的に符号化対象ＭＢにおいて探索する予測モードの探索候補を決定して、レジスタに格納する。本処理の詳細については、図９および図１０で後述する。 Step S201: The initial value of the prediction mode search candidate to be searched for in the encoding target macroblock (MB) is read, and finally the prediction mode search candidate to be searched for in the encoding target MB is determined and stored in the register. Details of this processing will be described later with reference to FIGS.

ステップＳ２０２：ステップＳ２０１の処理によって格納された予測モード探索候補の情報をレジスタより読み込み、各予測モード探索候補の探索を実行し、符号化に利用する最適予測モードを１つ決定して、その情報をレジスタに格納する。最適予測モードの決定方法の一例としては、ＪＳＶＭで行われている符号量と符号化歪みの線形和で表現される符号化コストを最小化する予測モードを最適とする方法が挙げられる。 Step S202: Read prediction mode search candidate information stored by the process of step S201 from the register, execute a search for each prediction mode search candidate, determine one optimal prediction mode to be used for encoding, and then the information Is stored in a register. As an example of a method for determining the optimum prediction mode, there is a method for optimizing a prediction mode that minimizes a coding cost expressed by a linear sum of a code amount and coding distortion performed in JSVM.

ステップＳ２０３：符号化対象ＭＢにおける最適予測モードの情報をレジスタより読み込み、その最適予測モードにて動き補償を行い、予測残差信号を生成して、バッファに格納する。 Step S203: Read information on the optimal prediction mode in the encoding target MB from the register, perform motion compensation in the optimal prediction mode, generate a prediction residual signal, and store it in the buffer.

ステップＳ２０４：予測残差信号をバッファより読み込み、その予測残差信号の符号化を行い、符号化データをバッファに格納する。本処理の一例には、非特許文献２で挙げたＳＶＣの参照エンコーダＪＳＶＭにおける、ＤＣＴ、量子化、可変長符号化の一連処理が挙げられる。 Step S204: The prediction residual signal is read from the buffer, the prediction residual signal is encoded, and the encoded data is stored in the buffer. As an example of this process, there is a series of processes of DCT, quantization, and variable length coding in the SVC reference encoder JSVM described in Non-Patent Document 2.

ステップＳ２０５：全てのＭＢの符号化が完了したのか否かの判定処理を行い、全てのＭＢの符号化が完了した場合には符号化処理を終了し、バッファより各ＭＢの符号化データおよび必要なその他のヘッダ情報を読み込み、最終的な符号化データとして出力する。一方、全てのＭＢの符号化が完了していない場合にはステップＳ２０６の処理に移る。 Step S205: A process for determining whether or not encoding of all MBs is completed. When encoding of all MBs is completed, the encoding process is terminated, and encoded data of each MB and necessary data are stored from the buffer. Other header information is read and output as final encoded data. On the other hand, if not all MBs have been encoded, the process proceeds to step S206.

ステップＳ２０６：次の符号化対象ＭＢに移り、ステップＳ２０１の処理を行う。 Step S206: Move to the next encoding target MB and perform the process of step S201.

次に、図９のフローチャートで実行するステップＳ３０１〜ステップＳ３０７の処理について説明することで、ステップＳ２０１で実行する処理の具体的な内容の一例について説明する。 Next, an example of specific contents of the process executed in step S201 will be described by explaining the process of steps S301 to S307 executed in the flowchart of FIG.

ステップＳ３０１：符号化対象ＭＢが本発明を適用する予測モード探索候補絞り込み対象ＭＢであるのか否かについて指定する情報を読み込み、予測モード探索候補絞り込み対象ＭＢである場合には、ステップＳ３０２の処理に移り、予測モード探索候補絞り込み対象ＭＢでない場合には、予測モード探索候補の初期値を最終的な予測モード探索候補として出力する。 Step S301: Information specifying whether or not the encoding target MB is a prediction mode search candidate narrowing target MB to which the present invention is applied is read, and if it is a prediction mode search candidate narrowing target MB, the process of step S302 is performed. On the other hand, if it is not the prediction mode search candidate narrowing-down target MB, the initial value of the prediction mode search candidate is output as the final prediction mode search candidate.

ステップＳ３０２：予測モード対応率表の計算対象とする符号化済みフレームの指定情報を外部より読み込み、その指定フレームの予測モード情報をレジスタに格納する。 Step S302: The designation information of the encoded frame to be calculated in the prediction mode correspondence rate table is read from the outside, and the prediction mode information of the designated frame is stored in the register.

ステップＳ３０３：予測モード対応率表の計算対象フレームにおける予測モード情報（符号化で用いられた最適な予測モードの情報）を読み込み、符号化対象レイヤとその直下レイヤの最適予測モードの対応率（発生率）を計算して、予測モード対応率表としてレジスタに格納する。すなわち、前述した手順に従って、図４に示すような予測モード対応率表の生成のための表を作成して、それに基づいて、図５に示すような予測モード対応率表を生成してレジスタに格納するのである。 Step S303: Read prediction mode information (information on the optimal prediction mode used in encoding) in the calculation target frame of the prediction mode correspondence ratio table, and the correspondence ratio (occurrence between the encoding target layer and the layer immediately below it) Rate) is calculated and stored in a register as a prediction mode correspondence rate table. That is, according to the procedure described above, a table for generating a prediction mode correspondence rate table as shown in FIG. 4 is created, and based on the table, a prediction mode correspondence rate table as shown in FIG. Store it.

ステップＳ３０４：符号化対象ＭＢの直下画像領域における１つ以上の最適予測モード群を読み込み、ある規範に従って、その中から予測モード対応率表の縦軸に用いる参照予測モードを１つ決定して、それをレジスタに出力する。ここで、参照予測モード決定の規範の例としては、前述した、（ａ）累積面積が最大となる予測モード、（ｂ）優先順位が最も高い予測モード、（ｃ）レイヤ間予測で参照する予測モード、などが適用可能である。 Step S304: One or more optimum prediction mode groups in the image region immediately below the encoding target MB are read, and one reference prediction mode to be used for the vertical axis of the prediction mode correspondence rate table is determined from the group according to a certain rule, Output it to the register. Here, as an example of the norm of determining the reference prediction mode, (a) the prediction mode with the maximum accumulated area, (b) the prediction mode with the highest priority, and (c) the prediction to be referred to in the inter-layer prediction. Mode, etc. are applicable.

ステップＳ３０５：参照予測モードに対応付けられる予測モード対応率表部分の情報を読み込み、それをバッファに格納する。 Step S305: Read the information of the prediction mode correspondence rate table portion associated with the reference prediction mode, and store it in the buffer.

ステップＳ３０６：予測モード探索候補絞り込み閾値を読み込み、それをレジスタに格納する。 Step S306: Read a prediction mode search candidate narrowing threshold value and store it in a register.

ステップＳ３０７：参照予測モードに対応付けられる予測モード対応率表部分の情報をバッファより読み込むとともに、予測モード探索候補絞り込み閾値をレジスタより読み込み、対応率（発生率）が予測モード探索候補絞り込み閾値以上の予測モードのみを予測モード探索候補として設定して、その情報をレジスタに格納する。ここで、この設定・格納にあたっては、基本レイヤの符号化で得られた最適予測モードに対応付けられる予測モード探索候補のみを選択して、設定・格納することになる。 Step S307: Read the information of the prediction mode correspondence rate table part associated with the reference prediction mode from the buffer, read the prediction mode search candidate narrowing threshold value from the register, and the correspondence rate (occurrence rate) is equal to or higher than the prediction mode search candidate narrowing threshold value. Only the prediction mode is set as a prediction mode search candidate, and the information is stored in the register. Here, in this setting / storage, only the prediction mode search candidate associated with the optimum prediction mode obtained by encoding the base layer is selected and set / stored.

このようにして、図８のフローチャートでは、図９のフローチャートの処理に従ってステップＳ２０１の処理を実行する場合には、図５に示すようなデータ構造を持つ予測モード対応率表に基づいて、図６に示すような形態で予測モード探索候補を絞り込むように処理するのである。 In this way, in the flowchart of FIG. 8, when the process of step S201 is executed according to the process of the flowchart of FIG. 9, based on the prediction mode correspondence rate table having the data structure as shown in FIG. It processes so that prediction mode search candidates may be narrowed down in the form as shown in FIG.

次に、図１０のフローチャートで実行するステップＳ４０１〜ステップＳ４０６の処理について説明することで、ステップＳ２０１で実行する処理の具体的な内容の他の一例について説明する。 Next, another example of the specific content of the process executed in step S201 will be described by explaining the process of steps S401 to S406 executed in the flowchart of FIG.

ステップＳ４０１：符号化対象ＭＢが本発明を適用する予測モード探索候補絞り込み対象ＭＢであるのか否かについて指定する情報を読み込み、予測モード探索候補絞り込み対象ＭＢである場合には、ステップＳ４０２の処理に移り、予測モード探索候補絞り込み対象ＭＢでない場合には、予測モード探索候補の初期値を最終的な予測モード探索候補として出力する。 Step S401: Information specifying whether or not the encoding target MB is a prediction mode search candidate narrowing target MB to which the present invention is applied is read, and if it is a prediction mode search candidate narrowing target MB, the process of step S402 is performed. On the other hand, if it is not the prediction mode search candidate narrowing-down target MB, the initial value of the prediction mode search candidate is output as the final prediction mode search candidate.

ステップＳ４０２：予測モード対応率表の計算対象とする符号化済みフレームの指定情報を外部より読み込み、その指定フレームの予測モード情報をレジスタに格納する。 Step S402: The designation information of the encoded frame to be calculated in the prediction mode correspondence rate table is read from the outside, and the prediction mode information of the designated frame is stored in the register.

ステップＳ４０３：予測モード対応率表の計算対象フレームにおける予測モード情報（符号化で用いられた最適な予測モードの情報）を読み込み、符号化対象レイヤとその直下レイヤの最適予測モードの対応率（発生率）を計算して、予測モード対応率表としてレジスタに格納する。すなわち、前述した手順に従って、図４に示すような予測モード対応率表の生成のための表を作成して、それに基づいて、図５に示すような予測モード対応率表を生成してレジスタに格納するのである。 Step S403: Read prediction mode information (information on the optimal prediction mode used in encoding) in the calculation target frame of the prediction mode correspondence rate table, and correspond to the optimal prediction mode (occurrence of the encoding target layer and the layer immediately below it). Rate) is calculated and stored in a register as a prediction mode correspondence rate table. That is, according to the procedure described above, a table for generating a prediction mode correspondence rate table as shown in FIG. 4 is created, and based on the table, a prediction mode correspondence rate table as shown in FIG. Store it.

ステップＳ４０４：符号化対象ＭＢの直下画像領域における１つ以上の最適予測モード群を読み込み、ある規範に従って、その中から予測モード対応率表の縦軸に用いる参照予測モードを１つ決定して、それをレジスタに出力する。ここで、参照予測モード決定の規範の例としては、前述した、（ａ）累積面積が最大となる予測モード、（ｂ）優先順位が最も高い予測モード、（ｃ）レイヤ間予測で参照する予測モード、などが適用可能である。 Step S404: Read one or more optimum prediction mode groups in the image region immediately below the encoding target MB, and determine one reference prediction mode to be used for the vertical axis of the prediction mode correspondence rate table from among them, according to a certain rule, Output it to the register. Here, as an example of the norm of determining the reference prediction mode, (a) the prediction mode with the maximum accumulated area, (b) the prediction mode with the highest priority, and (c) the prediction to be referred to in the inter-layer prediction. Mode, etc. are applicable.

ステップＳ４０５：参照予測モードに対応付けられる予測モード対応率表部分の情報を読み込み、それをバッファに格納する。 Step S405: Read the information of the prediction mode correspondence rate table portion associated with the reference prediction mode, and store it in the buffer.

ステップＳ４０６：参照予測モードに対応付けられる予測モード対応率表部分の情報をバッファより読み込み、対応率（発生率）が最大の予測モードのみを予測モード探索候補として設定し、その情報をレジスタに格納する。ここで、この設定・格納にあたっては、基本レイヤの符号化で得られた最適予測モードに対応付けられる予測モード探索候補のみを選択して、設定・格納することになる。 Step S406: Read information of the prediction mode correspondence rate table portion associated with the reference prediction mode from the buffer, set only the prediction mode with the highest correspondence rate (occurrence rate) as a prediction mode search candidate, and store the information in the register To do. Here, in this setting / storage, only the prediction mode search candidate associated with the optimum prediction mode obtained by encoding the base layer is selected and set / stored.

このようにして、図８のフローチャートでは、図１０のフローチャートの処理に従ってステップＳ２０１の処理を実行する場合には、図５に示すようなデータ構造を持つ予測モード対応率表に基づいて、図７に示すような形態で予測モード探索候補を絞り込むように処理するのである。 Thus, in the flowchart of FIG. 8, when the process of step S201 is executed according to the process of the flowchart of FIG. 10, based on the prediction mode correspondence rate table having the data structure shown in FIG. It processes so that prediction mode search candidates may be narrowed down in the form as shown in FIG.

図１１〜図１３に、本発明を具備するスケーラブル動画像符号化装置の装置構成を図示する。 FIG. 11 to FIG. 13 show a device configuration of a scalable video encoding device provided with the present invention.

図１１は、本発明を具備するスケーラブル動画像符号化装置の全体的な装置構成であり、図１２と図１３のそれぞれは、図１１に示す予測モード探索候補決定部１０２の詳細な装置構成の一例である。 FIG. 11 shows the overall apparatus configuration of the scalable video encoding apparatus comprising the present invention. FIGS. 12 and 13 each show the detailed apparatus configuration of the prediction mode search candidate determination unit 102 shown in FIG. It is an example.

次に、これらの装置構成図に従って、本発明を具備するスケーラブル動画像符号化装置について詳細に説明する。 Next, the scalable video encoding apparatus including the present invention will be described in detail according to these apparatus configuration diagrams.

ここで、本発明を具備するスケーラブル動画像符号化装置は、拡張レイヤに対する処理装置であり、基本レイヤには非スケーラブルのシングルレイヤ符号化処理を適用する。シングルレイヤ符号化処理の一例には、非特許文献２で挙げたＳＶＣの参照エンコーダＪＳＶＭの基本レイヤ部分の符号化処理が挙げられる。 Here, the scalable video encoding apparatus provided with the present invention is a processing apparatus for an enhancement layer, and applies a non-scalable single layer encoding process to the base layer. An example of the single layer encoding process is the encoding process of the base layer portion of the SVC reference encoder JSVM mentioned in Non-Patent Document 2.

まず最初に、図１１に従って、本発明を具備するスケーラブル動画像符号化装置の全体構成について説明する。 First, the overall configuration of the scalable video encoding apparatus including the present invention will be described with reference to FIG.

予測モード探索候補初期値記憶部１０１：予測モード探索候補の初期値を読み込み、レジスタに出力する。 Prediction mode search candidate initial value storage unit 101: Reads the initial value of the prediction mode search candidate and outputs it to the register.

予測モード探索候補決定部１０２：予測モード探索候補の初期値を読み込み、最終的に探索する予測モード探索候補を決定し、その最終的に決定した予測モード探索候補の情報をレジスタに出力して、最適予測モード決定部１０３に移る。本処理部の詳細な構成については、図１２および図１３で後述する。 Prediction mode search candidate determination unit 102: reads an initial value of a prediction mode search candidate, determines a prediction mode search candidate to be finally searched, outputs information of the finally determined prediction mode search candidate to a register, The process moves to the optimum prediction mode determination unit 103. The detailed configuration of this processing unit will be described later with reference to FIGS.

最適予測モード決定部１０３：予測モード探索候補をレジスタより読み込み、各予測モード探索候補について探索を実行し、符号化に利用する最適予測モードを１つ決定して、その情報を最適予測モード記憶部１０４に出力する。最適予測モードの決定方法の一例としては、ＪＳＶＭで行われている符号量と符号化歪みの線形和で表現される符号化コストを最小化する予測モードを最適とする方法が挙げられる。 Optimal prediction mode determination unit 103: Reads prediction mode search candidates from a register, executes a search for each prediction mode search candidate, determines one optimal prediction mode to be used for encoding, and stores the information as an optimal prediction mode storage unit To 104. As an example of a method for determining the optimum prediction mode, there is a method for optimizing a prediction mode that minimizes a coding cost expressed by a linear sum of a code amount and coding distortion performed in JSVM.

予測残差信号生成部１０５：最適予測モード記憶部１０４より符号化対象ＭＢにおける最適予測モードを読み込み、その最適予測モードにて動き補償を行い、予測残差信号を生成して、バッファに出力する。 Prediction residual signal generation unit 105: Reads the optimal prediction mode in the encoding target MB from the optimal prediction mode storage unit 104, performs motion compensation in the optimal prediction mode, generates a prediction residual signal, and outputs it to the buffer .

予測残差信号符号化部１０６：符号化対象ＭＢにおける予測残差信号をバッファより読み込み、その予測残差信号の符号化を行い、符号化データをバッファに出力する。本処理の一例に、Ｈ．２６４／ＡＶＣの参照エンコーダＪＭや、非特許文献２で挙げたＳＶＣの参照エンコーダＪＳＶＭのＤＣＴ、量子化、可変長符号化の一連処理の適用が考えられる。 Prediction residual signal encoding unit 106: Reads the prediction residual signal in the encoding target MB from the buffer, encodes the prediction residual signal, and outputs the encoded data to the buffer. An example of this processing is H.264. Application of a series of processes such as DCT, quantization, and variable length coding of the H.264 / AVC reference encoder JM and the SVC reference encoder JSVM described in Non-Patent Document 2 is conceivable.

全ＭＢ完了判定部１０７：全てのＭＢの符号化が完了したのか否かの判定処理を行い、全てのＭＢの符号化が完了した場合には符号化処理を終了して、最終的な符号化データを出力し、全てのＭＢの符号化が完了していない場合には符号化対象ＭＢ更新部１０８の処理に移る。 All MB completion determination unit 107: Performs a determination process of whether or not encoding of all MBs is completed. When encoding of all MBs is completed, the encoding process ends and final encoding is performed. Data is output, and when the encoding of all MBs is not completed, the process proceeds to the encoding target MB update unit 108.

符号化対象ＭＢ更新部１０８：次の符号化対象ＭＢに移り、予測モード探索候補決定部１０２の処理を行う。 Encoding target MB update unit 108: Moves to the next encoding target MB, and performs the process of the prediction mode search candidate determination unit 102.

次に、図１２に従って、予測モード探索候補決定部１０２の詳細な構成の一例について説明する。 Next, an example of a detailed configuration of the prediction mode search candidate determination unit 102 will be described with reference to FIG.

予測モード探索候補絞り込み対象ＭＢ指定情報記憶部２０１：予測モード探索候補の絞り込みを行うＭＢであるのか否かについて指定する情報を読み込み、レジスタに出力する。 Prediction mode search candidate narrowing-down target MB designation information storage unit 201: Reads information for designating whether or not the MB is a target for narrowing down prediction mode search candidates, and outputs it to a register.

予測モード探索候補絞り込み対象ＭＢ判定部２０２：予測モード探索候補絞り込み対象ＭＢ指定情報記憶部２０１より予測モード探索候補の絞り込みを行うＭＢの指定情報を読み込み、符号化対象ＭＢが絞り込みを行うＭＢであるのか否かの判定処理を行い、絞り込みを行うＭＢである場合には予測モード対応率表生成部２０６の処理に移り、絞り込みを行わないＭＢである場合には予測モード探索候補の初期値を最終的な予測モード探索候補として決定して、出力する。 Prediction mode search candidate narrowing-down target MB determining unit 202: Predictive mode search candidate narrowing-down target MB designation information storage unit 201 reads MB designation information for narrowing down prediction mode search candidates, and the encoding target MB is a MB for narrowing down. If the MB is to be narrowed down, the process proceeds to the prediction mode correspondence rate table generation unit 206. If the MB is not to be narrowed down, the initial value of the prediction mode search candidate is finalized. Is determined as a candidate prediction mode search and output.

予測モード対応率計算対象フレーム指定情報記憶部２０３：予測モード対応率の計算対象となる符号化済みのフレームの指定情報を読み込み、レジスタに出力する。 Prediction mode correspondence rate calculation target frame designation information storage unit 203: Reads the designation information of the encoded frame that is the calculation target of the prediction mode correspondence rate, and outputs it to the register.

対象フレーム拡張レイヤ最適予測モード記憶部２０４：予測モード対応率計算対象フレーム指定情報記憶部２０３の読み込んだ指定情報の指す予測モード対応率の計算対象となるフレームについて、符号化対象レイヤにおける最適予測モード情報を読み込み、レジスタに出力する。 Target frame enhancement layer optimal prediction mode storage unit 204: Optimal prediction mode in the encoding target layer for a frame for which the prediction mode correspondence rate indicated by the designation information read by the prediction mode correspondence rate calculation target frame designation information storage unit 203 is calculated. Read information and output to register.

対象フレーム直下レイヤ最適予測モード記憶部２０５：予測モード対応率計算対象フレーム指定情報記憶部２０３の読み込んだ指定情報の指す予測モード対応率の計算対象となるフレームについて、符号化対象レイヤの直下レイヤにおける最適予測モード情報を読み込み、レジスタに出力する。 Target frame direct layer optimum prediction mode storage unit 205: For a frame for which the prediction mode correspondence rate indicated by the designation information read in the prediction mode correspondence rate calculation target frame designation information storage unit 203 is calculated, in the layer immediately below the encoding target layer Reads the optimal prediction mode information and outputs it to the register.

予測モード対応率表生成部２０６：対象フレーム拡張レイヤ最適予測モード記憶部２０４より予測モード対応率の計算対象フレームの符号化対象レイヤにおける最適予測モード情報を読み込むとともに、対象フレーム直下レイヤ最適予測モード記憶部２０５より予測モード対応率の計算対象フレームの符号化対象レイヤの直下レイヤにおける最適予測モード情報を読み込んで、符号化対象レイヤのマクロブロックとその直下の画像領域との間での最適予測モードの対応率（発生率）を計算して、予測モード対応率表として予測モード対応率表記憶部２０７に出力する。すなわち、前述した手順に従って、図４に示すような予測モード対応率表の生成のための表を作成して、それに基づいて、図５に示すような予測モード対応率表を生成して予測モード対応率表記憶部２０７に出力するのである。 Prediction mode correspondence rate table generation unit 206: Reads the optimal prediction mode information in the encoding target layer of the calculation target frame of the prediction mode correspondence rate from the target frame extension layer optimal prediction mode storage unit 204 and stores the optimal prediction mode layer immediately below the target frame. The optimal prediction mode information in the layer immediately below the encoding target layer of the encoding target layer of the target frame for calculating the prediction mode correspondence rate is read from the unit 205, and the optimal prediction mode between the macro block of the encoding target layer and the image region immediately below the target mode is read. The correspondence rate (occurrence rate) is calculated and output to the prediction mode correspondence rate table storage unit 207 as a prediction mode correspondence rate table. That is, according to the procedure described above, a table for generating a prediction mode correspondence rate table as shown in FIG. 4 is created, and based on the table, a prediction mode correspondence rate table as shown in FIG. The data is output to the correspondence rate table storage unit 207.

参照予測モード決定部２０８：対象フレーム直下レイヤ最適予測モード記憶部２０５より、直下レイヤにおける１つ以上の最適予測モード群を読み込み、ある規範に従って、その中から予測モード対応率表の縦軸に用いる参照予測モードを１つ決定して、それを参照予測モード記憶部２０９に出力する。ここで、参照予測モード決定の規範の例としては、前述した、（ａ）累積面積が最大となる予測モード、（ｂ）優先順位が最も高い予測モード、（ｃ）レイヤ間予測で参照する予測モード、などが適用可能である。 Reference prediction mode determination unit 208: Reads one or more optimum prediction mode groups in the immediately lower layer from the target frame immediately lower layer optimal prediction mode storage unit 205, and uses them as a vertical axis of the prediction mode correspondence rate table from among them according to a certain rule One reference prediction mode is determined and output to the reference prediction mode storage unit 209. Here, as an example of the norm of determining the reference prediction mode, (a) the prediction mode with the maximum accumulated area, (b) the prediction mode with the highest priority, and (c) the prediction to be referred to in the inter-layer prediction. Mode, etc. are applicable.

予測モード対応率絞り込み閾値記憶部２１０：予測モード探索候補絞り込み閾値を読み込み、レジスタに出力する。 Prediction mode correspondence rate narrowing threshold storage unit 210: Reads a prediction mode search candidate narrowing threshold and outputs it to a register.

予測モード対応率閾値比較部２１１：予測モード対応率表記憶部２０７より予測モード対応率表を読み込むとともに、参照予測モード記憶部２０９より参照予測モードを読み込み、さらに、予測モード対応率絞り込み閾値記憶部２１０より予測モード対応率絞り込み閾値を読み込んで、参照予測モードに対応付けられる符号化対象ＭＢの最適予測モードの発生確率を調査し、発生確率が予測モード対応率絞り込み閾値以上の予測モードのみを最終的な予測モード探索候補として設定して、出力する。 Prediction mode correspondence rate threshold comparison unit 211: Reads the prediction mode correspondence rate table from the prediction mode correspondence rate table storage unit 207, reads the reference prediction mode from the reference prediction mode storage unit 209, and further reduces the prediction mode correspondence rate threshold storage unit The prediction mode correspondence rate narrowing threshold is read from 210, the occurrence probability of the optimal prediction mode of the encoding target MB associated with the reference prediction mode is investigated, and only the prediction mode whose occurrence probability is equal to or higher than the prediction mode correspondence rate narrowing threshold is final Is set as a typical prediction mode search candidate and output.

このようにして、図１２に示す装置構成では、図５に示すようなデータ構造を持つ予測モード対応率表に基づいて、図６に示すような形態で予測モード探索候補を絞り込むように処理するのである。 In this way, the apparatus configuration shown in FIG. 12 performs processing to narrow down prediction mode search candidates in the form shown in FIG. 6 based on the prediction mode correspondence rate table having the data structure shown in FIG. It is.

次に、図１３に従って、予測モード探索候補決定部１０２の詳細な構成の他の一例について説明する。 Next, another example of the detailed configuration of the prediction mode search candidate determination unit 102 will be described with reference to FIG.

予測モード探索候補絞り込み対象ＭＢ指定情報記憶部３０１：予測モード探索候補の絞り込みを行うＭＢであるのか否かについて指定する情報を読み込み、レジスタに出力する。 Prediction mode search candidate narrowing-down target MB designation information storage unit 301: Reads information for designating whether or not the prediction mode search candidate narrows down the prediction mode search candidate, and outputs it to a register.

予測モード探索候補絞り込み対象ＭＢ判定部３０２：予測モード探索候補絞り込み対象ＭＢ指定情報記憶部３０１より予測モード探索候補の絞り込みを行うＭＢの指定情報を読み込み、符号化対象ＭＢが絞り込みを行うＭＢであるのか否かの判定処理を行い、絞り込みを行うＭＢである場合には予測モード対応率表生成部３０６の処理に移り、絞り込みを行わないＭＢである場合には予測モード探索候補の初期値を最終的な予測モード探索候補として決定して、出力する。 Prediction mode search candidate narrowing target MB determination unit 302: Predictive mode search candidate narrowing target MB designation information storage unit 301 reads MB designation information for narrowing down prediction mode search candidates, and the encoding target MB is a MB for narrowing down. If the MB is to be narrowed down, the process proceeds to the prediction mode correspondence rate table generation unit 306. If the MB is not to be narrowed down, the initial value of the prediction mode search candidate is finalized. Is determined as a candidate prediction mode search and output.

予測モード対応率計算対象フレーム指定情報記憶部３０３：予測モード対応率の計算対象となる符号化済みのフレームの指定情報を読み込み、レジスタに出力する。 Prediction mode correspondence rate calculation target frame designation information storage unit 303: Reads designation information of an encoded frame that is a calculation target of the prediction mode correspondence rate, and outputs it to a register.

対象フレーム拡張レイヤ最適予測モード記憶部３０４：予測モード対応率計算対象フレーム指定情報記憶部３０３の読み込んだ指定情報の指す予測モード対応率の計算対象となるフレームについて、符号化対象レイヤにおける最適予測モード情報を読み込み、レジスタに出力する。 Target frame enhancement layer optimal prediction mode storage unit 304: prediction mode correspondence rate calculation target frame designation information storage unit 303 for the prediction target for the prediction mode correspondence rate pointed to by the designation information read in the coding information layer optimum prediction mode in the coding target layer Read information and output to register.

対象フレーム直下レイヤ最適予測モード記憶部３０５：予測モード対応率計算対象フレーム指定情報記憶部３０３の読み込んだ指定情報の指す予測モード対応率の計算対象となるフレームについて、符号化対象レイヤの直下レイヤにおける最適予測モード情報を読み込み、レジスタに出力する。 Target frame immediately below layer optimum prediction mode storage unit 305: Prediction mode correspondence rate calculation target frame designation information storage unit 303 for a frame that is a calculation target of the prediction mode correspondence rate indicated by the designation information read in the encoding target layer immediately below the encoding target layer Reads the optimal prediction mode information and outputs it to the register.

予測モード対応率表生成部３０６：対象フレーム拡張レイヤ最適予測モード記憶部３０４より予測モード対応率の計算対象フレームの符号化対象レイヤにおける最適予測モード情報を読み込むとともに、対象フレーム直下レイヤ最適予測モード記憶部３０５より予測モード対応率の計算対象フレームの符号化対象レイヤの直下レイヤにおける最適予測モード情報を読み込んで、符号化対象レイヤのマクロブロックとその直下の画像領域との間での最適予測モードの対応率（発生率）を計算して、予測モード対応率表として予測モード対応率表記憶部３０７に出力する。すなわち、前述した手順に従って、図４に示すような予測モード対応率表の生成のための表を作成して、それに基づいて、図５に示すような予測モード対応率表を生成して予測モード対応率表記憶部３０７に出力するのである。 Prediction mode correspondence rate table generation unit 306: Reads the optimal prediction mode information in the encoding target layer of the target frame extension layer optimal prediction mode storage unit 304 and calculates the prediction mode correspondence rate, and stores the optimal prediction mode layer immediately below the target frame. The optimal prediction mode information in the layer immediately below the encoding target layer of the target frame for calculating the prediction mode correspondence rate is read from the unit 305, and the optimal prediction mode between the macroblock of the encoding target layer and the image region immediately below the target mode is read. The correspondence rate (occurrence rate) is calculated and output to the prediction mode correspondence rate table storage unit 307 as a prediction mode correspondence rate table. That is, according to the procedure described above, a table for generating a prediction mode correspondence rate table as shown in FIG. 4 is created, and based on the table, a prediction mode correspondence rate table as shown in FIG. This is output to the correspondence rate table storage unit 307.

参照予測モード決定部３０８：対象フレーム直下レイヤ最適予測モード記憶部３０５より、直下レイヤにおける１つ以上の最適予測モード群を読み込み、ある規範に従って、その中から予測モード対応率表の縦軸に用いる参照予測モードを１つ決定して、それを参照予測モード記憶部３０９に出力する。ここで、参照予測モード決定の規範の例としては、前述した、（ａ）累積面積が最大となる予測モード、（ｂ）優先順位が最も高い予測モード、（ｃ）レイヤ間予測で参照する予測モード、などが適用可能である。 Reference prediction mode determination unit 308: One or more optimum prediction mode groups in the immediately lower layer are read from the target frame immediately below layer optimum prediction mode storage unit 305, and are used as the vertical axis of the prediction mode correspondence rate table from among them according to a certain rule. One reference prediction mode is determined and output to the reference prediction mode storage unit 309. Here, as an example of the norm of determining the reference prediction mode, (a) the prediction mode with the maximum accumulated area, (b) the prediction mode with the highest priority, and (c) the prediction to be referred to in the inter-layer prediction. Mode, etc. are applicable.

発生率最大予測モード特定部３１０：予測モード対応率表記憶部３０７より予測モード対応率表を読み込むとともに、参照予測モード記憶部３０９より参照予測モードを読み込んで、参照予測モードに対する符号化対象ＭＢの最適予測モードの発生確率を調査し、発生確率が最大の予測モードを最終的な予測モード探索候補として設定して、出力する。 Occurrence rate maximum prediction mode identification unit 310: Reads a prediction mode correspondence rate table from the prediction mode correspondence rate table storage unit 307, reads a reference prediction mode from the reference prediction mode storage unit 309, and stores the encoding target MB for the reference prediction mode. The occurrence probability of the optimum prediction mode is investigated, and the prediction mode with the maximum occurrence probability is set as the final prediction mode search candidate and output.

このようにして、図１３に示す装置構成では、図５に示すようなデータ構造を持つ予測モード対応率表に基づいて、図７に示すような形態で予測モード探索候補を絞り込むように処理するのである。 In this way, the apparatus configuration shown in FIG. 13 performs processing to narrow down prediction mode search candidates in the form shown in FIG. 7 based on the prediction mode correspondence rate table having the data structure shown in FIG. It is.

次に、本発明の有効性について説明する。 Next, the effectiveness of the present invention will be described.

前述したように、本発明者は、非特許文献３で、スケーラブル動画像符号化におけるレイヤ間の予測モードの相関性を利用して、拡張レイヤにおけるマクロブロックの予測モードの探索候補を絞り込むようにするという発明を開示した。 As described above, in the Non-Patent Document 3, the present inventor uses the correlation of prediction modes between layers in scalable video coding to narrow down search candidates for prediction modes of macroblocks in the enhancement layer. The invention to do was disclosed.

この発明によれば、拡張レイヤにおけるマクロブロックの予測モードの探索候補を絞り込むことができることから、スケーラブル動画像符号化処理の高速化を実現できるようになる。 According to the present invention, search candidates for prediction modes of macroblocks in the enhancement layer can be narrowed down, so that it is possible to realize a high speed scalable video encoding process.

しかしながら、この発明は、符号化対象ブロックと空間的に対応する直下の画像領域の持つ最適予測モードが１つである場合にのみ適用できるものであり、複数の最適予測モードを持つ場合には適用できない。 However, the present invention can be applied only when the optimal prediction mode of the image area directly corresponding to the block to be encoded has one optimal prediction mode, and is applicable when there are a plurality of optimal prediction modes. Can not.

上位レイヤと下位レイヤとの間に２のべき乗の解像度スケーラビリティが成立しない場合や、上位レイヤの映像から一部の映像を切り出すことで下位レイヤの映像を生成する場合には、符号化対象ブロックと空間的に対応する直下の画像領域の持つ最適予測モードが複数となるので、非特許文献３に開示した発明は適用できないことになる。 When the power scalability resolution of 2 is not established between the upper layer and the lower layer, or when the lower layer video is generated by cutting out a part of the video from the upper layer video, Since there are a plurality of optimal prediction modes in the image region directly corresponding to the space, the invention disclosed in Non-Patent Document 3 cannot be applied.

これに対して、本発明は、符号化対象ブロックと空間的に対応する直下の画像領域が複数の最適予測モードを持つ場合であっても適用可能であり、非特許文献３に開示した発明よりも、符号化対象ブロックの予測モード選択を高速に実行できることになる。 On the other hand, the present invention can be applied even when the image area directly corresponding to the encoding target block has a plurality of optimum prediction modes. The invention disclosed in Non-Patent Document 3 Also, the prediction mode selection of the encoding target block can be executed at high speed.

例えば、図１４に示すように、１９２０×１０８０画素サイズの映像を上位レイヤとし、それを横方向に３／４倍して生成した１４４０×１０８０画素サイズの映像を下位レイヤとする空間スケーラビリティを持つスケーラブル動画像符号化を実施したとする。 For example, as shown in FIG. 14, a video with a 1920 × 1080 pixel size is used as an upper layer, and a video with a size of 1440 × 1080 pixels generated by multiplying it by 3/4 in the horizontal direction is used as a lower layer. Assume that scalable video coding has been performed.

この場合、非特許文献３で開示した発明では、符号化対象ブロックと空間的に対応する直下の画像領域の持つ最適予測モードが１つのみの場合に適用可能であるという制約条件によって、上位レイヤの符号化の際に、総マクロブロック数の１／２の数のマクロブロックでしか高速予測モード探索を実施できない。これに対して、本発明では、すべてのマクロブロックに対して実施できることから、予測モード探索処理の絞り込みの行われるマクロブロックの数が非特許文献３に開示した発明と比較して２倍に増加する。 In this case, in the invention disclosed in Non-Patent Document 3, the upper layer is applied according to the constraint that it can be applied when only one optimal prediction mode has an image area immediately below that spatially corresponds to the encoding target block. When encoding is performed, the fast prediction mode search can be performed only with the number of macroblocks that is ½ of the total number of macroblocks. On the other hand, since the present invention can be performed on all macroblocks, the number of macroblocks on which prediction mode search processing is narrowed is doubled compared to the invention disclosed in Non-Patent Document 3. To do.

すなわち、図１４に示すように、下位レイヤの４つのブロックａ，ｂ，ｃ，ｄの内、ブロックａ，ｄについては、最適予測モードを１つしか持たないのに対して、ブロックｂについては、左のマクロブロックＡで導出された最適予測モードと右のマクロブロックＢで導出された最適予測モードとがあることで最適予測モードを２つ持つことになり、そして、ブロックｃについては、左のマクロブロックＢで導出された最適予測モードと右のマクロブロックＣで導出された最適予測モードとがあることで最適予測モードを２つ持つことになる。 That is, as shown in FIG. 14, among the four blocks a, b, c, and d in the lower layer, the blocks a and d have only one optimal prediction mode, while the block b has The optimal prediction mode derived from the left macroblock A and the optimal prediction mode derived from the right macroblock B have two optimal prediction modes. With the optimal prediction mode derived from the macroblock B and the optimal prediction mode derived from the right macroblock C, there are two optimal prediction modes.

これから、非特許文献３で開示した発明では、上位レイヤの符号化の際に、４つのブロックａ，ｂ，ｃ，ｄの内のブロックａ，ｄの上位に位置する２つのマクロブロックでしか高速予測モード探索を実施できないのに対して、本発明では、４つのブロックａ，ｂ，ｃ，ｄの上位に位置する４つのすべてのマクロブロックで高速予測モード探索を実施できることから、予測モード探索処理の絞り込みの行われるマクロブロックの数が非特許文献３に開示した発明と比較して２倍に増加するのである。 From the above, in the invention disclosed in Non-Patent Document 3, when encoding the upper layer, only the two macroblocks positioned above the blocks a and d out of the four blocks a, b, c and d are high-speed. In contrast to the prediction mode search, in the present invention, since the high-speed prediction mode search can be performed on all four macroblocks positioned above the four blocks a, b, c, and d, the prediction mode search process is performed. As compared with the invention disclosed in Non-Patent Document 3, the number of macroblocks to be narrowed is increased twice.

同様にして、本発明によれば、非特許文献３で開示した発明と比較して、予測モード探索処理の絞り込みの行われるマクロブロックの数が、
（１）１９２０×１０８０画素サイズと１２８０×７２０画素サイズとの空間スケーラビリティを持つスケーラブル動画像符号化では９／４倍
（２）１９２０×１０８０画素サイズと７２０×４８０画素サイズとの空間スケーラビリティを持つスケーラブル動画像符号化では２倍
（３）１９２０×１０８０画素サイズと６４０×４８０画素サイズとの空間スケーラビリティを持つスケーラブル動画像符号化では３／２倍
というように増加することになる。 Similarly, according to the present invention, compared to the invention disclosed in Non-Patent Document 3, the number of macroblocks to be narrowed down in the prediction mode search process is
(1) 9/4 times for scalable video coding with spatial scalability of 1920 × 1080 pixel size and 1280 × 720 pixel size (2) Spatial scalability of 1920 × 1080 pixel size and 720 × 480 pixel size (3) In scalable video coding with spatial scalability of 1920 × 1080 pixel size and 640 × 480 pixel size, it will increase to 3/2 times.

したがって、本発明によれば、非特許文献３で開示した発明と比較して、符号化に要する計算時間を大幅に削減することができるようになる。 Therefore, according to the present invention, compared with the invention disclosed in Non-Patent Document 3, the calculation time required for encoding can be greatly reduced.

図示実施形態例に従って本発明を説明したが、本発明はこれに限定されるものではない。例えば、実施形態例では、基本レイヤ・拡張レイヤという階層レイヤ構成に対しての適用に従って本発明を説明したが、本発明はこのような階層レイヤ構成に、その適用が限られるものではない。 Although the present invention has been described according to the illustrated embodiment, the present invention is not limited to this. For example, in the embodiment, the present invention has been described according to the application to the hierarchical layer configuration of the base layer and the extension layer, but the application of the present invention is not limited to such a hierarchical layer configuration.

また、実施形態例では、参照予測モードとして１つの予測モードを選択するという構成をとったが、参照予測モードとして複数の予測モードを選択するようにするという構成をとることも可能である。 In the embodiment, the configuration is such that one prediction mode is selected as the reference prediction mode, but a configuration in which a plurality of prediction modes are selected as the reference prediction mode is also possible.

本発明は、レイヤ構造によってスケーラビリティを実現するスケーラブル動画像符号化に適用できるものであり、本発明を適用することで符号化時間を削減することができるようになる。 The present invention can be applied to scalable video coding that achieves scalability by a layer structure, and the coding time can be reduced by applying the present invention.

１０１予測モード探索候補初期値記憶部
１０２予測モード探索候補決定部
１０３最適予測モード決定部
１０４最適予測モード記憶部
１０５予測残差信号生成部
１０６予測残差信号符号化部
１０７全ＭＢ完了判定部
１０８符号化対象ＭＢ更新部
２０１予測モード探索候補絞り込み対象ＭＢ指定情報記憶部
２０２予測モード探索候補絞り込み対象ＭＢ判定部
２０３予測モード対応率計算対象フレーム指定情報記憶部
２０４対象フレーム拡張レイヤ最適予測モード記憶部
２０５対象フレーム直下レイヤ最適予測モード記憶部
２０６予測モード対応率表生成部
２０７予測モード対応率表記憶部
２０８参照予測モード決定部
２０９参照予測モード記憶部
２１０予測モード対応率絞り込み閾値記憶部
２１１予測モード対応率閾値比較部
３０１予測モード探索候補絞り込み対象ＭＢ指定情報記憶部
３０２予測モード探索候補絞り込み対象ＭＢ判定部
３０３予測モード対応率計算対象フレーム指定情報記憶部
３０４対象フレーム拡張レイヤ最適予測モード記憶部
３０５対象フレーム直下レイヤ最適予測モード記憶部
３０６予測モード対応率表生成部
３０７予測モード対応率表記憶部
３０８参照予測モード決定部
３０９参照予測モード記憶部
３１０発生率最大予測モード特定部 101 prediction mode search candidate initial value storage unit 102 prediction mode search candidate determination unit 103 optimal prediction mode determination unit 104 optimal prediction mode storage unit 105 prediction residual signal generation unit 106 prediction residual signal encoding unit 107 all MB completion determination unit 108 Encoding target MB update unit 201 Prediction mode search candidate narrowing target MB designation information storage unit 202 Prediction mode search candidate narrowing target MB determination unit 203 Prediction mode correspondence rate calculation target frame designation information storage unit 204 Target frame enhancement layer optimum prediction mode storage unit 205 Predictive layer optimum prediction mode storage unit immediately below target frame 206 Prediction mode correspondence rate table generation unit 207 Prediction mode correspondence rate table storage unit 208 Reference prediction mode determination unit 209 Reference prediction mode storage unit 210 Prediction mode correspondence rate narrowing threshold storage unit 211 Prediction mode Response rate threshold Comparison unit 301 Prediction mode search candidate narrowing target MB designation information storage unit 302 Prediction mode search candidate narrowing target MB determination unit 303 Prediction mode correspondence rate calculation target frame designation information storage unit 304 Target frame enhancement layer optimal prediction mode storage unit 305 Directly under target frame Layer optimal prediction mode storage unit 306 Prediction mode correspondence rate table generation unit 307 Prediction mode correspondence rate table storage unit 308 Reference prediction mode determination unit 309 Reference prediction mode storage unit 310 Maximum occurrence rate prediction mode identification unit

Claims

A scalable video encoding method for encoding a video in a scalable manner,
Occurrence of a combination of the optimal prediction mode selected in the upper layer block and the image region of the lower layer immediately below it based on the optimal prediction mode selected in scalable coding performed without limiting the use of the prediction mode Generating a correspondence table that describes the correspondence between the combination of the optimal prediction modes and the occurrence rate,
When encoding a block of the upper layer, a process of obtaining information on the optimal prediction mode selected in the image region of the lower layer immediately below the block,
Selecting an optimum prediction mode to be processed from the obtained optimum prediction modes;
Based on the selected optimum prediction mode and the occurrence rate described in the correspondence table, an effective combination is extracted from the combinations of optimum prediction modes described in the correspondence table, and the extracted combination has Determining the optimal prediction mode of the upper layer as a prediction mode search candidate for searching by encoding the block of the upper layer,
A scalable video encoding method characterized by the above.

A scalable video encoding method for encoding a video in a scalable manner,
Occurrence of a combination of the optimal prediction mode selected in the upper layer block and the image region of the lower layer immediately below it based on the optimal prediction mode selected in scalable coding performed without limiting the use of the prediction mode Generating a correspondence table that describes the correspondence between the combination of the optimal prediction modes and the occurrence rate,
Based on the value of the occurrence rate, a combination of effective optimum prediction modes is extracted by narrowing down the combination of optimum prediction modes described in the correspondence table, and the combination of the extracted effective optimum prediction modes is described. A process of generating prediction mode correspondence information;
When encoding a block of the upper layer, a process of obtaining information on the optimal prediction mode selected in the image region of the lower layer immediately below the block,
Selecting an optimum prediction mode to be processed from the obtained optimum prediction modes;
A step of determining a prediction mode search candidate to be searched by encoding a block of a higher layer by referring to the prediction mode correspondence information having the selected optimum prediction mode in combination,
A scalable video encoding method characterized by the above.

The scalable video encoding method according to claim 2,
In the process of generating the prediction mode correspondence information, extracting a combination of optimum prediction modes having an occurrence rate indicating a value larger than a predetermined threshold as effective,
A scalable video encoding method characterized by the above.

The scalable video encoding method according to claim 2,
In the process of generating the prediction mode correspondence information, a combination of optimum prediction modes having the highest occurrence rate indicating the largest value is extracted as an effective one from combinations of optimum prediction modes having the same optimum prediction mode for lower layers. Or extracting a combination of a predetermined number of optimum prediction modes that are selected in the order of occurrence rates showing large values as effective,
A scalable video encoding method characterized by the above.

The scalable video encoding method according to any one of claims 1 to 4,
In the process of selecting the optimum prediction mode, selecting the optimum prediction mode to be processed based on the size of the overlap between the upper layer block and the lower layer image region,
A scalable video encoding method characterized by the above.

The scalable video encoding method according to any one of claims 1 to 4,
In the process of selecting the optimum prediction mode, selecting the optimum prediction mode to be processed based on the preset priority order of the prediction modes,
A scalable video encoding method characterized by the above.

The scalable video coding method according to any one of claims 1 to 6,
In the process of selecting the optimum prediction mode, if there is a prediction mode to be referred to in inter-layer prediction in the obtained optimum prediction mode, selecting the prediction mode as the optimum prediction mode to be processed;
A scalable video encoding method characterized by the above.

The scalable video coding method according to any one of claims 1 to 7,
In the process of generating the correspondence table, for each of the optimal prediction modes selected in the block of the upper layer, the number of pixels in the image region of the lower layer immediately below the block is determined for each optimal prediction mode selected in the image region. Counting and generating the correspondence table based on the counting result,
A scalable video encoding method characterized by the above.

The scalable video coding method according to any one of claims 1 to 8,
The scalable coding that restricts the use of the prediction mode executed using the correspondence table and the scalable coding that does not restrict the use of the prediction mode executed without using the correspondence table are alternately repeated. To have a process of controlling
A scalable video encoding method characterized by the above.

A scalable video encoding device for encoding a video in a scalable manner,
Occurrence of a combination of the optimal prediction mode selected in the upper layer block and the image region of the lower layer immediately below it based on the optimal prediction mode selected in scalable coding performed without limiting the use of the prediction mode Means for generating a correspondence table describing the correspondence between the combination of the optimum prediction mode and the occurrence rate,
Means for acquiring information of the optimal prediction mode selected in the image region of the lower layer immediately below when encoding a block of the upper layer;
Means for selecting an optimum prediction mode to be processed from the obtained optimum prediction modes;
Based on the selected optimum prediction mode and the occurrence rate described in the correspondence table, an effective combination is extracted from the combinations of optimum prediction modes described in the correspondence table, and the extracted combination has Means for determining the optimal prediction mode of the upper layer as a prediction mode search candidate for searching by encoding the block of the upper layer,
A featured scalable video encoding apparatus.

A scalable video encoding device for encoding a video in a scalable manner,
Occurrence of a combination of the optimal prediction mode selected in the upper layer block and the image region of the lower layer immediately below it based on the optimal prediction mode selected in scalable coding performed without limiting the use of the prediction mode Means for generating a correspondence table describing the correspondence between the combination of the optimum prediction mode and the occurrence rate,
Based on the value of the occurrence rate, a combination of effective optimum prediction modes is extracted by narrowing down the combination of optimum prediction modes described in the correspondence table, and the combination of the extracted effective optimum prediction modes is described. Means for generating prediction mode correspondence information;
Means for acquiring information of the optimal prediction mode selected in the image region of the lower layer immediately below when encoding a block of the upper layer;
Means for selecting an optimum prediction mode to be processed from the obtained optimum prediction modes;
Means for determining a prediction mode search candidate to be searched by encoding a block of a higher layer by referring to the prediction mode correspondence information having the selected optimal prediction mode in combination,
A featured scalable video encoding apparatus.

A scalable video encoding program for causing a computer to execute the scalable video encoding method according to any one of claims 1 to 9.