JPWO2012141221A1

JPWO2012141221A1 - Moving picture coding apparatus, moving picture coding method, moving picture coding program, and computer-readable recording medium

Info

Publication number: JPWO2012141221A1
Application number: JP2013509949A
Authority: JP
Inventors: 天宋; 孝文板東; 隆島本
Original assignee: University of Tokushima
Current assignee: University of Tokushima
Priority date: 2011-04-12
Filing date: 2012-04-11
Publication date: 2014-07-28
Anticipated expiration: 2032-04-11
Also published as: JP5950260B2; WO2012141221A1

Abstract

【課題】インター予測符号化の効率を高めて、データサイズを縮小する。【解決手段】既に符号化された従前の画像フレーム中の、該サブブロックと対応する位置及びその周辺に存在する複数のブロック中から、該複数のブロック中で関連性の高い一以上の候補基準ブロックを抽出する工程と、抽出された一以上の候補基準ブロック中から、符号化対象のサブブロックと最も近似するブロックを基準ブロックとして選択する工程と、選択された基準ブロックを構成する複数の画素値に基づいて、予め規定された複数の追加予測モードに従って第二基準画素を生成する工程と、第二基準画素に基づいて、既定のイントラ予測モードに従って第二予測画素を生成する工程と、第二予測画素に基づいて、レート歪み最適化を実行する工程とを含む。【選択図】図１９An object of the present invention is to increase the efficiency of inter prediction encoding and reduce the data size. One or more candidate criteria having high relevance in the plurality of blocks from among a plurality of blocks existing at a position corresponding to the sub-block and its surroundings in a previously encoded previous image frame A step of extracting a block, a step of selecting a block closest to the encoding target sub-block as a reference block from among the extracted one or more candidate reference blocks, and a plurality of pixels constituting the selected reference block Generating a second reference pixel according to a plurality of pre-defined additional prediction modes based on the value, generating a second prediction pixel according to a predetermined intra prediction mode based on the second reference pixel, Performing rate distortion optimization based on the two predicted pixels. [Selection] Figure 19

Description

本発明は、動画像の符号化、復号化に用いられる動画像符号化装置、動画像符号化方法、動画像符号化プログラム及びコンピュータで読み取り可能な記録媒体に関する。 The present invention relates to a moving image encoding apparatus, a moving image encoding method, a moving image encoding program, and a computer-readable recording medium used for encoding and decoding moving images.

近年、ＤＶＤや地上デジタルハイビジョン等の高画質動画像の普及が急速に進んでおり、動画像の高ビットレート、高解像度によって情報量が飛躍的に増加し、効率のよい動画像データの圧縮が必須となっている。このようなデータ圧縮を行う符号化方式として、現在Ｈ．２６４／ＡＶＣが普及している。Ｈ．２６４／ＡＶＣは、２００３年にＩＴＵ−Ｔ等により勧告されたビデオ符号化規格であり、従来の動き補償（Motion Compensation：ＭＣ）ＤＣＴに基づくハイブリッド符号化構造を用いている。Ｈ．２６４／ＡＶＣでは時間的、空間的冗長度を低減するため、２種類の予測モードとして画面内予測符号化（ＩＮＴＲＡ予測符号化；以下「イントラ予測」ともいう。）と画面間予測符号化（ＩＮＴＥＲ予測符号化、フレーム間予測；以下「インター予測」ともいう。）を導入している。 In recent years, high-definition moving images such as DVD and terrestrial digital high-definition have been rapidly spreading, and the amount of information has increased dramatically due to the high bit rate and high resolution of moving images. It is essential. As an encoding method for performing such data compression, currently H.264 is used. H.264 / AVC is prevalent. H. H.264 / AVC is a video coding standard recommended by ITU-T and others in 2003, and uses a conventional hybrid coding structure based on Motion Compensation (MC) DCT. H. In order to reduce temporal and spatial redundancy in H.264 / AVC, intra prediction encoding (INTRA prediction encoding; hereinafter also referred to as “intra prediction”) and inter prediction encoding (INTER) are provided as two types of prediction modes. Prediction coding, interframe prediction; hereinafter also referred to as “inter prediction”).

インター予測とは、時間が異なる２つのフレーム間における対応画像の差分をとる予測方法であり、時間的冗長度を削減することを目的とする。ここでは動き補償を効果的に行うため、１６×１６画素（ピクセル）ブロック（Macroblock：以下「マクロブロック」という）を、４×４〜１６×１６画素の７種類ブロックサイズで符号化できる。加えて、更に符号化効率を改善するため、複数フレームや１／２、１／４精度予測も利用できる。 Inter prediction is a prediction method that takes a difference between corresponding images between two frames having different times, and aims to reduce temporal redundancy. Here, in order to effectively perform motion compensation, a 16 × 16 pixel (pixel) block (Macroblock: hereinafter referred to as “macroblock”) can be encoded with 7 types of block sizes of 4 × 4 to 16 × 16 pixels. In addition, in order to further improve the coding efficiency, a plurality of frames and 1/2, 1/4 accuracy prediction can also be used.

しかしながら、これらの動き補償ツールは、連続的な動きの動画に対してのみ、時間的冗長度を低減できるに過ぎない。すなわち、動きの変化の激しい動画に対してはインター予測では充分な符号化効率を達成できない。このような場合には、空間的冗長度を低減するためにイントラ予測モードが利用されることになる。Ｈ．２６４／ＡＶＣでは、隣接するブロックからの予測に４×４又は１６×１６画素モードが用いられる。これらの２つのモードは、レート歪み最適化（rate-distortion optimization：ＲＤＯ）処理による前エンコーディングであり、最終的な符号化モードを適切とするような最も効率的なモードが選択される。インターモードでは良好な符号化を行えない場合には、イントラモードが補足的なツールとしてうまく機能するため、大抵の場合にはこの方法でうまくいく。 However, these motion compensation tools can only reduce temporal redundancy for continuous motion movies. That is, sufficient encoding efficiency cannot be achieved by inter prediction for a moving image with a large change in motion. In such a case, the intra prediction mode is used to reduce the spatial redundancy. H. In H.264 / AVC, 4 × 4 or 16 × 16 pixel mode is used for prediction from adjacent blocks. These two modes are pre-encoding by rate-distortion optimization (RDO) processing, and the most efficient mode that makes the final encoding mode appropriate is selected. In most cases, this method works well because intra mode works well as a supplemental tool when inter mode does not provide good coding.

しかしながら、動画像によっては時間的冗長度でも空間的冗長度でも、いずれの観点からも冗長度を低減できないことがある。このような場合でもＨ．２６４／ＡＶＣでは、インター予測とイントラ予測のいずれか、相対的にビットレートと画質が最も良く、ＲＤＣ（Rate-Distortion Cost：以下「レート歪みコスト」又は単に「コスト」という。）の低いモードを選択しなければならず、結果としてビットレートの増大を招くこととなる。このような動画像の例としては「フットボール」が挙げられる。フットボールの試合を撮影した動画像では、選手の素早い動きのため時間的冗長度の低減が困難となる上、空間的冗長度も極めて複雑な内容のため推定が困難となる。この結果、ビットレートが不安定となり、これに応じてハード及びソフトに要求される実装仕様も高くなるという問題があった。 However, depending on the moving image, there is a case where the redundancy cannot be reduced from either viewpoint of temporal redundancy and spatial redundancy. Even in such a case, H.C. In H.264 / AVC, either inter prediction or intra prediction, the bit rate and image quality are relatively the best, and the mode with low RDC (Rate-Distortion Cost: hereinafter referred to as “rate distortion cost” or simply “cost”) is used. As a result, the bit rate increases. An example of such a moving image is “football”. In a moving image obtained by shooting a football game, it is difficult to reduce temporal redundancy due to the quick movement of the players, and it is difficult to estimate the spatial redundancy due to extremely complicated content. As a result, there is a problem that the bit rate becomes unstable, and the mounting specifications required for hardware and software are increased accordingly.

その一方、このような増大するハードウェアの演算能力の要求に応えるべく、マルチコアＣＰＵや高性能ＧＰＵアーキテクチャが新たに開発されようとしている。例えばマルチコア演算は、マルチメディア処理を含め多くの分野の基本的な概念を変えるものではある。しかしながら、このような技術の利用では従来のビデオ符号化ツールを単にすべて符号化の処理に集中させるにすぎない。この結果、隣接するブロック間における複雑な相関のため並列処理が困難になるという根本的な問題の解決には至らない。 On the other hand, a multi-core CPU and a high-performance GPU architecture are being newly developed in order to meet such demands for increasing hardware computing power. For example, multi-core arithmetic changes the basic concept of many fields including multimedia processing. However, the use of such techniques merely concentrates all conventional video encoding tools on the encoding process. As a result, the fundamental problem that parallel processing becomes difficult due to complex correlation between adjacent blocks cannot be solved.

さらに一方で、次世代の画像圧縮方法としてＨ．２６５／ＡＶＣ（ＨＶＣ：High-performance Video Coding；ＨＥＶＣ：High Efficiency Video Coding等とも呼ばれる。）の策定作業が進められており、ここではＨ．２６４／ＡＶＣに対して約２倍の符号化効率を達成することを目標としている。しかしながら、このような高い圧縮率の実現は容易でなく、現在提案されている手法の多くは、既存のあるいは新規な圧縮方法を複数組み合わせることで、トータルの圧縮率を高めようとするアプローチが取られている。換言すると、単独の符号化方法で高い圧縮率を達成することは容易でない（例えば特許文献１及び非特許文献１〜９）。 On the other hand, as a next-generation image compression method, H.264 is used. H.265 / AVC (HVC: also called High Efficiency Video Coding, etc.) is being developed. The goal is to achieve approximately twice the coding efficiency over H.264 / AVC. However, it is not easy to achieve such a high compression ratio, and many of the currently proposed methods take an approach to increase the total compression ratio by combining multiple existing or new compression methods. It has been. In other words, it is not easy to achieve a high compression rate with a single encoding method (for example, Patent Document 1 and Non-Patent Documents 1 to 9).

特開２００５−１３００９９号公報Japanese Patent Laid-Open No. 2005-13003

「オーディオビジュアルオブジェクトのＩＴ符号化−第１０部：高度な映像符号化」ＩＳＯ／ＩＥＣ１４４９６−１０：２００３（２００３年１２月）(ISO/IEC 14496-10:2003. Information technology coding of audio-visual objects-Part 10: advanced video coding. Dec. 2003.)"IT coding of audiovisual objects-Part 10: Advanced video coding" ISO / IEC 14496-10: 2003 (December 2003) (ISO / IEC 14496-10: 2003. Information technology coding of audio-visual objects-Part 10: advanced video coding. Dec. 2003.) 「コンピュータ統合デバイスアーキテクチャ・プログラミング・ガイド、バージョン２．３、２００９」エヌビディア(NVIDIA, NVIDIA CUDA Compute Unified Device Architecture Programming Guide Version 2.3, 2009.)"Computer Integrated Device Architecture Programming Guide, Version 2.3, 2009" Nvidia (NVIDIA, NVIDIA CUDA Compute Unified Device Architecture Programming Guide Version 2.3, 2009.) Ｔ．Ｄ．チャン，Ｙ．Ｈ．チェン，Ｃ．Ｈ．ツァイ，Ｙ．Ｊ．チェン，Ｌ．Ｇ．チェン「Ｈ．２６４／ＡＶＣハイプロファイルのイントラ予測のアルゴリズムおよびアーキテクチャ設計」画像符号化シンポジウムプロシーディング（２００７）(T. D. Chuang, Y. H. Chen, C. H. Tsai, Y. J. Chen, and L. G. Chen, "Algorithm and architecture design for intra prediction in H.264/AVC high profile", in Proc. of Picture Coding Symposium, 2007.)T.A. D. Chang, Y. H. Chen, C.H. H. Tsai, Y. J. et al. Chen, L. G. Chen “H.264 / AVC High Profile Intra Prediction Algorithm and Architecture Design” Image Coding Symposium Proceedings (2007) (TD Chuang, YH Chen, CH Tsai, YJ Chen, and LG Chen, “Algorithm and architecture design for (intra prediction in H.264 / AVC high profile ", in Proc. of Picture Coding Symposium, 2007.) Ｌ．チャン，Ｓ．Ｗ．マ，Ｗ．ガオ「画像符号化用位置依存性リニアイントラ予測」ＩＥＥＥ第１７回画像処理国際会議（ＩＣＩＰ）プロシーディング２８７７−２８８０頁（２０１０年）(L.Zhang, S.W.Ma, W. Gao, "Position dependent linear intra prediction fro image coding", in Proc. of IEEE 17th International Conference on Image Processing(ICIP), pp.2877-2880, 2010.)L. Chang, S. W. Ma, W. Gao "Position-Dependent Linear Intra Prediction for Image Coding" IEEE 17th International Conference on Image Processing (ICIP) Proceedings 2877-2880 (2010) (L. Zhang, SWMa, W. Gao, "Position dependent linear intra prediction fro image coding ", in Proc. of IEEE 17th International Conference on Image Processing (ICIP), pp.2877-2880, 2010.) ＪＭ１４．２<http://iphome.hhi.de/suehring/tml/>JM14.2 <http://iphome.hhi.de/suehring/tml/> 大久保榮他「改訂三版Ｈ．２６４／ＡＶＣ教科書」インプレスSatoshi Okubo et al. “Revised Third Edition H.264 / AVC Textbook” Impress 「技術分野別特許マップ電気１４デジタル動画像圧縮技術」特許庁"Patent Map by Technology Field Electricity 14 Digital Video Compression Technology" Japan Patent Office JVT of ISO/IEC MPEG & ITU-T VCEG, "H.264/MPEG-4 AVC Reference Software Manual" July, 2007JVT of ISO / IEC MPEG & ITU-T VCEG, "H.264 / MPEG-4 AVC Reference Software Manual" July, 2007 T. Wiegand et al, "Overview of the H.264/AVC Video Coding Standard" IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003T. Wiegand et al, "Overview of the H.264 / AVC Video Coding Standard" IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 13, NO. 7, JULY 2003

本発明は、このような背景に鑑みてなされたものである。本発明の主な目的は、より高効率な動画像符号化装置、動画像符号化方法、動画像符号化プログラム及びコンピュータで読み取り可能な記録媒体を提供することにある。 The present invention has been made in view of such a background. The main object of the present invention is to provide a more efficient moving picture coding apparatus, moving picture coding method, moving picture coding program, and computer-readable recording medium.

Means for Solving the Problems and Effects of the Invention

上記の目的を達成するために、本発明の第１の動画像符号化方法は、動画像データを符号化する動画像符号化方法であって、動画像データを取得する工程と、取得された動画像データの任意の現画像フレームを構成する任意のマクロブロックを、該マクロブロックを構成するサブブロック毎に符号化するにおいて、既に符号化された従前の画像フレーム中の、該サブブロックと対応する位置及びその周辺に存在する複数のブロック中から、該複数のブロック中で関連性の高い一以上の候補基準ブロックを抽出する工程と、前記抽出された一以上の候補基準ブロック中から、符号化対象のサブブロックと最も近似するブロックを基準ブロックとして選択する工程と、前記選択された基準ブロックを構成する複数の画素値に基づいて、予め規定された複数の追加予測モードに従って第二基準画素を生成する工程と、前記第二基準画素に基づいて、既定のイントラ予測モードに従って第二予測画素を生成する工程と、前記第二予測画素に基づいて、レート歪み最適化を実行し、最もレート歪みコストの小さいイントラ予測モードを選択する工程と、前記該選択されたイントラ予測モードに従って符号化を行う工程と、を含むことができる。これにより、計算量を殆ど増やすことなく、すなわち従来のハードウェアを変更することなく、より高効率な符号化が実現できる。 In order to achieve the above object, a first moving image encoding method of the present invention is a moving image encoding method for encoding moving image data, the step of acquiring moving image data, and the acquired When encoding an arbitrary macroblock constituting an arbitrary current image frame of moving image data for each subblock constituting the macroblock, it corresponds to the subblock in the previous image frame that has already been encoded. A step of extracting one or more candidate reference blocks having high relevance in the plurality of blocks from the plurality of blocks existing in the position and the periphery thereof, and a code from the extracted one or more candidate reference blocks A step of selecting a block closest to the sub-block to be converted as a reference block, and a plurality of pixel values constituting the selected reference block. Generating a second reference pixel according to a plurality of additional prediction modes, generating a second prediction pixel according to a predetermined intra prediction mode based on the second reference pixel, and based on the second prediction pixel , Performing rate distortion optimization, selecting an intra prediction mode with the lowest rate distortion cost, and encoding according to the selected intra prediction mode. As a result, it is possible to realize more efficient encoding without substantially increasing the amount of calculation, that is, without changing the conventional hardware.

また第２の動画像符号化方法は、前記第二予測画素を生成する工程において、既定のイントラ予測モードに従って取得された予測画素を生成すると共に、前記レート歪み最適化に際して、前記第二予測画素に加えて、予測画素を含めて、最もレート歪みコストの小さいイントラ予測モードを選択することができる。これにより、イントラ予測モードによる符号化に際して、従前の画像フレームの時間的な相関性を利用できるため、従来のイントラ予測モードよりもデータ量を少なくした符号化が実現できる。また、既存のイントラ予測モードと同様のアルゴリズムを採用しているため、既存のハードウェアが利用でき、導入に当たってコスト面での不利益は殆ど生じず、既存の設備に安価に実装できる。 In the second moving image encoding method, in the step of generating the second prediction pixel, the prediction pixel acquired according to a predetermined intra prediction mode is generated, and at the time of the rate distortion optimization, the second prediction pixel is generated. In addition, the intra prediction mode with the lowest rate distortion cost can be selected including the prediction pixels. Thereby, since the temporal correlation of the previous image frame can be used in encoding in the intra prediction mode, encoding with a data amount smaller than that in the conventional intra prediction mode can be realized. In addition, since the same algorithm as that of the existing intra prediction mode is adopted, existing hardware can be used, and there is almost no cost penalty for introduction, and it can be implemented at low cost on existing equipment.

さらに第３の動画像符号化方法は、前記一以上の候補基準ブロックを抽出する工程が、前記複数のブロックの符号化に際して採用された複数のイントラ予測モードの内で、最も多く採用されたイントラ予測モードを最多数モードとして選択し、該最多数モードを採用したブロックをすべて候補基準ブロックとして抽出することができる。 Furthermore, in the third moving image encoding method, the step of extracting the one or more candidate reference blocks is an intra prediction mode that is most frequently used among a plurality of intra prediction modes employed in encoding the plurality of blocks. The prediction mode can be selected as the most numerous mode, and all the blocks adopting the most frequent mode can be extracted as candidate reference blocks.

さらにまた第４の動画像符号化方法は、前記第二基準画素を生成する工程が、前記選択された基準ブロックを構成する複数の画素値の内、垂直方向に並ぶ画素同士を加算して、画素の数で除算した平均値を、第二基準画素の画素値とすることができる。 Furthermore, in the fourth moving image encoding method, the step of generating the second reference pixel adds pixels arranged in the vertical direction among a plurality of pixel values constituting the selected reference block, The average value divided by the number of pixels can be used as the pixel value of the second reference pixel.

さらにまた第５の動画像符号化方法は、前記第二基準画素を生成する工程が、前記選択された基準ブロックを構成する複数の画素値の内、水平方向に並ぶ画素同士を加算して、画素の数で除算した平均値を、第二基準画素の画素値とすることができる。 Furthermore, in the fifth moving image encoding method, the step of generating the second reference pixel adds pixels aligned in the horizontal direction among a plurality of pixel values constituting the selected reference block, The average value divided by the number of pixels can be used as the pixel value of the second reference pixel.

さらにまた第６の動画像符号化方法は、前記第二基準画素を生成する工程が、前記選択された基準ブロックを構成する複数の画素値をすべて加算して、全画素の数で除算した平均値を、第二基準画素の画素値とすることができる。 Furthermore, in the sixth moving image encoding method, the step of generating the second reference pixel is an average obtained by adding all of a plurality of pixel values constituting the selected reference block and dividing by the number of all pixels. The value can be the pixel value of the second reference pixel.

さらにまた第７の動画像符号化方法は、前記複数の追加予測モードが、Ｈ．２６４／ＡＶＣ規格で規定される輝度成分の４×４ブロックの４つのイントラ予測モードの逆演算とできる。 Furthermore, according to a seventh video encoding method, the plurality of additional prediction modes are H.264 and H.264. Inverse calculation of four intra prediction modes of 4 × 4 blocks of luminance components defined by the H.264 / AVC standard.

さらにまた第８の動画像符号化方法は、前記複数の追加予測モードが、Ｈ．２６４／ＡＶＣ規格で規定される輝度成分の１６×１６ブロックの４つのイントラ予測モードの逆演算とできる。 Furthermore, in the eighth moving image encoding method, the plurality of additional prediction modes are H.264 and H.264. Inverse calculation of four intra prediction modes of 16 × 16 blocks of luminance components defined by the H.264 / AVC standard.

さらにまた第９の動画像符号化方法は、符号化に際してインター予測モードをイントラ予測モードよりも優先的に採用し、インター予測モードがイントラ予測モードよりもレート歪みコストが高い場合にイントラ予測モードを採用することができる。 Furthermore, the ninth video coding method adopts the inter prediction mode preferentially over the intra prediction mode in encoding, and the intra prediction mode is selected when the inter prediction mode has a higher rate distortion cost than the intra prediction mode. Can be adopted.

さらにまた第１０の動画像符号化方法は、符号化対象の画像データとして、原画素データを用いることができる。これにより、演算量の増加を少なく抑えつつも、マクロブロックに基づく並列符号化を可能とできる。 Furthermore, the tenth moving image encoding method can use original pixel data as image data to be encoded. Thereby, it is possible to perform parallel encoding based on macroblocks while suppressing an increase in the amount of calculation.

さらにまた第１１の動画像符号化方法は、前記第二基準画素を生成する工程を、前記選択された基準ブロックを構成する複数の画素値に対して、注目画素からの距離に応じて重み付けを行った第三基準画素を生成する工程とできる。これにより、距離の近い画素ほど重視した予測が行えるため、より正確な参照画素を得ることができる。 Furthermore, in the eleventh moving image encoding method, the step of generating the second reference pixel weights the plurality of pixel values constituting the selected reference block in accordance with the distance from the target pixel. The step of generating the third reference pixel can be performed. Thereby, since the prediction which attached importance to the pixel with a short distance can be performed, a more exact reference pixel can be obtained.

さらにまた第１２の動画像符号化方法は、前記基準ブロックを選択する工程において、サブブロックのそれぞれに対し、先頭のブロックをキーブロックとして設定し、該キーブロックに対して位置情報を設定すると共に、キーブロック以外のブロックは、キーブロックの位置情報で代用することができる。これにより、各基準ブロックの内キーブロックに対しては位置情報を付与する一方で、近接するブロックに対してキーブロックの位置情報を用いることでビット数の増大を抑えることが可能となる。 Furthermore, in the twelfth moving image encoding method, in the step of selecting the reference block, the head block is set as a key block for each of the sub-blocks, and position information is set for the key block. The blocks other than the key block can be substituted with the position information of the key block. Thereby, while providing position information to the inner key block of each reference block, it is possible to suppress an increase in the number of bits by using the position information of the key block for adjacent blocks.

さらにまた第１３の動画像符号化装置は、動画像データを取得するための動画像入力手段と、前記動画像入力手段で入力された動画像データを圧縮するための圧縮手段と、前記圧縮手段で圧縮された圧縮データを量子化するための量子化手段とを備える動画像符号化装置であって、前記圧縮手段は、取得された動画像データの任意の現画像フレームを構成する任意のマクロブロックを、該マクロブロックを構成するサブブロック毎に符号化するにおいて、既に符号化された従前の画像フレーム中の、該サブブロックと対応する位置及びその周辺に存在する複数のブロック中から、該複数のブロック中で関連性の高い一以上の候補基準ブロックを抽出する抽出手段と、前記抽出手段で抽出された一以上の候補基準ブロック中から、符号化対象のサブブロックと最も近似するブロックを基準ブロックとして選択する基準ブロック選択手段と、前記基準ブロック選択手段で選択された基準ブロックを構成する複数の画素値に基づいて、予め規定された複数の追加予測モードに従って第二基準画素を生成する第二基準画素生成手段と、前記第二基準画素に基づいて、既定のイントラ予測モードに従って第二予測画素を生成する第二基準予測画素生成手段と、を含み、前記第二基準予測画素生成手段で生成された第二予測画素に基づいて、レート歪み最適化を実行し、最もレート歪みコストの小さいイントラ予測モードを選択することができる。これにより、計算量を殆ど増やすことなく、すなわち従来のハードウェアを変更することなく、より高効率な符号化が実現できる。 Furthermore, the thirteenth moving image encoding apparatus comprises moving image input means for acquiring moving image data, compression means for compressing moving image data input by the moving image input means, and the compression means. And a quantization means for quantizing the compressed data compressed in step (b), wherein the compression means is an arbitrary macro that constitutes an arbitrary current image frame of the acquired moving image data. In encoding a block for each sub-block constituting the macro block, a position corresponding to the sub-block in a previously encoded image frame and a plurality of blocks existing in the vicinity thereof are Extraction means for extracting one or more candidate reference blocks having high relevance among a plurality of blocks, and one or more candidate reference blocks extracted by the extraction means, In accordance with a plurality of pre-defined additional prediction modes based on a plurality of pixel values constituting a reference block selected by the reference block selection unit Second reference pixel generation means for generating a second reference pixel; and second reference prediction pixel generation means for generating a second prediction pixel according to a predetermined intra prediction mode based on the second reference pixel, Based on the second prediction pixel generated by the second reference prediction pixel generation means, rate distortion optimization can be executed and an intra prediction mode with the lowest rate distortion cost can be selected. As a result, it is possible to realize more efficient encoding without substantially increasing the amount of calculation, that is, without changing the conventional hardware.

さらにまた第１４の動画像符号化プログラムは、動画像データを符号化する動画像符号化プログラムであって、コンピュータに、動画像データを取得する機能と、取得された動画像データの任意の現画像フレームを構成する任意のマクロブロックを、該マクロブロックを構成するサブブロック毎に符号化するにおいて、既に符号化された従前の画像フレーム中の、該サブブロックと対応する位置及びその周辺に存在する複数のブロック中から、該複数のブロック中で関連性の高い一以上の候補基準ブロックを抽出する機能と、前記抽出された一以上の候補基準ブロック中から、符号化対象のサブブロックと最も近似するブロックを基準ブロックとして選択する機能と、前記選択された基準ブロックを構成する複数の画素値に基づいて、予め規定された複数の追加予測モードに従って第二基準画素を生成する機能と、前記第二基準画素に基づいて、既定のイントラ予測モードに従って第二予測画素を生成する機能と、前記第二予測画素に基づいて、レート歪み最適化を実行し、最もレート歪みコストの小さいイントラ予測モードを選択する機能と、前記該選択されたイントラ予測モードに従って符号化を行う機能とを実現させることができる。これにより、計算量を殆ど増やすことなく、すなわち従来のハードウェアを変更することなく、より高効率な符号化が実現できる。 Further, the fourteenth moving image encoding program is a moving image encoding program for encoding moving image data. The fourteenth moving image encoding program has a function of acquiring moving image data in a computer and an arbitrary present of the acquired moving image data. When an arbitrary macroblock constituting an image frame is encoded for each subblock constituting the macroblock, the macroblock is present at a position corresponding to the subblock and the surrounding area in a previously encoded image frame. A function of extracting one or more candidate reference blocks having high relevance in the plurality of blocks, and the sub-block to be encoded is most frequently selected from the extracted one or more candidate reference blocks. Based on the function of selecting an approximate block as a reference block and a plurality of pixel values constituting the selected reference block, A function of generating a second reference pixel according to the plurality of additional prediction modes performed, a function of generating a second prediction pixel according to a predetermined intra prediction mode based on the second reference pixel, and a function of the second prediction pixel Thus, it is possible to realize a function of performing rate distortion optimization and selecting an intra prediction mode with the lowest rate distortion cost and a function of performing encoding according to the selected intra prediction mode. As a result, it is possible to realize more efficient encoding without substantially increasing the amount of calculation, that is, without changing the conventional hardware.

さらにまた第１５のコンピュータで読み取り可能な記録媒体又は記録した機器は、上記プログラムを格納するものである。記録媒体には、ＣＤ−ＲＯＭ、ＤＶＤ、Ｂｌｕ−ｒａｙ（登録商標）、フレキシブルディスク、ＭＯ等の光ディスク、磁気ディスク、光磁気ディスク、あるいは磁気テープ等の磁気媒体、半導体メモリその他のプログラムを格納可能な媒体が含まれる。またプログラムには、上記記録媒体に格納されて配布されるものの他、インターネット等のネットワーク回線を通じてダウンロードによって配布される形態のものも含まれる。さらに記録した機器には、上記プログラムがソフトウェアやファームウェア等の形態で実行可能な状態に実装された汎用もしくは専用機器を含む。さらにまたプログラムに含まれる各処理や機能は、コンピュータで実行可能なプログラムソフトウェアにより実行してもよいし、各部の処理を所定のゲートアレイ（ＦＰＧＡ、ＡＳＩＣ）等のハードウェア、又はプログラム・ソフトウェアとハードウェアの一部の要素を実現する部分的ハードウェア・モジュールとが混在する形式で実現してもよい。 Furthermore, a fifteenth computer-readable recording medium or recorded device stores the program. Recording media can store CD-ROM, DVD, Blu-ray (registered trademark), flexible disk, optical disk such as MO, magnetic disk, magneto-optical disk, magnetic medium such as magnetic tape, semiconductor memory, and other programs. Media. The program includes a program distributed in a download manner through a network line such as the Internet, in addition to a program stored and distributed in the recording medium. Further, the recorded devices include general-purpose or dedicated devices in which the program is implemented in a state where it can be executed in the form of software, firmware, or the like. Furthermore, each process and function included in the program may be executed by program software that can be executed by a computer, and each part of the process and function may be performed with hardware such as a predetermined gate array (FPGA, ASIC), or program software. You may implement | achieve in the format with which the partial hardware module which implement | achieves some elements of hardware is mixed.

本発明の一実施の形態に係る動画像符号化装置を示すブロック図である。It is a block diagram which shows the moving image encoder which concerns on one embodiment of this invention. インター予測におけるマクロブロックの例を示す模式図である。It is a schematic diagram which shows the example of the macroblock in inter prediction. １６×１６イントラ予測モードのマクロブロックを示す模式図である。It is a schematic diagram which shows the macroblock of 16 * 16 intra prediction mode. 図３における予測モード０〜３を示す模式図であり、図４（ａ）はモード０、図４（ｂ）はモード１、図４（ｃ）はモード２、図４（ｄ）はモード３を、それぞれ示している。FIG. 4 is a schematic diagram showing prediction modes 0 to 3 in FIG. 3, in which FIG. 4A is mode 0, FIG. 4B is mode 1, FIG. 4C is mode 2, and FIG. Respectively. ４×４イントラ予測モードのマクロブロックを示す模式図である。It is a schematic diagram which shows the macroblock of 4x4 intra prediction mode. 図５における予測モード０〜８を示す模式図であり、図６（ａ）はモード０、図６（ｂ）はモード１、図６（ｃ）はモード２、図６（ｄ）はモード３を、図６（ｅ）はモード４、図６（ｆ）はモード５、図６（ｇ）はモード６、図６（ｈ）はモード７、図６（ｉ）はモード８、図６（ｊ）は各画素の画素値を、それぞれ示す模式図である。6 is a schematic diagram illustrating prediction modes 0 to 8 in FIG. 5, FIG. 6A is mode 0, FIG. 6B is mode 1, FIG. 6C is mode 2, and FIG. 6D is mode 3. 6 (e) is mode 4, FIG. 6 (f) is mode 5, FIG. 6 (g) is mode 6, FIG. 6 (h) is mode 7, FIG. 6 (i) is mode 8, and FIG. j) is a schematic diagram showing the pixel value of each pixel. 図７（ａ）、（ｃ）は動画像中の連続したフレーム画像を、図７（ｂ）、（ｄ）は図７（ａ）、（ｃ）のフレーム画像の符号化方式を、それぞれ示すイメージ図である。FIGS. 7A and 7C show continuous frame images in the moving image, and FIGS. 7B and 7D show the encoding methods of the frame images in FIGS. 7A and 7C, respectively. It is an image figure. 図８（ａ）はフレームｎ−１におけるフレーム画像の符号化方式を、図８（ｂ）はフレームｎにおけるフレーム画像の符号化方式を、それぞれ示すイメージ図である。FIG. 8A is an image diagram showing a frame image encoding method in frame n-1, and FIG. 8B is an image diagram showing a frame image encoding method in frame n. フレーム画像の前後でイントラ予測が選択されている様子を示す模式図である。It is a schematic diagram which shows a mode that intra prediction is selected before and behind a frame image. 符号化対象のマクロブロックを構成する画素を示す模式図である。It is a schematic diagram which shows the pixel which comprises the macroblock of encoding object. バスの動画像における原画素と符号化画素とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency with the original pixel and encoding pixel in the moving image of a bus | bath. コーストガードの動画像における原画素と符号化画素とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency with the original pixel and encoding pixel in the moving image of a coast guard. フットボールの動画像における原画素と符号化画素とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency with the original pixel and encoding pixel in the moving image of football. フォアマンの動画像における原画素と符号化画素とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency with the original pixel and encoding pixel in the foreman's moving image. モバイルの動画像における原画素と符号化画素とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency with the original pixel and encoding pixel in the moving image of a mobile. タンペットの動画像における原画素と符号化画素とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency with the original pixel and encoding pixel in the moving image of a tamppet. 時間的予測イントラモードの手順を示すフローチャートである。It is a flowchart which shows the procedure of temporal prediction intra mode. 候補となる参照画素を選択する様子を示す模式図である。It is a schematic diagram which shows a mode that the reference pixel used as a candidate is selected. 時間的予測イントラモードで追加した予測モードを示す模式図であり、図１９（ａ）は追加モード０、図１９（ｂ）は追加モード１、図１９（ｃ）は追加モード２、図１９（ｄ）は追加モード３を、図１９（ｅ）は追加モード４、図１９（ｆ）は追加モード５、図１９（ｇ）は追加モード６、図１９（ｈ）は追加モード７、図１９（ｉ）は追加モード８、図１９（ｊ）は各画素の画素値を、それぞれ示す模式図である。It is a schematic diagram which shows the prediction mode added by the temporal prediction intra mode, FIG.19 (a) is addition mode 0, FIG.19 (b) is addition mode 1, FIG.19 (c) is addition mode 2, FIG. FIG. 19E shows the addition mode 4, FIG. 19F shows the addition mode 5, FIG. 19G shows the addition mode 6, FIG. 19H shows the addition mode 7, and FIG. (I) is a schematic diagram showing the add mode 8, and FIG. 19 (j) is a schematic diagram showing the pixel value of each pixel. バスの動画像における従来の符号化と実施例に係る符号化とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency with the conventional encoding in the moving image of a bus | bath, and the encoding which concerns on an Example. コーストガードの動画像における従来の符号化と実施例に係る符号化とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency by the conventional encoding in the moving image of a coast guard, and the encoding which concerns on an Example. フットボールの動画像における従来の符号化と実施例に係る符号化とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency by the conventional encoding in the moving image of football, and the encoding which concerns on an Example. フォアマンの動画像における従来の符号化と実施例に係る符号化とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared efficiency with the conventional encoding in the foreman moving image, and the encoding which concerns on an Example. モバイルの動画像における従来の符号化と実施例に係る符号化とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared efficiency with the conventional encoding in the moving image of a mobile, and the encoding which concerns on an Example. タンペットの動画像における従来の符号化と実施例に係る符号化とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared efficiency with the conventional encoding in the moving image of a tampet, and the encoding which concerns on an Example. 従前フレームｎ−１の基準マクロブロック中で、追加モードｘの候補基準ブロックを示す拡大図である。It is an enlarged view which shows the candidate reference | standard block of additional mode x in the reference | standard macroblock of previous frame n-1. 図２６の候補基準ブロック中から、現マクロブロックと最も類似したブロックを探索する様子を示す模式図である。FIG. 27 is a schematic diagram showing a state in which a block most similar to the current macro block is searched from the candidate reference blocks of FIG. 26. 基準ブロックがマクロブロック中で取り得る位置を示す模式図である。It is a schematic diagram which shows the position which a reference | standard block can take in a macroblock. 位置情報を追加することで発生ビット数が増大する様子を示す模式図である。It is a schematic diagram which shows a mode that the number of generated bits increases by adding position information. フレームｎ−１とフレームｎのマクロブロックにおける位置情報とを１：１に対応させた例を示す模式図である。It is a schematic diagram which shows the example which matched the positional information in the macroblock of the frame n-1 and the frame n 1: 1. サブブロックに対してキーブロックを設定する様子を示す模式図である。It is a schematic diagram which shows a mode that a key block is set with respect to a subblock. 各シーケンスに対してサブブロック単位で参照ブロックと他のブロックの位置が同じ割合を演算した結果を示すグラフである。It is a graph which shows the result of having calculated the ratio with which the position of a reference block and another block is the same per subblock unit with respect to each sequence. バスの動画像における実施例２に係る符号化と比較例１の符号化とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency by the encoding which concerns on Example 2 in the moving image of a bus | bath, and the encoding of the comparative example 1. FIG. コーストガードの動画像における実施例２に係る符号化と比較例１の符号化とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency by the encoding which concerns on Example 2 in the moving image of a coast guard, and the encoding of the comparative example 1. フットボールの動画像における実施例２に係る符号化と比較例１の符号化とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency by the encoding which concerns on Example 2 in the moving image of football, and the encoding of the comparative example 1. FIG. フォアマンの動画像における実施例２に係る符号化と比較例１の符号化とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency by the encoding which concerns on Example 2 in the Forman's moving image, and the encoding of the comparative example 1. FIG. モバイルの動画像における実施例２に係る符号化と比較例１の符号化とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency by the encoding which concerns on Example 2 in the moving image of a mobile, and the encoding of the comparative example 1. FIG. タンペットの動画像における実施例２に係る符号化と比較例１の符号化とで効率を比較したシミュレーション結果を示すグラフである。It is a graph which shows the simulation result which compared the efficiency by the encoding which concerns on Example 2 in the moving image of a tampet, and the encoding of the comparative example 1. FIG.

以下、本発明の実施の形態を図面に基づいて説明する。ただし、以下に示す実施の形態は、本発明の技術思想を具体化するための動画像符号化装置、動画像符号化方法、動画像符号化プログラム及びコンピュータで読み取り可能な記録媒体を例示するものであって、本発明は動画像符号化装置、動画像符号化方法、動画像符号化プログラム及びコンピュータで読み取り可能な記録媒体を以下のものに特定しない。また、本明細書は特許請求の範囲に示される部材を、実施の形態の部材に特定するものでは決してない。特に実施の形態に記載されている構成部品の寸法、材質、形状、その相対的配置等は特に特定的な記載がない限りは、本発明の範囲をそれのみに限定する趣旨ではなく、単なる説明例にすぎない。なお、各図面が示す部材の大きさや位置関係等は、説明を明確にするため誇張していることがある。さらに以下の説明において、同一の名称、符号については同一もしくは同質の部材を示しており、詳細説明を適宜省略する。さらに、本発明を構成する各要素は、複数の要素を同一の部材で構成して一の部材で複数の要素を兼用する態様としてもよいし、逆に一の部材の機能を複数の部材で分担して実現することもできる。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. However, the embodiments described below exemplify a moving image encoding device, a moving image encoding method, a moving image encoding program, and a computer-readable recording medium for embodying the technical idea of the present invention. Therefore, the present invention does not specify a moving image encoding apparatus, a moving image encoding method, a moving image encoding program, and a computer-readable recording medium as follows. Further, the present specification by no means specifies the members shown in the claims to the members of the embodiments. In particular, the dimensions, materials, shapes, relative arrangements, and the like of the component parts described in the embodiments are not intended to limit the scope of the present invention unless otherwise specified, and are merely explanations. It is just an example. Note that the size, positional relationship, and the like of the members shown in each drawing may be exaggerated for clarity of explanation. Furthermore, in the following description, the same name and symbol indicate the same or the same members, and detailed description thereof will be omitted as appropriate. Furthermore, each element constituting the present invention may be configured such that a plurality of elements are constituted by the same member and the plurality of elements are shared by one member, and conversely, the function of one member is constituted by a plurality of members. It can also be realized by sharing.

本明細書において動画像符号化装置に接続される操作、制御、入出力、表示、その他の処理等のためのコンピュータ、プリンタ、外部記憶装置その他の周辺機器との接続は、例えばＩＥＥＥ１３９４、ＲＳ−２３２ｘ、ＲＳ−４２２、ＲＳ−４２３、ＲＳ−４８５、ＵＳＢ等のシリアル接続、パラレル接続、あるいは１０ＢＡＳＥ−Ｔ、１００ＢＡＳＥ−ＴＸ、１０００ＢＡＳＥ−Ｔ等のネットワークを介して電気的に接続して通信を行う。接続は有線を使った物理的な接続に限られず、ＩＥＥＥ８０２．１ｘ、ＯＦＤＭ、ＬＴＥ方式等の無線ＬＡＮやＢｌｕｅｔｏｏｔｈ（登録商標）等の電波、赤外線、光通信等を利用した無線接続等でもよい。さらに検索対象のテキストや画像データの保存やデータベース構築、検索等に関する設定の保存等を行うための記録媒体には、メモリカードや磁気ディスク、光ディスク、光磁気ディスク、半導体メモリ等が利用できる。
（動画像符号化装置）In this specification, connection to a computer, printer, external storage device and other peripheral devices for operation, control, input / output, display, and other processing connected to the moving image encoding apparatus is, for example, IEEE 1394, RS- 232x, RS-422, RS-423, RS-485, serial connection such as USB, parallel connection, or electrically connected via a network such as 10BASE-T, 100BASE-TX, 1000BASE-T . The connection is not limited to a physical connection using a wire, but may be a wireless connection using a wireless LAN such as IEEE802.1x, OFDM, or LTE, a radio wave such as Bluetooth (registered trademark), infrared light, optical communication, or the like. Further, a memory card, a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like can be used as a recording medium for storing text and image data to be searched, database construction, setting for searching, and the like.
(Moving picture encoding device)

図１に、Ｈ．２６４／ＡＶＣの符号化処理により、画像データからビットストリームを得る動画像符号化装置１００を構成するブロック図の一例を示す。動画像符号化装置１００は、動き推定部１０２、インター予測部１０４、イントラ予測部１０６、変換部１０８、量子化部１１０、エントロピー符号化部１１４、逆量子化部１１６、逆変換部１１８、ＲＤＯ処理部１１９、フィルタ１２０及びフレームメモリ１２２を備える。この動画像符号化装置１００は、入力される動画データを、データサイズを圧縮して符号化する。なおデータサイズを縮小する圧縮作業は、周辺のデータや以前のフレームからそのブロックのデータを予測する作業となるため、予測とも呼ばれる。入力される動画データは、時間軸上に配列される複数のピクチャから構成され、ピクチャは、複数のブロックから構成される。ブロックは、マクロブロック、マクロブロックを垂直又は水平方向に二分割又は四分割して得られたサブブロックを含む。またインター予測部１０４は、動き補償を行う。ここでインター予測部１０４とイントラ予測部１０６は、いずれか一方がＲＤＯ処理部１１９により選択される。 In FIG. 1 is a block diagram illustrating an example of a moving image encoding apparatus 100 that obtains a bit stream from image data by H.264 / AVC encoding processing. The moving image encoding apparatus 100 includes a motion estimation unit 102, an inter prediction unit 104, an intra prediction unit 106, a conversion unit 108, a quantization unit 110, an entropy encoding unit 114, an inverse quantization unit 116, an inverse conversion unit 118, and an RDO. A processing unit 119, a filter 120, and a frame memory 122 are provided. This moving image encoding apparatus 100 encodes input moving image data by compressing the data size. The compression work for reducing the data size is also called prediction because it is work for predicting the data of the block from the surrounding data and the previous frame. The input moving image data is composed of a plurality of pictures arranged on the time axis, and the picture is composed of a plurality of blocks. The block includes a macroblock and a subblock obtained by dividing the macroblock into two or four in the vertical or horizontal direction. The inter prediction unit 104 performs motion compensation. Here, one of the inter prediction unit 104 and the intra prediction unit 106 is selected by the RDO processing unit 119.

動画像符号化装置１００は、インター予測及びイントラ予測を含む複数の符号化モード（予測モード）中から一の予測モードを選択し、この選択された予測モードにて、現ピクチャのマクロブロックに対して符号化を行う。ここで動画像符号化装置１００は、インター予測及びイントラ予測に含まれるすべての予測モードで符号化を行って、コストを計算し、コストが最小である予測モードを最適モードと決定し、この予測モードにて符号化を行う。なお、予測モードの選択、決定に際しては、符号化コストによるものが一般的である。符号化コストを評価する関数は適宜選択できる。例えば、符号化コストを評価する複雑度を削減するために完全な仮符号化を行わず、生成ビット数を予測する方法を採用する場合がある。
（インター予測）The moving picture coding apparatus 100 selects one prediction mode from among a plurality of coding modes (prediction modes) including inter prediction and intra prediction, and in the selected prediction mode, the macro coding of the current picture is selected. Encoding. Here, the moving picture coding apparatus 100 performs coding in all the prediction modes included in the inter prediction and the intra prediction, calculates the cost, determines the prediction mode with the smallest cost as the optimum mode, and performs this prediction. Encode in mode. Note that the selection and determination of the prediction mode is generally based on the coding cost. A function for evaluating the coding cost can be selected as appropriate. For example, there is a case where a method of predicting the number of generated bits is used without performing complete provisional encoding in order to reduce the complexity of evaluating the encoding cost.
(Inter prediction)

インター予測は、動き推定部１０２及びインター予測部１０４で行われる。動き推定部１０２は、現ピクチャのマクロブロックの予測値を基準ピクチャで探索する。またインター予測部１０４は、１／２画素又は１／４画素単位で参照ブロックが探索された場合には、これら中間画素値を計算して参照ブロックデータ値を決定する。
（イントラ予測）Inter prediction is performed by the motion estimation unit 102 and the inter prediction unit 104. The motion estimation unit 102 searches the reference picture for the prediction value of the macroblock of the current picture. In addition, when the reference block is searched in units of 1/2 pixel or 1/4 pixel, the inter prediction unit 104 calculates the intermediate pixel value and determines the reference block data value.
(Intra prediction)

一方で、イントラ予測部１０６は、現ピクチャのマクロブロックの予測値を現ピクチャ内で探索するイントラ予測を行う。現マクロブロックに対して、インター予測、又はイントラ予測のいずれを実行するかは、すべての予測モードのコストを計算した上で決定される。すなわち、コストを比較して、その値が最小であるモードをこのブロックの最適な予測モードとして決定し、このモードにてマクロブロックに対する符号化を行う。 On the other hand, the intra prediction unit 106 performs intra prediction for searching for a prediction value of a macroblock of the current picture in the current picture. Whether to perform inter prediction or intra prediction on the current macroblock is determined after calculating the costs of all prediction modes. That is, by comparing the costs, the mode having the smallest value is determined as the optimum prediction mode of this block, and the macroblock is encoded in this mode.

インター予測又はイントラ予測が実行され、現フレームのマクロブロックが参照する予測データが探索されると、これを現ピクチャのマクロブロックから減算して、変換部１０８で変換して、量子化部１１０で量子化を行う。符号化時のデータ量を減らすためには、現フレームのマクロブロックから動き推定された参照ブロックを減算した差分値を符号化する。量子化された差分値は、エントロピー符号化部１１４で符号化される。 When inter prediction or intra prediction is performed and prediction data referred to by the macroblock of the current frame is searched, the prediction data is subtracted from the macroblock of the current picture, converted by the conversion unit 108, and converted by the quantization unit 110. Perform quantization. In order to reduce the amount of data at the time of encoding, a difference value obtained by subtracting a motion estimated reference block from a macroblock of the current frame is encoded. The quantized difference value is encoded by the entropy encoding unit 114.

一方、インター予測に使用される基準ピクチャを得るために、量子化されたピクチャを逆量子化部１１６及び逆変換部１１８を経て、現ピクチャを復元する。このように復元された現ピクチャは、フレームメモリ１２２に保存されて、以降のピクチャに対するインター予測に使用される。復元されたピクチャをフィルタ１２０に通すことで、本来のピクチャで多少の符号化エラーを含むピクチャが得られる。 On the other hand, in order to obtain a reference picture used for inter prediction, the current picture is restored by passing the quantized picture through the inverse quantization unit 116 and the inverse transform unit 118. The current picture restored in this way is stored in the frame memory 122 and used for inter prediction for subsequent pictures. By passing the restored picture through the filter 120, a picture including some coding errors in the original picture can be obtained.

図２に、インター予測時にマクロブロックが採用する可変ブロックを示す。Ｈ．２６４／ＡＶＣでのインター予測で、一つの１６×１６マクロブロックは、１６×１６、１６×８、８×１６、又は８×８ブロックに分けられる。またそれぞれの８×８ブロックは、さらに小単位の８×４、４×８、４×４ブロックに分けられる。このように分けられた各サブブロックに対して、動き推定及び補償が実行され、動きベクトルが決定される。このような多様な種類の可変ブロックを使用してインター予測を行えば、動画像の特性や動きに応じた効果的な符号化が行える。
（インター予測部１０４）FIG. 2 shows variable blocks adopted by the macroblock during inter prediction. H. In inter prediction in H.264 / AVC, one 16 × 16 macroblock is divided into 16 × 16, 16 × 8, 8 × 16, or 8 × 8 blocks. Each 8 × 8 block is further divided into small units of 8 × 4, 4 × 8, and 4 × 4 blocks. Motion estimation and compensation are performed on each of the sub-blocks thus divided, and a motion vector is determined. By performing inter prediction using such various types of variable blocks, it is possible to perform effective coding according to the characteristics and motion of moving images.
(Inter prediction unit 104)

図１のインター予測は、インター予測部１０４により実行される。インター予測はフレーム間予測とも呼ばれ、時間が異なる２つのフレーム間における対応画像の差分をとる予測方法であり、時間的冗長度を削減することを目的とする。インター予測符号化においては、従来は８×８画素ブロックサイズ以上の予測が用いられていたが、Ｈ．２６４／ＡＶＣでは４×４画素ブロックサイズでの予測が可能で、さらに既に符号化された複数枚の参照画像からの動き補償により、従来方法よりも高精度な予測が可能となっている。このように、ブロック毎に選択可能な予測モードの数を増やし、より予測効率の高い予測モードを選択することで、符号化効率の向上が図られている。
（イントラ予測モード）The inter prediction in FIG. 1 is executed by the inter prediction unit 104. Inter-prediction is also called inter-frame prediction, and is a prediction method that takes a difference between corresponding images between two frames having different times, and aims to reduce temporal redundancy. In inter-prediction coding, conventionally, prediction of 8 × 8 pixel block size or larger has been used. In H.264 / AVC, prediction with a 4 × 4 pixel block size is possible, and more accurate prediction than in the conventional method is possible by motion compensation from a plurality of already encoded reference images. As described above, the number of prediction modes that can be selected for each block is increased, and a prediction mode with higher prediction efficiency is selected to improve the encoding efficiency.
(Intra prediction mode)

次に、従来のイントラ予測モードの詳細について、図３〜図６に基づいて説明する。Ｈ．２６４／ＡＶＣ規格においては、図３、図４に示すように１６×１６ブロックに関して４つの異なるイントラ予測モード、また図５、図６に示すように４×４のブロックに関して９つの異なるイントラ予測モードが利用可能である。図４（ａ）〜（ｄ）は、Ｈ．２６４／ＡＶＣ規格による輝度成分の１６×１６イントラ予測モードを、図６（ａ）〜（ｉ）は、４×４イントラ予測モードを、それぞれ示している。図４（ａ）〜（ｄ）に示すように、１６×１６イントラ予測モードは、モード０の垂直モード、モード１の水平モード、モード２のＤＣ（直流）モード、モード３の平面モードの計４個のモードを含む。また図６（ａ）〜（ｉ）に示すように、４×４イントラ予測モードは、モード０の垂直モード、モード１の水平モード、モード２のＤＣモード、モード３の対角線左下側モード、モード４の対角線右下側モード、モード５の垂直右側モード、モード６の垂直左側モード、モード７の水平上側モード、モード８の水平下側モードの計９個のモードを有する。ここで画素Ａ〜Ｍの画素は、既に符号化されており、これらの画素値を用いて図６（ｊ）に示す隣接する画素ａ〜ｐの画素値を予測する。 Next, details of the conventional intra prediction mode will be described with reference to FIGS. H. In the H.264 / AVC standard, four different intra prediction modes for 16 × 16 blocks as shown in FIGS. 3 and 4, and nine different intra prediction modes for 4 × 4 blocks as shown in FIGS. Is available. 4 (a) to 4 (d) show H.264. The luminance component 16 × 16 intra prediction mode according to the H.264 / AVC standard and FIGS. 6A to 6I respectively illustrate the 4 × 4 intra prediction mode. As shown in FIGS. 4A to 4D, the 16 × 16 intra prediction mode includes a mode 0 vertical mode, a mode 1 horizontal mode, a mode 2 DC (direct current) mode, and a mode 3 plane mode. Includes 4 modes. Further, as shown in FIGS. 6A to 6I, the 4 × 4 intra prediction mode includes the mode 0 vertical mode, the mode 1 horizontal mode, the mode 2 DC mode, the mode 3 diagonal lower left mode, and the mode. There are a total of nine modes: 4 diagonal lower right mode, mode 5 vertical right mode, mode 6 vertical left mode, mode 7 horizontal upper mode, and mode 8 horizontal lower mode. Here, the pixels A to M are already encoded, and the pixel values of the adjacent pixels a to p shown in FIG. 6J are predicted using these pixel values.

目標ブロックの予測値は、固有の係数を持つ複数の方向に沿って隣接する符号化された画素を外挿することで得られる。例えば、図６においてモード０で示す垂直予測モードを用いて、目標４×４ブロックを予測符号化する動作を説明する。まず、４×４サイズの現ブロックの上側に隣接した画素Ａ〜Ｄの画素値に基づいて、４×４現ブロックの画素値を予測する。すなわち、画素Ａの値を４×４現ブロックの１列目に含まれる４個の画素ａ、ｅ、ｉ、ｍの画素値に、画素Ｂの値を４×４現ブロックの２列目に含まれる４個の画素ｂ、ｆ、ｊ、ｎの画素値に、画素Ｃの値を４×４現ブロックの３列目に含まれる４個の画素ｃ、ｇ、ｋ、ｏの画素値に、画素Ｄの値を４×４現ブロックの４列目に含まれる４個の画素ｄ、ｈ、ｌ、ｐの画素値に、それぞれ予測する。このようにして画素Ａ〜Ｄを用いて、画素ａ〜ｐの画素値がすべて予測される。次いで、予測された４×４現ブロックと、元の４×４現ブロックに含まれる画素の実際値の差分を求め、この差分値を符号化する。 The predicted value of the target block is obtained by extrapolating adjacent encoded pixels along a plurality of directions having unique coefficients. For example, an operation for predictively encoding a target 4 × 4 block using the vertical prediction mode indicated by mode 0 in FIG. 6 will be described. First, the pixel value of the 4 × 4 current block is predicted based on the pixel values of the pixels A to D adjacent to the upper side of the 4 × 4 size current block. That is, the value of the pixel A is set to the pixel values of the four pixels a, e, i, m included in the first column of the 4 × 4 current block, and the value of the pixel B is set to the second column of the 4 × 4 current block. The pixel values of the four pixels b, f, j, and n included are changed, and the value of the pixel C is changed to the pixel values of the four pixels c, g, k, and o included in the third column of the 4 × 4 current block. The value of the pixel D is predicted to be the pixel value of the four pixels d, h, l, and p included in the fourth column of the 4 × 4 current block. In this way, the pixel values of the pixels a to p are all predicted using the pixels A to D. Next, the difference between the predicted 4 × 4 current block and the actual value of the pixels included in the original 4 × 4 current block is obtained, and this difference value is encoded.

このような固定外挿（fixed extrapolations）方法では、テクスチャが単純な連続画像に関しては良好な結果が得られる。しかしながら、この方法は一のマクロブロック内で複雑なエッジを有する連続画像やコンテキストには適応できない。 Such a fixed extrapolations method gives good results for continuous images with simple textures. However, this method cannot be applied to continuous images and contexts having complex edges within one macroblock.

ここで、イントラモードが連続して選択される傾向について、シミュレーションを行った結果を表１に示す。この表では、異なる種類の動画像ソース（シーケンス）として標準的に用いられているバス（bus）、コーストガード（coastguard）、フットボール（football）、フォアマン（foreman）、モバイル（mobile）、タンペット（tempete）の各シーケンスに対して、連続するフレーム間でイントラモードが選択される相関関係を示している。この表において、「イントラＭＢ」は、イントラ予測モード（モード０〜８）で符号化されたマクロブロックの数を示しており、「イントラ−イントラＭＢ」は相関マクロブロック又は周囲のマクロブロック中で、同じイントラ予測モードで符号化されたマクロブロックの数を示している。 Here, Table 1 shows the result of simulation for the tendency that the intra mode is continuously selected. This table shows the buses, coastguards, footballs, foreman, mobiles, tamppets (standard) used as different types of video sources (sequences) tempete) shows a correlation in which an intra mode is selected between consecutive frames. In this table, “intra MB” indicates the number of macroblocks encoded in the intra prediction mode (modes 0 to 8), and “intra-intra MB” is in a correlation macroblock or a surrounding macroblock. The number of macroblocks encoded in the same intra prediction mode is shown.

表１に示すように、連続するフレーム間ではイントラモードも連続する傾向が極めて高い。さらに、イントラで符号化された相関マクロブロックは、現在のフレームにおいても、イントラモードで符号化するための基準画素を予測するのに利用できる。 As shown in Table 1, the intra mode also has a very high tendency to continue between consecutive frames. Further, the intra-correlated correlation macroblock can be used to predict a reference pixel for encoding in the intra mode even in the current frame.

このような予測モードを用いて、動画像データを圧縮して符号化する。符号化に際しては、可能な限りデータサイズを縮小することが望まれる。しかしながら、動画像データの品質を損なうことなく圧縮することは容易でない。また隣接するマクロブロック間の高い相関のため、マクロブロックはマクロブロック番号順に、すなわちシーケンシャルに一ずつ符号化する必要がある。したがって、マクロブロック−レベル並列符号化処理がマルチコアＧＰＵでは利用可能できないことから、高性能コンピューティングは実現できない。 Using such a prediction mode, the moving image data is compressed and encoded. In encoding, it is desirable to reduce the data size as much as possible. However, it is not easy to compress without impairing the quality of moving image data. Also, because of the high correlation between adjacent macroblocks, the macroblocks need to be encoded one by one in the order of macroblock numbers, that is, sequentially. Therefore, high-performance computing cannot be realized because macroblock-level parallel encoding processing cannot be used in a multi-core GPU.

一般には、インター予測がイントラ予測よりも優れていると考えられており、このためインター予測では十分な成果が得られない場合に、イントラ予測が採用される。しかしながら、相関性の少ない画像においては、インター符号化とイントラ符号化のいずれでもビット数の削減が不十分となることがあった。本発明者らが行った試験では、例えばフットボールの試合を撮像した動画のような場合、連続したフレーム間で似ている部分があるにも拘わらずインター符号化でなくイントラ符号化が採用されていることを見出した。例えば図７（ａ）、（ｃ）に動画像中の連続したフレーム画像を、図７（ｂ）、（ｄ）に各フレーム画像の符号化方式を、それぞれ示す。これらの図に示すように、芝生の部分は前後のフレーム間で変化が殆ど見られないにも拘わらず、イントラ符号化が採用されていることを見出した。 In general, it is considered that inter prediction is superior to intra prediction, and therefore intra prediction is employed when sufficient results cannot be obtained by inter prediction. However, in an image with little correlation, the number of bits may be insufficiently reduced in both inter coding and intra coding. In the test conducted by the present inventors, for example, in the case of a moving image of a football game, intra coding is adopted instead of inter coding even though there are similar parts between consecutive frames. I found out. For example, FIGS. 7A and 7C show continuous frame images in a moving image, and FIGS. 7B and 7D show encoding methods of the respective frame images. As shown in these figures, the present inventors have found that intra coding is used even though the lawn portion hardly changes between the previous and next frames.

さらに図８（ａ）、（ｂ）に示すように、直前のフレームとの間で符号化方式を比較すると、連続した画像間では予測モードの傾向が変わり難いこと、すなわち同じ予測モードが継続して行われていることを見出した。このことから、図９に示すように、イントラ予測が選択された場合は、前のフレーム画像でも同位置又はその近辺において、同様にイントラ予測が採用されている可能性が高いと思われる。実際に様々な動画ソースに対して、イントラ予測が連続して行われる比率を調べたところ、表１に示すように、動画像の種類によっては極めて高い確率でイントラ予測が連続していることを見出した。このことから、イントラ予測が選択されやすい動画像においては、前のフレームのイントラ符号化の結果を利用して符号化を行うことで、より効率のよい符号化が行えるとの着想を得た。すなわち、従来のイントラ符号化で行われていた同一画像内での相関性に加えて、さらに時間的な相関性をも加味することで、一層の高効率化を図ることができる。本実施の形態は、ブロック符号化の並列処理を実現するイントラモードとして、独立した予測方法である。具体的には、既存のイントラモードに加えて、新たに時間的予測イントラモードを導入することで、符号化効率を改善している。このような高速な並列符号化と高い符号化効率を実現するアルゴリズムについて、以下詳述する。 Furthermore, as shown in FIGS. 8A and 8B, when the encoding method is compared with the immediately preceding frame, the tendency of the prediction mode is unlikely to change between consecutive images, that is, the same prediction mode continues. I found out that it was done. From this, as shown in FIG. 9, when intra prediction is selected, it is highly likely that intra prediction is similarly adopted in the same position or in the vicinity thereof in the previous frame image. As a result of examining the rate at which intra prediction is continuously performed for various video sources, it is confirmed that intra prediction is continuous with a very high probability depending on the type of moving image, as shown in Table 1. I found it. From this, the idea that more efficient encoding can be performed by performing encoding using a result of intra encoding of the previous frame in a moving image in which intra prediction is easily selected. In other words, in addition to the correlation in the same image that has been performed by the conventional intra coding, the temporal correlation is also taken into consideration, so that further improvement in efficiency can be achieved. This embodiment is an independent prediction method as an intra mode for realizing parallel processing of block coding. Specifically, coding efficiency is improved by introducing a temporal prediction intra mode in addition to the existing intra mode. An algorithm for realizing such high-speed parallel coding and high coding efficiency will be described in detail below.

本実施の形態においては、符号化された基準画素の代わりに原画素を用いることが好ましい。ここで用いた基準画素の構成を図１０に示す。この図に示すように、マクロブロックを構成する画素Ｃ0〜Ｃ15が符号化対象であって、隣接する画素Ａ〜Ｍは既に符号化されているものとする。一般に、イントラモードの基準データは、基準ブロックを生成するためにＡ〜Ｍの符号化された隣接画素を使用している。これに対し、本実施の形態においては、隣接画素の符号化を待たずに一のマクロブロックのＲＤＯ処理を開始できるように基準ブロックを生成すべく、原画素を使用する。ただし、これによって推定誤差を生じ、また符号化性能を損なうことは不可避となる。 In the present embodiment, it is preferable to use the original pixel instead of the encoded reference pixel. FIG. 10 shows the configuration of the reference pixel used here. As shown in this figure, it is assumed that pixels C0 to C15 constituting a macroblock are to be encoded, and adjacent pixels A to M have already been encoded. In general, intra mode reference data uses A to M encoded neighboring pixels to generate a reference block. On the other hand, in the present embodiment, the original pixel is used to generate a reference block so that RDO processing of one macroblock can be started without waiting for encoding of adjacent pixels. However, this causes an estimation error and impairs the encoding performance.

このような原画素を用いる場合と、符号化された画素を用いる場合とでパフォーマンスを比較したシミュレーション結果を、図１１〜図１６に示す。これらの図において、図１１はバスの動画像における原画素と符号化画像との効率、図１２はコーストガードの動画像における原画素と符号化画像との効率、図１３はフットボールの動画像における原画素と符号化画像との効率、図１４はフォアマンの動画像における原画素と符号化画像との効率、図１５はモバイルの動画像における原画素と符号化画像との効率、図１６はタンペットの動画像における原画素と符号化画像との効率を比較したシミュレーション結果を、それぞれ示している。各グラフにおいて、横軸はビットレート［ｋｂｉｔ／ｓ］、縦軸はピーク信号対雑音比（Peak Signal-to-Noise Ratio：ＰＳＮＲ）［ｄＢ］を示している。これらの図に示すように、原画素を使用することで、僅かな符号化性能の低下が生じている。ただし、全体的な演算量からすれば無視できる程のレベルであって、従来のハードウェア構成でも十分対応可能なレベルである。一方、原画像データを用いることで、マクロブロックに基づく並列符号化が可能となることで得られる恩恵は遙かに大きい。 FIG. 11 to FIG. 16 show simulation results comparing the performance when using such original pixels and when using encoded pixels. In these figures, FIG. 11 is the efficiency of the original pixel and the encoded image in the bus moving image, FIG. 12 is the efficiency of the original pixel and the encoded image in the coast guard moving image, and FIG. 13 is the football moving image. FIG. 14 shows the efficiency of the original pixel and the encoded image in the foreman moving image, FIG. 15 shows the efficiency of the original pixel and the encoded image in the mobile moving image, and FIG. Simulation results comparing the efficiency of the original pixel and the encoded image in the pet moving image are shown. In each graph, the horizontal axis represents the bit rate [kbit / s], and the vertical axis represents the peak signal-to-noise ratio (PSNR) [dB]. As shown in these drawings, the use of the original pixel causes a slight deterioration in encoding performance. However, it is a level that can be ignored from the viewpoint of the overall calculation amount, and is a level that can be sufficiently handled even by a conventional hardware configuration. On the other hand, by using original image data, the benefits obtained by enabling parallel encoding based on macroblocks are much greater.

さらに、動画像シーケンスとしてバス、コーストガード、フットボール、フォアマン、モバイル、タンペットに対して、原画像データを用いて実施例１の符号化を行い、ＰＳＮＲのみならずビット削減率で比較例１と比較した結果を、表２に示す。この図に示すように、いずれの動画像においてもビットレートが削減されており、これらの平均で１６．３４％という、単独の技術としては極めて優れたビットレート削減を実現できた。もちろん、他の既知の符号化技術を組み合わせることで、より一層の改善を得ることも可能である。 Further, the encoding of the first embodiment is performed on the bus, coast guard, football, foreman, mobile, and tamppet as the moving image sequence by using the original image data, and the bit reduction rate as well as the PSNR is compared with the comparative example 1. The comparison results are shown in Table 2. As shown in this figure, the bit rate was reduced in any moving image, and the average bit rate was reduced to 16.34%, which was an extremely excellent bit rate reduction as a single technique. Of course, further improvements can be obtained by combining other known encoding techniques.

（時間的予測イントラモードの手順） (Procedure for temporal prediction intra mode)

次に、時間的予測イントラモードについて、図１７及び図１８に基づいて説明する。本発明者らの行った試験によれば、従前のフレームにおいてマクロブロックがイントラ予測で符号化されていた場合は、現フレームのマクロブロックはこれと深い相関があることが判明した。したがって、相関するマクロブロックがイントラモードで符号化される場合は、現マクロブロックのＲＤＯ処理において、現マクロブロックと最も類似する一の４×４ブロックを選択することで、効率のよい符号化が得られる。この技術は、イントラ予測に際して適用する。以下、図９の模式図に示すように、現フレームｎ中のマクロブロックを符号化する際、従前のフレームｎ−１中の対応する位置に存在する、符号化済みのブロックを利用する手順を、図１７のフローチャートに基づいて説明する。
（ステップＳ１：候補基準ブロックの決定）Next, temporal prediction intra mode is demonstrated based on FIG.17 and FIG.18. According to the tests conducted by the present inventors, when the macroblock was encoded by intra prediction in the previous frame, it was found that the macroblock of the current frame has a deep correlation with this. Therefore, when correlated macroblocks are encoded in the intra mode, efficient 4D4 coding can be achieved by selecting one 4 × 4 block most similar to the current macroblock in the RDO processing of the current macroblock. can get. This technique is applied for intra prediction. Hereinafter, as shown in the schematic diagram of FIG. 9, when a macroblock in the current frame n is encoded, a procedure for using the encoded block existing at the corresponding position in the previous frame n−1 is used. This will be described based on the flowchart of FIG.
(Step S1: Determination of candidate reference block)

まずステップＳ１で候補基準ブロックを決定する。具体的には、図１８に示すように、現フレームｎ中の、符号化対象の現ブロック（図１８において黒塗りで示す）と相関する、従前フレームｎ−１中の対応する部位に存在する４×４の１６個のマクロブロックに注目し、これらのブロック中で、最も多く採用されたイントラ予測モードを、最多数モード（モードｉ）として選択する。次に、この最多数モードを採用したブロックをすべて抽出し、候補基準ブロックとする。図１８の例では、最多数モードがモード０であるとして、このモード０で符号化されたブロックＸ1、Ｘ2、Ｘ3が候補基準ブロックとして抽出される。
（ステップＳ２：基準ブロックの選択）First, at step S1, candidate reference blocks are determined. Specifically, as shown in FIG. 18, it exists in the corresponding part in the previous frame n-1 that correlates with the current block to be encoded (shown in black in FIG. 18) in the current frame n. Attention is paid to 16 macroblocks of 4 × 4, and the intra prediction mode adopted most frequently among these blocks is selected as the most numerous mode (mode i). Next, all the blocks adopting this most numerous mode are extracted and set as candidate reference blocks. In the example of FIG. 18, assuming that the most numerous mode is mode 0, blocks X1, X2, and X3 encoded in this mode 0 are extracted as candidate reference blocks.
(Step S2: Selection of reference block)

次にステップＳ２で、候補基準ブロック中から基準ブロックを選択する。ここでは、すべての候補基準ブロックＸ1、Ｘ2、Ｘ3に対して、符号化対象の現ブロックと比較し、差分の最も少ないブロックを基準ブロックとする。差分は、例えば次式で計算できる。 Next, in step S2, a reference block is selected from the candidate reference blocks. Here, all candidate reference blocks X1, X2, and X3 are compared with the current block to be encoded, and the block with the smallest difference is set as the reference block. The difference can be calculated by the following equation, for example.

このようにして、符号化対象のブロックと最も類似する一の基準ブロックが選択される。
（ステップＳ３：第二基準画素の生成）In this way, one reference block that is most similar to the block to be encoded is selected.
(Step S3: Generation of second reference pixel)

次にステップＳ３で、この基準ブロックに基づいて、第二基準画素Ａ’〜Ｍ’を生成する。第二基準画素の生成は、従来の４×４イントラ予測の各予測モード０〜８で用いられる画素生成方法とは逆の方法で行われる。すなわち、図１９（ｊ）に示すように、既に生成された基準ブロックを構成する４×４の画素値ａ’〜ｐ’に基づいて、隣接する第二基準画素Ａ’〜Ｍ’を逆に生成する。この第二基準画素Ａ’〜Ｍ’の画素値の決定方法は、図１９（ａ）〜（ｉ）に示す９つの追加モード（Ｅｘモード）に基づいて行われる。すなわち、ここでは図１９（ｊ）の画素値に基づいて、９組の第二基準画素Ａ’〜Ｍ’が生成されることになる。 In step S3, second reference pixels A 'to M' are generated based on the reference block. The generation of the second reference pixel is performed by a method opposite to the pixel generation method used in each prediction mode 0 to 8 of the conventional 4 × 4 intra prediction. That is, as shown in FIG. 19 (j), the adjacent second reference pixels A ′ to M ′ are reversed based on the 4 × 4 pixel values a ′ to p ′ constituting the already generated reference block. Generate. The method of determining the pixel values of the second reference pixels A ′ to M ′ is performed based on nine additional modes (Ex modes) shown in FIGS. That is, here, nine sets of second reference pixels A ′ to M ′ are generated based on the pixel values of FIG.

例えば、図１９（ａ）に示す追加モード０では、基準ブロックの第一列に位置する画素ａ’、ｅ’、ｉ’、ｍ’を平均した画素値が、第二基準画素Ａ’として算出される。同様に基準ブロックの第二列に位置する画素ｂ’、ｆ’、ｊ’、ｎ’を平均した画素値が、第二基準画素Ｂ’として算出される。以下同様にして、各列の画素値の平均値が、それぞれ第二基準画素Ｃ’、Ｄ’として生成される。 For example, in the additional mode 0 shown in FIG. 19A, a pixel value obtained by averaging the pixels a ′, e ′, i ′, m ′ located in the first column of the reference block is calculated as the second reference pixel A ′. Is done. Similarly, a pixel value obtained by averaging the pixels b ', f', j ', and n' located in the second column of the reference block is calculated as the second reference pixel B '. Similarly, the average values of the pixel values in each column are generated as second reference pixels C ′ and D ′, respectively.

また図１９（ｃ）に示す追加モード２では、基準ブロックのすべての画素ａ’〜ｐ’の画素値の平均が、第二基準画素Ａ’〜Ｍ’にそれぞれ入力される。これら各追加モードにおける各第二基準画素の演算方法を、図１０の画素値を用いて数式で示した表を、表３に示す。このようにして、ステップＳ２で選択された基準ブロックの画素値を用いて、既存のイントラ基準画素生成とは逆の方法によって、第二基準画素Ａ’〜Ｍ’を生成する。 In addition mode 2 shown in FIG. 19C, the average of the pixel values of all the pixels a 'to p' of the reference block is input to the second reference pixels A 'to M', respectively. Table 3 shows a table showing the calculation method of each second reference pixel in each of these additional modes using mathematical values using the pixel values of FIG. In this way, the second reference pixels A ′ to M ′ are generated by using the pixel value of the reference block selected in step S <b> 2 by a method opposite to the existing intra reference pixel generation.

（ステップＳ４：予測画素の生成）
(Step S4: Generation of prediction pixel)

そして、得られた第二基準画素Ａ’〜Ｍ’に基づいて、従来のイントラ予測モードと同様の処理が行われる。すなわち、図６（ａ）〜（ｊ）に示す予測モード０〜８の９つの予測モードが実行され、基準画素Ａ〜Ｍに基づいて予測画素ａ〜ｐが得られる。加えて、ステップＳ３で生成した第二基準画素Ａ’〜Ｍ’に基づいて、第二予測画素ａ〜ｐが得られる。このように、既存の９つのイントラ予測モードに加えて、新たに図１９に示す９つの追加モードでそれぞれ演算された第二基準画素Ａ’〜Ｍ’に基づいて、第二予測画素ａ〜ｐが生成される。
（ステップＳ５：レート歪み最適化）Then, based on the obtained second reference pixels A ′ to M ′, the same processing as in the conventional intra prediction mode is performed. That is, nine prediction modes 0 to 8 shown in FIGS. 6A to 6J are executed, and prediction pixels a to p are obtained based on the reference pixels A to M. In addition, second predicted pixels a to p are obtained based on the second reference pixels A ′ to M ′ generated in step S3. In this way, in addition to the nine existing intra prediction modes, the second prediction pixels a to p are newly calculated based on the second reference pixels A ′ to M ′ respectively calculated in the nine additional modes shown in FIG. Is generated.
(Step S5: Rate distortion optimization)

さらに、得られた予測画素に基づいてレート歪み最適化が実行される。すなわち、従来のイントラ予測モード０〜８に加え、追加モード０〜８に基づく、計１８通りの予測画素に基づいて、原画素との差分に基づいてコストが演算され、ＲＤＯ（レート歪み最適化）が行われる。これにより、従来よりも精度の高い符号化が期待できる。この方法であれば、追加モード０〜８が追加される分だけ理論上は演算量が増えるものの、第二基準画素の演算と、これに基づく予測画素の演算は演算処理が相当部分で共通するため、実際の処理としては殆ど変化しない。逆に、従前の演算結果を適宜利用できるため、効率はよいといえる。さらに、既存のイントラ予測モードと同様のアルゴリズムを採用しているため、既存のハードウェアが利用でき、導入に当たってコスト面での不利益は殆ど生じず、既存の設備に安価に実装できるという優れた利点が得られる。 Furthermore, rate distortion optimization is performed based on the obtained predicted pixels. That is, in addition to the conventional intra prediction modes 0 to 8, the cost is calculated based on the difference from the original pixel based on a total of 18 prediction pixels based on the additional modes 0 to 8, and RDO (Rate distortion optimization) ) Is performed. As a result, encoding with higher accuracy than conventional can be expected. With this method, although the amount of calculation increases theoretically as the additional modes 0 to 8 are added, the calculation processing of the second reference pixel and the calculation of the prediction pixel based on this are common in a considerable part. Therefore, the actual processing hardly changes. On the contrary, it can be said that the efficiency is good because the previous calculation result can be used appropriately. In addition, because it uses the same algorithm as the existing intra prediction mode, existing hardware can be used, and there is almost no cost penalty for installation, and it can be implemented at low cost on existing equipment. Benefits are gained.

（シミュレーション結果）
本実施の形態に係る時間的予測イントラモードを、Ｈ．２６４／ＡＶＣの基準ソフトウェアとして利用されているＪＭ（非特許文献５）に実装してシミュレーションを行った。ここではテスト動画像として、「バス」、「コーストガード」、「フットボール」、「フォアマン」、「モバイル」、「タンペット」を用いた。またシミュレーション条件は、以下の通りである。
コーデック：ＪＭ１４．２
解像度：ＣＩＦ
フレームレート：３０Ｈｚ
フレーム数：６５
プロファイル：ハイプロファイル
ＣＡＢＡＣ：ＯＮ
ＱＰ設定：１８〜３０
ＧＯＰサイズ：１６
これらのシミュレーションにおいて、最初のフレームのみがＩフレームとして符号化され、他のフレームはＰフレームとして符号化されている。(simulation result)
The temporal prediction intra mode according to this embodiment is referred to as H.264. The simulation was performed by mounting on JM (Non-Patent Document 5) used as the H.264 / AVC standard software. Here, “Bus”, “Coast Guard”, “Football”, “Foreman”, “Mobile”, and “Tamppet” were used as test moving images. The simulation conditions are as follows.
Codec: JM14.2
Resolution: CIF
Frame rate: 30Hz
Number of frames: 65
Profile: High Profile CABAC: ON
QP setting: 18-30
GOP size: 16
In these simulations, only the first frame is encoded as an I frame and the other frames are encoded as P frames.

ここでは符号化の性能を、Ｒ−Ｄ曲線（Rate-Distortion Curve：レート歪曲線）により評価した。図２０〜図２５に、シミュレーション結果のＲ−Ｄ曲線を示す。これらの図において、図２０はバスの動画像、図２１はコーストガードの動画像、図２２はフットボールの動画像、図２３はフォアマンの動画像、図２４はモバイルの動画像、図２５はタンペットの動画像における、従来の符号化と実施例１に係る符号化とで効率を比較したシミュレーション結果を、それぞれ示している。これらの図から明らかな通り、時間的予測イントラモードアルゴリズムは、従来のＪＭと比較して同ビットレートの場合、画質は平均で１ｄＢ以上向上している。また同画質の場合、約１５％のビットレートの削減を実現できた。このことは、他の圧縮方法と組み合わせることで、Ｈ．２６５／ＡＶＣで目標とする約５０％削減を実現できる可能性を示唆するものである。 Here, the encoding performance was evaluated by an RD curve (Rate-Distortion Curve). 20 to 25 show RD curves as simulation results. In these figures, FIG. 20 is a bus moving image, FIG. 21 is a coast guard moving image, FIG. 22 is a football moving image, FIG. 23 is a foreman moving image, FIG. 24 is a mobile moving image, and FIG. The simulation results comparing the efficiency of the conventional encoding and the encoding according to the first embodiment in the pet moving image are shown. As is clear from these figures, the temporal prediction intra mode algorithm improves the image quality on average by 1 dB or more at the same bit rate as compared with the conventional JM. In the case of the same image quality, the bit rate was reduced by about 15%. This can be achieved by combining with other compression methods. This suggests the possibility of realizing a target reduction of about 50% with H.265 / AVC.

以上の通り、ＧＰＵへの実装を主目的とするＨ．２６４／ＡＶＣの新たな並列処理アルゴリズムとして時間的予測イントラモードが有効であることが確認された。すなわち、イントラ予測の完全なマクロブロックレベルの並列処理を実現するために、符号化された画素データに代えて原画素データをＲＤＯ処理で使用した。また、空間的予測や時間的予測を個別に使用した際の欠点を補償するため、従来のイントラモードに時間的予測概念を導入した。これにより、シミュレーション結果は、符号化性能が約１ｄＢ改善された。 As described above, H.264 mainly intended for implementation on GPUs. It was confirmed that the temporal prediction intra mode is effective as a new parallel processing algorithm of H.264 / AVC. That is, in order to realize complete macroblock level parallel processing of intra prediction, original pixel data is used in RDO processing instead of encoded pixel data. In addition, in order to compensate for the shortcomings of using spatial prediction and temporal prediction individually, we introduced a temporal prediction concept in the conventional intra mode. As a result, the simulation result shows that the encoding performance is improved by about 1 dB.

以上の実施例１では、参照画素の生成方法として図１９（ａ）〜（ｉ）に示した追加モード１〜８を用いている。各追加モード１〜８では、図１０に示した画素値Ｃ0〜Ｃ15を用いて、表３に示す数式に従い参照画素として第二基準画素Ａ’〜Ｍ’を演算している。
（重み付け）In the first embodiment, the additional modes 1 to 8 shown in FIGS. 19A to 19I are used as the reference pixel generation method. In each of the additional modes 1 to 8, the pixel values C0 to C15 shown in FIG. 10 are used to calculate the second reference pixels A ′ to M ′ as reference pixels according to the mathematical expressions shown in Table 3.
(Weighting)

ここで、注目画素と参照画素との距離に応じて画素値の重み付けを行うことで、より正確な参照画素（第三基準画素）が得られると考えられる。そこで、追加モードとして、重み付けを加味した重み付けモード（Ｗｔ−ＭＯＤＥ）を採用した動画像符号化方法を、実施例２として行った。実施例２で用いた重み付けモード０〜８における各第三基準画素Ａ”〜Ｍ”の演算方法を、表４に示す。 Here, it is considered that a more accurate reference pixel (third reference pixel) can be obtained by weighting the pixel value according to the distance between the target pixel and the reference pixel. Therefore, a moving image encoding method that employs a weighting mode (Wt-MODE) with weighting as an additional mode was performed as Example 2. Table 4 shows calculation methods of the third reference pixels A ″ to M ″ in the weighting modes 0 to 8 used in the second embodiment.

（位置情報） (location information)

一方で、上述した実施例１では図１８に示すように、候補となる候補基準ブロックの中から基準ブロックを選択する際、従前フレームｎ−１中のどの位置にあるマクロブロックを基準ブロックとして選択したのか、位置情報を考慮していない。例えば、図２６に示すように従前フレームｎ−１のマクロブロック中で、最も選択された数の多い追加モードをモードｘとする。ここで基準マクロブロックの拡大図において、斜線で示すブロックがモードｘで符号化されている場合、これらを候補基準ブロックとする。そして図２７に示すように、候補基準ブロックを構成するブロックＲｅｆ0〜Ｒｅｆ12の中から、符号化対象のフレームｎの現マクロブロックと最も類似したブロックを探索していく。例えばこの図においてＲｅｆ0が最も類似するブロックである場合は、Ｒｅｆ0を基準ブロックとして選択すると共に、この基準ブロックの位置を記憶させる。このようにして位置情報を加えることで、正確な画像のデコードが可能となる。 On the other hand, in the first embodiment, as shown in FIG. 18, when selecting a reference block from candidate candidate reference blocks, a macroblock at any position in the previous frame n-1 is selected as the reference block. The location information is not taken into account. For example, as shown in FIG. 26, the most selected additional mode in the macroblock of the previous frame n−1 is the mode x. Here, in the enlarged view of the reference macroblock, when blocks indicated by diagonal lines are encoded in mode x, these are set as candidate reference blocks. Then, as shown in FIG. 27, a block most similar to the current macroblock of the frame n to be encoded is searched from among the blocks Ref0 to Ref12 constituting the candidate reference block. For example, in the figure, when Ref0 is the most similar block, Ref0 is selected as a reference block and the position of this reference block is stored. By adding position information in this way, accurate image decoding can be performed.

しかしながら、位置情報を追加するとその分だけビット数が増え、データサイズが増大する。例えば図２８の例では、基準ブロックがＲｅｆ0〜Ｒｅｆ15の１６通りの位置を取り得るため、符号化対象のフレームｎの各ブロックに対して２4すなわち４ビットが位置情報として追加されることになる。このようなデータサイズの変化する様子を図２９に示す。この図において斜線で示す部分が、位置情報の追加によって増大したビット数を示している。特に図３０に示すように、従前フレームｎ−１中における基準ブロックの位置情報と、符号化対象のフレームｎ中における位置情報とを、１：１に対応させるように記憶すると、基準ブロックの位置精度は向上するものの、発生ビット数が多くなってしまう。この結果、最終的な符号化効率が低下するという問題があった。 However, if position information is added, the number of bits increases accordingly, and the data size increases. For example, in the example of FIG. 28, since the reference block can take 16 positions from Ref0 to Ref15, 2 4, that is, 4 bits are added as position information to each block of the frame n to be encoded. FIG. 29 shows how the data size changes. In this figure, the hatched portion indicates the number of bits increased by the addition of position information. In particular, as shown in FIG. 30, when the position information of the reference block in the previous frame n-1 and the position information in the frame n to be encoded are stored so as to correspond to 1: 1, the position of the reference block Although the accuracy is improved, the number of generated bits increases. As a result, there is a problem that the final encoding efficiency is lowered.

そこで、実施例２においては、幾つかのマクロブロックの位置情報を纏めることで、効率的に位置情報を付加している。ここでは、マクロブロックをサブブロックに分割して、分割されたサブブロックのそれぞれに対し、先頭のブロックをキーブロックとして設定し、このキーブロックに対して位置情報を設定すると共に、キーブロック以外のブロックは、キーブロックの位置情報で代用する。図３１に示す例では、マクロブロックを２×２のサブブロックに分割している。さらに各サブブロックの先頭の４×４のブロックをキーブロックとして設定している。図３１の例では、ブロックｋｅｙ0、ｋｅｙ1、ｋｅｙ2、ｋｅｙ3をそれぞれキーブロックとして位置情報を設定している。そして各サブブロックの、キーブロック以外のブロックはキーブロックの位置情報を使用する。この方法であれば、位置情報を１／４に低減できるため、位置情報の追加によるビット数の上昇を抑制できる。 Therefore, in the second embodiment, the position information is efficiently added by collecting the position information of several macroblocks. Here, the macroblock is divided into sub-blocks, and for each of the divided sub-blocks, the top block is set as a key block, position information is set for this key block, and other than the key block The block is substituted with the position information of the key block. In the example shown in FIG. 31, the macroblock is divided into 2 × 2 sub-blocks. Further, the top 4 × 4 block of each sub-block is set as a key block. In the example of FIG. 31, the position information is set with the blocks key0, key1, key2, and key3 as key blocks. The blocks other than the key block in each sub block use the position information of the key block. With this method, since the position information can be reduced to ¼, an increase in the number of bits due to the addition of the position information can be suppressed.

発明者らが行った試験によれば、どのシーケンスでも参照ブロックはかたまって存在する傾向があることが確認された。図３２に、複数の標準的な動画像ソースに対してサブブロック単位で先頭の４×４ブロックを基準として、参照ブロックと他のブロックの位置が同じ割合を演算した結果を示している。この図に示すように、いずれのシーケンスにおいても、８０％以上で参照ブロックの位置が利用できることが確認された。よって、先頭のブロックを中心に位置情報を付加しても、実用上ほぼ問題なく符号化が実現できることが判明した。 According to tests conducted by the inventors, it was confirmed that the reference blocks tend to exist in any sequence. FIG. 32 shows the result of calculating the same ratio of the position of the reference block and the other blocks on the basis of the top 4 × 4 block in units of sub-blocks for a plurality of standard moving image sources. As shown in this figure, it was confirmed that the position of the reference block can be used at 80% or more in any sequence. Therefore, it has been found that even if position information is added around the head block, encoding can be realized with practically no problem.

以上の実施例２の符号化方法で複数の動画像シーケンスに対して実施例２に係る符号化と、上述した比較例１に係る符号化とで効率を比較したシミュレーション結果を、図３３〜図３８に示す。これらの図において、図３３はバスの動画像、図３４はコーストガードの動画像、図３５はフットボールの動画像、図３６はフォアマンの動画像、図３７はモバイルの動画像、図３８はタンペットの動画像における、実施例２に係る符号化と比較例１に係る符号化とで効率を比較したシミュレーション結果をそれぞれ示している。各グラフにおいて、横軸はビットレート［ｋｂｉｔ／ｓ］、縦軸はＰＳＮＲ［ｄＢ］を示している。また、動画像シーケンスとしてバス、フットボール、フォアマン、タンペット、モバイル、コーストガードに対して、実施例２の符号化を行って得られた各グラフのＲＤ曲線のＰＳＮＲ差分（ΔＰＳＮＲ［ｄＢ］）とビット削減率（Δｂｉｔｒａｔｅ［％］）を、表５に示す。これらの結果から、いずれの動画像においてもビットレートが削減されており、これらの平均で８．１４％という優れたビットレート削減を実現できた。 FIG. 33 to FIG. 33 show simulation results comparing the efficiency of the encoding according to Example 2 and the encoding according to Comparative Example 1 described above with respect to a plurality of moving image sequences by the encoding method of Example 2 described above. 38. In these figures, FIG. 33 is a bus moving image, FIG. 34 is a coast guard moving image, FIG. 35 is a football moving image, FIG. 36 is a foreman moving image, FIG. 37 is a mobile moving image, and FIG. The simulation result which compared the efficiency by the encoding which concerns on Example 2 and the encoding which concerns on the comparative example 1 in the moving image of a pet is each shown. In each graph, the horizontal axis represents the bit rate [kbit / s], and the vertical axis represents PSNR [dB]. Further, PSNR difference (ΔPSNR [dB]) of the RD curve of each graph obtained by performing the encoding of Example 2 on the bus, football, foreman, tamper, mobile, and coast guard as a moving image sequence and Table 5 shows the bit reduction rate (Δbitrate [%]). From these results, the bit rate was reduced in any moving image, and an excellent bit rate reduction of 8.14% on average was realized.

本発明の動画像符号化装置、動画像符号化方法、動画像符号化プログラム及びコンピュータで読み取り可能な記録媒体は、動画像信号を圧縮して記録したり伝送したりする用途に有用であり、例えばモバイル端末装置、動画像撮像装置、動画像記録・再生装置等、高画質動画像を処理する装置及びその応用分野において好適に利用できる。 The moving image encoding apparatus, the moving image encoding method, the moving image encoding program, and the computer-readable recording medium of the present invention are useful for applications in which a moving image signal is compressed and recorded or transmitted. For example, it can be suitably used in devices that process high-quality moving images, such as mobile terminal devices, moving image capturing devices, moving image recording / reproducing devices, and their application fields.

１００…動画像符号化装置
１０２…動き推定部
１０４…インター予測部
１０６…イントラ予測部
１０８…変換部
１１０…量子化部
１１４…エントロピー符号化部
１１６…逆量子化部
１１８…逆変換部
１１９…ＲＤＯ処理部
１２０…フィルタ
１２２…フレームメモリDESCRIPTION OF SYMBOLS 100 ... Video coding apparatus 102 ... Motion estimation part 104 ... Inter prediction part 106 ... Intra prediction part 108 ... Transformer 110 ... Quantization part 114 ... Entropy encoding part 116 ... Inverse quantization part 118 ... Inverse transform part 119 ... RDO processing unit 120 ... filter 122 ... frame memory

Claims

A moving image encoding method for encoding moving image data,
Obtaining moving image data;
In encoding an arbitrary macroblock constituting an arbitrary current image frame of the obtained moving image data for each subblock constituting the macroblock, the subframe in the previous image frame that has already been encoded is encoded. Extracting one or more candidate reference blocks highly relevant in the plurality of blocks from a plurality of blocks existing at a position corresponding to the block and its surroundings; and
Selecting from among the one or more extracted candidate reference blocks as a reference block a block that most closely approximates a sub-block to be encoded;
Generating a second reference pixel according to a plurality of pre-defined additional prediction modes based on a plurality of pixel values constituting the selected reference block;
Generating a second prediction pixel according to a predetermined intra prediction mode based on the second reference pixel;
Performing rate distortion optimization based on the second prediction pixel and selecting an intra prediction mode with the lowest rate distortion cost;
Encoding according to the selected intra prediction mode;
A moving picture encoding method comprising:

The moving image encoding method according to claim 1,
In the step of generating the second prediction pixel, generating a prediction pixel acquired according to a predetermined intra prediction mode,
In the rate distortion optimization, an intra prediction mode with the lowest rate distortion cost is selected including a prediction pixel in addition to the second prediction pixel.

The moving image encoding method according to claim 1 or 2,
The step of extracting the one or more candidate reference blocks selects the most commonly used intra prediction mode as the most numerous mode among the plurality of intra prediction modes employed in encoding the plurality of blocks, A moving picture coding method, wherein all blocks adopting the most number of modes are extracted as candidate reference blocks.

A moving image encoding method according to any one of claims 1 to 3, comprising:
The step of generating the second reference pixel includes adding the pixels arranged in the vertical direction among the plurality of pixel values constituting the selected reference block and dividing the average value by the number of pixels into the second value A moving image encoding method, wherein the pixel value of a reference pixel is used.

A moving image encoding method according to any one of claims 1 to 3, comprising:
The step of generating the second reference pixel includes adding the pixels arranged in the horizontal direction among the plurality of pixel values constituting the selected reference block and dividing the average value by the number of pixels into the second value A moving image encoding method, wherein the pixel value of a reference pixel is used.

A moving image encoding method according to any one of claims 1 to 3, comprising:
In the step of generating the second reference pixel, an average value obtained by adding all the plurality of pixel values constituting the selected reference block and dividing by the number of all pixels is set as the pixel value of the second reference pixel. A video encoding method characterized by the above.

A moving image encoding method according to any one of claims 1 to 6, comprising:
The plurality of additional prediction modes may be H.264. A moving picture coding method, which is an inverse operation of four intra prediction modes of 4 × 4 blocks of luminance components defined by the H.264 / AVC standard.

A moving image encoding method according to any one of claims 1 to 6, comprising:
The plurality of additional prediction modes may be H.264. A moving picture encoding method, which is an inverse operation of four intra prediction modes of 16 × 16 blocks of luminance components defined by the H.264 / AVC standard.

A moving image encoding method according to any one of claims 1 to 8, comprising:
A video encoding method characterized in that inter prediction mode is preferentially adopted over intra prediction mode in encoding, and the intra prediction mode is adopted when the inter prediction mode has a higher rate distortion cost than the intra prediction mode. .

A moving image encoding method according to any one of claims 1 to 9, comprising:
A moving image encoding method, wherein original pixel data is used as image data to be encoded.

A moving image encoding method according to any one of claims 1 to 10, comprising:
The step of generating the second reference pixel is a step of generating a third reference pixel in which a plurality of pixel values constituting the selected reference block are weighted according to the distance from the target pixel. A video encoding method characterized by the above.

A moving image encoding method according to any one of claims 1 to 11, comprising:
In the step of selecting the reference block, for each of the sub-blocks, the first block is set as a key block, position information is set for the key block, and blocks other than the key block are key block positions. A moving picture coding method characterized by substituting information.

Moving image input means for acquiring moving image data;
Compression means for compressing moving image data input by the moving image input means;
Quantization means for quantizing the compressed data compressed by the compression means;
A video encoding device comprising:
The compression means includes
In encoding an arbitrary macroblock constituting an arbitrary current image frame of the obtained moving image data for each subblock constituting the macroblock, the subframe in the previous image frame that has already been encoded is encoded. An extracting means for extracting one or more candidate reference blocks having high relevance in the plurality of blocks from a plurality of blocks existing at and around the position corresponding to the block;
A reference block selecting means for selecting, as a reference block, a block that most closely approximates a sub-block to be encoded from among one or more candidate reference blocks extracted by the extracting means;
Second reference pixel generation means for generating a second reference pixel according to a plurality of additional prediction modes defined in advance based on a plurality of pixel values constituting the reference block selected by the reference block selection means;
Second reference prediction pixel generation means for generating a second prediction pixel according to a predetermined intra prediction mode based on the second reference pixel;
Including
A moving picture code comprising: performing rate distortion optimization based on the second prediction pixel generated by the second reference prediction pixel generation means; and selecting an intra prediction mode having the lowest rate distortion cost. Device.

A moving image encoding program for encoding moving image data, which is stored in a computer,
A function to acquire moving image data;
In encoding an arbitrary macroblock constituting an arbitrary current image frame of the obtained moving image data for each subblock constituting the macroblock, the subframe in the previous image frame that has already been encoded is encoded. A function of extracting one or more candidate reference blocks having high relevance in the plurality of blocks from a plurality of blocks existing at a position corresponding to the block and its surroundings;
A function of selecting, as a reference block, a block that most closely approximates a sub-block to be encoded from among the extracted one or more candidate reference blocks;
A function of generating a second reference pixel according to a plurality of pre-defined additional prediction modes based on a plurality of pixel values constituting the selected reference block;
A function for generating a second prediction pixel according to a predetermined intra prediction mode based on the second reference pixel;
A function of performing rate distortion optimization based on the second prediction pixel and selecting an intra prediction mode having the lowest rate distortion cost;
A function of performing encoding according to the selected intra prediction mode;
A moving picture encoding program characterized by realizing the above.

A computer-readable recording medium or a recorded device storing the program according to claim 14.