JP2018078545A

JP2018078545A - Method for selecting prediction mode of intraprediction, video coding device and image processing apparatus

Info

Publication number: JP2018078545A
Application number: JP2017193720A
Authority: JP
Inventors: 俊▲隆▼ 林; Chun-Lung Lin; ▲敬▼傑林; Ching-Chieh Lin; 柏翰林; Po-Han Lin
Original assignee: Industrial Technology Research Institute ITRI
Current assignee: Industrial Technology Research Institute ITRI
Priority date: 2016-10-07
Filing date: 2017-10-03
Publication date: 2018-05-17

Abstract

PROBLEM TO BE SOLVED: To provide a method for selecting prediction mode of intraprediction capable of improving video encoding efficiency and processing speed, while reducing the video encoding hardware mounting cost, and to provide a video coding device and an image processing apparatus.SOLUTION: A method for selecting prediction mode of intraprediction includes a step of calculating multiple prediction costs corresponding to multiple prediction modes in the intraprediction based on a block, when a conversion unit performs operation based on a preset conversion index, a step of selecting multiple candidate prediction modes from the prediction mode based on the prediction cost, a step of calculating multiple distortion costs corresponding to the candidate prediction modes in multiple conversion indices, based on the prediction cost corresponding to the block and candidate prediction mode, and a step of selecting one of the candidate prediction modes based on the distortion cost, as a prediction mode used for intraprediction corresponding to the block.SELECTED DRAWING: Figure 4

Description

本発明は、画面内予測の予測モードを選択する方法、ビデオ符号化デバイス及び画像処理装置に関する。 The present invention relates to a method for selecting a prediction mode for intra prediction, a video encoding device, and an image processing apparatus.

ネットワーク、通信システム、ディスプレイ及びコンピュータなどのアプリケーションの新技術の近頃の発展に従って、多くのアプリケーションは、いずれも例えば高いビデオ圧縮率、バーチャルリアリティ（ＶｉｒｔｕａｌＲｅａｌｉｔｙ；ＶＲ）及び３６０度のビデオコンテンツのような効率的なビデオ符号化の解決方案を必要としている。その場に立ち臨むような視覚効果を提供するために、ビデオにおいてより詳細を見えるようにビデオ解像度を向上させるのが一般的なやり方である。ＶＲ技術は、通常、ヘッドマウントデバイス（ＨｅａｄＭｏｕｎｔｅｄｄｅｖｉｃｅ；ＨＭＤ）により実現され、ヘッドマウントデバイスと目との距離は十分に接近しているため、必要なビデオコンテンツの解像度としては４Ｋ〜８Ｋ、さらには３２Ｋ以上まで向上できることが好ましい。また、画面のリフレッシュレートもＶＲの使用感覚に影響するため、リフレッシュレートを毎秒３０枚、毎秒９０枚、さらに毎秒１２０枚まで増加させることが好ましい。上記の要求に基づき、従来の高効率ビデオ符号化（ＨｉｇｈＥｆｆｉｃｉｅｎｃｙＶｉｄｅｏＣｏｄｉｎｇ；ＨＥＶＣ）（Ｈ．２６５とも称する）は、ユーザにより良好な視覚効果及び体験を提供することができない。 With the recent development of new technologies for applications such as networks, communication systems, displays and computers, many applications are all like high video compression rate, virtual reality (VR) and 360 degree video content, for example. There is a need for an efficient video coding solution. It is common practice to improve the video resolution so that more details can be seen in the video in order to provide an on-the-spot visual effect. VR technology is usually realized by a head mounted device (HMD), and the distance between the head mounted device and the eyes is sufficiently close, so that the necessary video content resolution is 4K-8K, Is preferably improved to 32K or more. Further, since the refresh rate of the screen also affects the feeling of using the VR, it is preferable to increase the refresh rate to 30 images per second, 90 images per second, and further 120 images per second. Based on the above requirements, conventional High Efficiency Video Coding (HEVC) (also referred to as H.265) cannot provide a better visual effect and experience to the user.

デジタルビデオに対する符号化効率及び画像品質を更に高めるために、共同ビデオ探索チーム（ＪｏｉｎｔＶｉｄｅｏＥｘｐｌｏｒａｔｉｏｎＴｅａｍ；ＪＶＥＴ）は、潜在的要求を解決する数種の強化型ビデオ符号化技術を共同探索テストモデル（ＪｏｉｎｔＥｘｐｌｏｒａｔｉｏｎＴｅｓｔＭｏｄｅｌ；ＪＥＭ）に応用し、ビデオ符号化技術の進歩を試行的に推進している。ＪＥＭが採用する画面内予測（ｉｎｔｒａｐｒｅｄｉｃｔｉｏｎ）技術は、従来のＨＥＶＣが具備する３５種類の予測モードから６７種類の予測モードまで拡張され、より正確な角度予測に用いられる。 In order to further improve the coding efficiency and image quality for digital video, the Joint Video Exploration Team (JVET) has developed several enhanced video coding techniques that solve potential demands in a joint search test model ( This is applied to the Joint Exploration Test Model (JEM), and the advancement of video coding technology is being promoted on a trial basis. The intra prediction technology adopted by JEM is expanded from 35 kinds of prediction modes provided in the conventional HEVC to 67 kinds of prediction modes, and is used for more accurate angle prediction.

また、ＪＥＭは、さらに変換ユニット（ＴｒａｎｓｆｏｒｍＵｎｉｔ；ＴＵ）にモード依存（ｍｏｄｅ−ｄｅｐｅｎｄｅｎｔ）非分離型二次変換（ｎｏｎ−ｓｅｐａｒａｂｌｅｓｅｃｏｎｄａｒｙｔｒａｎｓｆｏｒｍ；ＮＳＳＴ）技術を導入している。ＮＳＳＴは、ビデオエンコーダの一次変換（ｐｒｉｍａｒｙｔｒａｎｓｆｏｒｍ）（コア変換（ｃｏｒｅｔｒａｎｓｆｏｒｍ）又は第１変換（ｆｉｒｓｔｔｒａｎｓｆｏｒｍ）とも言う）と量子化（ｑｕａｎｔｉｚａｔｉｏｎ）との間で実現されることができ、また、ビデオエンコーダの逆量子化（ｄｅ−ｑｕａｎｔｉｚａｔｉｏｎ）と逆一次変換において実現されることもできる。ＮＳＳＴは、指向性テクスチャパターン（ｄｉｒｅｃｔｉｏｎａｌｔｅｘｔｕｒｅｐａｔｔｅｒｎ）においてよりよい圧縮率に達することができ、但し、比較的に複雑な演算を必要とする。 Further, JEM introduces a mode-dependent non-separable secondary transform (NSST) technique in a transform unit (Transform Unit; TU). NSST can be implemented between primary transform (also called core transform or first transform) and quantization of a video encoder, It can also be realized in de-quantization and inverse linear transformation of the encoder. NSST can reach better compression ratios in directional texture patterns, but requires relatively complex operations.

本発明は、ビデオ符号化の効率と処理速度を向上させるとともに、ビデオ符号化のハードウェア実装のコストを低減することができる、画面内予測の予測モードを選択する方法、ビデオ符号化デバイス及び画像処理装置を提供することを目的とする。 The present invention relates to a method for selecting a prediction mode for intra prediction, a video encoding device, and an image, which can improve the efficiency and processing speed of video encoding and reduce the cost of hardware implementation of video encoding. An object is to provide a processing apparatus.

本発明の画面内予測の予測モードを選択する方法は以下のステップを含む。変換ユニットが予め設定された変換インデックスに基づいて動作を行う場合、入力画像のブロックに基づいて画面内予測における複数の予測モードが対応する複数の予測コストを計算するステップ、前記複数の予測コストに基づいて前記複数の予測モードから複数の候補予測モードを選択するステップ、前記ブロック及び前記複数の候補予測モードが対応する前記予測コストに基づいて、複数の変換インデックスにおいて前記複数の候補予測モードが対応する複数の歪みコストを計算するステップ、及び前記歪みコストに基づいて前記複数の候補予測モードからそのうちの１つを選択して前記ブロックに対応する画面内予測の使用する予測モードとするステップ。 The method for selecting a prediction mode for intra prediction according to the present invention includes the following steps. When the conversion unit performs an operation based on a preset conversion index, calculating a plurality of prediction costs corresponding to a plurality of prediction modes in intra prediction based on a block of the input image, the plurality of prediction costs Selecting a plurality of candidate prediction modes from the plurality of prediction modes based on the prediction cost corresponding to the block and the plurality of candidate prediction modes, the plurality of candidate prediction modes corresponding to a plurality of transform indexes Calculating a plurality of distortion costs, and selecting one of the plurality of candidate prediction modes based on the distortion costs to make a prediction mode used for in-screen prediction corresponding to the block.

本発明のビデオ符号化デバイスは少なくとも変換ユニットと画面内予測ユニットとを含む。変換ユニットは、複数の変換インデックスに基づいて入力画像のブロックが対応する残差を変換することに用いられる。画面内予測ユニットは前記変換ユニットに結合される。前記変換ユニットが予め設定された変換インデックスに基づいて動作を行う場合、画面内予測ユニットは入力画像のブロックを取得し、かつ前記ブロックに基づいて画面内予測における複数の予測モードが対応する複数の予測コストを計算する。前記予め設定された変換インデックスは前記複数の変換インデックスのうちの１つである。画面内予測ユニットは、前記予測コストに基づいて前記予測モードから複数の候補予測モードを選択し、前記ブロック及び前記複数の候補予測モードが対応する前記予測コストに基づいて前記変換ユニットの前記複数の変換インデックスにおいて前記複数の候補予測モードが対応する複数の歪みコストを計算し、前記歪みコストに基づいて前記複数の候補予測モードからそのうちの１つを選択して前記ブロックに対応する画面内予測の使用する予測モードとする。 The video encoding device of the present invention includes at least a transform unit and an intra prediction unit. The transform unit is used to transform a residual corresponding to a block of the input image based on a plurality of transform indexes. An in-screen prediction unit is coupled to the conversion unit. When the conversion unit performs an operation based on a preset conversion index, the intra prediction unit obtains a block of the input image, and a plurality of prediction modes corresponding to a plurality of prediction modes in the intra prediction based on the block are obtained. Calculate the estimated cost. The preset conversion index is one of the plurality of conversion indexes. The in-screen prediction unit selects a plurality of candidate prediction modes from the prediction mode based on the prediction cost, and the plurality of conversion units based on the prediction cost to which the block and the plurality of candidate prediction modes correspond. Calculating a plurality of distortion costs corresponding to the plurality of candidate prediction modes in the transformed index, selecting one of the plurality of candidate prediction modes based on the distortion cost, and performing an intra-screen prediction corresponding to the block The prediction mode to be used is used.

本発明の画像処理装置はプロセッサとメモリとを含む。前記プロセッサが予め設定された変換インデックスに基づいて残差を変換する場合、入力画像のブロックに基づいて画面内予測における複数の予測モードが対応する複数の予測コストを計算する。前記残差は前記ブロックに対応する。プロセッサは前記予測コストに基づいて前記複数の予測モードから複数の候補予測モードを選択し、前記ブロック及び前記候補予測モードが対応する前記予測コストに基づいて、複数の変換インデックスにおいて前記候補予測モードが対応する複数の歪みコストを計算する。前記予め設定された変換インデックスは前記複数の変換インデックスのうちの１つである。プロセッサは前記歪みコストに基づいて前記複数の候補予測モードからそのうちの１つを選択して、前記ブロックに対応する画面内予測の使用する予測モードとする。 The image processing apparatus of the present invention includes a processor and a memory. When the processor converts a residual based on a preset conversion index, it calculates a plurality of prediction costs corresponding to a plurality of prediction modes in intra prediction based on a block of the input image. The residual corresponds to the block. The processor selects a plurality of candidate prediction modes from the plurality of prediction modes based on the prediction cost, and the candidate prediction mode is selected in a plurality of conversion indexes based on the prediction cost corresponding to the block and the candidate prediction mode. Calculate the corresponding multiple distortion costs. The preset conversion index is one of the plurality of conversion indexes. The processor selects one of the plurality of candidate prediction modes based on the distortion cost, and sets it as the prediction mode used by the intra prediction corresponding to the block.

上記に基づき、本発明の実施例に記載の画面内予測のモードの選択方法、ビデオ符号化デバイス及び画像処理装置が画面内予測の予測モードの選択を行うとき、まず、変換ユニットを予め設定された変換インデックスに設定してから（例えば、変換ユニットを、第2の変換ユニットを無効にし、かつ第１の変換ユニットのみで残差を変換する動作モードに設定する）、入力画像のブロックに基づいて画面内予測における各予測モードが対応する予測コストを計算し、これらの予測モードから複数の候補予測モードを選択する。次に、候補予測モードが対応する予測コスト及び前記ブロックによってこれらの候補予測モードから最適な（例えば、最も低い）歪みコストを具備する候補予測モードを、使用する予測モードとする。言い換えると、本発明の実施例は、変換ユニットにおける異なる動作モード（即ち、異なる変換インデックスに基づいて残差を変換する場合）に対して各予測モードが対応する予測コストの計算をそれぞれ行わず、変換ユニットにおける予測動作モード（即ち、予め設定された変換インデックスに基づいて残差を変換する場合）に対して画面内予測における各予測モードが対応する予測コストの計算を1回行う。そして、上記予測コストと変換ユニットが異なる変換インデックスにおいて残差を変換する場合を結合して歪みコストの計算を実現し、後続の候補予測モードの選択を行う。これにより、本発明の実施例は、予測コストの計算量を大幅に減少させ、ビデオ符号化の効率及び処理速度を向上させるとともに、ビデオ符号化のハードウェア実装のコストを低減することができる。 Based on the above, when the selection method of the intra prediction mode, the video encoding device, and the image processing apparatus described in the embodiments of the present invention select the prediction mode of the intra prediction, first, the conversion unit is set in advance. Based on the block of the input image (for example, the conversion unit is set to an operation mode in which the second conversion unit is disabled and the residual is converted only by the first conversion unit). Then, the prediction cost corresponding to each prediction mode in the intra prediction is calculated, and a plurality of candidate prediction modes are selected from these prediction modes. Next, a candidate prediction mode having an optimal (for example, the lowest) distortion cost from these candidate prediction modes according to the prediction cost corresponding to the candidate prediction mode and the block is set as a prediction mode to be used. In other words, the embodiment of the present invention does not calculate the prediction cost corresponding to each prediction mode for different operation modes in the conversion unit (i.e., when converting the residual based on different conversion indexes), The prediction cost corresponding to each prediction mode in the intra prediction is calculated once for the prediction operation mode in the conversion unit (that is, when the residual is converted based on a preset conversion index). Then, by combining the cases where the prediction cost and the conversion unit having different conversion units are used to convert the residual, the distortion cost is calculated, and the subsequent candidate prediction mode is selected. Thus, the embodiment of the present invention can greatly reduce the amount of calculation of the prediction cost, improve the efficiency and processing speed of video encoding, and reduce the cost of hardware implementation of video encoding.

本発明の画面内予測のモードの選択方法、ビデオ符号化デバイス及び画像処理装置は、予測コストの計算量を大幅に減少させ、ビデオ符号化の効率及び処理速度を向上させるとともに、ビデオ符号化のハードウェア実装のコストを低減することができる。 The method for selecting the mode of intra prediction of the present invention, the video encoding device, and the image processing apparatus greatly reduce the calculation amount of the prediction cost, improve the efficiency and processing speed of the video encoding, and improve the video encoding. Hardware implementation costs can be reduced.

本発明の実施例に基づくビデオ符号化デバイスの構造ブロック図である。FIG. 2 is a structural block diagram of a video encoding device according to an embodiment of the present invention. 本発明の実施例に符合する画像処理装置のブロック図である。1 is a block diagram of an image processing apparatus consistent with an embodiment of the present invention. 共同探索テストモデル（ＪＥＭ）における画面内予測の２つの段階を示す図である。It is a figure which shows two steps of the prediction in a screen in a joint search test model (JEM). 本発明の実施例に符合する画面内予測の予測モードを選択する方法のフローチャートである。It is a flowchart of the method of selecting the prediction mode of the prediction in a screen corresponding to the Example of this invention.

本発明の上記特徴および長所をより分かりやすくするために、以下では、実施例と図面を合わせて詳しく説明を行う。 In order to make the above features and advantages of the present invention easier to understand, a detailed description will be given below in conjunction with the embodiments and the drawings.

図１は、本発明の実施例に基づくビデオ符号化デバイス１００の構造ブロック図である。ビデオ符号化デバイス１００は、得られた入力映像における複数枚の入力画像ＩＭに基づいてビデオ符号化を行うことによって、入力映像のデータ量を減少させ、入力映像の伝送及び記憶を容易にしている。ビデオ符号化デバイス１００が使用するビデオ符号化は共同探索テストモデル（ＪＥＭ）であってもよく、本発明の実施例に符合する、ビデオ変換において第１の変換及び第2の変換（例えば、ＮＳＳＴ）を具備するビデオ符号化であってもよい。 FIG. 1 is a structural block diagram of a video encoding device 100 according to an embodiment of the present invention. The video encoding device 100 performs video encoding based on a plurality of input images IM in the obtained input video, thereby reducing the data amount of the input video and facilitating transmission and storage of the input video. . The video encoding used by the video encoding device 100 may be a joint search test model (JEM), and in the video conversion, the first and second conversions (e.g., NSST) consistent with embodiments of the present invention. ) May be used.

本実施例のビデオ符号化デバイス１００は、主に変換・量子化ユニット１１０と、逆量子化・逆変換ユニット１２０と、予測ユニット１３０と、ビデオ符号化デバイス１００の入力端子Ｎ１に位置する加算器１４０と、逆量子化・逆変換ユニット１２０の出力端子Ｎ２に位置する加算器１５０と、画面バッファ１６０と、エントロピー符号化ユニット１７０とを含む。変換・量子化ユニット１１０は変換ユニット１１２と量子化ユニット１１５とを含む。予測ユニット１３０は画面内予測ユニット１３２及び画面間予測ユニット１３４とを含む。加算器１４０は、予測ユニット１３０によって提供された情報は、入力画像ＩＭから引き算して入力画像ＩＭの残差ＭＲを得る。 The video encoding device 100 according to the present embodiment mainly includes a transform / quantization unit 110, an inverse quantization / inverse transform unit 120, a prediction unit 130, and an adder located at the input terminal N1 of the video encoding device 100. 140, an adder 150 located at the output terminal N2 of the inverse quantization / inverse transform unit 120, a screen buffer 160, and an entropy coding unit 170. The transform / quantization unit 110 includes a transform unit 112 and a quantization unit 115. The prediction unit 130 includes an intra-screen prediction unit 132 and an inter-screen prediction unit 134. The adder 140 subtracts the information provided by the prediction unit 130 from the input image IM to obtain a residual MR of the input image IM.

ＥＭにおいて、変換ユニット１１２は第１の変換ユニット１１３と第2の変換ユニット１１４とを含む。第１の変換ユニット１１３は入力画像ＩＭの残差ＭＲに対して第１の変換（コア変換又は一次変換とも言う）を行う。第2の変換ユニット１１４は第１の変換が行われた残差に対して第2の変換を行う。ここの第2の変換はモード依存型非分離型二次変換（ＮＳＳＴ）である。ＮＳＳＴの残差処理は予測ユニット１３０（例えば、画面内予測ユニット１３２）が選択・使用する画面内予測モードと関連してもよい。ＪＥＭにおけるＮＳＳＴは３種類の変換コアを具備することができ、画面内予測ユニットは、選択的に、これらの変換コアを使用して残差符号化の効率を強化することができる。言い換えると、ＪＥＭは、選択的に、第１の変換及びＮＳＳＴにおける３種類の変換コアのうちの１つを使用して残差符号化を行うか、又はＮＳＳＴを無効にして第１の変換のみを使用して残差符号化を行うことができる。本実施例は複数の「変換インデックス」によってＮＳＳＴの動作モードを表している。そのうちの１つの変換インデックスは、変換ユニット１１２が第2の変換ユニット１１４を使用せずに現在ブロックの残差を変換することを表し、この動作モードは「予め設定された変換インデックス」により表されることができる。予め設定された変換インデックス以外の変換インデックスは、変換ユニット１１２が第2の変換ユニット１１４における少なくとも１つの変換コア（本発明で使用されるＮＳＳＴは３種類の変換コアを具備する）のうちの１つを使用して現在ブロックの残差を変換する動作モードを表すことに用いられる。言い換えると、本発明は４種類の変換インデックスを具備し、それぞれＮＳＳＴを無効にすること（変換インデックスは「０」である）、第１の変換コアを使用してＮＳＳＴを行うこと（変換インデックスは「１」である）、第2の変換コアを使用してＮＳＳＴを行うこと（変換インデックスは「２」である）、及び第３の変換コアを使用してＮＳＳＴを行うこと（変換インデックスは「３」である）を表す。 In the EM, the conversion unit 112 includes a first conversion unit 113 and a second conversion unit 114. The first conversion unit 113 performs a first conversion (also referred to as core conversion or primary conversion) on the residual MR of the input image IM. The second conversion unit 114 performs a second conversion on the residual subjected to the first conversion. The second transformation here is a mode-dependent non-separable secondary transformation (NSST). The NSST residual processing may be related to the intra prediction mode selected and used by the prediction unit 130 (for example, the intra prediction unit 132). NSST in JEM can comprise three types of transform cores, and the intra prediction unit can selectively use these transform cores to enhance the efficiency of residual coding. In other words, the JEM selectively performs residual encoding using one of the three transform cores in the first transform and NSST, or disables NSST and only the first transform. Can be used to perform residual encoding. In this embodiment, the operation mode of NSST is represented by a plurality of “conversion indexes”. One of the transform indexes represents that the transform unit 112 transforms the residual of the current block without using the second transform unit 114, and this operation mode is represented by “preset transform index”. Can. The conversion index other than the preset conversion index is one of the conversion units 112 out of at least one conversion core in the second conversion unit 114 (NSST used in the present invention includes three types of conversion cores). Is used to represent the mode of operation for transforming the residual of the current block. In other words, the present invention includes four types of conversion indexes, each of which disables NSST (the conversion index is “0”), and performs NSST using the first conversion core (the conversion index is NSST using the second conversion core (conversion index is “2”) and NSST using the third conversion core (conversion index is “1”). 3 ”).

変換ユニット１１２により残差変換されたデータＴＤは、量子化ユニット１１５によって処理された後にデータＤＡとなり、エントロピー符号化ユニット１７０の処理によって圧縮された映像データＶＤとなる。映像データＶＤはデータＤＡのほか、さらに予測ユニット１３０が生成する各種の画面内予測モード及び画面間予測モードを含んでもよい。 The data TD subjected to the residual transform by the transform unit 112 becomes the data DA after being processed by the quantization unit 115, and becomes the video data VD compressed by the processing of the entropy encoding unit 170. The video data VD may include various intra prediction modes and inter prediction modes generated by the prediction unit 130 in addition to the data DA.

ビデオエンコーダされたデータをシミュレーションするために、ビデオ符号化デバイス１００は、逆量子化・逆変換ユニット１２０における逆量子化ユニット１２２と逆変換ユニット１２４によってデータＤＡをビデオエンコーダされた画像データに還元する。この画像データは、加算器１５０と入力画像ＩＭの処理を経て画面バッファ１６０に一時的に保存される。ビデオエンコーダされた画像データは現在ブロックのモード予測として、画面内予測ユニット１３２及び画面間予測ユニット１３４に使用させることができる。 In order to simulate the video-encoded data, the video encoding device 100 reduces the data DA to video-encoded image data by the inverse quantization unit 122 and the inverse transform unit 124 in the inverse quantization and inverse transform unit 120. . This image data is temporarily stored in the screen buffer 160 through the processing of the adder 150 and the input image IM. The video-encoded image data can be used by the intra prediction unit 132 and the inter prediction unit 134 as the mode prediction of the current block.

画面内予測ユニット１３２は、同じ画面における解析されたブロックによって、処理中のブロックに対してピクセル値の予測及び残差の変換を行う。画面間予測ユニット１３４は、連続する複数の入力映像の間のブロックに対してピクセルの予測及び残差の変換を行う。 The in-screen prediction unit 132 performs pixel value prediction and residual conversion on the block being processed by the analyzed blocks on the same screen. The inter-screen prediction unit 134 performs pixel prediction and residual conversion on a block between a plurality of consecutive input images.

図１における各機能ブロックは、ハードウェアの方法により実現されてもよく、ソフトウェアプログラム又はファームウェアモジュールの方法により実現されてもよい。図２は、本発明の実施例に符合する画像処理装置２００のブロック図である。図１におけるビデオ符号化デバイス１００がソフトウェアプログラム又はファームウェアモジュールにより実現される場合、本発明の実施例を実現するために、画像処理装置２００におけるプロセッサ２１０及びメモリ２２０により実行されてもよい。メモリ２２０は、コマンドで表されるビデオ符号化デバイス１００における各ソフトウェアプログラム又はファームウェアモジュールを記憶することができる。これらのソフトウェアプログラム又はファームウェアモジュールを実行するために、プロセッサ２１０はメモリ２２０にアクセスすることができる。プロセッサ２１０は、中央処理ユニット、描画処理ユニット、マイクロプロセッサ、フィールドプログラマブル論理ゲートアレイ（ＦｉｅｌｄＰｒｏｇｒａｍｍａｂｌｅＬｏｇｉｃＧａｔｅＡｒｒａｙ）…などであってもよい。 Each functional block in FIG. 1 may be realized by a hardware method or a software program or a firmware module method. FIG. 2 is a block diagram of an image processing apparatus 200 consistent with the embodiment of the present invention. If the video encoding device 100 in FIG. 1 is implemented by a software program or firmware module, it may be executed by the processor 210 and the memory 220 in the image processing apparatus 200 to implement an embodiment of the present invention. The memory 220 can store each software program or firmware module in the video encoding device 100 represented by a command. The processor 210 can access the memory 220 to execute these software programs or firmware modules. The processor 210 may be a central processing unit, a drawing processing unit, a microprocessor, a field programmable logic gate array, etc.

ＪＥＭの画面内予測技術において、２つの段階によって、どの画面内符号化の予測モードを符号化を行う現在ブロックに用いるかを決定する。図３は、ＪＥＭにおける画面内予測の２つの段階を示す図である。第１の段階ＳＴ１はラフモード検出（ｒｏｕｇｈｍｏｄｅｄｅｔｅｃｔｉｏｎ；ＲＭＤ）段階である。詳しくいうと、ＲＭＤ段階は、２つのサブ段階ＳＴ１１及びＳＴ１２を含む。この２つのサブ段階ＳＴ１１及びＳＴ１２は、図１における画面内予測モード１３２により実現されることができる。サブ段階ＳＴ１１は、絶対変換差の和（ＳｕｍＯｆＡｂｓｏｌｕｔｅＴｒａｎｓｆｏｒｍｅｄＤｉｆｆｅｒｅｎｃｅ；ＳＡＴＤ）方法を使用して現在ブロックが対応する複数の画面内予測モード（ＪＥＭにおいて３５〜６７種類の画面内予測モードを具備する）の予測コスト（ＳＡＴＤコストとも言える）を計算し、ここでは「画面内予測のＳＡＴＤコストを計算する」という。サブ段階ＳＴ１２は、これらの予測コストに基づいて上記複数の画面内予測モードから複数の候補予測モードを選択し、ここでは「候補予測モードを選択する」という。本実施例を応用するものは、その必要に応じて選択される候補予測モードの数を調整してもよく、例えば、候補予測モードとして低いＳＡＴＤコストを具備する３〜５つの画面内予測モードを選択してもよい。本実施例は「候補予測モードとして３つの予測モードを選択する」ことによって本発明の実施例を実現する。 In JEM intra-screen prediction technology, the prediction mode of intra-screen encoding to be used for the current block to be encoded is determined in two stages. FIG. 3 is a diagram showing two stages of intra-screen prediction in JEM. The first stage ST1 is a rough mode detection (RMD) stage. In detail, the RMD stage includes two sub-stages ST11 and ST12. These two sub-stages ST11 and ST12 can be realized by the intra-screen prediction mode 132 in FIG. The sub-stage ST11 includes a plurality of intra-screen prediction modes (35 to 67 types of intra-screen prediction modes in JEM) to which the current block corresponds using the sum of absolute transform difference (SATD) method. ) Prediction cost (also referred to as a SATD cost) is calculated, and here it is referred to as “calculate the SATD cost for intra-screen prediction”. The sub-stage ST12 selects a plurality of candidate prediction modes from the plurality of intra-screen prediction modes based on these prediction costs, and is referred to as “select candidate prediction mode” here. The application of the present embodiment may adjust the number of candidate prediction modes selected according to the necessity. For example, three to five intra-screen prediction modes having a low SATD cost as candidate prediction modes. You may choose. This embodiment implements the embodiment of the present invention by “selecting three prediction modes as candidate prediction modes”.

第2の段階ＳＴ２はレート歪み最適化（Ｒａｔｅ−ＤｉｓｔｏｒｔｉｏｎＯｐｔｉｍｉｚａｔｉｏｎ；ＲＤＯ）段階である。詳しく言うと、段階ＳＴ２は４つのサブ段階ＳＴ２１〜ＳＴ２４を含む。サブ段階２１は図１の第１の変換ユニット１１３によって実現されることができる。サブ段階ＳＴ２２は図１の第2の変換ユニット１１４によって実現されることができる。サブ段階ＳＴ２３は図１の量子化ユニット１１５によって実現されることができる。サブ段階ＳＴ２４は図１の画面内予測ユニット１３２又は量子化ユニット１１５のうちの１つによって実現されることができる。本発明実施例を応用するものは、その必要に応じて上記各サブ段階を実現する機能ブロックを調整してもよく、本発明はこれに限られない。 The second stage ST2 is a rate-distortion optimization (RDO) stage. Specifically, the stage ST2 includes four sub-stages ST21 to ST24. The sub-stage 21 can be realized by the first conversion unit 113 of FIG. The sub-stage ST22 can be realized by the second conversion unit 114 of FIG. The sub-stage ST23 can be realized by the quantization unit 115 of FIG. The sub-stage ST24 can be realized by one of the intra-screen prediction unit 132 or the quantization unit 115 of FIG. The application of the embodiment of the present invention may adjust the functional blocks for realizing each of the sub-stages as needed, and the present invention is not limited to this.

サブ段階ＳＴ２１は、現在ブロック及びこれらの候補予測モードに対して第１の変換／コア変換／一次変換を行う。また、符号化効率を強化するために、本実施例はサブ段階ＳＴ２２において、第１の変換が行われた現在ブロックの残差データに対して第2の変換（例えば、ＮＳＳＴ）を行う。サブ段階ＳＴ２３は、サブ段階ＳＴ２２の現在ブロックの残差データに対して量子化符号化を行うことによって各候補予測モードが対応するレート歪みコスト（Ｒａｔｅ−ＤｉｓｔｏｒｔｉｏｎＣｏｓｔ；ＲＤＣｏｓｔ）を計算して歪みコストとする。本発明は前記レート歪みコストを前記歪みコストとする。サブ段階ＳＴ２４は、実際の符号化ビットの数と量子化歪みとの間に最適なレート歪みコストを具備する候補予測モードを選択してこの現在ブロックに対応する画面内予測の使用する予測モードとすることに用いられ、ここで「現在ブロックの使用する予測モードを選択する」という。 The sub-stage ST21 performs first transformation / core transformation / primary transformation on the current block and these candidate prediction modes. In order to enhance the coding efficiency, in the present embodiment, in the sub-stage ST22, the second transformation (for example, NSST) is performed on the residual data of the current block on which the first transformation has been performed. The sub-stage ST23 calculates a rate-distortion cost (RDCost) corresponding to each candidate prediction mode by performing quantization coding on the residual data of the current block of the sub-stage ST22, thereby calculating the distortion cost. And In the present invention, the rate distortion cost is the distortion cost. The sub-stage ST24 selects a candidate prediction mode having an optimal rate distortion cost between the actual number of encoded bits and the quantization distortion, and uses a prediction mode to be used for intra prediction corresponding to the current block. Here, it is referred to as “selecting the prediction mode used by the current block”.

ＪＥＭの設計において、ＮＳＳＴは３種類の変換ユニットを具備するため、４種類の動作モードを有する。これらの動作モードは、異なる変換インデックスにより表される。従って、各候補予測モードは、異なるＮＳＳＴ動作モードにおいてそれぞれ計算する必要がある。特に注意すべきこととして、ＪＥＭは６７種類の画面内予測モード及び４種類のＮＳＳＴ動作モード（ＮＳＳＴ変換インデックス（「０」〜「３」）で表す）を具備する。最適な画面内予測モードを精確に計算するため、かつ異なるＮＳＳＴ動作モードによってＲＤＯ段階の結果（候補予測モードに対する選択）に違いが生じ得るため、ＪＥＭは、各画面内予測モードが異なるＮＳＳＴの動作モードにおいてそれぞれＲＭＤ段階ＳＴ１及びＲＤＯ段階ＳＴ２を実行するようにし、このようにしてようやく選択された画面内予測モードが比較的正確であると認められる。 In the design of JEM, NSST has four types of operation modes since it has three types of conversion units. These operating modes are represented by different conversion indexes. Therefore, each candidate prediction mode needs to be calculated in a different NSST operation mode. Of particular note, JEM has 67 types of intra prediction modes and 4 types of NSST operation modes (represented by NSST conversion indexes (“0” to “3”)). JEM is an NSST operation with different in-screen prediction modes, because the optimal in-screen prediction mode can be calculated accurately, and the results of the RDO stage (selection for candidate prediction mode) can vary due to different NSST operation modes. In each mode, the RMD stage ST1 and the RDO stage ST2 are executed, and it is recognized that the intra prediction mode finally selected in this way is relatively accurate.

他の観点から見ると、残差のビットの数をさらに低減させるために、ＮＳＳＴは画面内の予め設定された第2の変換に適用される。上記４種類のＮＳＳＴ変換インデックスの画面内予測モードの選択のプロセスは、概ね以下の演算１〜演算８のように記載できる。 Viewed from another perspective, NSST is applied to a second preset transformation in the screen to further reduce the number of residual bits. The process of selecting the intra prediction mode of the above four types of NSST conversion indexes can be generally described as the following operations 1 to 8.

演算１：ＮＳＳＴ変換インデックスが「０」である場合のＲＭＤ段階（ＳＡＴＤコストに基づいて６７種類の画面内予測モードから３つの候補予測モードを選択する）。 Calculation 1: RMD stage when NSST conversion index is “0” (selecting three candidate prediction modes from 67 types of intra prediction modes based on SATD cost).

演算２：ＮＳＳＴ変換インデックスが「０」である場合のＲＤＯ段階（３つの候補予測モードから最適な画面内予測モードを選択する）。 Arithmetic 2: RDO stage when NSST conversion index is “0” (optimal in-screen prediction mode is selected from three candidate prediction modes).

演算３：ＮＳＳＴ変換インデックスが「１」である場合のＲＭＤ段階（ＳＡＴＤコストに基づいて６７種類の画面内予測モードから３つの候補予測モードを選択する）。 Calculation 3: RMD stage when NSST conversion index is “1” (selecting three candidate prediction modes from 67 types of intra-screen prediction modes based on SATD cost).

演算４：ＮＳＳＴ変換インデックスが「１」である場合のＲＤＯ段階（３つの候補予測モードから最適な画面内予測モードを選択する）。 Arithmetic 4: RDO stage when NSST conversion index is “1” (optimal in-screen prediction mode is selected from three candidate prediction modes).

演算５：ＮＳＳＴ変換インデックスが「２」である場合のＲＭＤ段階（ＳＡＴＤコストに基づいて６７種類の画面内予測モードから３つの候補予測モードを選択する）。 Calculation 5: RMD stage when the NSST conversion index is “2” (selecting three candidate prediction modes from 67 types of intra-screen prediction modes based on the SATD cost).

演算６：ＮＳＳＴ変換インデックスが「２」である場合のＲＤＯ段階（３つの候補予測モードから最適な画面内予測モードを選択する）。 Arithmetic 6: RDO stage when NSST conversion index is “2” (selecting an optimal in-screen prediction mode from three candidate prediction modes).

演算７：ＮＳＳＴ変換インデックスが「３」である場合のＲＭＤ段階（ＳＡＴＤコストに基づいて６７種類の画面内予測モードから３つの候補予測モードを選択する）。 Calculation 7: RMD stage when the NSST conversion index is “3” (selecting three candidate prediction modes from 67 types of intra-screen prediction modes based on the SATD cost).

演算８：ＮＳＳＴ変換インデックスが「３」である場合のＲＤＯ段階（３つの候補予測モードから最適な画面内予測モードを選択する）。 Arithmetic 8: RDO stage when NSST conversion index is “3” (optimal in-screen prediction mode is selected from three candidate prediction modes).

上記演算１〜演算８から分かるとおり、どの画面内符号化でブロックに対して符号化を行うのが最も小さいコストを具備するかを速やかに計算することができるＳＡＴＤ方法であっても、ＲＭＤ段階において、３つの候補予測モードにおける最も小さいＳＡＴＤコストを計算するには、複数回計算する必要がある（例えば、演算１、演算３、演算５、演算７）。 As can be seen from the above operations 1 to 8, even in the SATD method that can quickly calculate which intra-frame encoding has the smallest cost for encoding a block, the RMD stage In order to calculate the smallest SATD cost in the three candidate prediction modes, it is necessary to calculate a plurality of times (for example, calculation 1, calculation 3, calculation 5, and calculation 7).

しかしながら、本発明の実施例によると、図３のサブ段階ＳＴ１１におけるＳＡＴＤコストに対する計算とＮＳＳＴの動作モードとは直接的に関連しておらず、言い換えると、ＳＡＴＤコストの計算とＮＳＳＴの動作モードの最終的なビデオ符号化結果に対する影響は大きくない。従って、異なるＮＳＳＴ変換インデックスにおいて、各画面内予測モードのＳＡＴＤコストは、同じＳＡＴＤコストを後続のＲＤＯ段階の異なるＮＳＳＴにおける動作モードの計算に用いることができる。これにより、本発明の実施例は、ＮＳＳＴを予め設定された変換インデックス（例えば、ＮＳＳＴの変換インデックスを「０」に設定する）に設定するときのみ、これらの画面内予測モードに対してＳＡＴＤコストの計算を１回行い、これらのＳＡＴＤコストを一時的に保存し、かつ「ＮＳＳＴを他の変換インデックス（例えば、ＮＳＳＴの変換インデックスを「１」〜「３」に設定する）に設定するときのＳＡＴＤコストの計算」のステップを削除することで、計算のプロセスを大幅に節約することができる。言い換えると、本発明の実施例は、上記運算１のＳＡＴＤコストの計算結果を一時的に保存し、かつ上記運算３、５、７を省略し、運算１により得られたＳＡＴＤコストを計算して運算４、６、８を行うことで、計算量を節約することができる。 However, according to the embodiment of the present invention, the calculation for the SATD cost in the sub-stage ST11 of FIG. 3 is not directly related to the operation mode of the NSST, in other words, the calculation of the SATD cost and the operation mode of the NSST are not related. The impact on the final video encoding result is not significant. Therefore, in different NSST conversion indexes, the SATD cost of each intra-screen prediction mode can be used to calculate the operation mode in different NSST in the subsequent RDO stage. As a result, the embodiment of the present invention enables the SATD cost for these intra prediction modes only when NSST is set to a preset conversion index (for example, the conversion index of NSST is set to “0”). Is calculated once, these SATD costs are temporarily stored, and “when NSST is set to another conversion index (for example, the conversion index of NSST is set to“ 1 ”to“ 3 ”)” By eliminating the “calculate SATD cost” step, the calculation process can be saved significantly. In other words, the embodiment of the present invention temporarily stores the calculation result of the SATD cost of the calculation 1 and omits the calculations 3, 5, and 7 to calculate the SATD cost obtained by the calculation 1. By performing the calculations 4, 6, and 8, the amount of calculation can be saved.

図４は、本発明の実施例に符合する画面内予測の予測モードを選択する方法のフローチャートである。図４に記載の方法は図１に記載のビデオ符号化デバイス１００及び図２に記載の画像処理装置２００に適用することができる。図１及び図４を参照し、ステップＳ４１０において、変換ユニット１１２における第2の変換ユニット１１４の動作モードを無効にし、即ち、第2の変換ユニット１１４の変換インデックスを「０」に設定する。ステップＳ４２０において、第2の変換ユニット１１４が前記予め設定された変換インデックスに基づいて動作を行う場合、画面内予測ユニット１３２は、絶対変換差の和（ＳＡＴＤ）の方法によって、入力画像ＩＭの現在ブロックに基づいて画面内予測における複数の予測モードが対応する複数の予測コストを計算する。前記予測コストはＳＡＴＤコストである。 FIG. 4 is a flowchart of a method for selecting a prediction mode for intra-screen prediction consistent with an embodiment of the present invention. The method described in FIG. 4 can be applied to the video encoding device 100 illustrated in FIG. 1 and the image processing apparatus 200 illustrated in FIG. 1 and 4, in step S410, the operation mode of the second conversion unit 114 in the conversion unit 112 is invalidated, that is, the conversion index of the second conversion unit 114 is set to “0”. In step S420, when the second conversion unit 114 performs an operation based on the preset conversion index, the in-screen prediction unit 132 uses the sum of absolute conversion differences (SATD) method to calculate the current input image IM. Based on the block, a plurality of prediction costs corresponding to a plurality of prediction modes in the intra prediction are calculated. The predicted cost is a SATD cost.

ステップＳ４３０において、画面内予測ユニット１３２はステップＳ４２０の予測コストに基づいて複数の画面内予測モード（例えば、６７種類の画面内予測モード）から複数の候補予測モードを選択する。本実施例は６７種類の画面内予測モードが対応する予測コストから最適な予測コストを探し出すことができる。画面内予測モードの数は選択された候補予測モードの数よりも多い。例えば、これらの予測コストから最も低い３つの予測コストが対応する画面内予測モードを探し出して予め選択された予測モードとする。 In step S430, the intra prediction unit 132 selects a plurality of candidate prediction modes from a plurality of intra prediction modes (for example, 67 types of intra prediction modes) based on the prediction cost in step S420. In this embodiment, the optimum prediction cost can be found from the prediction costs corresponding to the 67 types of intra-screen prediction modes. The number of intra prediction modes is greater than the number of selected candidate prediction modes. For example, an in-screen prediction mode corresponding to the lowest three prediction costs is searched from these prediction costs, and the prediction mode is selected in advance.

ステップＳ４４０において、これらの候補予測モードを選択した後、後続のステップに用いられるように、画面内予測ユニット１３２はこれらの候補予測モードが対応する予測コストを一時的に保存する。一部の実施例において、画面内予測ユニット１３２は各画面内予測モードが対応する予測コストを一時的に保存することができる。 After selecting these candidate prediction modes in step S440, the in-screen prediction unit 132 temporarily stores the prediction costs to which these candidate prediction modes correspond, as used in subsequent steps. In some embodiments, the intra-screen prediction unit 132 may temporarily store the prediction cost corresponding to each intra-screen prediction mode.

ステップＳ４５０において、複数の変換インデックス（本実施例は４つの変換インデックス「０」〜「３」を具備する）においてこれらの候補予測モードが対応する複数の歪みコストを計算するために、変換・量子化ユニット１１０における第１の変換ユニット１１３、第2の変換ユニット１１４及び量子化ユニット１１５によって、前記現在ブロック及びステップＳ４３０に選択された複数の候補予測モードが対応する予測コストに基づいて、レート歪み最適化（Ｒａｔｅ−ＤｉｓｔｏｒｔｉｏｎＯｐｔｉｍｉｚａｔｉｏｎ；ＲＤＯ）検出を行ってもよい。本発明実施例の歪みコストは図３中ＲＤＯ段階ＳＴ２におけるサブ段階ＳＴ２３に記載のレート歪みコストにより実現されている。言い換えると、ステップＳ４５０の歪みコストの計算方法は図３におけるＲＤＯ段階ＳＴ２を参照することができる。 In step S450, in order to calculate a plurality of distortion costs corresponding to these candidate prediction modes in a plurality of transformation indexes (this embodiment includes four transformation indexes “0” to “3”), transformation / quantization is performed. Rate distortion based on the prediction cost corresponding to the current block and the plurality of candidate prediction modes selected in step S430 by the first transform unit 113, the second transform unit 114, and the quantization unit 115 in the quantization unit 110 Optimization (Rate-Distortion Optimization; RDO) detection may be performed. The distortion cost of the embodiment of the present invention is realized by the rate distortion cost described in the sub-stage ST23 in the RDO stage ST2 in FIG. In other words, the distortion cost calculation method in step S450 can refer to the RDO stage ST2 in FIG.

ステップＳ４６０において、第2の変換ユニット１１４に設定された変換インデックスが最後の変換インデックス（即ち、変換インデックス「３」であるか否か）であるか否かを判断する。第2の変換ユニット１１４で設定された変換インデックスが変換インデックス「３」ではない場合、ステップＳ４６０からステップＳ４７０に入り、第2の変換ユニット１１４で設定された変換インデックスに１を足す。また、変換インデックスに１を足した後、ステップＳ４５０に戻り、このＮＳＳＴの変換インデックスの場合の各候補予測コストが対応する歪みコストを計算する。ステップＳ４５０〜Ｓ４７０に基づいて、本発明は異なる変換インデックスの場合においてこれらの候補予測モードが対応する歪みコストを計算することができる。 In step S460, it is determined whether or not the conversion index set in the second conversion unit 114 is the last conversion index (that is, whether or not it is the conversion index “3”). When the conversion index set by the second conversion unit 114 is not the conversion index “3”, the process enters from step S460 to step S470, and 1 is added to the conversion index set by the second conversion unit 114. Further, after adding 1 to the conversion index, the process returns to step S450, and the distortion cost corresponding to each candidate prediction cost in the case of the conversion index of NSST is calculated. Based on steps S450 to S470, the present invention can calculate the distortion cost to which these candidate prediction modes correspond in the case of different transform indexes.

ステップＳ４８０において、画面内予測ユニット１３４（又はステップＳ４８０を実行する他の部品）は、ステップＳ４５０が計算して得られた歪みコストに基づいて、これらの候補予測モードからそのうちの１つを選択して前記現在ブロックに対応する画面内予測の使用する予測モードとすることができる。 In step S480, the intra-screen prediction unit 134 (or other component that executes step S480) selects one of these candidate prediction modes based on the distortion cost obtained by the calculation in step S450. Thus, the prediction mode used by the intra prediction corresponding to the current block can be set.

表１は、本発明実施例を採用したビデオ圧縮率と画像品質との比較である。表１における「Ｙ」、「Ｕ」、「Ｖ」はカラ符号化方法である。「Ｙ」は輝度（Ｌｕｍｉｎａｎｃｅ）を表し、「Ｕ」及び「Ｖ」はそれぞれクロミナンス（Ｃｈｒｏｍｉｎａｎｃｅ）とクロマ（Ｃｈｒｏｍａ）を表す。 Table 1 is a comparison between the video compression rate and the image quality employing the embodiment of the present invention. “Y”, “U”, and “V” in Table 1 are color coding methods. “Y” represents luminance (Luminance), and “U” and “V” represent chrominance (Chroma) and chroma (Chroma), respectively.

表１は、本発明実施例を使用してビデオ符号化の後にデコードする画像と元のパターンとの比較結果である。ビデオ符号化された画像のＹ、Ｕ、Ｖ値と元のパターンとの差異は非常に小さいが、符号化時間は９％短縮され、ビデオ符号化の処理速度を大幅に向上させていることが分かる。 Table 1 shows a comparison result between an image to be decoded after video coding using the embodiment of the present invention and an original pattern. The difference between the Y, U, and V values of the video-encoded image and the original pattern is very small, but the encoding time is shortened by 9% and the video encoding processing speed is greatly improved. I understand.

上記に基づき、本発明の実施例に記載の画面内予測のモードの選択方法、ビデオ符号化デバイス及び画像処理装置が画面内予測の予測モードの選択を行うとき、まず、変換ユニットを予め設定された変換インデックスに設定してから（例えば、変換ユニットを、第2の変換ユニットを無効にし、かつ第１の変換ユニットのみで残差を変換する動作モードに設定する）、入力画像のブロックに基づいて画面内予測における各予測モードが対応する予測コストを計算し、これらの予測モードから複数の候補予測モードを選択する。次に、候補予測モードが対応する予測コスト及び前記ブロックによってこれらの候補予測モードから最適な（例えば、最も低い）歪みコストを具備する候補予測モードを、使用する予測モードとする。言い換えると、本発明の実施例は、変換ユニットにおける異なる動作モード（即ち、異なる変換インデックスに基づいて残差を変換する場合）に対して各予測モードが対応する予測コストの計算をそれぞれ行わず、変換ユニットにおける予測動作モード（即ち、予め設定された変換インデックスに基づいて残差を変換する場合）に対して画面内予測における各予測モードが対応する予測コストの計算を１回行う。そして、上記予測コストと変換ユニットが異なる変換インデックスにおいて残差を変換する場合を結合して歪みコストの計算を実現し、後続の候補予測モードの選択を行う。これにより、本発明の実施例は、予測コストの計算量を大幅に減少させ、ビデオ符号化の効率及び処理速度を向上させるとともに、ビデオ符号化のハードウェア実装のコストを低減することができる。 Based on the above, when the selection method of the intra prediction mode, the video encoding device, and the image processing apparatus described in the embodiments of the present invention select the prediction mode of the intra prediction, first, the conversion unit is set in advance. Based on the block of the input image (for example, the conversion unit is set to an operation mode in which the second conversion unit is disabled and the residual is converted only by the first conversion unit). Then, the prediction cost corresponding to each prediction mode in the intra prediction is calculated, and a plurality of candidate prediction modes are selected from these prediction modes. Next, a candidate prediction mode having an optimal (for example, the lowest) distortion cost from these candidate prediction modes according to the prediction cost corresponding to the candidate prediction mode and the block is set as a prediction mode to be used. In other words, the embodiment of the present invention does not calculate the prediction cost corresponding to each prediction mode for different operation modes in the conversion unit (i.e., when converting the residual based on different conversion indexes), The prediction cost corresponding to each prediction mode in the in-screen prediction is calculated once for the prediction operation mode in the conversion unit (that is, when the residual is converted based on a preset conversion index). Then, by combining the cases where the prediction cost and the conversion unit having different conversion units are used to convert the residual, the distortion cost is calculated, and the subsequent candidate prediction mode is selected. Thus, the embodiment of the present invention can greatly reduce the amount of calculation of the prediction cost, improve the efficiency and processing speed of video encoding, and reduce the cost of hardware implementation of video encoding.

本発明は実施例で以上のことを開示しているが、それは本発明を限定するものではなく、当業者は、本発明の主旨および範囲を遺脱しない条件において、些細な変動および修飾をしてもよいため、本発明の保護範囲は後の専利請求の範囲に限定した内容を基準とする。 While the present invention has been disclosed in the examples above, it is not intended to limit the invention and those skilled in the art will make minor variations and modifications without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention is based on the content limited to the scope of the patent claims later.

１００ビデオ符号化デバイス
１１０変換・量子化ユニット
１１２変換ユニット
１１３第１の変換ユニット
１１４第2の変換ユニット
１１５量子化ユニット
１２０逆量子化・逆変換ユニット
１２２逆量子化ユニット
１２４逆変換ユニット
１３０予測ユニット
１３２画面内予測ユニット
１３４画面間予測ユニット
１４０、１５０加算器
１６０画面バッファ
１７０エントロピー符号化ユニット
２００画像処理装置
２１０プロセッサ
２２０メモリ
Ｓ４１０〜Ｓ４８０画面内予測の予測モードを選択する方法のステップ
ＳＴ１ラフモード検出（ＲＭＤ）段階
ＳＴ１１画面内予測のＳＡＴＤコストを計算する
ＳＴ１２候補予測モードを選択する
ＳＴ２レート歪み最適化（ＲＤＯ）段階
ＳＴ２１第１の変換を行う
ＳＴ２２第2の変換を行う
ＳＴ２３量子化符号化を行う
ＳＴ２４現在ブロックの使用する予測モードを選択する
ＩＭ入力画像
ＭＲ入力画像の残差
ＴＤ、ＤＡデータ
ＶＤ映像データ
Ｎ１ビデオ符号化デバイスの入力端子
Ｎ２逆量子化・逆変換ユニットの出力端子 100 video encoding device 110 transform / quantization unit 112 transform unit 113 first transform unit 114 second transform unit 115 quantization unit 120 inverse quantization / inverse transform unit 122 inverse quantization unit 124 inverse transform unit 130 prediction unit 132 Intra-screen prediction unit 134 Inter-screen prediction unit 140, 150 Adder 160 Screen buffer 170 Entropy encoding unit 200 Image processing device 210 Processor 220 Memory S410 to S480 Step ST1 of method for selecting prediction mode of intra-screen prediction Rough mode detection ( RMD) stage ST11 STAT cost of intra prediction is calculated ST12 candidate prediction mode is selected ST2 rate distortion optimization (RDO) stage ST21 first conversion is performed ST22 second conversion is performed S 23 ST24 that performs quantization coding IM selecting prediction mode used by current block Input image MR Input image residual TD, DA data VD Video data N1 Video encoding device input terminal N2 Inverse quantization / inverse transformation unit Output terminal

Claims

When the conversion unit performs an operation based on a preset conversion index, calculating a plurality of prediction costs corresponding to a plurality of prediction modes in intra prediction based on a block of the input image;
Selecting a plurality of candidate prediction modes from the plurality of prediction modes based on the plurality of prediction costs;
Based on the prediction cost to which the block and the plurality of candidate prediction modes correspond, calculate a plurality of distortion costs to which the plurality of candidate prediction modes correspond in a plurality of conversion indexes, and the preset conversion index is the A step that is one of a plurality of transform indexes;
Selecting one of the plurality of candidate prediction modes based on the distortion cost and setting the prediction mode to be used for the intra-screen prediction corresponding to the block. Method.

The prediction of intra prediction according to claim 1, wherein the conversion unit includes a first conversion unit and a second conversion unit, and the second conversion unit uses non-separable secondary conversion (NSST). How to select a mode.

The second conversion unit includes at least one conversion core;
The preset transformation index is used to represent an operation mode in which the transformation unit transforms the residual of the block without using the second transformation unit, and other than the preset transformation index. The transform index is used to represent an operation mode in which the transform unit transforms the residual of the block using one of at least one transform core in the second transform unit. The method of selecting the prediction mode of the in-screen prediction described in 2.

The screen according to claim 1, wherein the plurality of prediction costs corresponding to the plurality of prediction modes in the intra-screen prediction are calculated based on the block of the input image by a method of sum of absolute conversion differences (SATD). A method of selecting a prediction mode for intra prediction.

The plurality of candidate prediction modes that correspond to the plurality of transform indexes based on the prediction cost that is tested using rate distortion optimization (RDO) and that corresponds to the block and the plurality of candidate prediction modes. The method for selecting a prediction mode for intra prediction according to claim 1, wherein a distortion cost of the image is calculated.

The method of selecting a prediction mode for intra prediction according to claim 1, further comprising temporarily storing the prediction cost corresponding to the candidate prediction mode after selecting the plurality of candidate prediction modes.

The in-screen prediction according to claim 1, wherein the video encoding used for the in-screen prediction is a joint search test model (JEM), and the number of the plurality of prediction modes is larger than the number of the plurality of candidate prediction modes. A method of selecting a prediction mode of prediction.

A transform unit used to transform a residual corresponding to a block of input images based on a plurality of transform indexes;
When the conversion unit is coupled to the conversion unit and performs an operation based on a preset conversion index, a block of the input image is acquired, and a plurality of prediction modes in the intra prediction are supported based on the block. Calculating a plurality of prediction costs, wherein the preset conversion index is one of the plurality of conversion indexes, and an in-screen prediction unit,
The intra-screen prediction unit selects a plurality of candidate prediction modes from the plurality of prediction modes based on the plurality of prediction costs, and the conversion is performed based on the prediction costs corresponding to the block and the plurality of candidate prediction modes. Calculating a plurality of distortion costs corresponding to the plurality of candidate prediction modes in the plurality of transform indexes of a unit, and selecting one of the plurality of candidate prediction modes based on the distortion cost to correspond to the block A video encoding device having a prediction mode used for intra prediction.

The conversion unit is
A first conversion unit for performing a first conversion on the residual;
A second transformation unit that generates a residual transformed selectively using a non-separable quadratic transformation with respect to the residual subjected to the first transformation as a second transformation; and The video encoding device of claim 8, comprising:

The conversion unit includes at least one conversion core;
The preset conversion index is used to represent an operation mode in which the conversion unit does not use the second conversion unit but converts the residual using the first conversion unit; The transformation index other than a preset transformation index is used to transform the residual using the transformation unit using one of at least one transformation core in the first transformation unit and the second transformation unit. The video encoding device of claim 9, wherein the video encoding device is used to represent a mode of operation to be performed.

The video code according to claim 8, wherein the intra prediction unit calculates the plurality of prediction costs corresponding to the plurality of prediction modes in the intra prediction based on the block by a method of sum of absolute conversion differences. Device.

The plurality of candidate prediction modes in the plurality of transform indexes correspond based on the prediction cost that the intra prediction unit examines using rate distortion optimization and the block and the plurality of candidate prediction modes correspond. The video encoding device of claim 8, wherein the plurality of distortion costs are calculated.

The video encoding device according to claim 8, wherein the prediction cost corresponding to the candidate prediction mode is temporarily stored after the intra prediction unit selects the plurality of candidate prediction modes.

The video of claim 8, wherein the video encoding used for the video encoding device is a joint search test model (JEM) and the number of the plurality of prediction modes is greater than the number of the plurality of candidate prediction modes. Encoding device.

A processor;
A memory coupled to the processor;
When the processor converts a residual based on a preset conversion index, the processor calculates a plurality of prediction costs corresponding to a plurality of prediction modes in intra prediction based on a block of an input image, Corresponding to the block, the processor selects a plurality of candidate prediction modes from the plurality of prediction modes based on the plurality of prediction costs, and based on the prediction cost corresponding to the block and the plurality of candidate prediction modes, Calculating a plurality of distortion costs corresponding to the plurality of candidate prediction modes in a plurality of conversion indexes, and the preset conversion index is one of the plurality of conversion indexes;
The image processing apparatus, wherein the processor selects one of the plurality of candidate prediction modes based on the distortion cost and sets the prediction mode to be used for intra prediction corresponding to the block.

The processor performs a first transformation on the residual, and a second transformation is performed using a non-separable quadratic transformation on the residual on which the first transformation has been performed. The image processing apparatus according to claim 15, wherein a residual is generated.

The second transformation includes at least one transformation core;
The preset transformation index is used to represent an operation mode in which the processor does not use the second transformation and transforms the residual using the first transformation. The transformation index other than the transformation index represents an operation mode in which the processor transforms the residual using one of at least one transformation core in the first transformation and the second transformation. The image processing apparatus according to claim 16, wherein the image processing apparatus is used.

The image processing device according to claim 15, wherein the processor calculates the plurality of prediction costs corresponding to the plurality of prediction modes in the intra prediction based on the block by a method of sum of absolute conversion differences.

The processor checks using rate distortion optimization, and the plurality of candidate prediction modes correspond to the plurality of transform indexes based on the prediction cost to which the block and the plurality of candidate prediction modes correspond. The image processing apparatus according to claim 15, wherein a distortion cost is calculated.

After the processor selects the plurality of candidate prediction modes, temporarily stores the prediction cost corresponding to the candidate prediction mode;
The image processing apparatus according to claim 15, wherein the video encoding to be used is a joint search test model (JEM), and the number of the plurality of prediction modes is larger than the number of the plurality of candidate prediction modes.