JP2018524870A

JP2018524870A - Content-adaptive application of fixed transfer functions to high dynamic range (HDR) and / or wide color gamut (WCG) video data

Info

Publication number: JP2018524870A
Application number: JP2017563203A
Authority: JP
Inventors: ドミトロ・ルサノフスキー; スンウォン・イ; サンスリ・ドネ・ブグダイシ; ジョエル・ソール・ロジャルス; アダルシュ・クリシュナン・ラマスブラモニアン; マルタ・カルチェヴィッチ
Original assignee: クアルコム，インコーポレイテッド
Priority date: 2015-06-08
Filing date: 2016-06-08
Publication date: 2018-08-30
Anticipated expiration: 2036-06-08
Also published as: AU2016274617A1; HUE058692T2; WO2016200969A1; TW201711471A; EP3304900A1; ES2916575T3; CN107736024A; TWI670969B; PT3304900T; US10244245B2; JP6882198B2; SI3304900T1; AU2016274617B2; CN107736024B; PL3304900T3; EP3304900B1; KR20180016383A; EP4090017A1; DK3304900T3; KR102031477B1

Abstract

本開示は、高ダイナミックレンジ(HDR)/広色域(WCG)カラーコンテナに適合するようにビデオデータを処理することを含む、ビデオデータを処理することに関する。技法は、符号化側において、静的伝達関数の適用の前にカラー値の前処理を適用し、かつ/または静的伝達関数の適用からの出力に対して後処理を適用する。前処理を適用することによって、例は、静的伝達関数の適用によって異なるダイナミックレンジに短縮されたときに出力コードワードを線形化するカラー値を生成し得る。後処理を適用することによって、例は、信号対量子化雑音比を高め得る。例は、カラー値を再構成するために、符号化側における動作の逆を復号側において適用し得る。 The present disclosure relates to processing video data, including processing video data to conform to a high dynamic range (HDR) / wide color gamut (WCG) color container. The technique applies color value pre-processing before application of the static transfer function and / or post-processing on the output from application of the static transfer function at the encoder side. By applying preprocessing, the example may generate color values that linearize the output codeword when reduced to a different dynamic range by applying a static transfer function. By applying post-processing, the example can increase the signal-to-quantization noise ratio. An example may apply the inverse of the operation at the encoding side at the decoding side to reconstruct the color values.

Description

本出願は、各々の内容全体が参照により本明細書に組み込まれる、2015年6月8日に出願した米国仮出願第62/172,713号、および2015年6月24日に出願した米国仮出願第62/184,216号の利益を主張するものである。 This application is based on U.S. Provisional Application No. 62 / 172,713 filed on June 8, 2015 and U.S. Provisional Application No. filed on June 24, 2015, the entire contents of each of which are incorporated herein by reference. It claims the benefit of 62 / 184,216.

本開示は、ビデオコーディングに関する。 The present disclosure relates to video coding.

デジタルビデオ能力は、デジタルテレビジョン、デジタルダイレクトブロードキャストシステム、ワイヤレス放送システム、携帯情報端末(PDA)、ラップトップまたはデスクトップコンピュータ、タブレットコンピュータ、電子ブックリーダー、デジタルカメラ、デジタル記録デバイス、デジタルメディアプレーヤ、ビデオゲーミングデバイス、ビデオゲームコンソール、セルラーまたは衛星無線電話、いわゆる「スマートフォン」、ビデオ会議デバイス、ビデオストリーミングデバイスなどを含む広範囲のデバイスに組み込まれ得る。デジタルビデオデバイスは、MPEG-2、MPEG-4、ITU-T H.263、ITU-T H.264/MPEG-4、Part 10、アドバンストビデオコーディング(AVC)、ITU-T H.265、高効率ビデオコーディング(HEVC)によって規定された規格、およびそのような規格の拡張に記載されているものなどのビデオコーディング技法を実施する。ビデオデバイスは、そのようなビデオコーディング技法を実施することによって、デジタルビデオ情報をより効率的に送信、受信、符号化、復号、および/または記憶し得る。 Digital video capabilities include digital television, digital direct broadcast system, wireless broadcast system, personal digital assistant (PDA), laptop or desktop computer, tablet computer, ebook reader, digital camera, digital recording device, digital media player, video It can be incorporated into a wide range of devices including gaming devices, video game consoles, cellular or satellite radiotelephones, so-called “smartphones”, video conferencing devices, video streaming devices, and the like. Digital video devices are MPEG-2, MPEG-4, ITU-T H.263, ITU-T H.264 / MPEG-4, Part 10, Advanced Video Coding (AVC), ITU-T H.265, high efficiency Implement video coding techniques such as those described in standards defined by Video Coding (HEVC), and extensions of such standards. A video device may more efficiently transmit, receive, encode, decode, and / or store digital video information by performing such video coding techniques.

ビデオコーディング技法は、ビデオシーケンスに固有の冗長性を低減または除去するために、空間(ピクチャ内)予測および/または時間(ピクチャ間)予測を含む。ブロックベースのビデオコーディングの場合、ビデオスライス(たとえば、ビデオフレーム、またはビデオフレームの一部分)は、ビデオブロックに区分されてよく、ビデオブロックは、ツリーブロック、コーディングユニット(CU)、および/またはコーディングノードと呼ばれることもある。ピクチャのイントラコード化(I)スライスの中のビデオブロックは、同じピクチャにおける隣接ブロックの中の参照サンプルに対する空間予測を使用して符号化される。ピクチャのインターコード化(PまたはB)スライスの中のビデオブロックは、同じピクチャにおける隣接ブロックの中の参照サンプルに対する空間予測、または他の参照ピクチャの中の参照サンプルに対する時間予測を使用し得る。ピクチャはフレームと呼ばれることがあり、参照ピクチャは参照フレームと呼ばれることがある。 Video coding techniques include spatial (intra-picture) prediction and / or temporal (inter-picture) prediction to reduce or remove redundancy inherent in video sequences. For block-based video coding, a video slice (e.g., a video frame, or a portion of a video frame) may be partitioned into video blocks, which may be tree blocks, coding units (CUs), and / or coding nodes. Sometimes called. Video blocks in an intra-coded (I) slice of a picture are encoded using spatial prediction on reference samples in neighboring blocks in the same picture. Video blocks in an inter-coded (P or B) slice of a picture may use spatial prediction for reference samples in neighboring blocks in the same picture, or temporal prediction for reference samples in other reference pictures. A picture may be referred to as a frame, and a reference picture may be referred to as a reference frame.

空間予測または時間予測は、コーディングされるべきブロックのための予測ブロックをもたらす。残差データは、コーディングされるべき元のブロックと予測ブロックとの間のピクセル差分を表す。インターコード化ブロックは、予測ブロックを形成する参照サンプルのブロックを指す動きベクトル、およびコード化ブロックと予測ブロックとの間の差分を示す残差データに従って符号化される。イントラコード化ブロックは、イントラコーディングモードおよび残差データに従って符号化される。さらなる圧縮のために、残差データは、ピクセル領域から変換領域に変換されて残差変換係数をもたらしてよく、残差変換係数は、次いで、量子化され得る。最初に2次元アレイで構成された量子化変換係数は、変換係数の1次元ベクトルを生成するために走査されてよく、なお一層の圧縮を達成するためにエントロピーコーディングが適用され得る。 Spatial or temporal prediction results in a predictive block for the block to be coded. The residual data represents the pixel difference between the original block to be coded and the prediction block. The inter-coded block is encoded according to a motion vector that points to the block of reference samples that form the prediction block, and residual data that indicates the difference between the coded block and the prediction block. Intra-coded blocks are encoded according to the intra-coding mode and residual data. For further compression, the residual data may be transformed from the pixel domain to the transform domain to yield residual transform coefficients, which may then be quantized. The quantized transform coefficients initially composed of a two-dimensional array may be scanned to generate a one-dimensional vector of transform coefficients, and entropy coding may be applied to achieve even further compression.

Brossら、「High efficiency video coding (HEVC) text specification draft 10 (for FDIS & Last Call)」、ITU-T SG16 WP3とISO/IEC JTC1/SC29/WG11とのビデオコーディング共同研究部会(JCT-VC)、第12回会合、ジュネーブ、スイス、2013年1月14〜23日、JCTVC-L1003v34、http://phenix.int-evry.fr/jct/doc_end_user/documents/12_Geneva/wg11/JCTVC-L1003-v34.zipBross et al., "High efficiency video coding (HEVC) text specification draft 10 (for FDIS & Last Call)", Video coding joint research group of ITU-T SG16 WP3 and ISO / IEC JTC1 / SC29 / WG11 (JCT-VC) , 12th meeting, Geneva, Switzerland, January 14-23, 2013, JCTVC-L1003v34, http://phenix.int-evry.fr/jct/doc_end_user/documents/12_Geneva/wg11/JCTVC-L1003-v34 .zip Wangら、「High efficiency video coding (HEVC) Defect Report」、ITU-T SG16 WP3とISO/IEC JTC1/SC29/WG11とのビデオコーディング共同研究部会(JCT-VC)、第14回会合、ウィーン、オーストリア、2013年7月25日〜8月2日、JCTVC-N1003v1、http://phenix.int-evry.fr/jct/doc_end_user/documents/14_Vienna/wg11/JCTVC-N1003-v1.zipWang et al., `` High efficiency video coding (HEVC) Defect Report '', ITU-T SG16 WP3 and ISO / IEC JTC1 / SC29 / WG11 video coding joint research group (JCT-VC), 14th meeting, Vienna, Austria July 25-August 2, 2013, JCTVC-N1003v1, http://phenix.int-evry.fr/jct/doc_end_user/documents/14_Vienna/wg11/JCTVC-N1003-v1.zip ITU-T H.265, Series H: Audiovisual and Multimedia Systems, Infrastructure of audiovisual services - Coding of moving video, High efficiency video coding、国際電気通信連合(ITU)の電気通信標準化部門、2013年4月ITU-T H.265, Series H: Audiovisual and Multimedia Systems, Infrastructure of audiovisual services-Coding of moving video, High efficiency video coding, Telecommunication Standardization Division, International Telecommunication Union (ITU), April 2013

本開示は、高ダイナミックレンジ(HDR)/広色域(WCG)カラーコンテナに適合するようにビデオデータを処理することを含む、ビデオデータを処理することに関する。以下でより詳細に説明するように、本開示の技法は、符号化側において、静的伝達関数の適用の前にカラー値の前処理を適用し、かつ/または静的伝達関数の適用からの出力に対して後処理を適用する。前処理を適用することによって、例は、静的伝達関数の適用によって異なるダイナミックレンジに短縮されたときに出力コードワードを線形化するカラー値を生成し得る。後処理を適用することによって、例は、信号対量子化雑音比を高め得る。例は、カラー値を再構成するために、符号化側における動作の逆を復号側において適用し得る。 The present disclosure relates to processing video data, including processing video data to conform to a high dynamic range (HDR) / wide color gamut (WCG) color container. As described in more detail below, the techniques of this disclosure apply color value preprocessing and / or from application of a static transfer function at the encoder side prior to application of a static transfer function. Apply post-processing to the output. By applying preprocessing, the example may generate color values that linearize the output codeword when reduced to a different dynamic range by applying a static transfer function. By applying post-processing, the example can increase the signal-to-quantization noise ratio. An example may apply the inverse of the operation at the encoding side at the decoding side to reconstruct the color values.

一例では、本開示はビデオ処理の方法を説明し、方法は、ビデオデータの短縮カラー値を表す第1の複数のコードワードを受信することであって、短縮カラー値が、第1のダイナミックレンジでのカラーを表すことと、非短縮カラー値を生成するために、ビデオデータに適応性を示さない逆静的伝達関数を使用して、第1の複数のコードワードに基づいて第2の複数のコードワードを非短縮化することであって、非短縮カラー値が、第2の異なるダイナミックレンジでのカラーを表し、第2の複数のコードワードが、逆後処理されている第1の複数のコードワードまたは第1の複数のコードワードからのコードワードのうちの1つであることと、第2の複数のコードワードを生成するために第1の複数のコードワードを逆後処理すること、または非短縮カラー値を逆前処理することのうちの少なくとも1つと、非短縮カラー値または逆前処理された非短縮カラー値を出力することとを備える。 In one example, this disclosure describes a method of video processing, the method comprising receiving a first plurality of codewords representing a shortened color value of video data, wherein the shortened color value is a first dynamic range. A second plurality based on the first plurality of codewords using an inverse static transfer function that does not exhibit adaptability to the video data to represent the color at and to generate non-shortened color values The first plurality of codewords, wherein the non-shortened color values represent colors in a second different dynamic range, and the second plurality of codewords are reverse post-processed One of the codewords or one of the codewords from the first plurality of codewords and reverse post-processing the first plurality of codewords to generate a second plurality of codewords Or non-shortened color At least one of the inverse pre-processing the values, and a outputting a non-reducing color value or vice pretreated non shortened color values.

一例では、本開示はビデオ処理の方法を説明し、方法は、第1のダイナミックレンジでのカラーを表すビデオデータの複数のカラー値を受信することと、短縮カラー値を表す複数のコードワードを生成するために、短縮されているビデオデータに適応性を示さない静的伝達関数を使用してカラー値を短縮することであって、短縮カラー値が、第2の異なるダイナミックレンジでのカラーを表すことと、短縮されるカラー値を生成するための短縮の前にカラー値を前処理すること、またはカラー値を短縮することから得られたコードワードを後処理することのうちの少なくとも1つと、短縮カラー値または後処理された短縮カラー値のうちの1つに基づくカラー値を出力することとを備える。 In one example, this disclosure describes a method of video processing, the method receiving a plurality of color values of video data representing colors in a first dynamic range and a plurality of codewords representing shortened color values. To reduce color values using a static transfer function that is not adaptable to the shortened video data to generate, where the shortened color values represent colors in a second different dynamic range. At least one of representing and preprocessing the color value before shortening to produce a shortened color value, or post-processing the codeword resulting from shortening the color value Outputting a color value based on one of the shortened color value or the post-processed shortened color value.

一例では、本開示はビデオ処理のためのデバイスを説明し、デバイスは、ビデオデータを記憶するように構成されたビデオデータメモリと、固定機能またはプログラマブルの回路構成のうちの少なくとも1つを備えるビデオポストプロセッサとを備える。ビデオポストプロセッサは、ビデオデータの短縮カラー値を表す第1の複数のコードワードをビデオデータメモリから受信することであって、短縮カラー値が、第1のダイナミックレンジでのカラーを表すことと、非短縮カラー値を生成するために、ビデオデータに適応性を示さない逆静的伝達関数を使用して、第1の複数のコードワードに基づいて第2の複数のコードワードを非短縮化することであって、非短縮カラー値が、第2の異なるダイナミックレンジでのカラーを表し、第2の複数のコードワードが、逆後処理されている第1の複数のコードワードまたは第1の複数のコードワードからのコードワードのうちの1つであることと、第2の複数のコードワードを生成するために第1の複数のコードワードを逆後処理すること、または非短縮カラー値を逆前処理することのうちの少なくとも1つと、非短縮カラー値または逆前処理された非短縮カラー値を出力することとを行うように構成される。 In one example, this disclosure describes a device for video processing, the device comprising a video data memory configured to store video data and at least one of fixed function or programmable circuitry. And a post processor. The video post processor receives a first plurality of codewords representing a shortened color value of the video data from the video data memory, wherein the shortened color value represents a color in a first dynamic range; Unshorten the second plurality of codewords based on the first plurality of codewords using an inverse static transfer function that is not adaptable to the video data to generate unshortened color values The non-shortened color value represents a color in a second different dynamic range, and the second plurality of codewords is the first plurality of codewords or the first plurality being reverse post-processed One of the codewords from the codeword and reverse post-processing the first plurality of codewords to generate a second plurality of codewords, or an unshortened color value Bract least one of the inverse pre-processing, configured to perform and outputting a non-reducing color value or vice pretreated non shortened color values.

一例では、本開示はビデオ処理のためのデバイスを説明し、デバイスは、ビデオデータを記憶するように構成されたビデオデータメモリと、固定機能またはプログラマブルの回路構成のうちの少なくとも1つを備えるビデオプリプロセッサとを備える。ビデオプリプロセッサは、第1のダイナミックレンジでのカラーを表すビデオデータの複数のカラー値をビデオデータメモリから受信することと、短縮カラー値を表す複数のコードワードを生成するために、短縮されているビデオデータに適応性を示さない静的伝達関数を使用してカラー値を短縮することであって、短縮カラー値が、第2の異なるダイナミックレンジでのカラーを表すことと、短縮されるカラー値を生成するための短縮の前にカラー値を前処理すること、またはカラー値を短縮することから得られたコードワードを後処理することのうちの少なくとも1つと、短縮カラー値または後処理された短縮カラー値のうちの1つに基づくカラー値を出力することとを行うように構成される。 In one example, this disclosure describes a device for video processing, the device comprising a video data memory configured to store video data and at least one of fixed function or programmable circuitry. And a preprocessor. The video preprocessor is shortened to receive a plurality of color values of video data representing colors in a first dynamic range from the video data memory and to generate a plurality of codewords representing shortened color values. Shortening color values using a static transfer function that is not adaptable to video data, where the shortened color value represents a color in a second different dynamic range and the shortened color value At least one of pre-processing color values before shortening to generate or post-processing codewords resulting from shortening color values, and shortened color values or post-processing And outputting a color value based on one of the shortened color values.

一例では、本開示は、実行されたとき、ビデオ処理のためのデバイスの1つまたは複数のプロセッサに、ビデオデータの短縮カラー値を表す第1の複数のコードワードを受信することであって、短縮カラー値が、第1のダイナミックレンジでのカラーを表すことと、非短縮カラー値を生成するために、ビデオデータに適応性を示さない逆静的伝達関数を使用して、第1の複数のコードワードに基づいて第2の複数のコードワードを非短縮化することであって、非短縮カラー値が、第2の異なるダイナミックレンジでのカラーを表し、第2の複数のコードワードが、逆後処理されている第1の複数のコードワードまたは第1の複数のコードワードからのコードワードのうちの1つであることと、第2の複数のコードワードを生成するために第1の複数のコードワードを逆後処理すること、または非短縮カラー値を逆前処理することのうちの少なくとも1つと、非短縮カラー値または逆前処理された非短縮カラー値を出力することとをさせる命令を記憶するコンピュータ可読記憶媒体を説明する。 In one example, the disclosure is, when executed, receiving a first plurality of codewords representing a shortened color value of video data to one or more processors of a device for video processing, The shortened color value represents a color in the first dynamic range and uses an inverse static transfer function that is not adaptable to the video data to produce a non-shortened color value. Non-shortening the second plurality of codewords based on the second codeword, wherein the non-shortening color value represents a color in a second different dynamic range, and the second plurality of codewords, The first plurality of codewords being reverse post-processed or one of the codewords from the first plurality of codewords, and the first to generate a second plurality of codewords Multiple cordova Stores instructions that cause at least one of reverse post-processing of the data or reverse pre-processing of non-shortened color values and outputting non-shortened color values or reverse pre-processed non-shortened color values A computer-readable storage medium will be described.

一例では、本開示はビデオ処理のためのデバイスを説明し、デバイスは、ビデオデータの短縮カラー値を表す第1の複数のコードワードを受信するための手段であって、短縮カラー値が、第1のダイナミックレンジでのカラーを表す、手段と、非短縮カラー値を生成するために、ビデオデータに適応性を示さない逆静的伝達関数を使用して、第1の複数のコードワードに基づいて第2の複数のコードワードを非短縮化するための手段であって、非短縮カラー値が、第2の異なるダイナミックレンジでのカラーを表し、第2の複数のコードワードが、逆後処理されている第1の複数のコードワードまたは第1の複数のコードワードからのコードワードのうちの1つである、手段と、第2の複数のコードワードを生成するために第1の複数のコードワードを逆後処理するための手段、または非短縮カラー値を逆前処理するための手段のうちの少なくとも1つと、非短縮カラー値または逆前処理された非短縮カラー値を出力するための手段とを備える。 In one example, this disclosure describes a device for video processing, wherein the device is a means for receiving a first plurality of codewords representing a shortened color value of video data, wherein the shortened color value is Based on the first multiple codewords, using an inverse static transfer function that does not show adaptability to the video data to produce a non-shortened color value and means to represent the color in a dynamic range of 1 Means for non-shortening the second plurality of codewords, wherein the non-shortening color value represents a color in a second different dynamic range, and the second plurality of codewords are reverse post-processed The first plurality of codewords or one of the codewords from the first plurality of codewords, the means and the first plurality of codewords to generate the second plurality of codewords After reverse codeword Means for physical or at least one of the means for the non-reduced color values to inverse preprocessing, and means for outputting the non-reduced color values or reverse pre-processed non-reduced color values.

1つまたは複数の例の詳細が、添付図面および以下の説明に記載される。他の特徴、目的、および利点は、説明、図面、および特許請求の範囲から明らかになるであろう。 The details of one or more examples are set forth in the accompanying drawings and the description below. Other features, objects, and advantages will be apparent from the description, drawings, and claims.

本開示の技法を実施するように構成された例示的なビデオ符号化および復号システムを示すブロック図である。FIG. 3 is a block diagram illustrating an example video encoding and decoding system configured to implement the techniques of this disclosure. 高ダイナミックレンジ(HDR)データの概念を示す図である。It is a figure which shows the concept of high dynamic range (HDR) data. 高精細度テレビジョン(HDTV)(BT.709)および超高精細度テレビジョン(UHDTV)(BT.2020)のビデオ信号の色域を比較する概念図である。It is a conceptual diagram comparing the color gamuts of video signals of high definition television (HDTV) (BT.709) and ultra high definition television (UHDTV) (BT.2020). HDR/WCG表現変換を示す概念図である。It is a conceptual diagram which shows HDR / WCG expression conversion. HDR/WCG逆変換を示す概念図である。It is a conceptual diagram which shows HDR / WCG reverse conversion. 例示的な伝達関数を示す概念図である。It is a conceptual diagram which shows an example transfer function. PQ TF(ST2084 EOTF)に対する出力対入力の可視化を示す概念図である。It is a conceptual diagram which shows the visualization of the output versus input with respect to PQ TF (ST2084 EOTF). 適応形状伝達関数(TF)を用いたコンテンツ適応型HDR処理パイプライン(エンコーダ側)を示す概念図である。It is a conceptual diagram which shows the content adaptive type HDR processing pipeline (encoder side) using an adaptive shape transfer function (TF). 固定TFを用いたコンテンツ適応型HDR処理パイプライン(エンコーダ側)を示す概念図である。It is a conceptual diagram which shows the content adaptive type HDR processing pipeline (encoder side) using fixed TF. 固定TFを用いたコンテンツ適応型HDR処理パイプライン(デコーダ側)を示す概念図である。It is a conceptual diagram which shows the content adaptive type HDR processing pipeline (decoder side) using fixed TF. 静的TFを用いたコンテンツ適応型HDRパイプライン、エンコーダ側を示す概念図である。It is a conceptual diagram which shows the content adaptive HDR pipeline which uses static TF, and the encoder side. 静的TFを用いたコンテンツ適応型HDRパイプライン、デコーダ側を示す概念図である。It is a conceptual diagram which shows the content adaptive type HDR pipeline and decoder side using static TF. 可視化のための、PQ TF(ST2084)が上に置かれたHDR信号の線形光信号(赤色カラー成分)のヒストグラムの一例を示す図である。It is a figure which shows an example of the histogram of the linear optical signal (red color component) of the HDR signal on which PQTF (ST2084) was set | placed for visualization. PQ TF(ST2084)を線形光信号(赤色カラー成分)に適用することから得られた非線形信号のヒストグラムの例を示す図である。It is a figure which shows the example of the histogram of the nonlinear signal obtained from applying PQTF (ST2084) to a linear optical signal (red color component). PQ TFおよび本開示で説明する技法による処理によって生成される非線形信号のヒストグラム出力を示す図である。FIG. 6 is a diagram illustrating a histogram output of a non-linear signal generated by processing according to the PQ TF and the techniques described in this disclosure. 前処理の後の信号統計量に対するPQ TFの影響を示す図である。It is a figure which shows the influence of PQ TF with respect to the signal statistics after a pre-processing. PQTFによって生成される出力非線形信号のヒストグラムを示す図である。It is a figure which shows the histogram of the output nonlinear signal produced | generated by PQTF. 正規化された非線形信号Sのヒストグラムを示す図である。FIG. 4 is a diagram showing a histogram of a normalized nonlinear signal S. 後処理の後の正規化された非線形信号S2のヒストグラムを示す図である。It is a figure which shows the histogram of the normalized nonlinear signal S2 after post-processing. ST2084に規定されるようなHDRのための非線形PQ TFを示す図である。It is a figure which shows the nonlinear PQTF for HDR as prescribed | regulated to ST2084. Scale2=1およびOffset2=0としてモデル化された、本開示で説明する技法を用いた線形伝達関数y=xを示す図である。FIG. 6 is a diagram illustrating a linear transfer function y = x using the techniques described in this disclosure, modeled as Scale2 = 1 and Offset2 = 0. Scale2=1.5およびOffset=-0.25としてモデル化された、本開示で説明する技法を用いた線形伝達関数を示す図である。FIG. 6 illustrates a linear transfer function using the techniques described in this disclosure, modeled as Scale2 = 1.5 and Offset = −0.25. 伝達関数が適用された後かつ本開示で説明する後処理技法の前に色変換が実行される例を示す概念図である。FIG. 6 is a conceptual diagram illustrating an example in which color conversion is performed after a transfer function is applied and before the post-processing techniques described in this disclosure. 本開示で説明する逆後処理技法の後かつ逆伝達関数が適用される前に逆色変換が実行される例を示す概念図である。It is a conceptual diagram which shows the example in which reverse color conversion is performed after the reverse post-processing technique demonstrated by this indication, and before an inverse transfer function is applied. S2のヒストグラムに関する範囲へのクリッピングを伴う後処理を示す図である。It is a figure which shows the post-processing accompanying the clipping to the range regarding the histogram of S2. S2のヒストグラムに関する範囲へのクリッピングを伴う後処理を示す別の図である。It is another figure which shows the post-processing with the clipping to the range regarding the histogram of S2. S2のヒストグラムに関する範囲へのクリッピングを伴う後処理を示す別の図である。It is another figure which shows the post-processing with the clipping to the range regarding the histogram of S2. 末尾処理を伴う本開示で説明する後処理技法の後の、コードワードのヒストグラムを示す図である。FIG. 6 shows a histogram of codewords after post-processing techniques described in this disclosure with tail processing. 末尾処理を伴う本開示で説明する後処理技法の後の、コードワードのヒストグラムを示す図である。FIG. 6 shows a histogram of codewords after post-processing techniques described in this disclosure with tail processing. 静的TFを用いたコンテンツ適応型HDRパイプライン、エンコーダ側の別の例を示す概念図である。It is a conceptual diagram which shows another example by the content adaptive HDR pipeline which uses static TF, and the encoder side. 静的TFを用いたコンテンツ適応型HDRパイプライン、デコーダ側の別の例を示す概念図である。It is a conceptual diagram which shows another example by the content adaptive HDR pipeline which uses static TF, and the decoder side. カラー値を処理する2つの予約済みコード化を用いたヒストグラムを示す図である。FIG. 6 shows a histogram using two reserved encodings for processing color values. 本開示で説明する技法によって実施されるパラメータ適応型関数を示す図である。FIG. 4 illustrates a parameter adaptive function implemented by the techniques described in this disclosure. 本開示で説明する技法によって実施されるパラメータ適応型関数を示す図である。FIG. 4 illustrates a parameter adaptive function implemented by the techniques described in this disclosure. 本開示で説明する技法が入力信号に適用されて実施される区分的線形伝達関数を用いた後処理、および出力信号のヒストグラムに対するこの後処理の影響を示す図である。FIG. 4 is a diagram illustrating post-processing using a piecewise linear transfer function implemented with the techniques described in this disclosure applied to an input signal, and the effect of this post-processing on the histogram of the output signal. 本開示で説明する技法が入力信号に適用されて実施される区分的線形伝達関数を用いた後処理、および出力信号のヒストグラムに対するこの後処理の影響を示す図である。FIG. 4 is a diagram illustrating post-processing using a piecewise linear transfer function implemented with the techniques described in this disclosure applied to an input signal, and the effect of this post-processing on the histogram of the output signal. 本開示で説明する技法が入力信号に適用されて実施される区分的線形伝達関数を用いた後処理、および出力信号のヒストグラムに対するこの後処理の影響を示す図である。FIG. 4 is a diagram illustrating post-processing using a piecewise linear transfer function implemented with the techniques described in this disclosure applied to an input signal, and the effect of this post-processing on the histogram of the output signal. 一例として、ハイブリッドログガンマ伝達関数および潜在的なターゲットレンジを示す図である。FIG. 4 is a diagram illustrating a hybrid log gamma transfer function and a potential target range as an example. ニーポイント(knee point)付近での放物線の勾配が調整可能である伝達関数を示す図である。It is a figure which shows the transfer function in which the gradient of the parabola in the knee point (knee point) vicinity is adjustable. コンテンツ適応型高ダイナミックレンジ(HDR)システムにおけるビデオ処理の例示的な方法を示すフローチャートである。2 is a flowchart illustrating an exemplary method of video processing in a content adaptive high dynamic range (HDR) system. コンテンツ適応型高ダイナミックレンジ(HDR)システムにおけるビデオ処理の別の例示的な方法を示すフローチャートである。6 is a flowchart illustrating another exemplary method of video processing in a content adaptive high dynamic range (HDR) system.

本開示は、高ダイナミックレンジ(HDR)および広色域(WCG)表現を有する映像信号のコーディングの分野に関する。より詳細には、本開示の技法は、HDRおよびWCGビデオデータのより効率的な圧縮を可能にするためにいくつかの色空間におけるビデオデータに適用される、シグナリングおよび動作を含む。提案される技法は、HDRおよびWCGビデオデータをコーディングするために利用されるハイブリッドベースのビデオコーディングシステム(たとえば、HEVCベースのビデオコーダ)の圧縮効率を改善し得る。 The present disclosure relates to the field of coding video signals having high dynamic range (HDR) and wide color gamut (WCG) representations. More particularly, the techniques of this disclosure include signaling and operations applied to video data in several color spaces to allow more efficient compression of HDR and WCG video data. The proposed technique may improve the compression efficiency of hybrid-based video coding systems (eg, HEVC-based video coders) that are utilized to code HDR and WCG video data.

ハイブリッドベースのビデオコーディング規格を含むビデオコーディング規格は、ITU-T H.261、ISO/IEC MPEG-1 Visual、ITU-T H.262またはISO/IEC MPEG-2 Visual、ITU-T H.263、ISO/IEC MPEG-4 Visual、ならびにそのスケーラブルビデオコーディング(SVC)拡張およびマルチビュービデオコーディング(MVC)拡張を含むITU-T H.264(ISO/IEC MPEG-4 AVCとも呼ばれる)を含む。新たなビデオコーディング規格すなわちHEVCの設計は、ITU-Tビデオコーディングエキスパートグループ(VCEG)とISO/IECモーションピクチャエキスパートグループ(MPEG)とのビデオコーディング共同研究部会(JCT-VC)によって確定された。HEVCワーキングドラフト10(WD10)と呼ばれるHEVCドラフト仕様である、Brossら、「High efficiency video coding (HEVC) text specification draft 10 (for FDIS & Last Call)」、ITU-T SG16 WP3とISO/IEC JTC1/SC29/WG11とのビデオコーディング共同研究部会(JCT-VC)、第12回会合、ジュネーブ、スイス、2013年1月14〜23日、JCTVC-L1003v34は、http://phenix.int-evry.fr/jct/doc_end_user/documents/12_Geneva/wg11/JCTVC-L1003-v34.zipから入手可能である。確定されたHEVC規格は、HEVCバージョン1と呼ばれる。 Video coding standards, including hybrid-based video coding standards, include ITU-T H.261, ISO / IEC MPEG-1 Visual, ITU-T H.262 or ISO / IEC MPEG-2 Visual, ITU-T H.263, Includes ITU-T H.264 (also called ISO / IEC MPEG-4 AVC), including ISO / IEC MPEG-4 Visual, and its scalable video coding (SVC) and multiview video coding (MVC) extensions. The design of a new video coding standard, namely HEVC, was finalized by the Video Coding Joint Research Group (JCT-VC) of the ITU-T Video Coding Expert Group (VCEG) and the ISO / IEC Motion Picture Expert Group (MPEG). Bross et al., `` High efficiency video coding (HEVC) text specification draft 10 (for FDIS & Last Call) '', ITU-T SG16 WP3 and ISO / IEC JTC1 / Video Coordination Working Group with SC29 / WG11 (JCT-VC), 12th Meeting, Geneva, Switzerland, January 14-23, 2013, JCTVC-L1003v34, http://phenix.int-evry.fr Available from /jct/doc_end_user/documents/12_Geneva/wg11/JCTVC-L1003-v34.zip. The established HEVC standard is called HEVC version 1.

障害報告である、Wangら、「High efficiency video coding (HEVC) Defect Report」、ITU-T SG16 WP3とISO/IEC JTC1/SC29/WG11とのビデオコーディング共同研究部会(JCT-VC)、第14回会合、ウィーン、オーストリア、2013年7月25日〜8月2日、JCTVC-N1003v1は、http://phenix.int-evry.fr/jct/doc_end_user/documents/14_Vienna/wg11/JCTVC-N1003-v1.zipから入手可能である。確定されたHEVC規格文書は、ITU-T H.265, Series H: Audiovisual and Multimedia Systems, Infrastructure of audiovisual services - Coding of moving video, High efficiency video coding、国際電気通信連合(ITU)の電気通信標準化部門、2013年4月として公表されており、2014年10月に別のバージョンが公表された。 Wang et al., `` High efficiency video coding (HEVC) Defect Report '', Video coding joint research group (JCT-VC) with ITU-T SG16 WP3 and ISO / IEC JTC1 / SC29 / WG11, 14th Meeting, Vienna, Austria, July 25-August 2, 2013, JCTVC-N1003v1, http://phenix.int-evry.fr/jct/doc_end_user/documents/14_Vienna/wg11/JCTVC-N1003-v1 Available from .zip. The confirmed HEVC standard document is ITU-T H.265, Series H: Audiovisual and Multimedia Systems, Infrastructure of audiovisual services-Coding of moving video, High efficiency video coding, Telecommunication Standardization Division of the International Telecommunication Union (ITU). It was announced as April 2013, and another version was announced in October 2014.

図1は、本開示の技法を利用し得る例示的なビデオ符号化および復号システム10を示すブロック図である。図1に示すように、システム10は、宛先デバイス14によって後で復号されるべき符号化ビデオデータを提供するソースデバイス12を含む。詳細には、ソースデバイス12は、コンピュータ可読媒体16を介して宛先デバイス14にビデオデータを提供する。ソースデバイス12および宛先デバイス14は、デスクトップコンピュータ、ノートブック(すなわち、ラップトップ)コンピュータ、タブレットコンピュータ、セットトップボックス、いわゆる「スマート」フォンなどの電話ハンドセット、いわゆる「スマート」パッド、テレビジョン、カメラ、ディスプレイデバイス、デジタルメディアプレーヤ、ビデオゲームコンソール、ビデオストリーミングデバイスなどを含む、広範囲のデバイスのいずれかを備え得る。場合によっては、ソースデバイス12および宛先デバイス14は、ワイヤレス通信のために装備され得る。 FIG. 1 is a block diagram illustrating an example video encoding and decoding system 10 that may utilize the techniques of this disclosure. As shown in FIG. 1, the system 10 includes a source device 12 that provides encoded video data to be decoded later by a destination device 14. Specifically, source device 12 provides video data to destination device 14 via computer-readable medium 16. Source device 12 and destination device 14 are desktop computers, notebook (i.e. laptop) computers, tablet computers, set top boxes, telephone handsets such as so-called `` smart '' phones, so-called `` smart '' pads, televisions, cameras, Any of a wide range of devices may be provided, including display devices, digital media players, video game consoles, video streaming devices, and the like. In some cases, source device 12 and destination device 14 may be equipped for wireless communication.

宛先デバイス14は、コンピュータ可読媒体16を介して、復号されるべき符号化ビデオデータを受信し得る。コンピュータ可読媒体16は、ソースデバイス12から宛先デバイス14に符号化ビデオデータを移動させることが可能な任意のタイプの媒体またはデバイスを備え得る。一例では、コンピュータ可読媒体16は、ソースデバイス12がリアルタイムで宛先デバイス14へ符号化ビデオデータを直接送信することを可能にする通信媒体を備え得る。符号化ビデオデータは、ワイヤレス通信プロトコルなどの通信規格に従って変調され得、宛先デバイス14へ送信され得る。通信媒体は、無線周波数(RF)スペクトルまたは1つもしくは複数の物理伝送線路などの、任意のワイヤレスまたは有線の通信媒体を備え得る。通信媒体は、ローカルエリアネットワーク、ワイドエリアネットワーク、またはインターネットなどのグローバルネットワークなどの、パケットベースネットワークの一部を形成し得る。通信媒体は、ルータ、スイッチ、基地局、またはソースデバイス12から宛先デバイス14への通信を容易にするのに有用であり得る任意の他の機器を含み得る。 Destination device 14 may receive encoded video data to be decoded via computer readable medium 16. The computer readable medium 16 may comprise any type of medium or device capable of moving encoded video data from the source device 12 to the destination device 14. In one example, computer readable medium 16 may comprise a communication medium that enables source device 12 to transmit encoded video data directly to destination device 14 in real time. The encoded video data may be modulated according to a communication standard such as a wireless communication protocol and transmitted to the destination device 14. The communication medium may comprise any wireless or wired communication medium, such as a radio frequency (RF) spectrum or one or more physical transmission lines. The communication medium may form part of a packet-based network, such as a local area network, a wide area network, or a global network such as the Internet. Communication media may include routers, switches, base stations, or any other equipment that may be useful for facilitating communication from source device 12 to destination device 14.

いくつかの例では、符号化データは、出力インターフェース22から記憶デバイスに出力され得る。同様に、符号化データは、入力インターフェースによって記憶デバイスからアクセスされ得る。記憶デバイスは、ハードドライブ、Blu-ray（登録商標）ディスク、DVD、CD-ROM、フラッシュメモリ、揮発性メモリもしくは不揮発性メモリ、または符号化ビデオデータを記憶するための任意の他の適切なデジタル記憶媒体などの、様々な分散されたまたは局所的にアクセスされるデータ記憶媒体のうちのいずれかを含み得る。さらなる例では、記憶デバイスは、ソースデバイス12によって生成された符号化ビデオを記憶し得るファイルサーバまたは別の中間記憶デバイスに相当し得る。宛先デバイス14は、ストリーミングまたはダウンロードを介して、記憶デバイスからの記憶されたビデオデータにアクセスし得る。ファイルサーバは、符号化ビデオデータを記憶するとともにその符号化ビデオデータを宛先デバイス14へ送信することが可能な、任意のタイプのサーバであり得る。例示的なファイルサーバは、(たとえば、ウェブサイト用の)ウェブサーバ、FTPサーバ、ネットワーク接続ストレージ(NAS)デバイス、またはローカルディスクドライブを含む。宛先デバイス14は、インターネット接続を含む任意の標準的なデータ接続を通じて、符号化ビデオデータにアクセスし得る。これは、ワイヤレスチャネル(たとえば、Wi-Fi接続)、有線接続(たとえば、DSL、ケーブルモデムなど)、またはファイルサーバ上に記憶された符号化ビデオデータにアクセスするのに適した両方の組合せを含み得る。記憶デバイスからの符号化ビデオデータの送信は、ストリーミング送信、ダウンロード送信、またはそれらの組合せであり得る。 In some examples, encoded data may be output from output interface 22 to a storage device. Similarly, encoded data may be accessed from a storage device by an input interface. The storage device can be a hard drive, Blu-ray® disk, DVD, CD-ROM, flash memory, volatile or non-volatile memory, or any other suitable digital for storing encoded video data It can include any of a variety of distributed or locally accessed data storage media, such as storage media. In a further example, the storage device may correspond to a file server or another intermediate storage device that may store the encoded video generated by the source device 12. Destination device 14 may access stored video data from the storage device via streaming or download. The file server can be any type of server that is capable of storing encoded video data and transmitting the encoded video data to the destination device 14. Exemplary file servers include a web server (eg, for a website), an FTP server, a network attached storage (NAS) device, or a local disk drive. Destination device 14 may access the encoded video data through any standard data connection, including an Internet connection. This includes wireless channels (e.g. Wi-Fi connections), wired connections (e.g. DSL, cable modems, etc.) or a combination of both suitable for accessing encoded video data stored on a file server. obtain. The transmission of the encoded video data from the storage device may be a streaming transmission, a download transmission, or a combination thereof.

本開示の技法は、必ずしもワイヤレスの適用例または設定に限定されるとは限らない。技法は、オーバージエアテレビジョン放送、ケーブルテレビジョン送信、衛星テレビジョン送信、動的適応ストリーミングオーバーHTTP(DASH)などのインターネットストリーミングビデオ送信、データ記憶媒体上へ符号化されるデジタルビデオ、データ記憶媒体上に記憶されたデジタルビデオの復号、または他の適用例などの、様々なマルチメディア適用例のうちのいずれかをサポートする際にビデオコーディングに適用され得る。いくつかの例では、システム10は、ビデオストリーミング、ビデオ再生、ビデオ放送、および/またはビデオ電話などの適用例をサポートするために、一方向または双方向ビデオ送信をサポートするように構成され得る。 The techniques of this disclosure are not necessarily limited to wireless applications or settings. Techniques include over-the-air television broadcasting, cable television transmission, satellite television transmission, Internet streaming video transmission such as dynamic adaptive streaming over HTTP (DASH), digital video encoded on a data storage medium, data storage It may be applied to video coding in supporting any of a variety of multimedia applications, such as decoding digital video stored on a medium, or other applications. In some examples, the system 10 may be configured to support one-way or two-way video transmission to support applications such as video streaming, video playback, video broadcast, and / or video telephony.

図1の例では、ソースデバイス12は、ビデオソース18、ビデオプリプロセッサ19およびビデオエンコーダ20を含むビデオ符号化ユニット21、ならびに出力インターフェース22を含む。宛先デバイス14は、入力インターフェース28、ビデオデコーダ30およびビデオポストプロセッサ31を含むビデオ復号ユニット29、ならびにディスプレイデバイス32を含む。本開示によれば、ビデオプリプロセッサ19およびビデオポストプロセッサ31は、本開示で説明する例示的な技法を適用するように構成され得る。たとえば、ビデオプリプロセッサ19およびビデオポストプロセッサ31は、静的伝達関数を適用するように構成された静的伝達関数ユニットを含んでよいが、信号特性を適応させることができる前処理ユニットおよび後処理ユニットを有する。 In the example of FIG. 1, the source device 12 includes a video source 18, a video encoding unit 21 that includes a video preprocessor 19 and a video encoder 20, and an output interface 22. The destination device 14 includes an input interface 28, a video decoding unit 29 including a video decoder 30 and a video post processor 31, and a display device 32. In accordance with this disclosure, video preprocessor 19 and video postprocessor 31 may be configured to apply the exemplary techniques described in this disclosure. For example, the video preprocessor 19 and video postprocessor 31 may include a static transfer function unit configured to apply a static transfer function, but a pre-processing unit and a post-processing unit capable of adapting signal characteristics Have

他の例では、ソースデバイスおよび宛先デバイスが、他の構成要素または構成を含み得る。たとえば、ソースデバイス12は、外部カメラなどの外部ビデオソース18からビデオデータを受信し得る。同様に、宛先デバイス14は、統合されたディスプレイデバイスを含むのではなく、外部ディスプレイデバイスとインターフェースしてもよい。 In other examples, the source device and destination device may include other components or configurations. For example, the source device 12 may receive video data from an external video source 18 such as an external camera. Similarly, destination device 14 may interface with an external display device rather than including an integrated display device.

図1の図示したシステム10は一例にすぎない。ビデオデータを処理するための技法は、任意のデジタルビデオ符号化および/または復号デバイスによって実行され得る。一般に、本開示の技法はビデオ符号化デバイスによって実行されるが、技法はまた、通常は「コーデック」と呼ばれるビデオエンコーダ/デコーダによって実行され得る。説明しやすいように、本開示は、ビデオプリプロセッサ19およびビデオポストプロセッサ31が、ソースデバイス12および宛先デバイス14のうちのそれぞれのデバイスにおいて、本開示で説明する例示的な技法を実行することに関して説明される。ソースデバイス12および宛先デバイス14は、ソースデバイス12が宛先デバイス14への送信のためのコード化ビデオデータを生成するようなコーディングデバイスの例にすぎない。いくつかの例では、デバイス12、14は、デバイス12、14の各々がビデオ符号化構成要素および復号構成要素を含むように実質的に対称的に動作し得る。したがって、システム10は、たとえば、ビデオストリーミング、ビデオ再生、ビデオ放送、またはビデオ電話のために、ビデオデバイス12と14との間で一方向または双方向のビデオ送信をサポートし得る。 The illustrated system 10 of FIG. 1 is only an example. Techniques for processing video data may be performed by any digital video encoding and / or decoding device. In general, the techniques of this disclosure are performed by a video encoding device, but the techniques may also be performed by a video encoder / decoder, commonly referred to as a “codec”. For ease of explanation, this disclosure describes video preprocessor 19 and video postprocessor 31 performing the exemplary techniques described in this disclosure on each of source device 12 and destination device 14. Is done. Source device 12 and destination device 14 are only examples of coding devices in which source device 12 generates coded video data for transmission to destination device 14. In some examples, devices 12, 14 may operate substantially symmetrically such that each of devices 12, 14 includes a video encoding component and a decoding component. Thus, the system 10 may support one-way or two-way video transmission between the video devices 12 and 14, for example for video streaming, video playback, video broadcast, or video telephony.

ソースデバイス12のビデオソース18は、ビデオカメラ、以前にキャプチャされたビデオを含むビデオアーカイブ、および/またはビデオコンテンツプロバイダからビデオを受信するためのビデオフィードインターフェースなどの、ビデオキャプチャデバイスを含み得る。さらなる代替として、ビデオソース18は、ソースビデオとしてコンピュータグラフィックスベースのデータ、またはライブビデオ、アーカイブされたビデオ、およびコンピュータ生成されたビデオの組合せを生成し得る。場合によっては、ビデオソース18がビデオカメラである場合、ソースデバイス12および宛先デバイス14は、いわゆるカメラ付き携帯電話またはビデオ付き携帯電話を形成し得る。しかしながら、上述のように、本開示で説明する技法は、一般に、ビデオコーディングに適用可能であってよく、ワイヤレスおよび/または有線の適用例に適用されてもよい。各々の場合において、キャプチャされたビデオ、プリキャプチャされたビデオ、またはコンピュータ生成されたビデオは、ビデオ符号化ユニット21によって符号化され得る。符号化ビデオ情報は、次いで、出力インターフェース22によってコンピュータ可読媒体16上に出力され得る。 The video source 18 of the source device 12 may include a video capture device, such as a video camera, a video archive containing previously captured video, and / or a video feed interface for receiving video from a video content provider. As a further alternative, video source 18 may generate computer graphics-based data as a source video, or a combination of live video, archived video, and computer-generated video. In some cases, if video source 18 is a video camera, source device 12 and destination device 14 may form a so-called camera phone or video phone. However, as described above, the techniques described in this disclosure may be generally applicable to video coding and may be applied to wireless and / or wired applications. In each case, the captured video, pre-captured video, or computer-generated video may be encoded by video encoding unit 21. The encoded video information may then be output on the computer readable medium 16 by the output interface 22.

コンピュータ可読媒体16は、ワイヤレスブロードキャストもしくは有線ネットワーク送信などの一時媒体、またはハードディスク、フラッシュドライブ、コンパクトディスク、デジタルビデオディスク、Blu-ray（登録商標）ディスク、もしくは他のコンピュータ可読媒体などの記憶媒体(すなわち、非一時的記憶媒体)を含み得る。いくつかの例では、ネットワークサーバ(図示せず)が、たとえば、ネットワーク送信を介して、ソースデバイス12から符号化ビデオデータを受信し得、符号化ビデオデータを宛先デバイス14に提供し得る。同様に、ディスクスタンピング設備などの、媒体製造設備のコンピューティングデバイスが、ソースデバイス12から符号化ビデオデータを受信し得、符号化ビデオデータを含むディスクを製造し得る。したがって、様々な例では、コンピュータ可読媒体16は、様々な形態の1つまたは複数のコンピュータ可読媒体を含むと理解され得る。 The computer readable medium 16 may be a temporary medium such as a wireless broadcast or wired network transmission, or a storage medium such as a hard disk, flash drive, compact disk, digital video disk, Blu-ray® disk, or other computer readable medium ( That is, it may include a non-transitory storage medium. In some examples, a network server (not shown) may receive encoded video data from the source device 12 and provide the encoded video data to the destination device 14, for example, via a network transmission. Similarly, a computing device of a media manufacturing facility, such as a disk stamping facility, may receive encoded video data from source device 12 and manufacture a disc that includes the encoded video data. Accordingly, in various examples, computer readable medium 16 may be understood to include various forms of one or more computer readable media.

宛先デバイス14の入力インターフェース28は、コンピュータ可読媒体16から情報を受信する。コンピュータ可読媒体16の情報は、ビデオ符号化ユニット21のビデオエンコーダ20によって規定され、またビデオ復号ユニット29のビデオデコーダ30によって使用される、シンタックス情報を含んでよく、シンタックス情報は、ブロックおよび他のコード化ユニット、たとえばピクチャグループ(GOP)の特性および/または処理を記述するシンタックス要素を含む。ディスプレイデバイス32は、復号ビデオデータをユーザに表示し、陰極線管(CRT)、液晶ディスプレイ(LCD)、プラズマディスプレイ、有機発光ダイオード(OLED)ディスプレイ、または別のタイプのディスプレイデバイスなどの、様々なディスプレイデバイスのうちのいずれかを備え得る。 The input interface 28 of the destination device 14 receives information from the computer readable medium 16. The information of the computer readable medium 16 may include syntax information defined by the video encoder 20 of the video encoding unit 21 and used by the video decoder 30 of the video decoding unit 29, where the syntax information includes blocks and Includes syntax elements that describe the characteristics and / or processing of other coding units, eg, picture groups (GOPs). The display device 32 displays the decoded video data to the user and various displays such as a cathode ray tube (CRT), liquid crystal display (LCD), plasma display, organic light emitting diode (OLED) display, or another type of display device. Any of the devices may be provided.

図示したように、ビデオプリプロセッサ19は、ビデオソース18からビデオデータを受信する。ビデオプリプロセッサ19は、ビデオデータを処理して、ビデオエンコーダ20を用いて符号化するのに適切な形態に変換するように構成され得る。たとえば、ビデオプリプロセッサ19は、ダイナミックレンジ短縮(たとえば、非線形伝達関数を使用して)、よりコンパクトもしくはロバストな色空間への色変換、および/または浮動表現から整数表現への変換を実行し得る。ビデオエンコーダ20は、ビデオプリプロセッサ19によって出力されるビデオデータ上でビデオ符号化を実行し得る。ビデオデコーダ30は、ビデオエンコーダ20の逆を実行してビデオデータを復号し得、ビデオポストプロセッサ31は、ビデオプリプロセッサ19の逆を実行してビデオデータを表示にとって適切な形態に変換し得る。たとえば、ビデオポストプロセッサ31は、表示にとって適切なビデオデータを生成するために、整数からフローティングへの変換、コンパクトもしくはロバストな色空間からの色変換、および/またはダイナミックレンジ短縮の逆を実行し得る。 As shown, video preprocessor 19 receives video data from video source 18. Video preprocessor 19 may be configured to process the video data and convert it to a form suitable for encoding using video encoder 20. For example, video preprocessor 19 may perform dynamic range reduction (eg, using a non-linear transfer function), color conversion to a more compact or robust color space, and / or conversion from a floating representation to an integer representation. Video encoder 20 may perform video encoding on the video data output by video preprocessor 19. Video decoder 30 may perform the inverse of video encoder 20 to decode the video data, and video postprocessor 31 may perform the inverse of video preprocessor 19 to convert the video data into a form suitable for display. For example, video post processor 31 may perform the inverse of integer to floating conversion, color conversion from a compact or robust color space, and / or dynamic range reduction to produce video data suitable for display. .

ビデオ符号化ユニット21およびビデオ復号ユニット29は各々、1つもしくは複数のマイクロプロセッサ、デジタル信号プロセッサ(DSP)、特定用途向け集積回路(ASIC)、フィールドプログラマブルゲートアレイ(FPGA)、個別論理、ソフトウェア、ハードウェア、ファームウェア、またはそれらの任意の組合せなどの、様々な固定機能およびプログラマブルの回路構成のうちのいずれかとして実装され得る。技法が部分的にソフトウェアで実装されるとき、デバイスは、適切な非一時的コンピュータ可読媒体にソフトウェアのための命令を記憶し得、1つまたは複数のプロセッサを使用してハードウェアで命令を実行して、本開示の技法を実行し得る。ビデオ符号化ユニット21およびビデオ復号ユニット29の各々は、1つまたは複数のエンコーダまたはデコーダの中に含まれてよく、そのいずれもが、それぞれのデバイスにおいて複合エンコーダ/デコーダ(コーデック)の一部として統合されてよい。 Video encoding unit 21 and video decoding unit 29 are each one or more microprocessors, digital signal processors (DSPs), application specific integrated circuits (ASICs), field programmable gate arrays (FPGAs), discrete logic, software, It can be implemented as any of a variety of fixed function and programmable circuit configurations, such as hardware, firmware, or any combination thereof. When the technique is partially implemented in software, the device may store instructions for the software in a suitable non-transitory computer readable medium and execute the instructions in hardware using one or more processors Thus, the techniques of this disclosure may be performed. Each of video encoding unit 21 and video decoding unit 29 may be included in one or more encoders or decoders, both of which are part of a combined encoder / decoder (codec) in the respective device. May be integrated.

ビデオプリプロセッサ19およびビデオエンコーダ20はビデオ符号化ユニット21内の別個のユニットであるものとして図示され、ビデオポストプロセッサ31およびビデオデコーダ30はビデオ復号ユニット29内の別個のユニットであるものとして図示されるが、本開示で説明する技法はそのように限定されない。ビデオプリプロセッサ19およびビデオエンコーダ20は、共通のデバイス(たとえば、同じ集積回路であるか、または同じチップもしくはチップパッケージ内に収容される)として形成されてよい。同様に、ビデオポストプロセッサ31およびビデオデコーダ30は、共通のデバイス(たとえば、同じ集積回路であるか、または同じチップもしくはチップパッケージ内に収容される)として形成されてよい。 Video preprocessor 19 and video encoder 20 are illustrated as being separate units within video encoding unit 21, and video post-processor 31 and video decoder 30 are illustrated as being separate units within video decoding unit 29. However, the techniques described in this disclosure are not so limited. Video preprocessor 19 and video encoder 20 may be formed as a common device (eg, the same integrated circuit or housed in the same chip or chip package). Similarly, video post processor 31 and video decoder 30 may be formed as a common device (eg, the same integrated circuit or housed in the same chip or chip package).

いくつかの例では、ビデオエンコーダ20およびビデオデコーダ30は、そのスケーラブルビデオコーディング(SVC)拡張、マルチビュービデオコーディング(MVC)拡張、およびMVCベースの3次元ビデオ(3DV)拡張を含む、ISO/IEC MPEG-4 VisualおよびITU-T H.264(ISO/IEC MPEG-4 AVCとも呼ばれる)などのビデオ圧縮規格に従って動作する。いくつかの事例では、MVCベースの3DVに適合するいかなるビットストリームも、MVCプロファイル、たとえば、ステレオハイプロファイルに準拠するサブビットストリームを常に含む。さらに、H.264/AVCに対する3DVコーディング拡張、すなわちAVCベースの3DVを生成するための作業が進行中である。ビデオコーディング規格の他の例は、ITU-T H.261、ISO/IEC MPEG-1 Visual、ITU-T H.262またはISO/IEC MPEG-2 Visual、ITU-T H.263、ISO/IEC MPEG-4 Visual、およびITU-T H.264、ISO/IEC Visualを含む。他の例では、ビデオエンコーダ20およびビデオデコーダ30は、ITU-T H.265、HEVC規格に従って動作するように構成され得る。 In some examples, video encoder 20 and video decoder 30 are ISO / IEC, including their scalable video coding (SVC) extension, multi-view video coding (MVC) extension, and MVC-based 3D video (3DV) extension. It operates according to video compression standards such as MPEG-4 Visual and ITU-T H.264 (also called ISO / IEC MPEG-4 AVC). In some cases, any bitstream that conforms to MVC-based 3DV always includes a sub-bitstream that conforms to an MVC profile, eg, a stereo high profile. Furthermore, work is underway to generate 3DV coding extensions to H.264 / AVC, ie AVC-based 3DV. Other examples of video coding standards are ITU-T H.261, ISO / IEC MPEG-1 Visual, ITU-T H.262 or ISO / IEC MPEG-2 Visual, ITU-T H.263, ISO / IEC MPEG -4 Visual, ITU-T H.264, ISO / IEC Visual included. In other examples, video encoder 20 and video decoder 30 may be configured to operate according to the ITU-T H.265, HEVC standard.

HEVCおよび他のビデオコーディング規格では、ビデオシーケンスは、通常、一連のピクチャを含む。ピクチャは「フレーム」と呼ばれることもある。ピクチャは、S_L、S_Cb、およびS_Crと示される3つのサンプルアレイを含み得る。S_Lは、ルーマサンプルの2次元アレイ(すなわち、ブロック)である。S_Cbは、Cbクロミナンスサンプルの2次元アレイである。S_Crは、Crクロミナンスサンプルの2次元アレイである。クロミナンスサンプルは、本明細書で「クロマ」サンプルと呼ばれることもある。他の事例では、ピクチャはモノクロームであってよく、ルーマサンプルのアレイしか含まないことがある。 In HEVC and other video coding standards, a video sequence typically includes a series of pictures. A picture is sometimes called a “frame”. A picture may include three sample arrays denoted S _L , S _Cb , and S _Cr . S _L is a two-dimensional array (ie, block) of luma samples. S _Cb is a two-dimensional array of Cb chrominance samples. S _Cr is a two-dimensional array of Cr chrominance samples. A chrominance sample may also be referred to herein as a “chroma” sample. In other cases, the picture may be monochrome and may contain only an array of luma samples.

ビデオエンコーダ20は、コーディングツリーユニット(CTU)のセットを生成し得る。CTUの各々は、ルーマサンプルのコーディングツリーブロック、クロマサンプルの2つの対応するコーディングツリーブロック、およびコーディングツリーブロックのサンプルをコーディングするために使用されるシンタックス構造を備え得る。モノクロームピクチャ、または3つの別個のカラープレーンを有するピクチャでは、CTUは、単一のコーディングツリーブロック、およびコーディングツリーブロックのサンプルをコーディングするために使用されるシンタックス構造を備え得る。コーディングツリーブロックは、サンプルのN×Nブロックであり得る。CTUは、「ツリーブロック」または「最大コーディングユニット」(LCU)と呼ばれることもある。HEVCのCTUは、H.264/AVCなどの他のビデオコーディング規格のマクロブロックとおおまかに類似であり得る。しかしながら、CTUは、必ずしも特定のサイズに限定されるとは限らず、1つまたは複数のコーディングユニット(CU)を含み得る。スライスは、ラスタ走査において連続的に順序付けられた整数個のCTUを含み得る。 Video encoder 20 may generate a set of coding tree units (CTUs). Each of the CTUs may comprise a syntax structure used to code a coding tree block of luma samples, two corresponding coding tree blocks of chroma samples, and a sample of coding tree blocks. For a monochrome picture, or a picture with three separate color planes, the CTU may comprise a single coding tree block and a syntax structure used to code samples of the coding tree block. The coding tree block may be an N × N block of samples. A CTU may also be referred to as a “tree block” or “maximum coding unit” (LCU). The HEVC CTU may be roughly similar to macroblocks of other video coding standards such as H.264 / AVC. However, a CTU is not necessarily limited to a particular size and may include one or more coding units (CUs). A slice may include an integer number of CTUs sequentially ordered in a raster scan.

本開示は、サンプルの1つまたは複数のブロックと、サンプルの1つまたは複数のブロックのサンプルをコーディングするために使用されるシンタックス構造とを指すために、「ビデオユニット」または「ビデオブロック」という用語を使用することがある。例示的なタイプのビデオユニットは、HEVCにおけるCTU、CU、PU、変換ユニット(TU)、または他のビデオコーディング規格におけるマクロブロック、マクロブロック区分などを含み得る。 This disclosure refers to “video unit” or “video block” to refer to one or more blocks of samples and the syntax structure used to code samples of one or more blocks of samples. The term is sometimes used. Exemplary types of video units may include CTUs, CUs, PUs, transform units (TUs) in HEVC, macroblocks, macroblock partitions, etc. in other video coding standards.

ビデオエンコーダ20は、CUのコーディングブロックを1つまたは複数の予測ブロックに区分し得る。予測ブロックは、同じ予測が適用されるサンプルの長方形(すなわち、正方形または非正方形)のブロックであり得る。CUの予測ユニット(PU)は、ピクチャの、ルーマサンプルの予測ブロック、クロマサンプルの2つの対応する予測ブロック、および予測ブロックサンプルを予測するために使用されるシンタックス構造を備え得る。モノクロームピクチャ、または3つの別個のカラープレーンを有するピクチャでは、PUは、単一の予測ブロック、および予測ブロックサンプルを予測するために使用されるシンタックス構造を備え得る。ビデオエンコーダ20は、CUの各PUのルーマ予測ブロック、Cb予測ブロック、およびCr予測ブロックに関する、予測ルーマブロック、予測Cbブロック、および予測Crブロックを生成し得る。 Video encoder 20 may partition the coding block of the CU into one or more prediction blocks. A prediction block may be a rectangular (ie, square or non-square) block of samples to which the same prediction is applied. The prediction unit (PU) of a CU may comprise a syntax structure used to predict a prediction block of luma samples, two corresponding prediction blocks of chroma samples, and a prediction block sample of a picture. For a monochrome picture, or a picture with three separate color planes, the PU may comprise a single predicted block and a syntax structure used to predict the predicted block samples. Video encoder 20 may generate a prediction luma block, a prediction Cb block, and a prediction Cr block for the luma prediction block, Cb prediction block, and Cr prediction block of each PU of the CU.

ビデオエンコーダ20は、PUに関する予測ブロックを生成するためにイントラ予測またはインター予測を使用し得る。ビデオエンコーダ20がPUの予測ブロックを生成するためにイントラ予測を使用する場合、ビデオエンコーダ20は、PUに関連するピクチャの復号サンプルに基づいてPUの予測ブロックを生成し得る。 Video encoder 20 may use intra prediction or inter prediction to generate a prediction block for the PU. If video encoder 20 uses intra prediction to generate a PU prediction block, video encoder 20 may generate a PU prediction block based on decoded samples of pictures associated with the PU.

ビデオエンコーダ20がCUの1つまたは複数のPUに関する予測ルーマブロック、予測Cbブロック、および予測Crブロックを生成した後、ビデオエンコーダ20は、CUに関するルーマ残差ブロックを生成し得る。CUのルーマ残差ブロックの中の各サンプルは、CUの予測ルーマブロックのうちの1つの中のルーマサンプルと、CUの元のルーマコーディングブロックの中の対応するサンプルとの間の差分を示す。加えて、ビデオエンコーダ20は、CUに関するCb残差ブロックを生成し得る。CUのCb残差ブロックの中の各サンプルは、CUの予測Cbブロックのうちの1つの中のCbサンプルと、CUの元のCbコーディングブロックの中の対応するサンプルとの間の差分を示し得る。ビデオエンコーダ20はまた、CUに関するCr残差ブロックを生成し得る。CUのCr残差ブロックの中の各サンプルは、CUの予測Crブロックのうちの1つの中のCrサンプルと、CUの元のCrコーディングブロックの中の対応するサンプルとの間の差分を示し得る。 After video encoder 20 generates a predicted luma block, predicted Cb block, and predicted Cr block for one or more PUs of the CU, video encoder 20 may generate a luma residual block for the CU. Each sample in the CU's luma residual block indicates the difference between the luma sample in one of the CU's predicted luma blocks and the corresponding sample in the CU's original luma coding block. In addition, video encoder 20 may generate a Cb residual block for the CU. Each sample in the CU's Cb residual block may indicate the difference between the Cb sample in one of the CU's predicted Cb blocks and the corresponding sample in the CU's original Cb coding block . Video encoder 20 may also generate a Cr residual block for the CU. Each sample in the CU's Cr residual block may indicate the difference between the Cr sample in one of the CU's predicted Cr blocks and the corresponding sample in the CU's original Cr coding block .

さらに、ビデオエンコーダ20は、4分木区分を使用して、CUのルーマ残差ブロック、Cb残差ブロック、およびCr残差ブロックを、1つまたは複数のルーマ変換ブロック、Cb変換ブロック、およびCr変換ブロックに分解し得る。変換ブロックは、同じ変換が適用されるサンプルの長方形ブロックであり得る。CUの変換ユニット(TU)は、ルーマサンプルの変換ブロック、クロマサンプルの2つの対応する変換ブロック、および変換ブロックサンプルを変換するために使用されるシンタックス構造を備え得る。モノクロームピクチャ、または3つの別個のカラープレーンを有するピクチャでは、TUは、単一の変換ブロック、および変換ブロックサンプルを変換するために使用されるシンタックス構造を備え得る。したがって、CUの各TUは、ルーマ変換ブロック、Cb変換ブロック、およびCr変換ブロックに関連し得る。TUに関連するルーマ変換ブロックは、CUのルーマ残差ブロックのサブブロックであり得る。Cb変換ブロックは、CUのCb残差ブロックのサブブロックであり得る。Cr変換ブロックは、CUのCr残差ブロックのサブブロックであり得る。 In addition, video encoder 20 uses quadtree partitioning to convert a CU's luma residual block, Cb residual block, and Cr residual block into one or more luma transform blocks, Cb transform blocks, and Cr Can be broken down into transform blocks. The transform block may be a sample rectangular block to which the same transform is applied. The transform unit (TU) of the CU may comprise a luma sample transform block, two corresponding transform blocks of chroma samples, and a syntax structure used to transform the transform block samples. For a monochrome picture, or a picture with three separate color planes, the TU may comprise a single transform block and a syntax structure that is used to transform transform block samples. Thus, each TU of a CU may be associated with a luma transform block, a Cb transform block, and a Cr transform block. The luma transform block associated with the TU may be a sub-block of the CU's luma residual block. The Cb transform block may be a sub-block of the CU's Cb residual block. The Cr transform block may be a sub-block of the CU's Cr residual block.

ビデオエンコーダ20は、TUに関するルーマ係数ブロックを生成するために、TUのルーマ変換ブロックに1つまたは複数の変換を適用し得る。係数ブロックは、変換係数の2次元アレイであり得る。変換係数は、スカラー量であり得る。ビデオエンコーダ20は、TUに関するCb係数ブロックを生成するために、TUのCb変換ブロックに1つまたは複数の変換を適用し得る。ビデオエンコーダ20は、TUに関するCr係数ブロックを生成するために、TUのCr変換ブロックに1つまたは複数の変換を適用し得る。 Video encoder 20 may apply one or more transforms to the TU's luma transform block to generate a luma coefficient block for the TU. The coefficient block can be a two-dimensional array of transform coefficients. The conversion factor can be a scalar quantity. Video encoder 20 may apply one or more transforms to the TU's Cb transform block to generate a Cb coefficient block for the TU. Video encoder 20 may apply one or more transforms to the TU Cr transform block to generate a Cr coefficient block for the TU.

係数ブロック(たとえば、ルーマ係数ブロック、Cb係数ブロック、またはCr係数ブロック)を生成した後、ビデオエンコーダ20は、係数ブロックを量子化し得る。量子化とは、概して、変換係数を表すために使用されるデータの量をできる限り低減するために変換係数が量子化され、さらなる圧縮をもたらすプロセスを指す。さらに、ビデオエンコーダ20は、ピクチャのCUのTUの変換ブロックを再構成するために、変換係数を逆量子化し得、変換係数に逆変換を適用し得る。ビデオエンコーダ20は、CUのコーディングブロックを再構成するために、CUのTUの再構成された変換ブロックおよびCUのPUの予測ブロックを使用し得る。ピクチャの各CUのコーディングブロックを再構成することによって、ビデオエンコーダ20は、ピクチャを再構成し得る。ビデオエンコーダ20は、再構成されたピクチャを復号ピクチャバッファ(DPB)の中に記憶し得る。ビデオエンコーダ20は、DPBの中の再構成されたピクチャを、インター予測およびイントラ予測のために使用し得る。 After generating a coefficient block (eg, luma coefficient block, Cb coefficient block, or Cr coefficient block), video encoder 20 may quantize the coefficient block. Quantization generally refers to a process in which transform coefficients are quantized to reduce as much as possible the amount of data used to represent the transform coefficients, resulting in further compression. Further, video encoder 20 may dequantize the transform coefficients and apply the inverse transform to the transform coefficients to reconstruct the transform block of the CU TU of the picture. Video encoder 20 may use the reconstructed transform block of CU TU and the prediction block of PU of CU to reconstruct the coding block of CU. By reconstructing the coding block for each CU of the picture, video encoder 20 may reconstruct the picture. Video encoder 20 may store the reconstructed picture in a decoded picture buffer (DPB). Video encoder 20 may use the reconstructed picture in the DPB for inter prediction and intra prediction.

ビデオエンコーダ20が係数ブロックを量子化した後、ビデオエンコーダ20は、量子化変換係数を示すシンタックス要素をエントロピー符号化し得る。たとえば、ビデオエンコーダ20は、量子化変換係数を示すシンタックス要素に対してコンテキスト適応型バイナリ算術コーディング(CABAC)を実行し得る。ビデオエンコーダ20は、エントロピー符号化されたシンタックス要素をビットストリームの中に出力し得る。 After video encoder 20 quantizes the coefficient block, video encoder 20 may entropy encode syntax elements indicating quantized transform coefficients. For example, video encoder 20 may perform context adaptive binary arithmetic coding (CABAC) on syntax elements indicating quantized transform coefficients. Video encoder 20 may output entropy encoded syntax elements in the bitstream.

ビデオエンコーダ20は、コード化ピクチャの表現および関連するデータを形成するビットのシーケンスを含むビットストリームを出力し得る。ビットストリームは、ネットワークアブストラクションレイヤ(NAL)ユニットのシーケンスを備え得る。NALユニットの各々は、NALユニットヘッダを含み、ローバイトシーケンスペイロード(RBSP)をカプセル化する。NALユニットヘッダは、NALユニットタイプコードを示すシンタックス要素を含み得る。NALユニットのNALユニットヘッダによって指定されるNALユニットタイプコードは、NALユニットのタイプを示す。RBSPは、NALユニット内にカプセル化されている整数個のバイトを含むシンタックス構造であり得る。いくつかの事例では、RBSPは0個のビットを含む。 Video encoder 20 may output a bitstream that includes a representation of a coded picture and a sequence of bits that form associated data. The bitstream may comprise a sequence of network abstraction layer (NAL) units. Each NAL unit includes a NAL unit header and encapsulates a raw byte sequence payload (RBSP). The NAL unit header may include a syntax element that indicates a NAL unit type code. The NAL unit type code specified by the NAL unit header of the NAL unit indicates the type of the NAL unit. An RBSP may be a syntax structure that includes an integer number of bytes encapsulated within a NAL unit. In some cases, the RBSP includes 0 bits.

異なるタイプのNALユニットは、異なるタイプのRBSPをカプセル化し得る。たとえば、第1のタイプのNALユニットはピクチャパラメータセット(PPS)用のRBSPをカプセル化し得、第2のタイプのNALユニットはコード化スライス用のRBSPをカプセル化し得、第3のタイプのNALユニットは補足エンハンスメント情報(SEI)用のRBSPをカプセル化し得、以下同様である。PPSとは、0個以上のコード化ピクチャ全体に適用されるシンタックス要素を含み得るシンタックス構造である。(パラメータセット用およびSEIメッセージ用のRBSPではなく)ビデオコーディングデータ用のRBSPをカプセル化するNALユニットは、ビデオコーディングレイヤ(VCL)NALユニットと呼ばれることがある。コード化スライスをカプセル化するNALユニットは、本明細書ではコード化スライスNALユニットと呼ばれることがある。コード化スライス用のRBSPは、スライスヘッダおよびスライスデータを含み得る。 Different types of NAL units may encapsulate different types of RBSP. For example, a first type of NAL unit may encapsulate an RBSP for a picture parameter set (PPS), a second type of NAL unit may encapsulate an RBSP for a coded slice, and a third type of NAL unit Can encapsulate RBSP for supplemental enhancement information (SEI), and so on. PPS is a syntax structure that may include syntax elements that apply to zero or more coded pictures as a whole. A NAL unit that encapsulates an RBSP for video coding data (as opposed to an RBSP for parameter sets and SEI messages) may be referred to as a video coding layer (VCL) NAL unit. A NAL unit that encapsulates a coded slice may be referred to herein as a coded slice NAL unit. An RBSP for a coded slice may include a slice header and slice data.

ビデオデコーダ30は、ビットストリームを受信し得る。加えて、ビデオデコーダ30は、ビットストリームからシンタックス要素を復号するために、ビットストリームを構文解析し得る。ビデオデコーダ30は、ビットストリームから復号されたシンタックス要素に少なくとも部分的に基づいて、ビデオデータのピクチャを再構成し得る。ビデオデータを再構成するためのプロセスは、概して、ビデオエンコーダ20によって実行されるプロセスと相反であり得る。たとえば、ビデオデコーダ30は、現在CUのPUに関する予測ブロックを決定するために、PUの動きベクトルを使用し得る。ビデオデコーダ30は、PUに関する予測ブロックを生成するために、PUの1つまたは複数の動きベクトルを使用し得る。 Video decoder 30 may receive the bitstream. In addition, video decoder 30 may parse the bitstream to decode syntax elements from the bitstream. Video decoder 30 may reconstruct a picture of the video data based at least in part on syntax elements decoded from the bitstream. The process for reconstructing video data may generally be the opposite of the process performed by video encoder 20. For example, video decoder 30 may use the motion vector of the PU to determine a prediction block for the PU of the current CU. Video decoder 30 may use one or more motion vectors of the PU to generate a prediction block for the PU.

加えて、ビデオデコーダ30は、現在CUのTUに関連する係数ブロックを逆量子化し得る。ビデオデコーダ30は、現在CUのTUに関連する変換ブロックを再構成するために、係数ブロックに対して逆変換を実行し得る。ビデオデコーダ30は、現在CUのPUに関する予測サンプルブロックのサンプルを、現在CUのTUの変換ブロックの対応するサンプルに加算することによって、現在CUのコーディングブロックを再構成し得る。ピクチャのCUごとにコーディングブロックを再構成することによって、ビデオデコーダ30はピクチャを再構成し得る。ビデオデコーダ30は、出力するために、および/または他のピクチャを復号する際に使用するために、復号ピクチャを復号ピクチャバッファの中に記憶し得る。 In addition, video decoder 30 may dequantize the coefficient block associated with the current CU's TU. Video decoder 30 may perform an inverse transform on the coefficient block to reconstruct the transform block associated with the current CU's TU. Video decoder 30 may reconstruct the current CU coding block by adding the sample of the predicted sample block for the current CU PU to the corresponding sample of the transform block of the current CU TU. By reconstructing the coding block for each CU of the picture, video decoder 30 may reconstruct the picture. Video decoder 30 may store the decoded picture in a decoded picture buffer for output and / or use in decoding other pictures.

次世代ビデオアプリケーションは、HDR(高ダイナミックレンジ)およびWCG(広色域)を有するキャプチャされた景色を表すビデオデータとともに動作すると予想される。利用されるダイナミックレンジおよび色域のパラメータは、ビデオコンテンツの2つの独立した属性であり、デジタルテレビジョンおよびマルチメディアサービスのためのそれらの仕様が、いくつかの国際規格によって規定されている。たとえば、ITU-R Rec.709は、標準ダイナミックレンジ(SDR)および標準色域などのHDTV(高精細度テレビジョン)用のパラメータを規定し、ITU-R Rec.2020は、HDRおよびWCGなどのUHDTV(超高精細度テレビジョン)パラメータを指定する。他のシステムにおけるダイナミックレンジおよび色域属性を指定する他の規格開発団体(SDO:standards developing organization)文書もあり、たとえば、P3色域がSMPTE(米国映画テレビ技術者協会)-231-2において規定されており、HDRのいくつかのパラメータがSMPTE-2084において規定されている。ビデオデータに関するダイナミックレンジおよび色域の簡潔な説明が以下で提供される。 Next generation video applications are expected to work with video data representing captured scenes with HDR (High Dynamic Range) and WCG (Wide Color Gamut). The dynamic range and color gamut parameters utilized are two independent attributes of video content, and their specifications for digital television and multimedia services are defined by several international standards. For example, ITU-R Rec. 709 specifies parameters for HDTV (high definition television) such as standard dynamic range (SDR) and standard color gamut, while ITU-R Rec. 2020 specifies parameters such as HDR and WCG. Specifies UHDTV (ultra high definition television) parameters. There are other standards developing organization (SDO) documents that specify dynamic range and color gamut attributes in other systems, for example, the P3 color gamut is specified in SMPTE (American Film and Television Engineers Association) -231-2 Some parameters of HDR are defined in SMPTE-2084. A brief description of dynamic range and color gamut for video data is provided below.

ダイナミックレンジは、通常、ビデオ信号の最小輝度と最大輝度との間の比として規定される。ダイナミックレンジはまた、「fストップ」の観点から測定されることがあり、ただし、1fストップが信号ダイナミックレンジの倍増に相当する。MPEGの規定では、HDRコンテンツとは、輝度変動が16fストップを越えることを特徴とするようなコンテンツである。いくつかの用語では、10fストップと16fストップの間のレベルは、中間ダイナミックレンジと見なされるが、それは他の規定ではHDRと見なされることがある。いくつかの例では、HDRビデオコンテンツは、標準ダイナミックレンジを有する従来使用されたビデオコンテンツ(たとえば、ITU-R Rec. BT.709によって指定されるビデオコンテンツ)よりも高いダイナミックレンジを有する任意のビデオコンテンツであり得る。同時に、人間の視覚系(HVS:human visual system)は、はるかに大きいダイナミックレンジを知覚することが可能である。しかしながら、HVSは、いわゆる同時レンジを狭める適応機構を含む。HDTVのSDR、予想されるUHDTVのHDR、およびHVSダイナミックレンジによって提供される、ダイナミックレンジの視覚化を図2に示す。 The dynamic range is usually defined as the ratio between the minimum and maximum luminance of the video signal. The dynamic range may also be measured in terms of “f-stop”, where 1 f-stop corresponds to doubling the signal dynamic range. According to the MPEG specification, HDR content is content characterized by luminance fluctuations exceeding 16 f stops. In some terms, the level between the 10f stop and the 16f stop is considered an intermediate dynamic range, but it may be considered HDR in other specifications. In some examples, the HDR video content is any video having a higher dynamic range than conventionally used video content having a standard dynamic range (e.g., video content specified by ITU-R Rec. BT.709). Can be content. At the same time, the human visual system (HVS) can perceive a much larger dynamic range. However, HVS includes an adaptive mechanism that narrows the so-called simultaneous range. The dynamic range visualization provided by HDTV SDR, expected UHDTV HDR, and HVS dynamic range is shown in FIG.

現在のビデオアプリケーションおよびサービスは、Rec.709によって規制されるとともにSDRを提供し、通常、0.1〜100カンデラ(cd)/m2(しばしば「ニト」と呼ばれる)付近の輝度(すなわち、ルミナンス)のレンジをサポートして、10fストップ未満へ導く。次世代ビデオサービスは、16fストップまでのダイナミックレンジを提供すると予想される。詳細な仕様は現在開発中であるが、いくつかの初期パラメータがSMPTE-2084およびRec.2020において指定されている。 Current video applications and services are regulated by Rec. 709 and provide SDR, usually in the range of luminance (i.e., luminance) near 0.1-100 candela (cd) / m2 (often called `` Nito '') Support and lead to less than 10f stop. Next-generation video services are expected to provide a dynamic range up to 16f stop. Detailed specifications are currently under development, but some initial parameters are specified in SMPTE-2084 and Rec.2020.

HDRのほかにもっと現実的なビデオエクスペリエンスのための別の態様はカラーディメンジョンであり、カラーディメンジョンは、従来、色域によって規定される。図3は、SDR色域(BT.709カラーの赤色、緑色、および青色の原色に基づく三角形)、およびUHDTVのためのもっと広い色域(BT.2020カラーの赤色、緑色、および青色の原色に基づく三角形)を示す概念図である。図3はまた、天然色の区域を表す、いわゆるスペクトル軌跡(舌型のエリアによって画定されている)を示す。図3によって図示されるように、BT.709原色からBT.2020原色に移ることは、カラーが約70%増加したUHDTVサービスを提供することを目指している。D65は、所与の仕様(たとえば、BT.709仕様および/またはBT.2020仕様)にとっての白色を指定する。 Another aspect for a more realistic video experience besides HDR is color dimension, which is traditionally defined by a color gamut. Figure 3 shows the SDR gamut (triangles based on the red, green, and blue primaries of the BT.709 color) and the wider gamut for the UHDTV (red, green, and blue primaries of the BT.2020 color). FIG. FIG. 3 also shows a so-called spectral trajectory (defined by a tongue-shaped area) that represents an area of natural color. As illustrated by Figure 3, the move from the BT.709 primary to the BT.2020 primary aims to provide UHDTV services with about 70% increase in color. D65 specifies the white color for a given specification (eg, BT.709 specification and / or BT.2020 specification).

色域仕様のいくつかの例がTable 1(表1)に示される。 Some examples of color gamut specifications are shown in Table 1.

Table 1(表1)でわかるように、色域は、白色点のX値およびY値によって、ならびに原色(たとえば、赤色(R)、緑色(G)、および青色(B))のX値およびY値によって規定され得る。X値およびY値は、CIE1931色空間によって規定されるように、カラーの色度(X)および輝度(Y)を表す。CIE1931色空間は、(たとえば、波長の観点からの)純粋色と人間の眼がそのようなカラーをどのように知覚するのかとの間のつながりを規定する。 As can be seen in Table 1, the color gamut depends on the X and Y values of the white point, as well as the X values and the primary colors (e.g., red (R), green (G), and blue (B)). Can be defined by a Y value. The X and Y values represent the chromaticity (X) and luminance (Y) of the color, as defined by the CIE 1931 color space. The CIE 1931 color space defines a link between pure colors (eg, from a wavelength perspective) and how the human eye perceives such colors.

HDR/WCGは、通常、4:4:4クロマフォーマットおよび極めて広い色空間(たとえば、CIE1931 XYZ色空間)を用いて、成分ごとに極めて高い精度(さらには、浮動小数点)で収集および記憶される。この表現は、高い精度をターゲットにし、数学的に(ほとんど)ロスレスである。しかしながら、このフォーマット特徴は多くの冗長性を含むことがあり、圧縮目的にとって最適でない。HVSベースの想定を伴うもっと低い精度のフォーマットが、通常、最先端のビデオアプリケーションに対して利用される。 HDR / WCG is typically collected and stored with extremely high accuracy (and even floating point) per component, using a 4: 4: 4 chroma format and a very wide color space (e.g. CIE1931 XYZ color space). . This representation targets high accuracy and is mathematically (almost) lossless. However, this format feature can include a lot of redundancy and is not optimal for compression purposes. Lower precision formats with HVS-based assumptions are typically used for advanced video applications.

圧縮のための一般のビデオデータフォーマット変換は、図4に示すように3つの主要プロセスからなる。図4の技法は、ビデオプリプロセッサ19によって実行され得る。線形RGBデータ110は、HDR/WCGビデオデータであってよく、浮動小数点表現で記憶され得る。線形RGBデータ110は、ダイナミックレンジ短縮のために非線形伝達関数(TF)を使用して短縮され得る。たとえば、ビデオプリプロセッサ19は、ダイナミックレンジ短縮のために非線形伝達関数を使用するように構成された伝達関数(TF)ユニット112を含み得る。 General video data format conversion for compression consists of three main processes as shown in FIG. The technique of FIG. 4 may be performed by the video preprocessor 19. The linear RGB data 110 may be HDR / WCG video data and may be stored in a floating point representation. The linear RGB data 110 can be shortened using a non-linear transfer function (TF) for dynamic range reduction. For example, video preprocessor 19 may include a transfer function (TF) unit 112 configured to use a non-linear transfer function for dynamic range reduction.

TFユニット112の出力は、コードワードのセットであり得、ここで、各コードワードはカラー値(たとえば、照度レベル)の範囲を表す。ダイナミックレンジ短縮とは、線形RGBデータ110のダイナミックレンジが第1のダイナミックレンジ(たとえば、図2に示すような人間の視覚範囲)であり得ることを意味する。得られたコードワードのダイナミックレンジは、第2のダイナミックレンジ(たとえば、図2に示すようなHDR表示範囲)であり得る。したがって、コードワードは、線形RGBデータ110よりも小さいダイナミックレンジをキャプチャし、したがって、TFユニット112は、ダイナミックレンジ短縮を実行する。 The output of the TF unit 112 can be a set of codewords, where each codeword represents a range of color values (eg, illumination levels). The dynamic range shortening means that the dynamic range of the linear RGB data 110 can be the first dynamic range (for example, the human visual range as shown in FIG. 2). The dynamic range of the obtained codeword may be a second dynamic range (eg, HDR display range as shown in FIG. 2). Thus, the codeword captures a smaller dynamic range than the linear RGB data 110, and thus the TF unit 112 performs dynamic range shortening.

TFユニット112は、コードワードと入力カラー値との間のマッピングが等しく離間されない(たとえば、コードワードが非線形コードワードである)という意味で、非線形関数を実行する。非線形コードワードとは、入力カラー値の変化が、出力コードワードにおいて線形比例する変化としてではなくコードワードの非線形の変化として現れることを意味する。たとえば、カラー値が低い照度を表す場合、入力カラー値の小さい変化が、TFユニット112によって出力されるコードワードの小さい変化をもたらすことになる。しかしながら、カラー値が高い照度を表す場合、コードワードの小さい変化にとって、入力カラー値の比較的大きい変化が必要とされることになる。各コードワードによって表される照度の範囲は一定でない(たとえば、照度の第1の範囲に対して第1のコードワードが同じであり、照度の第2の範囲に対して第2のコードワードが同じであり、第1および第2の範囲が異なる)。以下で説明する図7は、TFユニット112によって適用される伝達関数の特性を示す。 The TF unit 112 performs a non-linear function in the sense that the mapping between codewords and input color values is not equally spaced (eg, the codeword is a non-linear codeword). Non-linear codeword means that the change in the input color value appears as a non-linear change in the codeword rather than as a linearly proportional change in the output codeword. For example, if the color value represents a low illuminance, a small change in the input color value will result in a small change in the codeword output by the TF unit 112. However, if the color value represents a high illuminance, a relatively large change in the input color value will be required for a small change in the codeword. The illuminance range represented by each codeword is not constant (e.g., the first codeword is the same for the first range of illuminance and the second codeword is for the second range of illuminance) And the first and second ranges are different). FIG. 7 described below shows the characteristics of the transfer function applied by the TF unit 112.

より詳細に説明するように、技法は、コードワード空間をより良好に利用するために、TFユニット112が受信する線形RGBデータ110をスケーリングおよびオフセットし得、かつ/またはTFユニット112が出力するコードワードをスケーリングおよびオフセットし得る。TFユニット112は、任意の数の非線形伝達関数(たとえば、SMPTE-2084において規定されるようなPQ(知覚的量子化器) TF)を使用して、線形RGBデータ110(または、スケーリングおよびオフセットされたRGBデータ)を短縮し得る。 As described in more detail, the technique may scale and offset the linear RGB data 110 received by the TF unit 112 and / or the code output by the TF unit 112 to better utilize the codeword space. Words can be scaled and offset. The TF unit 112 may be linear RGB data 110 (or scaled and offset) using any number of non-linear transfer functions (e.g., PQ (Perceptual Quantizer) TF as specified in SMPTE-2084). RGB data) can be shortened.

いくつかの例では、色変換ユニット114は、短縮されたデータを、ビデオエンコーダ20による圧縮にとってもっと適切な、よりコンパクトまたはロバストな色空間に(たとえば、色変換ユニットを介してYUV色空間またはYCrCb色空間において)変換する。より詳細に説明するように、いくつかの例では、色変換ユニット114が色変換を実行する前に、技法は、TFユニット112によるTFの適用によって出力されるコードワードをスケーリングおよびオフセットし得る。色変換ユニット114は、これらのスケーリングおよびオフセットされたコードワードを受信し得る。いくつかの例では、いくつかのスケーリングおよびオフセットされたコードワードは、それぞれのしきい値よりも大きくてよく、または小さくてもよく、これらに対して技法はそれぞれのセットコードワードを割り当ててよい。 In some examples, the color conversion unit 114 converts the shortened data into a more compact or robust color space that is more suitable for compression by the video encoder 20 (e.g., via the color conversion unit via the YUV color space or YCrCb Convert (in color space). As described in more detail, in some examples, the technique may scale and offset the codeword output by the TF application by the TF unit 112 before the color conversion unit 114 performs the color conversion. Color conversion unit 114 may receive these scaled and offset codewords. In some examples, some scaled and offset codewords may be larger or smaller than their respective thresholds, for which the technique may assign their respective set codewords. .

このデータは、次いで、符号化されるべき、ビデオエンコーダ20へ送信されるビデオデータ(たとえば、HDRデータ118)を生成するために、浮動表現から整数表現への変換を使用して(たとえば、量子化ユニット116を介して)量子化される。この例では、HDRデータ118は整数表現である。HDRデータ118は、現在、ビデオエンコーダ20による圧縮にとってより適切なフォーマットであり得る。図4に示すプロセスの順序が一例として与えられ、他の適用例において異なってよいことを理解されたい。たとえば、色変換がTFプロセスに先行してもよい。加えて、ビデオプリプロセッサ19は、もっと多くの処理(たとえば、空間的なサブサンプリング)をカラー成分に適用してもよい。 This data is then encoded using a floating representation to integer representation (e.g., quantum representation) to generate video data (e.g., HDR data 118) that is to be encoded and transmitted to video encoder 20. Quantized (through the quantization unit 116). In this example, the HDR data 118 is an integer representation. HDR data 118 may currently be in a more appropriate format for compression by video encoder 20. It should be understood that the process sequence shown in FIG. 4 is given as an example and may be different in other applications. For example, color conversion may precede the TF process. In addition, video preprocessor 19 may apply more processing (eg, spatial subsampling) to the color components.

したがって、図4では、線形および浮動小数点表現における入力RGBデータ110の高ダイナミックレンジは、TFユニット112によって利用される非線形伝達関数、たとえば、SMPTE-2084において規定されるようなPQ TFを用いて短縮され、それに続いて圧縮にとってより適切なターゲット色空間、たとえば、YCbCrに(たとえば、色変換ユニット114によって)変換され、次いで、整数表現を得るために量子化される(たとえば、量子化ユニット116)。これらの要素の順序は一例として与えられ、実世界の適用例において異なってよく、たとえば、色変換がTFモジュール(たとえば、TFユニット112)に先行してもよい。TFユニット112が伝達関数を適用する前に、そのような空間的なサブサンプリングのような追加の処理がカラー成分に適用されてよい。 Thus, in FIG. 4, the high dynamic range of input RGB data 110 in linear and floating point representations is shortened using a non-linear transfer function utilized by TF unit 112, for example, PQ TF as defined in SMPTE-2084 Followed by conversion to a target color space more suitable for compression, e.g., YCbCr (e.g., by color conversion unit 114), and then quantized to obtain an integer representation (e.g., quantization unit 116) . The order of these elements is given as an example and may be different in real world applications, for example, color conversion may precede a TF module (eg, TF unit 112). Additional processing such as spatial subsampling may be applied to the color components before the TF unit 112 applies the transfer function.

デコーダ側における逆変換が図5に示される。図5の技法は、ビデオポストプロセッサ31によって実行され得る。たとえば、ビデオポストプロセッサ31はビデオデコーダ30からビデオデータ(たとえば、HDRデータ120)を受信し、逆量子化ユニット122がデータを逆量子化し得、逆色変換ユニット124が逆色変換を実行し、逆非線形伝達関数ユニット126が逆非線形伝達を実行して線形RGBデータ128を生成する。 The inverse transformation on the decoder side is shown in FIG. The technique of FIG. 5 may be performed by the video post processor 31. For example, the video post processor 31 receives video data (e.g., HDR data 120) from the video decoder 30, the inverse quantization unit 122 can inverse quantize the data, the inverse color conversion unit 124 performs the inverse color conversion, An inverse nonlinear transfer function unit 126 performs inverse nonlinear transfer to generate linear RGB data 128.

逆色変換ユニット124が実行する逆色変換プロセスは、色変換ユニット114が実行した色変換プロセスの逆であってよい。たとえば、逆色変換ユニット124は、YCrCbフォーマットからRGBフォーマットに戻してHDRデータを変換し得る。逆伝達関数ユニット126は、データに逆伝達関数を適用して、TFユニット112によって短縮されたダイナミックレンジを加減して、線形RGBデータ128を再作成し得る。 The reverse color conversion process performed by the reverse color conversion unit 124 may be the reverse of the color conversion process performed by the color conversion unit 114. For example, the inverse color conversion unit 124 may convert the HDR data back from the YCrCb format to the RGB format. The inverse transfer function unit 126 may apply the inverse transfer function to the data to adjust the dynamic range shortened by the TF unit 112 to recreate the linear RGB data 128.

本開示で説明する例示的な技法では、逆伝達関数ユニット126が逆伝達関数を実行する前に、ビデオポストプロセッサ31は、逆後処理を適用し得、逆伝達関数ユニット126が逆伝達関数を実行した後、逆前処理を適用し得る。たとえば、上記で説明したように、いくつかの例では、ビデオプリプロセッサ19は、TFユニット112の前に前処理(たとえば、スケーリングおよびオフセットすること)を適用してよく、TFユニット112の後に後処理(たとえば、スケーリングおよびオフセットすること)を適用してもよい。前処理および後処理を補償するために、ビデオポストプロセッサ31は、逆TFユニット126が逆伝達関数を実行する前に逆後処理を適用し得、逆TFユニット126が逆伝達関数を実行した後に逆前処理を適用し得る。前処理および後処理と逆後処理および逆前処理の両方を適用することは随意である。いくつかの例では、ビデオプリプロセッサ19は、前処理および後処理の両方でなく1つを適用してよく、そのような例の場合、ビデオポストプロセッサ31は、ビデオプリプロセッサ19によって適用された処理の逆を適用し得る。 In the exemplary techniques described in this disclosure, video postprocessor 31 may apply inverse post-processing before inverse transfer function unit 126 performs the inverse transfer function, and inverse transfer function unit 126 may apply the inverse transfer function. After execution, reverse preprocessing may be applied. For example, as described above, in some examples, video preprocessor 19 may apply preprocessing (eg, scaling and offsetting) before TF unit 112, and postprocessing after TF unit 112. (Eg, scaling and offsetting) may be applied. To compensate for pre-processing and post-processing, video post-processor 31 may apply reverse post-processing before inverse TF unit 126 performs the reverse transfer function, and after reverse TF unit 126 executes the reverse transfer function. Reverse pre-processing can be applied. It is optional to apply both pre-processing and post-processing and reverse post-processing and reverse pre-processing. In some examples, the video preprocessor 19 may apply one instead of both pre-processing and post-processing, and in such an example, the video post-processor 31 may be responsible for the processing applied by the video pre-processor 19. The reverse can be applied.

図5に示す例示的なビデオポストプロセッサ31が概略的に相反なものを実行するという理解とともに、図4に示す例示的なビデオプリプロセッサ19がさらに詳細に説明される。そのダイナミックレンジを短縮し、限られた数のビットを用いてデータを表すことを可能にするために、伝達関数がデータ(たとえば、HDR/WCG RGBビデオデータ)に適用される。データを表すこれらの限られた数のビットは、コードワードと呼ばれる。この関数は、通常、Rec.709におけるSDRに対して指定されるようなエンドユーザディスプレイの電気光学伝達関数(EOTF:electro-optical transfer function)の逆を反映するか、またはHDR用のSMPTE-2084において指定されるPQ TF用のように、HVS知覚を輝度変化に近似させるかのいずれかである、1次元(1D)の非線形関数である。OETFの逆のプロセス(たとえば、ビデオポストプロセッサ31によって実行されるような)はEOTF(電気光学伝達関数)であり、これはコードレベルをルミナンスに戻してマッピングする。図6は、非線形TFのいくつか例を示す。これらの伝達関数はまた、別個に各R、G、およびB成分に適用され得る。 The example video preprocessor 19 shown in FIG. 4 is described in further detail, with the understanding that the example video postprocessor 31 shown in FIG. A transfer function is applied to the data (eg, HDR / WCG RGB video data) to reduce its dynamic range and allow the data to be represented using a limited number of bits. These limited number of bits representing data are called codewords. This function typically reflects the inverse of the end-user display electro-optical transfer function (EOTF) as specified for SDR in Rec. 709, or SMPTE-2084 for HDR A one-dimensional (1D) nonlinear function that either approximates HVS perception to luminance changes, as for the PQ TF specified in. The reverse process of OETF (eg, as performed by video postprocessor 31) is EOTF (electro-optic transfer function), which maps code levels back to luminance. FIG. 6 shows some examples of nonlinear TFs. These transfer functions can also be applied to each R, G, and B component separately.

本開示のコンテキストでは、「信号値」または「カラー値」という用語は、画像要素に対する(R、G、B、またはYなどの)特定のカラー成分の値に対応するルミナンスレベルを説明するために使用され得る。信号値は、通常、線形の光レベル(ルミナンス値)を表す。「コードレベル」、「デジタルコード値」、または「コードワード」という用語は、画像信号値のデジタル表現を指すことがある。通常、そのようなデジタル表現は、非線形の信号値を表す。EOTFは、ディスプレイデバイス(たとえば、ディスプレイデバイス32)に提供される非線形信号値とディスプレイデバイスによって生成される線形カラー値との間の関係を表す。 In the context of this disclosure, the terms “signal value” or “color value” are used to describe the luminance level corresponding to the value of a particular color component (such as R, G, B, or Y) for an image element. Can be used. The signal value usually represents a linear light level (luminance value). The terms “code level”, “digital code value”, or “code word” may refer to a digital representation of an image signal value. Such digital representations typically represent non-linear signal values. EOTF represents the relationship between a non-linear signal value provided to a display device (eg, display device 32) and a linear color value generated by the display device.

ST2084の仕様は、EOTF適用を以下のように規定した。TFが正規化線形R、G、B値に適用され、そのことはR'G'B'という非線形表現をもたらす。ST2084は、10000ニト(cd/m2)としてのピーク輝度に関連するNORM=10000による正規化を規定している。 The ST2084 specification stipulated the application of EOTF as follows. TF is applied to the normalized linear R, G, B values, which results in a non-linear representation R'G'B '. ST2084 specifies normalization with NORM = 10000 related to peak luminance as 10000 nits (cd / m 2).

範囲0〜1に正規化されたx軸上の入力値(線形カラー値)およびy軸上の正規化された出力値(非線形カラー値)を用いて、PQ EOTFが図7において可視化される。図7における曲線から見られるように、入力信号のダイナミックレンジの1パーセント(低照度)が、出力信号のダイナミックレンジの50%に変換される。 Using the input values on the x-axis (linear color values) normalized to the range 0-1 and the normalized output values (non-linear color values) on the y-axis, PQ EOTF is visualized in FIG. As can be seen from the curve in FIG. 7, 1 percent (low illumination) of the dynamic range of the input signal is converted to 50% of the dynamic range of the output signal.

入力線形カラー値を表すために使用され得る有限数のコードワードがある。図7は、PQ EOTFの場合、利用可能なコードワードのほぼ50%が低照度入力信号に専用であり、照度がより高い入力信号に対してより少数の利用可能なコードワードしか残さないことを示す。したがって、比較的照度が高い入力信号のわずかな変化は、これらのわずかな変化を表すのに不十分なコードワードしかないのでキャプチャされないことがある。しかしながら、照度が低い入力信号に対して不必要に多数の利用可能なコードワードがあり得、そのことは、比較的照度が低い入力信号のどんなにわずかな変化でさえ、異なるコードワードによって表され得ることを意味する。したがって、照度が低い入力信号にとって利用可能なコードワードの大きい分布と、照度が高い入力信号にとって利用可能なコードワードの比較的小さい分布とがあり得る。 There are a finite number of codewords that can be used to represent the input linear color value. Figure 7 shows that for PQ EOTF, nearly 50% of the available codewords are dedicated to low-light input signals, leaving fewer available codewords for higher-light input signals. Show. Thus, slight changes in the relatively bright input signal may not be captured because there are insufficient codewords to represent these slight changes. However, there can be an unnecessarily large number of available codewords for low illumination input signals, which can be represented by different codewords, even the slightest changes in relatively low illumination input signals Means that. Thus, there can be a large distribution of codewords available for input signals with low illuminance and a relatively small distribution of codewords available for input signals with high illuminance.

通常、EOTFは浮動小数点精度を有する関数として規定される。したがって、逆TF、すなわち、いわゆるOETFが適用される場合、この非線形性を有する信号に誤差が持ち込まれない。ST2084において指定される逆TF(OETF)は、逆PQ関数として規定される。 Normally, EOTF is specified as a function with floating point precision. Therefore, when inverse TF, that is, so-called OETF is applied, an error is not introduced into a signal having this nonlinearity. The inverse TF (OETF) specified in ST2084 is defined as an inverse PQ function.

浮動小数点精度を用いると、EOTFおよびOETFの逐次的な適用は、誤差のない完全な再構成をもたらす。しかしながら、この表現はストリーミングサービスまたはブロードキャスティングサービスにとって最適でない。非線形R'G'B'データの、ビット精度が一定のもっとコンパクトな表現が、以下で説明される。EOTFおよびOETFが現在極めて活発な研究の対象であり、いくつかのHDRビデオコーディングシステムにおいて利用されるTFがST2084とは異なる場合があることに留意されたい。 With floating point precision, sequential application of EOTF and OETF results in complete reconstruction without error. However, this representation is not optimal for streaming services or broadcasting services. A more compact representation of non-linear R'G'B 'data with constant bit precision is described below. Note that EOTF and OETF are currently the subject of very active research, and the TF utilized in some HDR video coding systems may differ from ST2084.

HDRに対して、他の伝達関数も検討中である。例は、Philips TFまたはBBCハイブリッド「ログガンマ」TFを含む。BBCハイブリッド「ログガンマ」TFは、SDR後方互換性のためのRec709 TFに基づく。ダイナミックレンジを拡張するために、BBCハイブリッドは、より高い入力ルミナンスにおいてRec709曲線に第3の部分を追加する。曲線の新たな部分は対数関数である。 Other transfer functions are being considered for HDR. Examples include Philips TF or BBC hybrid “log gamma” TF. The BBC Hybrid “Loggamma” TF is based on the Rec709 TF for SDR backward compatibility. To extend the dynamic range, the BBC hybrid adds a third part to the Rec709 curve at higher input luminance. The new part of the curve is a logarithmic function.

Rec709において規定されるOETFは、 The OETF specified in Rec709 is

であり、ただし、Lは画像のルミナンス0≦L≦1であり、Vは対応する電気信号である。Rec2020において、同じ式が次のように指定される。 Where L is the luminance of the image 0 ≦ L ≦ 1 and V is the corresponding electrical signal. In Rec2020, the same formula is specified as follows:

ただし、Eは、基準白色レベルによって正規化され、基準カメラカラーチャネルR、G、Bを用いて検出されることになる無条件の光強度に比例する電圧である。E'は、得られた非線形信号である。
10ビットシステムの場合、α=1.099およびβ=0.018である。
12ビットシステムの場合、α=1.0993およびβ=0.0181である。 Where E is a voltage that is normalized by the reference white level and proportional to the unconditional light intensity that will be detected using the reference camera color channels R, G, B. E ′ is the obtained nonlinear signal.
For a 10-bit system, α = 1.099 and β = 0.018.
For a 12-bit system, α = 1.0993 and β = 0.0181.

明示的に述べられないが、αおよびβは次の連立方程式に対する解である。
4.5β=αβ^0.45-(α-1)
4.5=0.45αβ^-0.55 Although not explicitly stated, α and β are solutions to the following simultaneous equations.
^{4.5β = αβ 0.45 - (α-} 1)
4.5 = 0.45αβ ^-0.55

第1の式は、E=βにおける線形関数およびガンマ関数の値を同等に扱うことであり、第2の式は、やはりE=βにおける2つの関数の傾きである。 The first expression is to treat the values of the linear function and the gamma function at E = β equally, and the second expression is also the slope of the two functions at E = β.

ハイブリッド「ログガンマ」の追加の部分が以下に示され、図27がOETFを示す。たとえば、図27は、一例として、ハイブリッドログガンマ伝達関数および潜在的なターゲットレンジを示す。 Additional portions of the hybrid “log gamma” are shown below, and FIG. 27 shows the OETF. For example, FIG. 27 shows, as an example, a hybrid log gamma transfer function and potential target range.

図27において、グラフの右下のボックスの中のそれぞれの凡例を一致させるために、それぞれの曲線がA、B、C、D、およびEを用いてマークされている。詳細には、Aは10ビットPQ-EOTFに対応し、Bは8ビットBT.709 EOTFに対応し、Cは10ビットBT.709 EOTFに対応し、Dは10ビットBBCハイブリッドログガンマEOTFに対応し、Eは12ビットPQ-EOTFに対応する。 In FIG. 27, each curve is marked with A, B, C, D, and E to match each legend in the lower right box of the graph. Specifically, A supports 10-bit PQ-EOTF, B supports 8-bit BT.709 EOTF, C supports 10-bit BT.709 EOTF, and D supports 10-bit BBC hybrid log gamma EOTF E corresponds to 12-bit PQ-EOTF.

図28は、ニーポイント付近での放物線の勾配が調整可能である伝達関数を示す。図28は、勾配調整例が5000[cd/m²]の一例として図示されるにすぎない。しかしながら、勾配が調整可能な伝達関数の他の例が可能である。 FIG. 28 shows a transfer function with adjustable parabola slope near the knee point. In FIG. 28, the gradient adjustment example is merely illustrated as an example of 5000 [cd / m ² ]. However, other examples of transfer functions with adjustable slope are possible.

RGBデータは、通常、画像キャプチャリングセンサーによって生成されるので入力として利用される。しかしながら、この色空間は、その成分の間に大きい冗長性を有し、コンパクトな表現にとって最適でない。よりコンパクトかつよりロバストな表現を達成するために、RGB成分は、通常、圧縮にとってもっと適切な、さほど相関のない色空間、たとえば、YCbCrに変換される(すなわち、色変換が実行される)。この色空間は、ルミナンス情報およびカラー情報の形態の輝度を、異なる無相関成分に分離する。 RGB data is usually generated by an image capturing sensor and is used as an input. However, this color space has great redundancy between its components and is not optimal for a compact representation. In order to achieve a more compact and more robust representation, the RGB components are usually converted into a less correlated color space, eg YCbCr, that is more appropriate for compression (ie color conversion is performed). This color space separates luminance in the form of luminance information and color information into different uncorrelated components.

現代のビデオコーディングシステムの場合、一般に使用される色空間は、ITU-R BT.709またはITU-R BT.709において指定されるようなYCbCrである。BT.709規格におけるYCbCr色空間は、R'G'B'からY'CbCr(非定常ルミナンス表現)への以下の変換プロセスを指定する。 For modern video coding systems, a commonly used color space is YCbCr as specified in ITU-R BT.709 or ITU-R BT.709. The YCbCr color space in the BT.709 standard specifies the following conversion process from R'G'B 'to Y'CbCr (unsteady luminance representation).

上記のことはまた、Cb成分およびCr成分に関する除算を回避する以下の近似的な変換を使用して実施され得る。
・Y'=0.212600*R'+0.715200*G'+0.072200*B'
・Cb=-0.114572*R'-0.385428*G'+0.500000*B' (4)
・Cr=0.500000*R'-0.454153*G'-0.045847*B' The above can also be implemented using the following approximate transformation that avoids division on the Cb and Cr components.
・ Y '= 0.212600 * R' + 0.715200 * G '+ 0.072200 * B'
・ Cb = -0.114572 * R'-0.385428 * G '+ 0.500000 * B' (4)
・ Cr = 0.500000 * R'-0.454153 * G'-0.045847 * B '

ITU-R BT.2020規格は、R'G'B'からY'CbCr(非定常ルミナンス表現)への以下の変換プロセスを指定する。 The ITU-R BT.2020 standard specifies the following conversion process from R'G'B 'to Y'CbCr (unsteady luminance representation).

上記のことはまた、Cb成分およびCr成分に関する除算を回避する以下の近似的な変換を使用して実施され得る。
・Y'=0.262700*R'+0.678000*G'+0.059300*B'
・Cb=-0.139630*R'-0.360370*G'+0.500000*B' (6)
・Cr=0.500000*R'-0.459786*G'-0.040214*B' The above can also be implemented using the following approximate transformation that avoids division on the Cb and Cr components.
・ Y '= 0.262700 * R' + 0.678000 * G '+ 0.059300 * B'
・ Cb = -0.139630 * R'-0.360370 * G '+ 0.500000 * B' (6)
・ Cr = 0.500000 * R'-0.459786 * G'-0.040214 * B '

両方の色空間が正規化されたままであることに留意されたい。したがって、範囲0〜1の中に正規化された入力値に対して、得られた値は範囲0〜1にマッピングされる。概して、浮動小数点精度を伴って実施される色変換は、完全な再構成をもたらし、したがって、このプロセスはロスレスである。 Note that both color spaces remain normalized. Thus, for input values normalized within the range 0-1 the resulting value is mapped to the range 0-1. In general, color conversion performed with floating point precision results in complete reconstruction, and thus the process is lossless.

量子化および/または固定小数点変換の場合、上記で説明した段階を処理することは、通常、浮動小数点精度表現で実施され、したがって、それらはロスレスと見なされ得る。しかしながら、このタイプの精度は、民生用電子機器用途のほとんどにとって冗長かつ高価であると見なされ得る。そのようなサービスに対して、ターゲット色空間の中の入力データは、固定小数点精度のターゲットビット深度に変換される。いくつかの検討は、丁度可知差異未満のひずみしか有しない16fストップのHDRデータを提供するために、PQ TFと組み合わされた10〜12ビット精度が十分であることを示す。10ビット精度を用いて表されるデータは、最先端のビデオコーディング解決策のほとんどを用いてさらにコーディングされ得る。この変換プロセスは、信号量子化を含み、損失のあるコーディングの要素であり、変換されるデータに持ち込まれる不正確さの根源である。 In the case of quantization and / or fixed-point transformations, processing the steps described above is usually performed with a floating-point precision representation, so they can be considered lossless. However, this type of accuracy can be considered redundant and expensive for most consumer electronics applications. For such services, input data in the target color space is converted to a target bit depth with fixed point precision. Several studies show that 10-12 bit accuracy combined with PQ TF is sufficient to provide 16f-stop HDR data with just less than a noticeable difference in distortion. Data represented using 10-bit precision can be further coded using most of the most advanced video coding solutions. This transformation process involves signal quantization, is a component of lossy coding, and is the source of inaccuracy introduced into the transformed data.

ターゲット色空間の中のコードワードに適用されるそのような量子化の一例、この例ではYCbCrが、以下に示される。浮動小数点精度で表される入力値YCbCrが、固定ビット深度の信号、すなわち、Y値に対するBitDepthYおよびクロマ値(Cb,Cr)に対するBitDepthCに変換される。
・D_Y'=Clip1_Y(Round((1<<(BitDepth_Y-8))*(219*Y'+16)))
・D_Cb=Clip1_C(Round((1<<(BitDepth_C-8))*(224*Cb+128))) (7)
・D_Cr=Clip1_C(Round((1<<(BitDepth_C-8))*(224*Cr+128)))
ただし、
Round(x)=Sign(x)*Floor(Abs(x)+0.5)、
Sign(x)=-1(x<0の場合)、0(x=0の場合)、1(x>0の場合)、
Floor(x)はx以下の最大の整数、
Abs(x)=x(x>=0の場合)、-x(x<0の場合)、
Clip1_Y(x)=Clip3(0,(1<<BitDepth_Y)-1,x)、
Clip1_C(x)=Clip3(0,(1<<BitDepth_C)-1,x)、
Clip3(x,y,z)=x(z<xの場合)、y(z>yの場合)、x(それ以外の場合)
である。 An example of such quantization applied to a codeword in the target color space, in this example YCbCr, is shown below. An input value YCbCr expressed in floating-point precision is converted into a signal having a fixed bit depth, that is, BitDepthY for a Y value and BitDepthC for a chroma value (Cb, Cr).
・ D _{Y '} = Clip1 _Y (Round ((1 << (BitDepth _Y -8)) * (219 * Y' + 16)))
・ D _Cb = Clip1 _C (Round ((1 << (BitDepth _C -8)) * (224 * Cb + 128))) (7)
・ D _Cr = Clip1 _C (Round ((1 << (BitDepth _C -8)) * (224 * Cr + 128)))
However,
Round (x) = Sign (x) * Floor (Abs (x) +0.5),
Sign (x) =-1 (if x <0), 0 (if x = 0), 1 (if x> 0),
Floor (x) is the largest integer less than or equal to x,
Abs (x) = x (if x> = 0), -x (if x <0),
Clip1 _Y (x) = Clip3 (0, (1 << BitDepth _Y ) -1, x),
Clip1 _C (x) = Clip3 (0, (1 << BitDepth _C ) -1, x),
Clip3 (x, y, z) = x (if z <x), y (if z> y), x (otherwise)
It is.

いくつかの技法は、HDRおよびWCGビデオデータにとって好適でないことがある。PH、Phillips、およびBBCなどの、HDRビデオシステムのための現在公開されているEOTFのほとんどは、独立にR、G、およびB成分に適用され、HDRビデオの空間時間的な統計量または局所的な輝度レベルを考慮に入れない、静的でコンテンツに依存しない1D伝達関数である。HDRビデオコーディングシステムにおけるそのようなEOTF利用の観点から、そのような手法は、HDRビデオコンテンツの提供される視覚的品質にとって最適でないビット割振りにつながることになる。 Some techniques may not be suitable for HDR and WCG video data. Most of the currently published EOTFs for HDR video systems, such as PH, Phillips, and BBC, are independently applied to the R, G, and B components, and the spatiotemporal statistics or locality of the HDR video It is a static, content-independent 1D transfer function that does not take into account the brightness level. From the perspective of such EOTF usage in HDR video coding systems, such an approach will lead to bit allocation that is not optimal for the visual quality of the HDR video content provided.

現在利用されるHDR処理パイプラインの第2の問題は、ターゲット色空間の中の非線形コードワードの、浮動小数点精度表現から表現への静的変換である。通常、コードワード空間は固定されており、時間的および空間的に変化するHDRコンテンツは、最適な表現を獲得しないことになる。後続するこのことは、これらの2つの問題におけるさらなる詳細を提供する。 A second problem with currently utilized HDR processing pipelines is the static conversion of non-linear codewords in the target color space from floating point precision representations to representations. Usually, the codeword space is fixed, and HDR content that changes temporally and spatially will not get the optimal representation. This that follows provides further details on these two issues.

静的1D EOTFの非最適性に関して、ST2084において規定されるEOTFは、おそらく輝度の特定のレベル(cd/m2としての数値)における人間の視覚系(HVS)の知覚的感度に基づくPQ TFとして示される、静的でコンテンツに依存しない1D伝達関数を指定する。ほとんどの検討が輝度に対するHVS知覚的感度をcd/m2単位で規定するという事実にもかかわらず、PQ TFは、式1におけるように(たとえば、R'、G'、およびB'を決定するために)独立にカラー値R、G、およびBの各々に適用され、このRGBピクセルの輝度強度を利用しない。このことは、得られたR'G'B'の非線形性におけるHVS感度の近似にPQ TFの推定される不正確さをもたらし、たとえば、R'G'B'カラー値の組合せは、独立にR、G、B成分の各々に適用されたものと比較して別のPQ TF値に関連付けられているべき、異なるレベルの輝度をもたらし得る。 Regarding the non-optimality of static 1D EOTF, the EOTF specified in ST2084 is probably shown as a PQ TF based on the perceptual sensitivity of the human visual system (HVS) at a specific level of brightness (numerical value as cd / m2) Specify a static, content-independent 1D transfer function. Despite the fact that most studies specify HVS perceptual sensitivity to luminance in cd / m2 units, PQ TF is as in Equation 1 (for example, to determine R ', G', and B ' ) Independently applied to each of the color values R, G, and B and does not utilize the luminance intensity of this RGB pixel. This leads to an estimated inaccuracy of the PQ TF in the approximation of the HVS sensitivity in the obtained R'G'B 'nonlinearity, for example, the combination of R'G'B' color values independently It can result in different levels of brightness that should be associated with different PQ TF values compared to those applied to each of the R, G, B components.

その上、この設計意図に起因して、PQ TFは、2つのモードにおけるHVSの感度、すなわち、いわゆる夜間明所視ビジョンと暗所視(いわゆる、夜間)ビジョンとを組み合わせる。後者のビジョンはシーンの輝度が0.03cd/m2未満であるときに有効になり、色知覚の低減を犠牲にしてはるかに高い感度を特徴とする。高い感度を可能にするために、利用されるPQ TFは、図7において反映されるように、より大量のコードワードを低い照度値に与える。そのようなコードワード分布は、ピクチャの輝度が低いHVSであり夜間ビジョンモードで動作することになる場合に最適であり得る。たとえば、そのようなコードワード分布は、輝度の軽微な変化にコードワードが敏感である場合に最適であり得る。しかしながら、典型的なHDRピクチャは、明るい景色と暗く雑音の多いフラグメントとを特徴とすることがあり、暗く雑音の多いフラグメントは、最も近くの明るいサンプルからマスクすることに起因して視覚的品質に影響を及ぼすことにならないが、現在の静的TFを用いたビットレートに大幅に寄与することになる。 Moreover, due to this design intent, the PQ TF combines the sensitivity of HVS in two modes: the so-called nighttime photopic vision and the scotopic vision (so-called nighttime) vision. The latter vision is effective when the scene brightness is less than 0.03 cd / m2, and features much higher sensitivity at the expense of reduced color perception. In order to allow high sensitivity, the PQ TF utilized gives a larger amount of codewords to lower illumination values, as reflected in FIG. Such codeword distribution may be optimal when the picture brightness is HVS and will operate in night vision mode. For example, such a codeword distribution may be optimal when the codeword is sensitive to minor changes in brightness. However, a typical HDR picture may be characterized by a bright scene and dark and noisy fragments, which darkness and noisy fragments will result in visual quality due to masking from the nearest bright sample. Although it will not affect, it will greatly contribute to the bit rate using the current static TF.

コードワード空間の非効率な利用に関して、浮動小数点精度で表される非線形コードワード(Y',Cb,Cr)の量子化、および式(7)に示すような固定数のビットを用いたそれらの表現(D_Y',D_Cb,D_Cr)は、HDRパイプラインの主要な圧縮ツールである。通常、量子化の前の入力信号のダイナミックレンジは1.0よりも小さく、Y成分に対して0〜1内の範囲、ならびにCbおよびCr成分に対して範囲-0.5〜0.5に属する。 For inefficient use of codeword space, quantization of nonlinear codewords (Y ', Cb, Cr) expressed in floating-point precision and those using a fixed number of bits as shown in Equation (7) The representation (D _{Y ′} , D _Cb , D _Cr ) is the main compression tool of the HDR pipeline. Usually, the dynamic range of the input signal before quantization is less than 1.0 and belongs to the range within 0 to 1 for the Y component and the range -0.5 to 0.5 for the Cb and Cr components.

しかしながら、HDR信号の実際の分布はフレームごとに変化し、したがって、式(7)に示すような量子化は、最小の量子化誤差を与えないことになり、1.0に等しい予期される範囲に整合するようにHDR信号のダイナミックレンジを調整することによって改善され得る。 However, the actual distribution of the HDR signal varies from frame to frame, so quantization as shown in equation (7) will not give the smallest quantization error and is consistent with the expected range equal to 1.0. This can be improved by adjusting the dynamic range of the HDR signal.

上記で説明した問題に対処するために、以下の解決策が検討され得る。EOTFは、図8に示すように、たとえば、フレームレベルで、コンテンツ特性次第で形状が変更される、動的なコンテンツ適応型伝達関数として規定され得る。図8は、適応形状TFユニット112'を有するコンテンツ適応型HDR処理パイプライン(エンコーダ側)を示す。たとえば、図8は、TFユニットによって利用される適応形状TF関数を含むビデオプリプロセッサ19の別の例を示す。 To address the problems described above, the following solutions can be considered. As shown in FIG. 8, EOTF can be defined as a dynamic content-adaptive transfer function whose shape changes depending on content characteristics, for example, at the frame level. FIG. 8 shows a content adaptive HDR processing pipeline (encoder side) having an adaptive shape TF unit 112 ′. For example, FIG. 8 shows another example of a video preprocessor 19 that includes an adaptive shape TF function utilized by a TF unit.

図8の構成要素は、概して、図4の構成要素に適合する。図8の例では、図4における例の参照番号と同じ参照番号を有する構成要素は同じである。ただし、図4に示すようにTFユニット112が静的TFを適用する代わりに、図8では適応型TFユニット112'は適応型TFを適用する。この例では、適応型TFユニット112'がデータ短縮を実行する方式は適応的であり、ビデオコンテンツに基づいて変化することができる。対応するビデオポストプロセッサ31が図示されないが、そのような例示的なビデオポストプロセッサ31は、図5の逆伝達関数ユニット126の静的逆伝達関数ではなく適応型逆伝達関数を実行することになる。ビデオプリプロセッサ19は、TFユニット112'がTFを適応させるために使用するパラメータを示す情報を出力することになる。ビデオポストプロセッサ31は、それに従って適応型逆伝達関数を適応させるためにTFユニット112'によって使用されるパラメータを示すような情報を受信することになる。 The components of FIG. 8 generally fit the components of FIG. In the example of FIG. 8, components having the same reference numbers as the reference numbers of the example in FIG. 4 are the same. However, instead of the TF unit 112 applying the static TF as shown in FIG. 4, the adaptive TF unit 112 ′ applies the adaptive TF in FIG. In this example, the manner in which adaptive TF unit 112 'performs data reduction is adaptive and can vary based on the video content. Although a corresponding video post-processor 31 is not shown, such an exemplary video post-processor 31 will perform an adaptive inverse transfer function rather than the static inverse transfer function of the inverse transfer function unit 126 of FIG. . The video preprocessor 19 will output information indicating parameters used by the TF unit 112 ′ to adapt the TF. Video postprocessor 31 will receive information indicative of parameters used by TF unit 112 'to adapt the adaptive inverse transfer function accordingly.

しかしながら、この技法は、伝達関数適合のパラメータの大規模なシグナリング、ならびにこの適応性をサポートする実装形態、たとえば、複数のルックアップテーブルまたは実施分岐を記憶することを必要とし得る。その上、PQ EOTFの非最適性のいくつかの態様は、3D伝達関数を通じて解決され得る。いくつかの実装形態にとって、またシグナリングコストにとって、そのような手法は、あまりに高価であり得る。 However, this technique may require extensive signaling of transfer function adaptation parameters, as well as implementations that support this adaptability, eg, storing multiple lookup tables or implementation branches. Moreover, some aspects of PQ EOTF non-optimality can be solved through 3D transfer functions. For some implementations and for signaling costs, such an approach can be too expensive.

本開示では、シグナリングとは、シンタックス要素、またはビデオデータを復号もしくは別の方法で再構成するために使用される他のビデオデータの出力を指す。シグナリングパラメータは、宛先デバイス14によって後で検索できるように記憶されてよく、または宛先デバイス14へ直接送信されてもよい。 In this disclosure, signaling refers to the output of syntax elements or other video data used to decode or otherwise reconstruct video data. The signaling parameters may be stored for later retrieval by the destination device 14 or may be sent directly to the destination device 14.

本開示は、静的固定伝達関数(TF)を採用するコンテンツ適応型HDRビデオシステムを説明する。本開示は、静的TFをパイプラインの中に保つが信号特性を固定処理フローに適応させることを説明する。このことは、TFによって処理されるべき信号またはTFの適用から得られた信号のいずれかの、適応処理によって達成され得る。いくつかの技法は、これらの適合メカニズムの両方を組み合わせてよい。デコーダ(たとえば、ビデオポストプロセッサ31)において、エンコーダ側(たとえば、ビデオプリプロセッサ19)において適用されるものとは逆の適応プロセスが適用されることになる。 This disclosure describes a content-adaptive HDR video system that employs a static fixed transfer function (TF). This disclosure describes keeping static TF in the pipeline but adapting signal characteristics to a fixed processing flow. This can be achieved by adaptive processing of either the signal to be processed by the TF or the signal obtained from the application of the TF. Some techniques may combine both of these adaptation mechanisms. At the decoder (eg, video postprocessor 31), an adaptation process opposite to that applied at the encoder side (eg, video preprocessor 19) will be applied.

図9は、固定TFを用いたコンテンツ適応型HDR処理パイプライン(エンコーダ側)を示す概念図である。図示したように、ビデオプリプロセッサ19は、ダイナミックレンジ調整(DRA1)とも呼ばれる前処理ユニット134、TFユニット112、DRA2とも呼ばれる後処理ユニット138、色変換ユニット114、および量子化ユニット116を含む。 FIG. 9 is a conceptual diagram showing a content adaptive HDR processing pipeline (encoder side) using a fixed TF. As shown, the video preprocessor 19 includes a pre-processing unit 134, also called dynamic range adjustment (DRA1), a TF unit 112, a post-processing unit 138, also called DRA2, a color conversion unit 114, and a quantization unit 116.

図示の例では、ビデオプリプロセッサ19は、固定機能およびプログラマブルの回路構成として構成され得る。たとえば、ビデオプリプロセッサ19は、前処理ユニット134、TFユニット112、後処理ユニット138、色変換ユニット114、および量子化ユニット116を一緒または別個に形成する、トランジスタ、キャパシタ、インダクタ、受動素子および能動素子、算術論理ユニット(ALU:arithmetic logic unit)、初等関数ユニット(EFU:elementary function unit)などを含み得る。いくつかの例では、ビデオプリプロセッサ19は、前処理ユニット134、TFユニット112、後処理ユニット138、色変換ユニット114、および量子化ユニット116に、それらのそれぞれの機能を実行させる命令を実行するプログラマブルコアを含む。そのような例では、ビデオデータメモリ132またはいくつかの他のメモリが、ビデオプリプロセッサ19によって実行される命令を記憶し得る。 In the illustrated example, the video preprocessor 19 may be configured as a fixed function and programmable circuit configuration. For example, video preprocessor 19 includes transistors, capacitors, inductors, passive elements and active elements that together or separately form pre-processing unit 134, TF unit 112, post-processing unit 138, color conversion unit 114, and quantization unit 116. , An arithmetic logic unit (ALU), an elementary function unit (EFU), and the like. In some examples, video preprocessor 19 is programmable to execute instructions that cause pre-processing unit 134, TF unit 112, post-processing unit 138, color conversion unit 114, and quantization unit 116 to perform their respective functions. Includes core. In such an example, video data memory 132 or some other memory may store instructions executed by video preprocessor 19.

図9では、理解しやすいように、ビデオデータメモリ132も図示される。たとえば、ビデオデータメモリ132は、ビデオプリプロセッサ19がビデオデータを受信する前に、ビデオデータを一時的に記憶し得る。別の例として、ビデオプリプロセッサ19が出力する任意のビデオデータは、ビデオデータメモリ132の中に一時的に記憶され得る(たとえば、ビデオエンコーダ20に出力される前にビデオデータメモリの中に記憶する)。ビデオデータメモリ132は、ビデオプリプロセッサ19の一部であってよく、またはビデオプリプロセッサ19の外部にあってもよい。 In FIG. 9, a video data memory 132 is also shown for ease of understanding. For example, video data memory 132 may temporarily store video data before video preprocessor 19 receives the video data. As another example, any video data output by the video preprocessor 19 may be temporarily stored in the video data memory 132 (eg, stored in the video data memory before being output to the video encoder 20). ). Video data memory 132 may be part of video preprocessor 19 or may be external to video preprocessor 19.

ビデオデータメモリ132の中に記憶されるビデオデータは、たとえば、ビデオソース18から取得され得る。ビデオデータメモリ132は、同期DRAM(SDRAM)、磁気抵抗RAM(MRAM)、抵抗RAM(RRAM（登録商標）)、または他のタイプのメモリデバイスを含む、ダイナミックランダムアクセスメモリ(DRAM)などの様々なメモリデバイスのうちのいずれかによって形成され得る。様々な例では、ビデオデータメモリ132は、ビデオプリプロセッサ19の他の構成要素とともにオンチップであってよく、またはそれらの構成要素に対してオフチップであってもよい。 Video data stored in video data memory 132 may be obtained from video source 18, for example. The video data memory 132 may be a variety of dynamic random access memory (DRAM), including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM), resistive RAM (RRAM®), or other types of memory devices. It can be formed by any of the memory devices. In various examples, video data memory 132 may be on-chip with other components of video preprocessor 19 or may be off-chip with respect to those components.

以下でより詳細に説明するように、前処理ユニット134(たとえば、DRA1)は、静的伝達関数ユニット112が静的伝達関数を適用する前に、線形RGBデータ110を前処理する。前処理の一部は、スケーリングおよびオフセットすること(たとえば、入力値にファクタを掛けてスケーリングし、値を加えてオフセットすること)を含む。いくつかの例では、前処理ユニット134は、ビデオコンテンツに基づいてビデオデータを前処理する(たとえば、スケーリングファクタおよびオフセットファクタがビデオコンテンツに基づく)。TFユニット112は、次いで、複数のコードワードを生成するために、スケーリングおよびオフセットされた入力値に伝達関数を適用する。この例では、TFユニット112は、生成される複数のコードワードが入力カラー値(たとえば、線形RGBデータ110)よりも小さいダイナミックレンジでカラーを表すように、入力値のダイナミックレンジを短縮する静的伝達関数を適用する。 As described in more detail below, preprocessing unit 134 (eg, DRA1) preprocesses linear RGB data 110 before static transfer function unit 112 applies the static transfer function. Part of the preprocessing includes scaling and offsetting (eg, scaling the input value by a factor and adding the value to offset). In some examples, preprocessing unit 134 preprocesses video data based on video content (eg, scaling and offset factors are based on video content). The TF unit 112 then applies a transfer function to the scaled and offset input values to generate a plurality of codewords. In this example, the TF unit 112 is a static that reduces the dynamic range of input values so that the generated codewords represent colors with a smaller dynamic range than the input color values (e.g., linear RGB data 110). Apply the transfer function.

後処理ユニット136(たとえば、DRA2)は、TFユニット112によって生成されたコードワードに対して後処理機能を実行する。たとえば、TFユニット112の出力は、カラー値を表す非線形コードワードであり得る。後処理ユニット136はまた、TFユニット112によって出力されたコードワードをスケーリングおよびオフセットし得る。色変換ユニット114は、後処理ユニット138の出力に対して色変換(たとえば、RGBからYCrCbへの変換)を実行し、量子化ユニット116は、色変換ユニット114の出力に対して量子化を実行する。 Post-processing unit 136 (eg, DRA2) performs a post-processing function on the codeword generated by TF unit 112. For example, the output of TF unit 112 may be a non-linear codeword that represents a color value. Post processing unit 136 may also scale and offset the codeword output by TF unit 112. The color conversion unit 114 performs color conversion (for example, conversion from RGB to YCrCb) on the output of the post-processing unit 138, and the quantization unit 116 performs quantization on the output of the color conversion unit 114 To do.

線形RGBデータ110を前処理することによって、前処理ユニット134は、線形RGBデータ110をスケーリングおよびオフセットするように構成され得、その結果、TFユニット112は静的伝達関数を適用するが、TFユニット112の入力は、TFユニット112の出力コードワードの中により大きい線形性があるように調整される。さらに、線形RGBデータ110を前処理することによって、前処理ユニット134は、ビデオコンテンツを使用して、空間時間的な統計量または局所的な輝度レベルを考慮に入れるスケーリングおよびオフセットパラメータを選択し得る。 By pre-processing the linear RGB data 110, the pre-processing unit 134 can be configured to scale and offset the linear RGB data 110 so that the TF unit 112 applies a static transfer function while the TF unit The 112 inputs are adjusted so that there is greater linearity in the output codeword of the TF unit 112. Further, by preprocessing linear RGB data 110, preprocessing unit 134 may use video content to select scaling and offset parameters that take into account spatiotemporal statistics or local luminance levels. .

前処理ユニット134がスケーリングおよびオフセットパラメータを決定し得る様々な方法があり得る。一例として、前処理ユニット134は、ピクチャの中のカラーの各々のヒストグラムを決定(たとえば、赤色に関するヒストグラム、緑色に関するヒストグラム、および青色に関するヒストグラムを決定)し得る。ヒストグラムは、特定のカラーの何個のピクセルが特定の照度レベルであるのかを示す。前処理ユニット134が、TFユニット112が出力するコードワードの線形性を高めるようにカラー値をスケーリングおよびオフセットすることができるように、前処理ユニット134は、TFユニット112が適用すべき伝達関数の数理的表現を用いてプリプログラムされ得る。 There can be various ways in which the preprocessing unit 134 can determine the scaling and offset parameters. As an example, preprocessing unit 134 may determine a histogram for each of the colors in the picture (eg, determine a histogram for red, a histogram for green, and a histogram for blue). The histogram shows how many pixels of a particular color are at a particular illumination level. The pre-processing unit 134 can be configured to transfer and offset the color values to enhance the linearity of the codeword output by the TF unit 112 so that the TF unit 112 can apply the transfer function to be applied. It can be preprogrammed using mathematical expressions.

前処理ユニット134は、限定された範囲[hist_min〜hist_max]にわたる入力信号のヒストグラムを、入力信号の許容されるコードワード空間の全ダイナミックレンジ[0〜1]にわたって正規化し得る。前処理ユニット134は、カラー値の各々のヒストグラムを水平に伸張すること、ならびに伸張されたヒストグラムに基づいてオフセットおよびスケールを決定することによって、入力信号に適用されるべきスケーリングおよびオフセットパラメータを決定し得る。たとえば、オフセットおよびスケールを決定するための式は、次のようであってよい。
Offset1=-hist_min
Scale1=1/(hist_max-hist_min) The preprocessing unit 134 may normalize the histogram of the input signal over a limited range [hist_min to hist_max] over the entire dynamic range [0 to 1] of the allowed codeword space of the input signal. Preprocessing unit 134 determines the scaling and offset parameters to be applied to the input signal by horizontally stretching each histogram of color values and determining the offset and scale based on the stretched histogram. obtain. For example, the equations for determining offset and scale may be as follows:
Offset1 = -hist_min
Scale1 = 1 / (hist_max-hist_min)

前処理ユニット134は、Offset1およびScale1に関する式を使用してカラーの各々に対してオフセットおよびスケーリングパラメータを決定し得、より詳細に説明するようにoffset1およびscale1パラメータを適用してカラー値をスケーリングおよびオフセットし得る。前処理ユニット134が前処理のためのオフセットおよびスケーリングパラメータを決定する様々な他の方法があり得、上式は一例として提供される。 Preprocessing unit 134 may determine offset and scaling parameters for each of the colors using equations for Offset1 and Scale1, and applies the offset1 and scale1 parameters to scale and color values as described in more detail. Can be offset. There may be various other ways in which the preprocessing unit 134 determines offset and scaling parameters for preprocessing, the above equation being provided as an example.

後処理ユニット138がオフセットおよびスケーリングパラメータを決定することをカラーコードワードに適用することを除き、後処理ユニット138は、同様の方法でオフセットおよびスケーリングパラメータを決定し得る。たとえば、後処理ユニット138は、TFユニット112が出力するコードワードの各々のヒストグラムを決定し得る。後処理ユニット138は、限定された範囲[hist_min〜hist_max]にわたるコードワードのヒストグラムを、許容されるコードワード空間の全ダイナミックレンジ[0〜1]にわたって正規化し得る。後処理ユニット138は、カラー値の各々に対してコードワードの各々のヒストグラムを伸張すること、ならびに伸張されたヒストグラムに基づいてオフセットおよびスケールを決定することによって、コードワードに適用されるべきスケーリングおよびオフセットパラメータを決定し得る。たとえば、オフセットおよびスケールを決定するための式は、次のようであってよい。
Offset1=-hist_min
Scale1=1/(hist_max-hist_min) Except for post processing unit 138 applying offset and scaling parameters to the color codeword, post processing unit 138 may determine offset and scaling parameters in a similar manner. For example, post-processing unit 138 may determine a histogram for each of the codewords output by TF unit 112. The post-processing unit 138 may normalize the histogram of codewords over a limited range [hist_min to hist_max] over the entire dynamic range [0-1] of the allowed codeword space. Post-processing unit 138 stretches each histogram of the codeword for each of the color values, and determines the scaling and to be applied to the codeword by determining an offset and scale based on the stretched histogram. An offset parameter may be determined. For example, the equations for determining offset and scale may be as follows:
Offset1 = -hist_min
Scale1 = 1 / (hist_max-hist_min)

後処理ユニット138は、Offset2およびScale2に関する式を使用してカラーの各々に対してオフセットおよびスケーリングパラメータを決定し得、より詳細に説明するようにoffset2およびscale2パラメータを適用してコードワードをスケーリングおよびオフセットし得る。前処理ユニット138が後処理のためのオフセットおよびスケーリングパラメータを決定する様々な他の方法があり得、上式は一例として提供される。 Post-processing unit 138 may determine offset and scaling parameters for each of the colors using the equations for Offset2 and Scale2, and applies the offset2 and scale2 parameters to scale and codeword as described in more detail. Can be offset. There can be various other ways in which the pre-processing unit 138 determines offset and scaling parameters for post-processing, and the above equation is provided as an example.

色変換ユニット114は後処理ユニット138に後続するものとして図示されるが、いくつかの例では、色変換ユニット114が最初にRGBからYCrCbにカラーを変換してよい。後処理ユニット138は、YCrCbコードワードに対して動作を実行し得る。ルーマ(Y)成分の場合、後処理ユニット138は、上記で説明したものと類似の技法を使用してスケーリングおよびオフセット値を決定し得る。以下のことは、クロマ成分用のスケールおよびオフセットを決定するための技法を説明する。 Although the color conversion unit 114 is illustrated as following the post-processing unit 138, in some examples, the color conversion unit 114 may first convert colors from RGB to YCrCb. Post processing unit 138 may perform operations on the YCrCb codeword. For the luma (Y) component, post-processing unit 138 may determine scaling and offset values using techniques similar to those described above. The following describes a technique for determining the scale and offset for the chroma component.

後処理ユニット138は、入力ビデオ信号の測色および出力ビデオ信号のターゲット測色から、CbおよびCrカラー成分用のスケーリングおよびオフセットパラメータを決定し得る。たとえば、原色座標(xXt,yXt)(ただし、XはR、G、Bカラー成分に対して定められる)によって指定されるターゲット(T)カラーコンテナ、 Post processing unit 138 may determine scaling and offset parameters for the Cb and Cr color components from the colorimetry of the input video signal and the target colorimetry of the output video signal. For example, the target (T) color container specified by the primary color coordinates (xXt, yXt), where X is defined for the R, G, and B color components,

および原色座標(xXn,yXn)(ただし、XはR、G、Bカラー成分に対して定められる)によって指定されるネイティブ(N)色域、 And the native (N) gamut specified by the primary color coordinates (xXn, yXn), where X is defined for the R, G, and B color components,

を考察する。 Is considered.

両方の色域に対する白色点座標は、whiteP=(xW,yW)に等しい。DRAパラメータ推定ユニット(たとえば、後処理ユニット138)は、原色座標から白色点までの間の距離の関数として、CbおよびCrカラー成分用のscale Cbおよびscale Crを導出し得る。そのような推定の一例は、以下のように与えられる。
rdT=sqrt((primeT(1,1)-whiteP(1,1))^2+(primeN(1,2)-whiteP(1,2))^2)
gdT=sqrt((primeT(2,1)-whiteP(1,1))^2+(primeN(2,2)-whiteP(1,2))^2)
bdT=sqrt((primeT(3,1)-whiteP(1,1))^2+(primeN(3,2)-whiteP(1,2))^2)
rdN=sqrt((primeN(1,1)-whiteP(1,1))^2+(primeN(1,2)-whiteP(1,2))^2)
gdN=sqrt((primeN(2,1)-whiteP(1,1))^2+(primeN(2,2)-whiteP(1,2))^2)
bdN=sqrt((primeN(3,1)-whiteP(1,1))^2+(primeN(3,2)-whiteP(1,2))^2)
scale Cb=bdT/bdN
scale Cr=sqrt((rdT/rdN)^2+(gdT/gdN)^2)
そのような実施形態のためのCbおよびCr offsetパラメータは0に等しく設定されてよく、すなわち、offset Cb=offset Cr=0である。 The white point coordinates for both gamuts are equal to whiteP = (xW, yW). A DRA parameter estimation unit (eg, post-processing unit 138) may derive scale Cb and scale Cr for the Cb and Cr color components as a function of the distance from the primary color coordinates to the white point. An example of such an estimation is given as follows.
rdT = sqrt ((primeT (1,1) -whiteP (1,1)) ^ 2+ (primeN (1,2) -whiteP (1,2)) ^ 2)
gdT = sqrt ((primeT (2,1) -whiteP (1,1)) ^ 2+ (primeN (2,2) -whiteP (1,2)) ^ 2)
bdT = sqrt ((primeT (3,1) -whiteP (1,1)) ^ 2+ (primeN (3,2) -whiteP (1,2)) ^ 2)
rdN = sqrt ((primeN (1,1) -whiteP (1,1)) ^ 2+ (primeN (1,2) -whiteP (1,2)) ^ 2)
gdN = sqrt ((primeN (2,1) -whiteP (1,1)) ^ 2+ (primeN (2,2) -whiteP (1,2)) ^ 2)
bdN = sqrt ((primeN (3,1) -whiteP (1,1)) ^ 2+ (primeN (3,2) -whiteP (1,2)) ^ 2)
scale Cb = bdT / bdN
scale Cr = sqrt ((rdT / rdN) ^ 2 + (gdT / gdN) ^ 2)
The Cb and Cr offset parameters for such an embodiment may be set equal to 0, ie offset Cb = offset Cr = 0.

いくつかの例では、色変換ユニット114は、前処理ユニット134が前処理を適用する前にRGBをYCrCbに変換し得る。そのような例の場合、前処理ユニット134は、後処理ユニット138に対して上記で説明したものと類似の動作をYCrCb値に対して実行し得、TFユニット112が伝達関数を適用する前の入力カラー値に対して予期し得る。 In some examples, the color conversion unit 114 may convert RGB to YCrCb before the preprocessing unit 134 applies preprocessing. In such an example, the pre-processing unit 134 may perform operations similar to those described above for the post-processing unit 138 on the YCrCb value, before the TF unit 112 applies the transfer function. Can be expected for input color values.

前処理ユニット134は、入力カラー値に適用されたとき、TFユニット112が伝達関数を適用するときにTFユニット112の出力が線形コードワードである(たとえば、コードワードによって表されるカラー値の範囲が、カラーおよびコードワード空間にわたって相対的に同じである)ような出力をもたらすスケーリングおよびオフセットファクタを決定するために、ヒストグラムに基づいてスケーリングおよびオフセットファクタを決定し得る。いくつかの例では、後処理ユニット138は、TFユニット112が出力するコードワードにスケーリングおよびオフセットすることを適用し得る。後処理ユニット138は、利用可能なコードワードの範囲をより良好に利用するようにコードワードを修正し得る。たとえば、TFユニット112の出力は、コードワード空間全体を利用しないコードワードであってよい。コードワード空間にわたってコードワードを広げることによって、量子化ユニット116による信号対量子化雑音比が改善され得る。 When the preprocessing unit 134 is applied to the input color value, the output of the TF unit 112 is a linear codeword when the TF unit 112 applies the transfer function (e.g., the range of color values represented by the codeword). Can be determined based on the histogram to determine a scaling and offset factor that produces an output that is relatively the same across color and codeword spaces. In some examples, post-processing unit 138 may apply scaling and offsetting to the codeword that TF unit 112 outputs. Post processing unit 138 may modify the codewords to better utilize the range of available codewords. For example, the output of the TF unit 112 may be a codeword that does not use the entire codeword space. By spreading the codeword over the codeword space, the signal to quantization noise ratio by the quantization unit 116 may be improved.

信号対量子化雑音比は、通常、最大公称信号強度と量子化誤差(量子化雑音とも呼ばれる)との間の関係を反映し、すなわち、SNR=E(x^2)/E(n^2)であり、ただし、E(x^2)は信号の電力であり、E(n^2)は量子化雑音の電力であり、ここで、^は指数演算を表し、Eはエネルギーであり、nは雑音である。 The signal-to-quantization noise ratio usually reflects the relationship between maximum nominal signal strength and quantization error (also called quantization noise), i.e., SNR = E (x ^ 2) / E (n ^ 2 ) Where E (x ^ 2) is the power of the signal, E (n ^ 2) is the power of the quantization noise, where ^ represents the exponential operation, E is the energy, n is noise.

信号xにスケーリングパラメータ>1.0を掛けることは、信号の電力の増大をもたらすことになり、したがって、改善された信号対量子化雑音比、すなわち、SNR2=E((scale*x)^2)/E(n^2)>SNRにつながることになる。 Multiplying the signal x by the scaling parameter> 1.0 will result in an increase in the power of the signal and thus an improved signal-to-quantization noise ratio, ie SNR2 = E ((scale * x) ^ 2) / E (n ^ 2)> SNR.

ビデオエンコーダ20は、ビデオプリプロセッサ19の出力を受信し、ビデオプリプロセッサ19が出力するビデオデータを符号化し、後でビデオデコーダ30によって復号およびビデオポストプロセッサ31によって処理できるように符号化ビデオデータを出力する。いくつかの例では、ビデオエンコーダ20は、前処理ユニット134および後処理ユニット138のうちの一方または両方のためのスケーリングおよびオフセットファクタを示す情報を、符号化およびシグナリングし得る。いくつかの例では、スケーリングおよびオフセットファクタを示す情報を符号化およびシグナリングするのではなく、ビデオプリプロセッサ19は、ビデオポストプロセッサ31がスケーリングおよびオフセットファクタをそこから決定するヒストグラムなどの情報を出力し得る。ビデオデータが前処理ユニット134によって前処理または後処理ユニット138によって後処理された方式を示す、ビデオプリプロセッサ19が出力する情報は、適応型伝達関数(ATF)パラメータと呼ばれることがある。 Video encoder 20 receives the output of video preprocessor 19, encodes the video data output by video preprocessor 19, and outputs the encoded video data for later decoding by video decoder 30 and processing by video postprocessor 31. . In some examples, video encoder 20 may encode and signal information indicating a scaling and offset factor for one or both of pre-processing unit 134 and post-processing unit 138. In some examples, rather than encoding and signaling information indicative of scaling and offset factors, video preprocessor 19 may output information such as a histogram from which video postprocessor 31 determines scaling and offset factors. . Information output by the video preprocessor 19 that indicates the manner in which the video data has been preprocessed by the preprocessing unit 134 or postprocessed by the postprocessing unit 138 may be referred to as adaptive transfer function (ATF) parameters.

様々な例では、TFユニット112によって適用される伝達関数が静的である(たとえば、コンテンツ適応型でない)ことを理解されたい。しかしながら、TFユニット112に出力されるデータは(たとえば、前処理ユニット134によって)適応されてよく、かつ/またはTFユニット112が出力するデータは(たとえば、後処理ユニット138によって)適応されてよい。このようにして、ビデオプリプロセッサ19は、線形RGBデータ110のダイナミックレンジと比較して低減されたダイナミックレンジでのカラーを表すコードワードを出力し、ここで、出力コードワードはビデオコンテンツに基づく。したがって、TFユニット112への入力およびTFユニット112の出力を適応させることによって、前処理ユニット134と、TFユニット112と、後処理ユニット138との組合せは、適応型伝達関数を適用するように機能する。 It should be understood that in various examples, the transfer function applied by the TF unit 112 is static (eg, not content adaptive). However, data output to the TF unit 112 may be adapted (eg, by the pre-processing unit 134) and / or data output by the TF unit 112 may be adapted (eg, by the post-processing unit 138). In this way, video preprocessor 19 outputs a codeword that represents a color with a reduced dynamic range compared to the dynamic range of linear RGB data 110, where the output codeword is based on the video content. Thus, by adapting the input to TF unit 112 and the output of TF unit 112, the combination of pre-processing unit 134, TF unit 112, and post-processing unit 138 functions to apply an adaptive transfer function. To do.

図10は、固定TFを用いたコンテンツ適応型HDR処理パイプライン(デコーダ側)を示す概念図である。図示したように、ビデオポストプロセッサ31は、逆量子化ユニット122、逆色変換ユニット124、逆後処理ユニット144、逆TFユニット126、および逆前処理ユニット142を含む。 FIG. 10 is a conceptual diagram showing a content adaptive HDR processing pipeline (decoder side) using a fixed TF. As shown, the video post-processor 31 includes an inverse quantization unit 122, an inverse color conversion unit 124, an inverse post-processing unit 144, an inverse TF unit 126, and an inverse pre-processing unit 142.

図示の例では、ビデオポストプロセッサ31は、固定機能およびプログラマブルの回路構成として構成され得る。たとえば、ビデオポストプロセッサ31は、逆量子化ユニット122、逆色変換ユニット124、逆後処理ユニット144、逆TFユニット126、および逆前処理ユニット142を一緒または別個に形成する、トランジスタ、キャパシタ、インダクタ、受動素子および能動素子、算術論理ユニット(ALU)、初等関数ユニット(EFU)などを含み得る。いくつかの例では、ビデオポストプロセッサ31は、逆量子化ユニット122、逆色変換ユニット124、逆後処理ユニット144、逆TFユニット126、および逆前処理ユニット142に、それらのそれぞれの機能を実行させる命令を実行するプログラマブルコアを含む。そのような例では、ビデオデータメモリ140またはいくつかの他のメモリが、ビデオポストプロセッサ31によって実行される命令を記憶し得る。 In the illustrated example, the video post processor 31 may be configured as a fixed function and programmable circuit configuration. For example, the video post-processor 31 may form an inverse quantization unit 122, an inverse color conversion unit 124, an inverse post-processing unit 144, an inverse TF unit 126, and an inverse pre-processing unit 142 together or separately, including transistors, capacitors, inductors , Passive and active elements, arithmetic logic units (ALU), elementary function units (EFU), and the like. In some examples, video post-processor 31 performs their respective functions on inverse quantization unit 122, inverse color transform unit 124, inverse post-processing unit 144, inverse TF unit 126, and inverse pre-processing unit 142. A programmable core that executes instructions to be executed. In such an example, video data memory 140 or some other memory may store instructions executed by video post processor 31.

図10では、理解しやすいように、ビデオデータメモリ140も図示される。たとえば、ビデオデータメモリ140は、ビデオポストプロセッサ31によって出力されるビデオデータを一時的に記憶し得る。別の例として、ビデオデコーダ30がビデオポストプロセッサ31に出力する任意のビデオデータは、ビデオデータメモリ140の中に一時的に記憶され得る(たとえば、ビデオポストプロセッサ31によって受信される前にビデオデータメモリの中に記憶する)。ビデオデータメモリ140は、ビデオポストプロセッサ31の一部であってよく、またはビデオポストプロセッサ31の外部にあってもよい。 In FIG. 10, a video data memory 140 is also shown for ease of understanding. For example, video data memory 140 may temporarily store video data output by video post processor 31. As another example, any video data that video decoder 30 outputs to video post processor 31 may be temporarily stored in video data memory 140 (e.g., video data before being received by video post processor 31). Store in memory). Video data memory 140 may be part of video post processor 31 or may be external to video post processor 31.

ビデオデータメモリ140の中に記憶されたビデオデータは、たとえば、ディスプレイデバイス32に出力され得る。ビデオデータメモリ140は、同期DRAM(SDRAM)、磁気抵抗RAM(MRAM)、抵抗RAM(RRAM（登録商標）)、または他のタイプのメモリデバイスを含む、ダイナミックランダムアクセスメモリ(DRAM)などの様々なメモリデバイスのうちのいずれかによって形成され得る。様々な例では、ビデオデータメモリ140は、ビデオポストプロセッサ31の他の構成要素とともにオンチップであってよく、またはそれらの構成要素に対してオフチップであってもよい。 Video data stored in video data memory 140 may be output to display device 32, for example. Video data memory 140 may be a variety of dynamic random access memory (DRAM), including synchronous DRAM (SDRAM), magnetoresistive RAM (MRAM), resistive RAM (RRAM®), or other types of memory devices. It can be formed by any of the memory devices. In various examples, video data memory 140 may be on-chip with other components of video post processor 31 or may be off-chip with respect to those components.

ビデオポストプロセッサ31は、ビデオプリプロセッサ19の逆のプロセスを実行するように構成され得る。たとえば、ビデオデコーダ30は、ビデオポストプロセッサ31に復号ビデオデータを出力し、復号ビデオデータは、ビデオプリプロセッサ19がビデオエンコーダ20に出力したビデオデータと実質的に類似であり得る。加えて、ビデオデコーダ30は、ビデオプリプロセッサ19によって実行された前処理および後処理の逆を実行するために、適応型伝達関数(ATF)パラメータをビデオポストプロセッサ31に出力し得る。 Video post processor 31 may be configured to perform the reverse process of video preprocessor 19. For example, the video decoder 30 outputs decoded video data to the video post processor 31, which may be substantially similar to the video data output by the video preprocessor 19 to the video encoder 20. In addition, video decoder 30 may output adaptive transfer function (ATF) parameters to video postprocessor 31 to perform the inverse of the pre-processing and post-processing performed by video preprocessor 19.

逆量子化ユニット122は、ビデオデコーダ30からビデオデータを受信し、量子化ユニット116の逆の動作を実行する。逆量子化ユニット122の出力は、量子化されていないビデオデータである。逆色変換ユニットは、色変換ユニット114の動作とは逆の動作を実行する。たとえば、色変換ユニット114がRGBカラーをYCrCbカラーに変換した場合、逆色変換ユニット124は、YCrCbカラーをRGBカラーに変換する。 The inverse quantization unit 122 receives video data from the video decoder 30 and performs the reverse operation of the quantization unit 116. The output of the inverse quantization unit 122 is video data that has not been quantized. The reverse color conversion unit performs an operation opposite to the operation of the color conversion unit 114. For example, when the color conversion unit 114 converts RGB color to YCrCb color, the reverse color conversion unit 124 converts the YCrCb color to RGB color.

逆後処理ユニット144は、後処理ユニット138の逆の動作を実行し得る。逆後処理ユニット144は、第1のダイナミックレンジでのカラーを表すコードワードを受信し、第1のダイナミックレンジは、この場合、後処理ユニット138の出力と同じダイナミックレンジである。 The reverse post-processing unit 144 may perform the reverse operation of the post-processing unit 138. The reverse post-processing unit 144 receives a codeword representing a color in the first dynamic range, which in this case is the same dynamic range as the output of the post-processing unit 138.

後処理ユニット138を用いて行われたようにスケーリングファクタを掛けるのではなく、逆後処理ユニット144は、後処理ユニット138がそれを用いて乗算したのと実質的に類似のスケーリングファクタで除算し得る。後処理ユニット138が、スケーリングされたコードワードからオフセットを減じた場合、逆後処理ユニット144は、逆スケーリングされたコードワードにオフセットを加え得る。逆後処理ユニット144の出力は、コードワード空間全体にわたって広げられないコードワードであってよい。いくつかの例では、逆後処理ユニット144は、ビデオプリプロセッサ19からスケーリングおよびオフセットパラメータ(たとえば、スケールおよびオフセットファクタ)を受信し得る。いくつかの例では、逆後処理ユニット144は、逆後処理ユニット144がスケーリングおよびオフセットファクタをそこから決定する情報を受信し得る。 Rather than multiplying by a scaling factor as was done with post-processing unit 138, inverse post-processing unit 144 divides by a scaling factor substantially similar to that post-processing unit 138 multiplied with. obtain. If post-processing unit 138 subtracts the offset from the scaled codeword, inverse post-processing unit 144 may add the offset to the descaled codeword. The output of the reverse post-processing unit 144 may be a codeword that is not spread over the entire codeword space. In some examples, inverse post-processing unit 144 may receive scaling and offset parameters (eg, scale and offset factor) from video preprocessor 19. In some examples, inverse post-processing unit 144 may receive information from which inverse post-processing unit 144 determines scaling and offset factors.

逆TFユニット126は、TFユニット112の逆の動作を実行する。たとえば、TFユニット112は、カラー値のダイナミックレンジと比較してより小さいダイナミックレンジを表すコードワードの中でカラー値を短縮した。逆TFユニット126は、より小さいダイナミックレンジからより大きいダイナミックレンジに戻してカラー値を拡張する。逆TFユニット126の出力は線形RGBデータであり得るが、元のRGBデータをもたらすために除去されることが必要ないくつかの前処理があり得る。 The reverse TF unit 126 performs the reverse operation of the TF unit 112. For example, the TF unit 112 shortened the color value in a codeword that represents a smaller dynamic range compared to the color value dynamic range. The inverse TF unit 126 extends the color value from the smaller dynamic range back to the larger dynamic range. The output of the inverse TF unit 126 can be linear RGB data, but there can be some preprocessing that needs to be removed to yield the original RGB data.

逆前処理ユニット142は、逆TFユニット126の出力を受信し、前処理ユニット134によって適用された前処理の逆を実行する。逆後処理ユニット144は、第2のダイナミックレンジでのカラーを表すカラー値を受信し、第2のダイナミックレンジは、この場合、前処理ユニット134の出力と同じダイナミックレンジである。 The reverse preprocessing unit 142 receives the output of the reverse TF unit 126 and performs the reverse of the preprocessing applied by the preprocessing unit 134. The inverse post-processing unit 144 receives a color value representing a color in the second dynamic range, which in this case is the same dynamic range as the output of the pre-processing unit 134.

前処理ユニット134を用いて行われたようにスケーリングファクタを掛けるのではなく、逆前処理ユニット142は、前処理ユニット134がそれを用いて乗算した実質的に類似のスケーリングファクタで除算し得る。前処理ユニット134が、スケーリングされたカラー値からオフセットを減じた場合、逆前処理ユニット142は、逆スケーリングされたカラー値にオフセットを加えてよい。いくつかの例では、逆前処理ユニット142は、ビデオプリプロセッサ19からスケーリングおよびオフセットパラメータ(たとえば、スケーリングおよびオフセットファクタ)を受信し得る。いくつかの例では、逆前処理ユニット142は、逆前処理ユニット142が、スケーリングおよびオフセットファクタをそこから決定する情報(たとえば、ヒストグラム情報)を受信し得る。 Rather than multiplying by a scaling factor as was done with preprocessing unit 134, inverse preprocessing unit 142 may divide by a substantially similar scaling factor that preprocessing unit 134 has multiplied with. If preprocessing unit 134 subtracts the offset from the scaled color value, inverse preprocessing unit 142 may add the offset to the descaled color value. In some examples, inverse preprocessing unit 142 may receive scaling and offset parameters (eg, scaling and offset factor) from video preprocessor 19. In some examples, inverse preprocessing unit 142 may receive information (eg, histogram information) from which inverse preprocessing unit 142 determines scaling and offset factors.

逆前処理ユニット142の出力は、線形RGBデータ128であり得る。線形RGBデータ128および線形RGBデータ110は、実質的に類似のはずである。ディスプレイデバイス32は、線形RGBデータ128を表示し得る。 The output of the inverse preprocessing unit 142 may be linear RGB data 128. The linear RGB data 128 and the linear RGB data 110 should be substantially similar. Display device 32 may display linear RGB data 128.

TFユニット112と同様に、逆TFユニット126が適用する逆伝達関数は、静的逆伝達関数である(たとえば、ビデオコンテンツに適応性を示さない)。しかしながら、逆TFユニット126に出力されるデータは(たとえば、逆後処理ユニット144によって)適応されてよく、かつ/または逆TFユニット126が出力するデータは(たとえば、逆前処理ユニット142によって)適応されてよい。このようにして、ビデオポストプロセッサ31は、後処理ユニット138の出力の低減されたダイナミックレンジと比較して、低減されたダイナミックレンジでないカラーを表すカラー値を出力する。したがって、逆TFユニット126への入力および逆TFユニット126の出力を適応させることによって、逆後処理ユニット144と、逆TFユニット126と、逆前処理ユニット142との組合せは、適応型逆伝達関数を適用するように機能する。 Similar to TF unit 112, the inverse transfer function applied by inverse TF unit 126 is a static inverse transfer function (eg, does not exhibit adaptability to video content). However, data output to inverse TF unit 126 may be adapted (eg, by inverse post-processing unit 144) and / or data output by inverse TF unit 126 is adapted (eg, by inverse pre-processing unit 142). May be. In this way, the video post processor 31 outputs a color value representing a color that is not in the reduced dynamic range as compared to the reduced dynamic range of the output of the post-processing unit 138. Therefore, by adapting the input to the inverse TF unit 126 and the output of the inverse TF unit 126, the combination of the inverse post-processing unit 144, the inverse TF unit 126, and the inverse pre-processing unit 142 is an adaptive inverse transfer function. Function to apply.

前処理ユニット134と後処理ユニット138の両方が図9に示されるが、本開示で説明する例示的な技法はそのように限定されない。いくつかの例では、ビデオプリプロセッサ19は、前処理ユニット134を含み得るが、後処理ユニット138を含まなくてもよい。いくつかの例では、ビデオプリプロセッサ19は、後処理ユニット138を含み得るが、前処理ユニット134を含まなくてもよい。 Although both pre-processing unit 134 and post-processing unit 138 are shown in FIG. 9, the exemplary techniques described in this disclosure are not so limited. In some examples, the video preprocessor 19 may include a preprocessing unit 134, but may not include a postprocessing unit 138. In some examples, video preprocessor 19 may include a post-processing unit 138, but may not include a pre-processing unit 134.

したがって、図9は、デバイスがビデオデータメモリ132およびビデオプリプロセッサ19を含む、コンテンツ適応型HDRシステムにおけるビデオ処理のためのデバイス(たとえば、ソースデバイス12)に関する例を示す。ビデオプリプロセッサ19は、固定機能もしくはプログラマブルの回路構成のうちの少なくとも1つ、または両方の組合せを備え、第1のダイナミックレンジでのカラー(たとえば、線形RGBデータ110)を表すビデオデータの複数のカラー値を受信するように構成される。ビデオプリプロセッサ19は、第2のダイナミックレンジでの短縮カラー値を表す複数のコードワードを生成するために、短縮されているビデオデータに適応性を示さない静的伝達関数を使用してカラー値を短縮するように構成されたTFユニット112を含む。第2のダイナミックレンジは、第1のダイナミックレンジよりもコンパクトである。 Accordingly, FIG. 9 shows an example of a device (eg, source device 12) for video processing in a content adaptive HDR system, where the device includes a video data memory 132 and a video preprocessor 19. The video preprocessor 19 has at least one of fixed function or programmable circuitry, or a combination of both, and a plurality of colors of video data representing a color in a first dynamic range (e.g., linear RGB data 110) Configured to receive a value. Video preprocessor 19 uses a static transfer function that is not adaptable to the shortened video data to generate a plurality of codewords that represent the shortened color value in the second dynamic range. It includes a TF unit 112 configured to be shortened. The second dynamic range is more compact than the first dynamic range.

ビデオプリプロセッサ19は、前処理ユニット134または後処理ユニット138のうちの少なくとも1つを含み得る。前処理ユニット134は、短縮されるカラー値を生成するための短縮の前に、カラー値を前処理するように構成される。後処理ユニット138は、カラー値の短縮から得られたコードワードを後処理するように構成される。ビデオプリプロセッサ19は、短縮カラー値(たとえば、後処理ユニット138がない例において)または後処理された短縮カラー値(たとえば、後処理ユニット138がビデオプリプロセッサ19の一部である例において)のうちの1つに基づくカラー値(たとえば、量子化ユニット116の出力)を出力するように構成され得る。 Video preprocessor 19 may include at least one of pre-processing unit 134 or post-processing unit 138. Preprocessing unit 134 is configured to preprocess the color values prior to shortening to produce shortened color values. The post-processing unit 138 is configured to post-process the codeword obtained from the color value shortening. The video pre-processor 19 may select from among the shortened color values (e.g., in the example without the post-processing unit 138) or post-processed shortened color values (e.g., in the example where the post-processing unit 138 is part of the video pre-processor 19). One may be configured to output a color value based on one (eg, the output of quantization unit 116).

また、逆後処理ユニット144と逆前処理ユニット142の両方が図10に示されるが、本開示で説明する例示的な技法はそのように限定されない。いくつかの例では、ビデオポストプロセッサ31は、逆後処理ユニット144を含み得るが、逆前処理ユニット142を含まなくてもよい(たとえば、ビデオプリプロセッサ19が前処理ユニット134を含まない例において、ただし、技法はそのように限定されない)。いくつかの例では、ビデオポストプロセッサ31は、逆前処理ユニット142を含み得るが、逆後処理ユニット144を含まなくてもよい(たとえば、ビデオプリプロセッサ19が後処理ユニット138を含まない例において、ただし、技法はそのように限定されない)。 Also, although both reverse post-processing unit 144 and reverse pre-processing unit 142 are shown in FIG. 10, the exemplary techniques described in this disclosure are not so limited. In some examples, video post-processor 31 may include reverse post-processing unit 144, but may not include reverse pre-processing unit 142 (e.g., in an example where video preprocessor 19 does not include pre-processing unit 134). However, the technique is not so limited). In some examples, video post-processor 31 may include reverse pre-processing unit 142, but may not include reverse post-processing unit 144 (e.g., in an example where video pre-processor 19 does not include post-processing unit 138, However, the technique is not so limited).

したがって、図10は、デバイスがビデオデータメモリ140およびビデオポストプロセッサ31を含む、コンテンツ適応型HDRシステムにおけるビデオ処理のためのデバイス(たとえば、宛先デバイス14)に関する例を示す。ビデオポストプロセッサ31は、固定機能もしくはプログラマブルの回路構成のうちの少なくとも1つ、または両方の組合せを備え、ビデオデータの短縮カラー値を表す第1の複数のコードワードを受信するように構成される。短縮カラー値は、第1のダイナミックレンジ(たとえば、ビデオプリプロセッサ19が出力するコードワードの短縮されたダイナミックレンジと実質的に類似のダイナミックレンジ)でのカラーを表す。ビデオポストプロセッサ31は、非短縮カラー値を生成するために、ビデオデータに適応性を示さない逆静的伝達関数を使用して、第1の複数のコードワードに基づいて第2の複数のコードワードを非短縮化するように構成された、逆TFユニット126を含む。非短縮カラー値は、第2のダイナミックレンジ(たとえば、線形RGBデータ110と実質的に類似のダイナミックレンジ)でのカラーを表す。第2の複数のコードワードは、逆後処理されている第1の複数のコードワード(たとえば、逆後処理ユニット144が含まれる例において)または第1の複数のコードワード(たとえば、逆後処理ユニット144が含まれない例において)からのコードワードのうちの1つである。 Accordingly, FIG. 10 shows an example of a device for video processing (eg, destination device 14) in a content adaptive HDR system, where the device includes a video data memory 140 and a video post processor 31. Video post-processor 31 comprises at least one of fixed function or programmable circuitry, or a combination of both, and is configured to receive a first plurality of codewords representing a shortened color value of video data . The shortened color value represents the color in the first dynamic range (eg, a dynamic range substantially similar to the shortened dynamic range of the codeword output by the video preprocessor 19). The video post processor 31 uses the inverse static transfer function that is not adaptable to the video data to generate a non-shortened color value and uses the second plurality of codes based on the first plurality of code words. It includes an inverse TF unit 126 configured to unshorten the word. The non-shortened color value represents a color in a second dynamic range (eg, a dynamic range substantially similar to linear RGB data 110). The second plurality of codewords may be the first plurality of codewords being reverse post-processed (e.g., in an example including reverse post-processing unit 144) or the first plurality of codewords (e.g., reverse post-processing) One of the codewords from (in the example where unit 144 is not included).

ビデオポストプロセッサ31は、逆後処理ユニット144または逆前処理ユニット142のうちの少なくとも1つを含み得る。逆後処理ユニット144は、非短縮化された第2の複数のコードワードを生成するために、第1の複数のコードワードを逆後処理するように構成される。逆前処理ユニット142は、第2の複数のコードワードの非短縮化から得られた非短縮カラー値を逆前処理するように構成される。ビデオプリプロセッサ19は、非短縮カラー値(たとえば、逆前処理ユニット142が含まれない例では逆TFユニット126の出力)または逆前処理された非短縮カラー値(たとえば、逆前処理ユニット142の出力)を、表示のために出力するように構成され得る。 Video post processor 31 may include at least one of reverse post-processing unit 144 or reverse pre-processing unit 142. The reverse post-processing unit 144 is configured to reverse post-process the first plurality of codewords to generate a non-shortened second plurality of codewords. The reverse preprocessing unit 142 is configured to reverse preprocess the non-shortened color values obtained from the non-shortening of the second plurality of codewords. The video preprocessor 19 may select a non-shortened color value (e.g., the output of the reverse TF unit 126 in an example where the reverse preprocessing unit 142 is not included) or a reverse preprocessed non-shortened color value (e.g., the output of the reverse preprocessing unit 142 ) May be configured to output for display.

図9および図10は、固定(たとえば、静的)伝達関数から得られるアーティファクトを低減するための、コンテンツ適応型ダイナミックレンジ調整(DRA)のための例示的な技法を示す。しかしながら、TFユニット112が受信または出力するデータを前処理および/または後処理することによって、ならびに逆TFユニット126が受信または出力するデータを逆後処理および/または逆前処理することによって、例示的な技法は、依然として静的伝達関数を使用し得るが、やはりコンテンツ適応型DRAを達成し得る。 9 and 10 illustrate exemplary techniques for content adaptive dynamic range adjustment (DRA) to reduce artifacts resulting from fixed (eg, static) transfer functions. However, by pre-processing and / or post-processing the data received or output by the TF unit 112 and by reverse post-processing and / or reverse pre-processing the data received or output by the reverse TF unit 126 Techniques may still use static transfer functions, but still achieve content adaptive DRA.

本開示は、静的固定伝達関数(TF)を採用するコンテンツ適応型HDRビデオシステムを説明する。以下のことは、一緒に使用され得るかまたは別個に維持され得るいくつかの例を説明する。簡単のために、様々な並べ替えおよび組合せが可能であるという理解とともに、それらが別個に説明される。 This disclosure describes a content-adaptive HDR video system that employs a static fixed transfer function (TF). The following describes some examples that can be used together or maintained separately. For simplicity, they will be described separately, with the understanding that various permutations and combinations are possible.

上記で説明したように、本開示で説明する技法は、(たとえば、符号化されるビデオコンテンツを生成するために)静的TFをパイプラインで利用するが、信号特性を固定処理フローに適応させる。このことは、TFユニット112によって(たとえば、前処理ユニット134を用いて)処理されるべき信号、またはTFユニット112による(たとえば、後処理ユニット138を用いた)TFの適用から得られた信号のいずれかの、適応処理によって達成され得る。 As explained above, the techniques described in this disclosure utilize static TF in the pipeline (eg, to generate encoded video content), but adapt the signal characteristics to a fixed processing flow. . This means that the signal to be processed by the TF unit 112 (e.g. using the pre-processing unit 134) or the signal obtained from the application of the TF by the TF unit 112 (e.g. using the post-processing unit 138) Either can be achieved by adaptive processing.

一例では、前処理ユニット134は、TFユニット112が伝達関数を適用する前に、入力線形カラー値の線形(スケーリングおよびオフセット)前処理によって適応性を可能にし得る。前処理ユニット134に加えて、または前処理ユニット134の代わりに、後処理ユニット138は、TFユニット112が線形カラー値にTFを適用することから得られた非線形コードワードの線形(スケーリングおよびオフセット)後処理によって、適応性を可能にし得る。逆後処理ユニット144、逆TFユニット126、および逆前処理ユニット142は、それぞれ、後処理ユニット138、TFユニット112、および前処理ユニット134の逆を適用し得る。 In one example, the preprocessing unit 134 may allow adaptability by linear (scaling and offset) preprocessing of the input linear color values before the TF unit 112 applies the transfer function. In addition to or instead of the preprocessing unit 134, the postprocessing unit 138 is a linear (scaling and offset) of the nonlinear codeword obtained from the TF unit 112 applying TF to the linear color values. Post-processing may allow for adaptability. Reverse post-processing unit 144, reverse TF unit 126, and reverse pre-processing unit 142 may apply the reverse of post-processing unit 138, TF unit 112, and pre-processing unit 134, respectively.

前処理ユニット134は、オフセットOffset1およびスケールScale1を適用することによって入力信号(たとえば、線形RGB110)に線形前処理を適用して、TF出力の好ましいコードワード分布(たとえば、各コードワードがルミナンスなどの入力カラー値のほぼ等しい範囲を表すような)を達成し得る。後処理ユニット138は、線形後処理をTFユニット112の出力に対して適用し得る。線形後処理は、パラメータScale2およびOffset2によって規定され、TFユニット112が伝達関数を適用した後、利用可能なコードワード空間(ダイナミックレンジ)の効率的な利用を可能にする。 Preprocessing unit 134 applies linear preprocessing to the input signal (e.g., linear RGB 110) by applying offset Offset1 and scale Scale1 to provide a preferred codeword distribution of TF output (e.g., each codeword is a luminance etc. To represent an approximately equal range of input color values). Post processing unit 138 may apply linear post processing to the output of TF unit 112. The linear post-processing is defined by the parameters Scale2 and Offset2, allowing efficient use of the available codeword space (dynamic range) after the TF unit 112 applies the transfer function.

上記で説明した例では、前処理ユニット134は、前処理を線形RGBデータ110に対して適用する。ただし、本開示で説明する技法はそのように限定されない。RGBは、前処理ユニット134が前処理をそこで実行し得る、HDRの1つの色空間である。概して、前処理ユニット134は、TFユニット112が伝達関数を適用することに先行する、HDR処理フローの任意の色空間において(たとえば、図示の例における入力線形RGBにおいて、ただし、YCbCr色空間などの他の色空間も可能である)、前処理を実施し得る。 In the example described above, the preprocessing unit 134 applies preprocessing to the linear RGB data 110. However, the techniques described in this disclosure are not so limited. RGB is one color space of HDR in which the preprocessing unit 134 can perform preprocessing there. In general, the preprocessing unit 134 is in any color space of the HDR processing flow (e.g., in the input linear RGB in the illustrated example, but in the YCbCr color space, etc.) that precedes the TF unit 112 applying the transfer function. Other color spaces are possible) and pre-processing can be performed.

逆前処理ユニット142も、任意の色空間において逆前処理を実行するように構成され得る。そのような例では、逆前処理ユニット142は、非線形カラー値(たとえば、非線形RGBデータ、ただし、YCbCrを含む他の色空間が可能である)を逆TFユニット126から受信し得る。 The reverse preprocessing unit 142 may also be configured to perform reverse preprocessing in any color space. In such an example, inverse preprocessing unit 142 may receive nonlinear color values (eg, nonlinear RGB data, but other color spaces including YCbCr are possible) from inverse TF unit 126.

上記の例は、後処理ユニット138がRGB色空間(たとえば、TFユニット112によって適用される伝達関数に起因する非線形RGB)において後処理を実行することを説明する。ただし、本開示で説明する技法はそのように限定されない。RGBは、後処理ユニット138が後処理をそこで実行し得る、HDRの1つの色空間である。概して、後処理ユニット138は、TFユニット112が伝達関数を適用することに後続する、HDR処理フローの任意の色空間において(たとえば、図示の例における非線形RGBにおいて、ただし、YCbCr色空間などの他の色空間も可能である)、後処理を実施し得る。 The above example illustrates that post-processing unit 138 performs post-processing in the RGB color space (eg, non-linear RGB due to the transfer function applied by TF unit 112). However, the techniques described in this disclosure are not so limited. RGB is one color space in HDR where the post-processing unit 138 can perform post-processing there. In general, the post-processing unit 138 is in any color space of the HDR processing flow (e.g., in the non-linear RGB in the illustrated example, but other in the YCbCr color space, etc.) following the TF unit 112 applying the transfer function. Post-processing may be performed.

逆後処理ユニット144も、任意の色空間において逆後処理を実行するように構成され得る。そのような例では、逆後処理ユニット144は、カラー値(たとえば、非線形短縮RGBデータ、ただし、YCbCrを含む他の色空間が可能である)をビデオデコーダ30から、場合によっては逆量子化の後に受信し得る。 Reverse post-processing unit 144 may also be configured to perform reverse post-processing in any color space. In such an example, the inverse post-processing unit 144 obtains color values (e.g., non-linear shortened RGB data, but other color spaces including YCbCr are possible) from the video decoder 30, and possibly in inverse quantization. Can be received later.

上記で説明したように、前処理ユニット134は、エンコーダ側にあり、より良好なパラメータ導出を引き起こすために、ヒストグラム分布入力などのHDR信号特性またはターゲット色空間(たとえば、線形RGBデータ110、または色変換ユニット114が線形RGBデータ110を受信しカラーをYCrCb値に変換する場合はYCrCb)から、あるいは補助色空間において、Scale1およびOffset1ファクタなどのパラメータを導出および適用し得る。後処理ユニット138も、TFユニット112の出力からScale2およびOffset2ファクタなどのパラメータを導出および適用し得る。 As explained above, the pre-processing unit 134 is on the encoder side, and in order to cause better parameter derivation, HDR signal characteristics such as histogram distribution input or target color space (e.g. linear RGB data 110, or color Parameters such as Scale1 and Offset1 factors may be derived and applied from YCrCb) when conversion unit 114 receives linear RGB data 110 and converts colors to YCrCb values) or in auxiliary color space. The post-processing unit 138 may also derive and apply parameters such as Scale2 and Offset2 factors from the output of the TF unit 112.

前処理ユニット134および/または後処理ユニット138は、Scale1およびOffset1ファクタ、ならびに/または逆前処理ユニット142および逆後処理ユニット144が、それぞれ、前処理ユニット134および後処理ユニット138の動作の逆を実行するために利用するScale2およびOffset2ファクタを出力し得る。いくつかの例では、Scale1、Offset1、Scale2、および/またはOffset2ファクタを出力するのではなく、前処理ユニット134および後処理ユニット138は、逆前処理ユニット142および逆後処理ユニット144が同じプロセスを使用して逆プロセス用のScale1、Offset1、Scale2、および/またはOffset2を決定できるように、Scale1、Offset1、Scale2、および/またはOffset2を決定するために使用される情報を出力し得る。場合によっては、前処理ユニット134および後処理ユニット138は、1つまたは複数のScale1、Offset1、Scale2、および/またはOffset2を出力し得、他のScale1、Offset1、Scale2、および/またはOffset2を導出するために使用され得る情報を出力し得る。 The pre-processing unit 134 and / or the post-processing unit 138 are scale 1 and offset 1 factors, and / or the reverse pre-processing unit 142 and the reverse post-processing unit 144, respectively, reverse the operation of the pre-processing unit 134 and the post-processing unit 138, respectively. Scale2 and Offset2 factors used to execute can be output. In some examples, instead of outputting the Scale1, Offset1, Scale2, and / or Offset2 factors, the pre-processing unit 134 and the post-processing unit 138 have the reverse pre-processing unit 142 and the reverse post-processing unit 144 perform the same process. The information used to determine Scale1, Offset1, Scale2, and / or Offset2 may be output so that it can be used to determine Scale1, Offset1, Scale2, and / or Offset2 for the reverse process. In some cases, pre-processing unit 134 and post-processing unit 138 may output one or more Scale1, Offset1, Scale2, and / or Offset2, and derive other Scale1, Offset1, Scale2, and / or Offset2. Information that can be used for the purpose.

いくつかの例では、ビデオプリプロセッサ19は(たとえば、コントローラ回路を介して)、前処理ユニット134および/または後処理ユニット138の使用を選択的に可能にし得る。したがって、ビデオプリプロセッサ19は、前処理ユニット134および/または後処理ユニット138が有効化されているかどうかを示す情報を出力し得る。それに応答して、ビデオポストプロセッサ31は(たとえば、コントローラ回路を介して)、逆後処理ユニット144および/または逆前処理ユニット142の使用を選択的に可能にし得る。 In some examples, video preprocessor 19 may selectively enable the use of pre-processing unit 134 and / or post-processing unit 138 (eg, via a controller circuit). Accordingly, video preprocessor 19 may output information indicating whether pre-processing unit 134 and / or post-processing unit 138 are enabled. In response, video post-processor 31 may selectively enable the use of reverse post-processing unit 144 and / or reverse pre-processing unit 142 (eg, via a controller circuit).

ビデオエンコーダ20が様々なパラメータを符号化および出力し得る様々な方法があり得る。たとえば、SEI(補足エンハンスメント情報)/VUI(ビデオ有用性情報)によるビットストリームを通じてパラメータを、もしくはサイド情報としてデコーダ側(たとえば、ビデオポストプロセッサ31)に提供されているパラメータを、または入力および出力色空間、利用される伝達関数などの他の識別からデコーダ側によって(たとえば、ビデオポストプロセッサ31によって)導出されるパラメータを、ビデオエンコーダ20はシグナリングし得、ビデオデコーダ30は受信し得る。 There may be various ways in which video encoder 20 may encode and output various parameters. For example, parameters through bitstreams with SEI (supplemental enhancement information) / VUI (video usability information), or parameters provided to the decoder side (e.g. video postprocessor 31) as side information, or input and output colors Video encoder 20 may signal and video decoder 30 may receive parameters derived by the decoder side (eg, by video post processor 31) from other identifications of space, transfer functions utilized, and the like.

以下のことは、本開示で説明する例示的な技法に従って適用され得る例示的な技法を説明する。これらの技法の各々は、別個にまたは任意の組合せで適用されてよい。また、これらの技法の各々は、さらに以下でより詳細に説明される。 The following describes exemplary techniques that may be applied in accordance with the exemplary techniques described in this disclosure. Each of these techniques may be applied separately or in any combination. Each of these techniques is also described in further detail below.

上記の例では、ビデオプリプロセッサ19は、様々な方法でパラメータ情報をシグナリングし得る。単に一例として、ビデオエンコーダ20は符号化ユニット(たとえば、エントロピー符号化ユニット)を含み、エントロピー符号化ユニットはシグナリングされるパラメータ情報を符号化し得る。同様に、ビデオデコーダ30は復号ユニット(たとえば、エントロピー復号ユニット)を含み、エントロピー復号ユニットはシグナリングされたパラメータ情報を復号し得、ビデオポストプロセッサ31はビデオデコーダ30からパラメータ情報を受信し得る。パラメータ情報がシグナリングおよび受信され得る様々な方法があり得、本開示で説明する技法はいかなる具体的な手法にも限定されない。また、上述のように、ビデオプリプロセッサ19は、すべての場合においてパラメータ情報をシグナリングする必要があるとは限らない。いくつかの例では、ビデオポストプロセッサ31は、必ずしもビデオプリプロセッサ19からパラメータ情報を受信することなく、パラメータ情報を導出し得る。 In the above example, video preprocessor 19 may signal parameter information in various ways. By way of example only, video encoder 20 includes an encoding unit (eg, an entropy encoding unit), which may encode parameter information that is signaled. Similarly, video decoder 30 includes a decoding unit (eg, an entropy decoding unit), which may decode the signaled parameter information and video postprocessor 31 may receive the parameter information from video decoder 30. There can be various ways in which parameter information can be signaled and received, and the techniques described in this disclosure are not limited to any specific approach. Further, as described above, the video preprocessor 19 does not always need to signal parameter information in all cases. In some examples, video postprocessor 31 may derive parameter information without necessarily receiving parameter information from video preprocessor 19.

以下は、単に例示のために使用される例示的な実装形態であり、限定的と見なされるべきでない。たとえば、本開示は、適用されるTF(たとえば、静的な、コンテンツ適応型でない)の前の入力線形カラー値の線形(スケールおよびオフセット)前処理によって、および/または線形カラー値に対して適用されるTFから得られた非線形コードワードの線形(スケールおよびオフセット)後処理によって、適応性を可能にすることを説明する。以下は、技法の実装形態のいくつかの非限定的な例である。 The following are exemplary implementations used for illustration only and should not be considered limiting. For example, the present disclosure may be applied by linear (scale and offset) preprocessing of input linear color values and / or for linear color values prior to the applied TF (eg, static, non-content adaptive) It will be explained that the linearity (scale and offset) post-processing of the non-linear codeword obtained from the TF is made possible. The following are some non-limiting examples of technique implementations.

提案される順方向ATF処理フローが図11に示される。この例では、入力色空間、たとえば、RGBにおける線形信号成分sが、出力信号s1を生成するために前処理ユニット134によって前処理されている。TFユニット112は、s1としての値に伝達関数を適用し、そのことは出力コードワードS1をもたらす。以下のステップにおいて、S1としてのコードワードは、出力値S2を生成するために後処理ユニット138によって後処理されている。 The proposed forward ATF processing flow is shown in FIG. In this example, a linear signal component s in an input color space, eg, RGB, has been preprocessed by preprocessing unit 134 to generate output signal s1. The TF unit 112 applies a transfer function to the value as s1, which results in the output codeword S1. In the following steps, the code word as S1 has been post-processed by the post-processing unit 138 to generate the output value S2.

図12は、図11に示すATF処理フローの逆である。たとえば、逆後処理ユニット144は、コードワードS2'に対して逆後処理を実行する。コードワードS2'は、コードワードS2と実質的に類似であり得る。逆後処理ユニット144による逆後処理の出力は、コードワードS1'である。コードワードS1'は、コードワードS1と類似であり得る。逆TFユニット126はコードワードS1'を受信し、逆TFユニット126の出力はs1'であり、s1'はカラー値を表しカラー値s1と類似である。逆前処理ユニット142は入力としてs1'を受信し、カラー値s'を出力する。カラー値s'は、カラー値sと実質的に類似である。 FIG. 12 is the reverse of the ATF process flow shown in FIG. For example, the reverse post-processing unit 144 performs reverse post-processing on the code word S2 ′. Codeword S2 ′ may be substantially similar to codeword S2. The output of the reverse post-processing by the reverse post-processing unit 144 is the code word S1 ′. Codeword S1 ′ may be similar to codeword S1. The inverse TF unit 126 receives the codeword S1 ′ and the output of the inverse TF unit 126 is s1 ′, where s1 ′ represents a color value and is similar to the color value s1. The inverse preprocessing unit 142 receives s1 ′ as input and outputs a color value s ′. The color value s ′ is substantially similar to the color value s.

説明しやすいように、様々なプロセスは図11および図12に示されていない。たとえば、色変換および量子化ならびにそれらのそれぞれの逆プロセスは、図11および図12に示されていない。 For ease of explanation, the various processes are not shown in FIGS. For example, color conversion and quantization and their respective inverse processes are not shown in FIG. 11 and FIG.

以下のことは、前処理ユニット134および逆前処理ユニット142が実行し得る例示的な動作およびアルゴリズムを説明する。前処理ユニット134は、パラメータScale1およびOffset1によって規定される。前処理ユニット134は、入力信号特性を、TFユニット112によって適用される伝達関数のいくつかの特性に調整することをターゲットにする。そのような処理の例が図13Aおよび図13Bに示される。図13Aは、PQT伝達関数が上に置かれたHDR信号の赤色カラー成分のヒストグラムを示し、可視化のために信号が異なるスケールで示されることに留意されたい。PQTFから得られる非線形信号のヒストグラムが図13Bに示される(たとえば、TFユニット112が前処理ユニット134を使用せずにPQTFを適用した場合)。 The following describes exemplary operations and algorithms that preprocessing unit 134 and reverse preprocessing unit 142 may perform. The preprocessing unit 134 is defined by parameters Scale1 and Offset1. Preprocessing unit 134 targets to adjust the input signal characteristics to some characteristics of the transfer function applied by TF unit 112. An example of such processing is shown in FIGS. 13A and 13B. Note that FIG. 13A shows a histogram of the red color component of the HDR signal with the PQT transfer function on top, and the signal is shown on a different scale for visualization. A histogram of the non-linear signal obtained from PQTF is shown in FIG. 13B (eg, when TF unit 112 applies PQTF without using preprocessing unit 134).

図示したように、(図7と同じ)PQTFの曲線によって、PQTFは輝度が低いサンプルにより多くのコードワードをもたらし、たとえば、入力ダイナミックレンジの0〜1%が出力ダイナミックレンジの50%を用いて表されることを許容する。そのような分布は、いくつかのクラスの信号および/またはアプリケーションにとって最適でないことがある。 As shown, the PQTF curve (same as in Figure 7) gives PQTF more codewords for samples with lower brightness, for example, 0-1% of the input dynamic range uses 50% of the output dynamic range. Allow to be represented. Such a distribution may not be optimal for some classes of signals and / or applications.

線形前処理のために、前処理ユニット134は、オフセットOffset1およびスケールScale1を適用することによって線形前処理を入力信号sに適用して、TFユニット112からのTF出力の好ましいコードワード分布を達成し得る。
s1=Scale1*(s-Offset1) (8) For linear preprocessing, preprocessing unit 134 applies linear preprocessing to input signal s by applying offset Offset1 and scale Scale1 to achieve a preferred codeword distribution of the TF output from TF unit 112. obtain.
s1 = Scale1 * (s-Offset1) (8)

式8によって生成された信号値s1は式1において利用され、式(3〜7)において指定されるHDR処理パイプラインが(たとえば、色変換ユニット114および量子化ユニット116によって)適用される。 The signal value s1 generated by Equation 8 is utilized in Equation 1, and the HDR processing pipeline specified in Equations (3-7) is applied (eg, by the color conversion unit 114 and the quantization unit 116).

デコーダ側(たとえば、ビデオポストプロセッサ31)において、逆前処理ユニット142は、次のように前処理とは逆の動作を適用する。 On the decoder side (for example, the video post processor 31), the reverse preprocessing unit 142 applies the reverse operation to the preprocessing as follows.

ただし、項s1'は、式2において指定されるように逆TFによって生成される線形カラー値を示す。言い換えれば、s1'は、逆TFユニット126によって適用される逆伝達関数の出力である。 Where the term s1 ′ indicates the linear color value generated by the inverse TF as specified in Equation 2. In other words, s1 ′ is the output of the inverse transfer function applied by the inverse TF unit 126.

パラメータ(Scale1=0.1およびOffset1=0.0001)を用いたそのような前処理の効果が図14に示される。図14は、図13Aとして図示した同じ入力から生成される。S1信号のコードワードが図13Bに示す例と比較して利用可能なダイナミックレンジをより効率的に占有し、高い輝度値がはるかに大きいダイナミックレンジを占有するので(図13Bにおけるヒストグラムピークと比較して図14におけるヒストグラムS1の伸張されたピークを参照)、それらの表現がより正確であることが理解され得る。 The effect of such preprocessing using the parameters (Scale1 = 0.1 and Offset1 = 0.0001) is shown in FIG. FIG. 14 is generated from the same inputs illustrated as FIG. 13A. Since the codeword of the S1 signal occupies the available dynamic range more efficiently compared to the example shown in Figure 13B, and the high luminance value occupies a much larger dynamic range (compared to the histogram peak in Figure 13B). 14), it can be seen that their representation is more accurate.

いくつかの例では、TFユニット112への入力は、両極性信号であってよい。両極性信号とは、前処理ユニット138が出力する値のうちのいくつかが負の値を示し、他方が正の値であることを意味する。両極性信号に対してさえTFユニット112による静的TF(また、逆TFユニット126を用いたそれの逆)の適用を可能にするために、TFユニット112は、前処理ユニット134の出力を調整し得、かつ/または伝達関数を適用した後の出力を調整し得る。たとえば、前処理ユニット134の出力が負の値である場合、TFユニット126は、前処理ユニット134の出力の絶対値に対して伝達関数を適用するとともにその結果をsign(s)関数の出力と乗算してよく、ただし、sは前処理ユニット134の出力であり、sign(s)=-1(s<0の場合)、0(s=0の場合)、1(s>0の場合)である。したがって、静的TFの適用を可能にするために、TFユニット112は、正の値で規定された関数を両極性信号に対して利用してよく、そのことは、両極性信号の符号のいくつかの特別な処置を暗示する。 In some examples, the input to TF unit 112 may be a bipolar signal. Bipolar signals mean that some of the values output by the preprocessing unit 138 are negative and the other is positive. The TF unit 112 adjusts the output of the preprocessing unit 134 to allow the application of a static TF by the TF unit 112 (and vice versa with the reverse TF unit 126) even for bipolar signals. And / or adjust the output after applying the transfer function. For example, if the output of the preprocessing unit 134 is a negative value, the TF unit 126 applies a transfer function to the absolute value of the output of the preprocessing unit 134 and uses the result as the output of the sign (s) function. May be multiplied, where s is the output of preprocessing unit 134, sign (s) =-1 (if s <0), 0 (if s = 0), 1 (if s> 0) It is. Therefore, in order to enable the application of static TF, the TF unit 112 may use a function defined by a positive value for a bipolar signal, which means that the number of signs of the bipolar signal Implying that special treatment.

以下のことは、両極性信号処理を用いた伝達関数を説明する。いくつかの例では、技法は、両極性入力信号を処理できるように式(1)を修正し得る。 The following describes a transfer function using bipolar signal processing. In some examples, the technique may modify equation (1) so that bipolar input signals can be processed.

この技法は、たとえば、信号の中程度に明るいレベルにとってより正確な表現が必要とされる場合、図9のTFユニット112が、ダイナミックレンジの必要とされる領域にTFの所望の急勾配を割り振ることを許容し得る。パラメータscale=0.1およびoffset=0.1を用いたこの概念の可視化が図15Aおよび図15Bに示される。 This technique allows the TF unit 112 of FIG. 9 to allocate the desired steep slope of TF to the area where dynamic range is needed, for example, when a more accurate representation is needed for a moderately bright level of the signal. Can be tolerated. Visualization of this concept with parameters scale = 0.1 and offset = 0.1 is shown in FIGS. 15A and 15B.

両極性信号に対する逆TFが、修正された式2のように、それに応じて規定されることになる。 The inverse TF for the bipolar signal will be defined accordingly, as modified Equation 2.

たとえば、図10の逆TFユニット126は、入力コードワードが正である場合のための逆伝達関数を適用し得る。負である入力コードワードの場合、逆TFユニット126は、入力コードワードの絶対値を決定し得、入力コードワードの絶対値に逆伝達関数を適用し得る。TFユニット126は、その結果をsign()関数の出力と乗算し得、ここで、sign()関数への入力は逆TFユニット126が受信したコードワードである。 For example, the inverse TF unit 126 of FIG. 10 may apply an inverse transfer function for the case where the input codeword is positive. For input codewords that are negative, inverse TF unit 126 may determine the absolute value of the input codeword and may apply an inverse transfer function to the absolute value of the input codeword. The TF unit 126 may multiply the result with the output of the sign () function, where the input to the sign () function is the codeword received by the inverse TF unit 126.

以下のことは、後処理ユニット138および逆後処理ユニット144が実行し得る例示的な動作およびアルゴリズムを説明する。後処理ユニット138は、パラメータScale2およびOffset2によって規定され、TFがTFユニット112によって適用された後の利用可能なコードワード空間(ダイナミックレンジ)の効率的な利用を可能にする。そのような処理の例が以下で説明される。 The following describes exemplary operations and algorithms that post-processing unit 138 and reverse post-processing unit 144 may perform. The post-processing unit 138 is defined by the parameters Scale2 and Offset2, and allows efficient use of the available codeword space (dynamic range) after the TF is applied by the TF unit 112. An example of such processing is described below.

以下のことは、TFがTFユニット112によって適用された後のHDR信号の一例である。図16Aは、TFユニット112によるTFの適用の後のHDR信号の赤色カラー成分のヒストグラムを示す。図示したように、信号のヒストグラムは全コードワード空間を占有せず、むしろこの例ではコードワードの60%だけが実際に利用される。そのような表現は、信号が浮動小数点精度で表されるまで問題でない。しかしながら、式7によって持ち込まれる量子化誤差を低減するために、未使用コードワードバジェット(codeword budget)がより効率的に利用され得る。 The following is an example of an HDR signal after TF has been applied by TF unit 112. FIG. 16A shows a histogram of the red color component of the HDR signal after TF application by the TF unit 112. As shown, the signal histogram does not occupy the entire codeword space, rather in this example only 60% of the codewords are actually utilized. Such a representation is not a problem until the signal is represented in floating point precision. However, an unused codeword budget can be utilized more efficiently to reduce the quantization error introduced by Equation 7.

後処理ユニット138は、TFユニット112によってTFを適用した後に生成されるコードワードSに、オフセットおよびスケーリングの線形演算を適用し得る。
S2=scale2*(S-offset2) (12) Post-processing unit 138 may apply offset and scaling linear operations to codeword S generated after applying TF by TF unit 112.
S2 = scale2 * (S-offset2) (12)

いくつかのフレームワークでは、後処理ユニット138(図9)は、次のように、得られたS値を指定されたダイナミックレンジにクリッピングし得る
S2=min(maxValue,max(S2,minValue)) (13) In some frameworks, the post-processing unit 138 (Figure 9) may clip the resulting S value to the specified dynamic range as follows:
S2 = min (maxValue, max (S2, minValue)) (13)

この後処理に続いて、信号値S2が式(3)において利用され、式(3〜7)において指定されるHDR処理フローが続く。 Following this post-processing, the signal value S2 is used in equation (3), followed by the HDR processing flow specified in equations (3-7).

デコーダ側(たとえば、ビデオポストプロセッサ31)において、逆後処理ユニット144は、次のように実施される、後処理とは逆の動作を適用し得る。 On the decoder side (eg, video post processor 31), inverse post-processing unit 144 may apply the reverse operation of post-processing, performed as follows.

いくつかのフレームワークでは、逆後処理ユニット144は、次のように、得られたS値を指定されたダイナミックレンジにクリッピングし得る。
S'=min(maxValue,max(S',minValue)) (15)
ただし、項S2'は、復号された非線形カラー値を示す。 In some frameworks, the inverse post-processing unit 144 may clip the resulting S value to the specified dynamic range as follows.
S '= min (maxValue, max (S', minValue)) (15)
However, the term S2 ′ indicates the decoded nonlinear color value.

パラメータ(scale=0.8およびoffset=0)を用いたそのような後処理の効果が図16Bに示される。S2信号のコードワードが、利用可能なコードワード空間(ダイナミックレンジ)をより効率的に占有し、高い輝度値がはるかに大きいダイナミックレンジを占有するので(図16Aにおけるヒストグラムピークと比較して図16BにおけるヒストグラムS2の伸張されたピークを参照)、それらの表現がより正確であることが理解され得る。 The effect of such post-processing using parameters (scale = 0.8 and offset = 0) is shown in FIG. 16B. Because the codeword of the S2 signal occupies the available codeword space (dynamic range) more efficiently, and the high luminance value occupies a much larger dynamic range (Figure 16B compared to the histogram peak in Figure 16A) It can be seen that their representation is more accurate.

上記の例では、式(12)の動作を適用するのではなく、後処理ユニット138は、TFユニット112が適用する非線形伝達関数の代わりに線形伝達関数である第2の伝達関数を適用するものと見なされ得る。この第2の伝達関数を適用することは、入力コードワードをScale2と乗算しOffset2を加算することと等価であり得、逆後処理ユニット144は、このプロセスの逆を適用する。したがって、いくつかの例では、後処理ユニット138および逆後処理ユニット144(図10)の技法は、追加の線形伝達関数TF2を通じて展開され得る。この追加の線形伝達関数TF2のパラメータは、たとえば、図17A〜図17Cに示すように、Scale2およびOffset2を通じて指定され得る。 In the above example, instead of applying the operation of Equation (12), the post-processing unit 138 applies a second transfer function that is a linear transfer function instead of the nonlinear transfer function applied by the TF unit 112. Can be considered. Applying this second transfer function may be equivalent to multiplying the input codeword by Scale2 and adding Offset2, and the inverse post-processing unit 144 applies the inverse of this process. Thus, in some examples, the techniques of post-processing unit 138 and inverse post-processing unit 144 (FIG. 10) can be expanded through an additional linear transfer function TF2. The parameters of this additional linear transfer function TF2 can be specified through Scale2 and Offset2, for example, as shown in FIGS. 17A-17C.

たとえば、図17Aは、図7(たとえば、TFユニット112が適用する静的伝達関数)と同じ伝達関数挙動を示す。図17Bは、後処理ユニット138がTFユニット112の出力に適用する、Scale2=1およびOffset2=0を伴う伝達関数を示す。図17Cは、後処理ユニット138がTFユニット112の出力に適用する、Scale2=1.5およびOffset2=-0.25を伴う伝達関数を示す。 For example, FIG. 17A shows the same transfer function behavior as FIG. 7 (eg, a static transfer function applied by the TF unit 112). FIG. 17B shows the transfer function with Scale2 = 1 and Offset2 = 0 that the post-processing unit 138 applies to the output of the TF unit 112. FIG. 17C shows the transfer function with Scale2 = 1.5 and Offset2 = −0.25 that the post-processing unit 138 applies to the output of the TF unit 112.

上記の例は、前処理ユニット134(DRA1)および/または後処理ユニット138(DRA2)、ならびに対応する逆前処理ユニット142および逆後処理ユニット144がある場合を説明する。いくつかの例では、後処理は、色変換の前ではなく色変換の後に(たとえば、ターゲット色空間において)実行され得る。たとえば、図18に示すように、TFユニット112は色変換ユニット114に出力し、後処理ユニット150は色変換の後に後処理を適用する。同様に、図19に示すように、逆後処理ユニット152は、逆色変換ユニット124が色変換を適用する前に、逆量子化ユニット122の出力に対して逆後処理を適用する。図18および図19において、前処理ユニット134および逆前処理ユニット142は簡単のために図示されず、他の例において含まれてよい。 The above example describes the case where there is a pre-processing unit 134 (DRA1) and / or a post-processing unit 138 (DRA2), and a corresponding reverse pre-processing unit 142 and reverse post-processing unit 144. In some examples, post processing may be performed after color conversion (eg, in a target color space) rather than before color conversion. For example, as shown in FIG. 18, the TF unit 112 outputs to the color conversion unit 114, and the post-processing unit 150 applies post-processing after color conversion. Similarly, as shown in FIG. 19, the inverse post-processing unit 152 applies inverse post-processing to the output of the inverse quantization unit 122 before the inverse color conversion unit 124 applies color conversion. 18 and 19, the preprocessing unit 134 and the reverse preprocessing unit 142 are not shown for simplicity, and may be included in other examples.

後処理ユニット150および逆後処理ユニット152は、後処理ユニット138および逆後処理ユニット144と同様に、ただし異なる色空間(たとえば、RGB色空間ではなくYCbCr色空間)において機能し得る。たとえば、後処理ユニット150および逆後処理ユニット152は、スケーリングおよびオフセットのために、Scale2およびOffset2パラメータを利用し得る。以下のことは、ターゲット色空間における後処理を説明する。 Post-processing unit 150 and reverse post-processing unit 152 can function in the same manner as post-processing unit 138 and reverse post-processing unit 144, but in different color spaces (eg, YCbCr color space rather than RGB color space). For example, post-processing unit 150 and reverse post-processing unit 152 may utilize the Scale2 and Offset2 parameters for scaling and offset. The following describes post-processing in the target color space.

いくつかのターゲット色空間(たとえば、BT709またはBT2020のYCbCr)を利用するいくつかのHDRシステムの場合、R、G、およびBとしての3つの成分に対して同一のパラメータを利用する後処理を用いると、後処理は、入力RGB色空間において適用するのではなく、出力色空間において適用され得る。 For some HDR systems that use some target color space (eg BT709 or BT2020 YCbCr), use post-processing that uses the same parameters for the three components as R, G, and B And post-processing can be applied in the output color space rather than in the input RGB color space.

たとえば、式3および式5において規定されるようなYCbCr色変換に関して、 For example, for YCbCr color conversion as defined in Equation 3 and Equation 5,

であり、ただし、変数a、b、c、d1、およびd2は色空間のパラメータであり、R'、G'、B'はEOTFを適用した後の入力色空間における非線形カラー値である。 Where variables a, b, c, d1, and d2 are color space parameters, and R ′, G ′, and B ′ are non-linear color values in the input color space after applying EOTF.

式12および式13において規定される後処理は、入力色空間において、すなわち、R'、G'、B'に適用され、結果としてR2'、G2'、B2'となる。
R2'=scale2*(R'-offset2)
G2'=scale2*(G'-offset2) (17)
B2'=scale2*(B'-offset2) The post-processing defined in Equation 12 and Equation 13 is applied to the input color space, that is, R ′, G ′, B ′, resulting in R2 ′, G2 ′, B2 ′.
R2 '= scale2 * (R'-offset2)
G2 '= scale2 * (G'-offset2) (17)
B2 '= scale2 * (B'-offset2)

式17において、R、G、およびB領域における後処理のパラメータが同一であると想定される。 In Equation 17, the post-processing parameters in the R, G, and B regions are assumed to be the same.

しかしながら、式16および式17において規定されるプロセスの線形性に起因して、後処理はターゲット色空間において実施され得る。
Y2'=scaleY*Y'-offsetY
Cb2=scaleB*Cb-offsetB (18)
Cr2=scaleR*Cr-offsetR
ここで、後処理ユニット150は、後処理のパラメータを次のように算出する。 However, due to the process linearity defined in Equations 16 and 17, post-processing can be performed in the target color space.
Y2 '= scaleY * Y'-offsetY
Cb2 = scaleB * Cb-offsetB (18)
Cr2 = scaleR * Cr-offsetR
Here, the post-processing unit 150 calculates post-processing parameters as follows.

デコーダ側(たとえば、ビデオポストプロセッサ31)において、逆後処理ユニット152による後処理の逆の動作が次のように実施され得る。 On the decoder side (eg, video post processor 31), the reverse operation of post-processing by the reverse post-processing unit 152 may be performed as follows.

ただし、変数Y2'、Cb、およびCrは復号および逆量子化された成分であり、逆後処理のパラメータは式19において示すように算出される。 However, the variables Y2 ′, Cb, and Cr are decoded and dequantized components, and the parameters of the inverse post-processing are calculated as shown in Equation 19.

BT2020において規定されるYCbCr色変換の場合、ターゲット色空間において実施される後処理のパラメータは、入力色空間における後処理のパラメータおよび色変換のパラメータから次のように導出される。 In the case of YCbCr color conversion defined in BT2020, parameters for post-processing performed in the target color space are derived as follows from parameters for post-processing and color conversion in the input color space.

いくつかの例では、入力色空間における後処理のパラメータから導出されたパラメータを用いてターゲット色空間において実施される式20は、次式20Aのように実施され得る。 In some examples, Equation 20 implemented in the target color space using parameters derived from post-processing parameters in the input color space may be implemented as Equation 20A:

いくつかの例では、上の式20Aは、除算ではなく乗算を通じて実施され得る。 In some examples, equation 20A above may be implemented through multiplication rather than division.

非線形処理に対して、上記で説明した前処理技法および後処理技法に加えて、後処理ユニット138または150および逆後処理ユニット144または152は、コードワード利用をさらに改善するために、いくつかの非線形技法を利用し得る。ヒストグラム末尾処理の一例として、後処理ユニット138または150および逆後処理ユニット144または152は、指定された範囲の外部にあるサンプルに適用される、パラメータ区分的に指定される伝達関数を用いて、上記で規定した線形コードワード後処理を増補し得る。 For non-linear processing, in addition to the pre-processing and post-processing techniques described above, post-processing unit 138 or 150 and reverse post-processing unit 144 or 152 may include several methods to further improve codeword utilization. Non-linear techniques can be utilized. As an example of histogram tail processing, post-processing unit 138 or 150 and inverse post-processing unit 144 or 152 use a parameter piecewise specified transfer function applied to samples outside the specified range, The linear codeword post-processing defined above can be augmented.

上記で指定されたHDR信号の処理フローは、通常、既定の範囲の中で動作する。
Range={minValue,MaxValue} The HDR signal processing flow specified above normally operates within a predetermined range.
Range = {minValue, MaxValue}

正規化された信号および上の式の場合には、値は、
Range={0.0〜1.0}、
minValue=0.0 (22)、
maxValue=1.0
である。 In the case of the normalized signal and the above equation, the value is
Range = {0.0-1.0},
minValue = 0.0 (22),
maxValue = 1.0
It is.

式12において提案される後処理(たとえば、後処理ユニット138によって実行されるような)は、Offset2およびScale2をコードワードS1に適用することを通じて、得られたS2値が指定された範囲境界をオーバーフローする状況につながり得る。データが指定された範囲の中にとどまることを確実にするために、たとえば、図20A〜図20Cに示すように、クリッピングの動作が式13における範囲{minValue〜maxValue}に適用され得る。 The post-processing proposed in Equation 12 (eg, as performed by post-processing unit 138) overflows the specified range boundary with the resulting S2 value through applying Offset2 and Scale2 to codeword S1. Can lead to situations. To ensure that the data stays within the specified range, a clipping operation may be applied to the range {minValue-maxValue} in Equation 13, for example, as shown in FIGS. 20A-20C.

図20Aは、S2のヒストグラムに関する範囲にクリッピングが適用された後処理を示す図である。図20Bは、式10における後処理の後のS2のヒストグラムに関する範囲にクリッピングが適用された後処理を示す別の図である。図20Cは、式11におけるクリッピングの後のS2のヒストグラムに関する範囲にクリッピングが適用された後処理を示す別の図である。 FIG. 20A is a diagram illustrating post-processing in which clipping is applied to a range related to the histogram of S2. FIG. 20B is another diagram illustrating post-processing in which clipping is applied to the range related to the histogram of S2 after post-processing in Equation 10. FIG. 20C is another diagram illustrating post-processing with clipping applied to the range for the histogram of S2 after clipping in Equation 11.

図20Cに示すように、後処理の後にクリッピングを適用することは、指定された範囲の外部の情報の回復不可能な損失につながることになる。Scale2/Offset2を適用することによって限定されたコードワード空間の中でHDR信号を効率的に表しHVS知覚に対するTF想定を維持するために、後処理ユニット138または150は、コンテンツ適応型ヒストグラム末尾処理を許容するために、指定されたRangeの外部のコードワードをもたらすカラーサンプルの特別な処理を用いて、上記で規定されたコードワード後処理を増補し得る。 As shown in FIG. 20C, applying clipping after post-processing will result in an irrecoverable loss of information outside the specified range. In order to efficiently represent the HDR signal in a limited codeword space by applying Scale2 / Offset2 and maintain the TF assumption for HVS perception, the post-processing unit 138 or 150 performs content adaptive histogram tail processing. To allow, special processing of color samples that yields codewords outside of the specified Range can be used to augment the codeword post-processing defined above.

後処理ユニット138または150は、範囲の外部に属する式12から得られるすべてのS2に対して、パラメータ区分的に指定される伝達関数を適用し得る。それらのサンプルの表現のパラメータは、指定された範囲に属するS2コードワードとは別個に、ビデオエンコーダ20によって符号化およびシグナリングされる。 The post-processing unit 138 or 150 may apply a transfer function specified in a parameter piecewise manner to all S2 obtained from Equation 12 that belong outside the range. These sample representation parameters are encoded and signaled by the video encoder 20 separately from the S2 codewords belonging to the specified range.

例示として、式22において指定されるように、式12から得られるコードワードS2をそれらの関係によって、サポートされるデータ範囲に分類するための動作が行われ得る。
S2_in=∀S2∈Range (23)
S2_low=∀S2≦minValue (24)
S2_high=∀S2≧maxValue (25)
ただし、S2_inサンプルはRange=[minValue〜maxValue]に属し、S2_lowは範囲の外部にあるサンプルのセットであるとともにminValue以下であり、S2_highは範囲の外部のサンプルであるとともにmaxValue以上である。 By way of illustration, as specified in Equation 22, an operation may be performed to classify the codeword S2 resulting from Equation 12 into a supported data range according to their relationship.
S2 _in = ∀S2∈Range (23)
S2 _low = ∀S2 ≦ minValue (24)
S2 _high = ∀S2 ≧ maxValue (25)
However, S2 _in samples belong to Range = [minValue ~ maxValue], S2 _low is a set of samples outside the range and below minValue, and S2 _high is a sample outside the range and above maxValue .

技法は、S2_lowおよびS2_highの各々に対してコードワード値を個別に指定/決定/シグナリングすることによって、範囲の外部にあるS2_lowおよびS2_highとしてのカラー値に関する、式13におけるクリッピングの動作を置換し得、またはRangeの外部の情報を最適に表すコードワードのグループを用いて指定/決定/シグナリングし得る。 Technique, by S2 _low and S2 _high each individually addressable / determine / signaling codeword values for the relates to a color value as S2 _low and S2 _high in the range outside of the operation of clipping the formula 13 Or may be specified / determined / signaled with a group of codewords that best represents information outside of the Range.

以下は、サンプルS2_lowおよびS2_highに関するクリッピングを置換し得る例示的な決定プロセスの非限定的な例である。
a.S2_lowおよびS2_highの各々に対して平均値を決定する。
Smin=mean(S2_low) (26)
Smax=mean(S2_high)
b.S2_lowおよびS2_highの各々に対して中央値を決定する。
Smin=median(S2_low) (27)
Smax=median(S2_high)
c.S2_lowおよびS2_highの各々に対して、いくつかの基準の下でそれらのサンプルセットの各々を最適に表す値、たとえば、絶対差分(SAD)の最小和、平均2乗誤差(MSE)、またはレートひずみ最適化(RDO)コストを決定する。
Smin=fun(S2_low) (28)
Smax=fun(S2_high) The following is a non-limiting example of an exemplary decision process that can replace clipping for samples S2 _low and S2 _high .
a. Determine an average value for each of S2 _low and S2 _high .
Smin = mean (S2 _low ) (26)
Smax = mean (S2 _high )
b. Determine the median for each of S2 _low and S2 _high .
Smin = median (S2 _low ) (27)
Smax = median (S2 _high )
c. For each of S2 _low and S2 _high , a value that best represents each of those sample sets under some criteria, for example, the minimum sum of absolute differences (SAD), the mean square error (MSE) Or determine rate-distortion optimization (RDO) costs.
Smin = fun (S2 _low ) (28)
Smax = fun (S2 _high )

そのような例では、各サンプル値S2_lowまたはS2_highは、式(13)に示すようなクリッピングされた値とではなく、それぞれ、式(26、27)において導出されるSminおよびSmax値と置換される。 In such an example, each sample value S2 _low or S2 _high replaces the Smin and Smax values derived in equations (26, 27), respectively, rather than the clipped values as shown in equation (13). Is done.

提案される非線形処理のパラメータは、S2_lowおよびS2_highの各々を表すコードワードである。これらのコードワードは、デコーダ側(たとえば、ビデオポストプロセッサ31)にシグナリングされ、再構成されたプロセスにおいて逆後処理ユニット144または152によって利用される。 The proposed nonlinear processing parameters are codewords representing each of S2 _low and S2 _high . These codewords are signaled to the decoder side (eg, video postprocessor 31) and utilized by inverse post-processing unit 144 or 152 in the reconstructed process.

たとえば、パラメータ区分的処理の上記の例では、後処理ユニット138または150は、TFユニット112によるカラー値の短縮(たとえば、伝達関数の適用から得られるダイナミックレンジの短縮)から得られたコードワードをスケーリングおよびオフセット(たとえば、Scale2およびOffset2を適用)し得る。後処理ユニット138または150は、スケーリングおよびオフセットされたコードワードのセットが、最小しきい値よりも小さいかまたは最大しきい値よりも大きい値を有することを決定し得る。後処理ユニット138または150は、最小しきい値よりも小さい値を有するスケーリングおよびオフセットされたコードワードのセットに第1のコードワード(たとえば、Smin)を割り当ててよく、最大しきい値よりも大きい値を有するスケーリングまたはオフセットされたコードワードのセットに第2のコードワード(たとえば、Smax)を割り当ててよい。最小値と最大値との間にスケーリングおよびオフセットされたコードワードに対して、後処理ユニット138または150は、上記の例において説明したようなスケーリングおよびオフセットを適用し得る。 For example, in the above example of parameter piecewise processing, the post-processing unit 138 or 150 may use codewords resulting from color value reduction (e.g., dynamic range reduction resulting from application of a transfer function) by the TF unit 112. Scaling and offset (eg, applying Scale2 and Offset2) may be performed. Post processing unit 138 or 150 may determine that the set of scaled and offset codewords has a value that is less than the minimum threshold or greater than the maximum threshold. Post processing unit 138 or 150 may assign a first codeword (e.g., Smin) to a set of scaled and offset codewords having a value less than the minimum threshold, and is greater than the maximum threshold A second codeword (eg, Smax) may be assigned to a set of scaled or offset codewords having values. For codewords that are scaled and offset between the minimum and maximum values, post-processing unit 138 or 150 may apply scaling and offset as described in the example above.

逆後処理ユニット144または152(図10または図19)は、パラメータ区分的処理の逆を実行し得る。たとえば、逆後処理ユニット144または152は、ビデオデコーダ30から受信された複数のコードワードからのコードワードの第1のセットが、最小しきい値よりも小さいかまたは最大しきい値よりも大きい値を有する短縮カラー値(たとえば、ダイナミックレンジがTFユニット112による伝達関数の適用がもとで短縮されている場合)を表すことを決定し得る。逆後処理ユニット144または152は、複数のコードワードからのコードワードの第2のセットが、最小しきい値以上かつ最大しきい値以下の値を有する短縮カラー値を表すことを決定し得る。そのような例の場合、逆後処理ユニット144または152は、最小しきい値よりも小さいコードワードの第1のセットのコードワードに第1のコードワード(たとえば、Smin)を割り当ててよく、最大しきい値よりも大きいコードワードの第1のセットのコードワードに第2のコードワード(たとえば、Smax)を割り当ててよい。SminおよびSmaxの値は、ビデオプリプロセッサ19からビデオエンコーダ20を介してシグナリングされ得る。逆後処理ユニット144または152は、コードワードの第2のセット(たとえば、最小値よりも大きくかつ最大値よりも小さいもの)を、上記で説明したもののような例示的な技法を使用して逆スケーリングおよび逆オフセットし得る。 Inverse post-processing unit 144 or 152 (FIG. 10 or FIG. 19) may perform the inverse of the parameter piecewise processing. For example, the inverse post-processing unit 144 or 152 is configured such that the first set of codewords from the plurality of codewords received from the video decoder 30 is less than the minimum threshold or greater than the maximum threshold. Can be determined to represent a shortened color value (eg, where the dynamic range has been shortened based on the application of the transfer function by the TF unit 112). Inverse post-processing unit 144 or 152 may determine that the second set of codewords from the plurality of codewords represents a shortened color value having a value greater than or equal to the minimum threshold and less than or equal to the maximum threshold. In such an example, the reverse post-processing unit 144 or 152 may assign a first codeword (e.g., Smin) to a first set of codewords of a codeword that is less than the minimum threshold, and the maximum A second codeword (eg, Smax) may be assigned to the first set of codewords of the codeword that is greater than the threshold. The values of Smin and Smax may be signaled from video preprocessor 19 via video encoder 20. The inverse post-processing unit 144 or 152 reverses the second set of codewords (e.g., greater than the minimum value and less than the maximum value) using exemplary techniques such as those described above. Can be scaled and inverse offset.

上記の例では、スケーリングおよびオフセットパラメータを区分的に適用することが説明される。いくつかの例では、スケーリングおよびオフセットパラメータのそのような区分的適用は、前処理ユニット134および後処理ユニット138または152と逆後処理ユニット144または152および逆前処理ユニット142の両方に関して拡張され得る。 In the above example, applying the scaling and offset parameters piecewise is described. In some examples, such piecewise application of scaling and offset parameters may be extended for both pre-processing unit 134 and post-processing unit 138 or 152 and reverse post-processing unit 144 or 152 and reverse pre-processing unit 142. .

たとえば、逆後処理ユニット144または152が受信するコードワードは、コードワードの第1のセットおよびコードワードの第2のセットを含み得る。コードワードの第1のセットは、ダイナミックレンジの第1の区分に属する値を有する短縮カラー値を表し、コードワードの第2のセットは、ダイナミックレンジの第2の区分に属する値を有する短縮カラー値を表す。そのような例では、逆後処理ユニット144または152は、スケーリングおよびオフセットパラメータの第1のセット(たとえば、第1のScale2および第1のOffset2)を用いて、コードワードの第1のセットをスケーリングおよびオフセットし得る。逆後処理ユニット144または152は、スケーリングおよびオフセットパラメータの第2のセット(たとえば、第2のScale2および第2のOffset2)を用いて、コードワードの第2のセットをスケーリングおよびオフセットし得る。 For example, codewords received by inverse post-processing unit 144 or 152 may include a first set of codewords and a second set of codewords. The first set of codewords represents a shortened color value having a value belonging to the first section of the dynamic range, and the second set of codewords has a shortened color having a value belonging to the second section of the dynamic range Represents a value. In such an example, inverse post-processing unit 144 or 152 scales the first set of codewords using a first set of scaling and offset parameters (e.g., first Scale2 and first Offset2). And can be offset. Inverse post-processing unit 144 or 152 may scale and offset the second set of codewords using a second set of scaling and offset parameters (eg, second Scale2 and second Offset2).

別の例として、逆前処理ユニット142が受信するカラー値は、非短縮カラー値の第1のセットおよび非短縮カラー値の第2のセットを含み得る。非短縮カラー値の第1のセットは、ダイナミックレンジの第1の区分に属する値を有する非短縮カラー値を表し、非短縮カラー値の第2のセットは、ダイナミックレンジの第2の区分に属する値を有する非短縮カラー値を表す。そのような例では、逆前処理ユニット142は、スケーリングおよびオフセットパラメータの第1のセット(たとえば、第1のScale1および第1のOffset1)を用いて、カラー値の第1のセットをスケーリングおよびオフセットし得る。逆前処理ユニット142は、スケーリングおよびオフセットパラメータの第2のセット(たとえば、第2のScale1および第2のOffset1)を用いて、カラー値の第2のセットをスケーリングおよびオフセットし得る。 As another example, the color values received by the inverse preprocessing unit 142 may include a first set of non-shortened color values and a second set of non-shortened color values. The first set of non-shortened color values represents a non-shortened color value having a value belonging to the first section of the dynamic range, and the second set of non-shortened color values belongs to the second section of the dynamic range Represents an unshortened color value with a value. In such an example, the inverse preprocessing unit 142 uses the first set of scaling and offset parameters (e.g., first Scale1 and first Offset1) to scale and offset the first set of color values. Can do. Inverse pre-processing unit 142 may scale and offset the second set of color values using a second set of scaling and offset parameters (eg, second Scale1 and second Offset1).

同様に、前処理ユニット134は、カラー値の第1のセットがダイナミックレンジの第1の区分に属する値を有することを決定し得、カラー値の第2のセットがダイナミックレンジの第2の区分に属する値を有することを決定し得る。前処理ユニット134は、スケーリングおよびオフセットパラメータの第1のセット(たとえば、第1のScale1および第1のOffset1)を用いて、カラー値の第1のセットをスケーリングおよびオフセットし得る。前処理ユニット134は、スケーリングおよびオフセットパラメータの第2のセット(たとえば、第2のScale1および第2のOffset1)を用いて、カラー値の第2のセットをスケーリングおよびオフセットし得る。 Similarly, the pre-processing unit 134 may determine that the first set of color values has a value belonging to the first partition of the dynamic range, and the second set of color values is the second partition of the dynamic range. Can be determined to have a value belonging to. Pre-processing unit 134 may scale and offset the first set of color values using a first set of scaling and offset parameters (eg, first Scale1 and first Offset1). Pre-processing unit 134 may scale and offset the second set of color values using a second set of scaling and offset parameters (eg, second Scale1 and second Offset1).

後処理ユニット138または152は、複数のコードワードからのコードワードの第1のセットがダイナミックレンジの第1の区分に属する値を有することを決定し得、複数のコードワードからのコードワードの第2のセットがダイナミックレンジの第2の区分に属する値を有することを決定し得る。後処理ユニット138または152は、スケーリングおよびオフセットパラメータの第1のセット(たとえば、第1のScale2および第1のOffset2)を用いて、コードワードの第1のセットをスケーリングおよびオフセットし得、スケーリングおよびオフセットパラメータの第2のセット(たとえば、第2のScale2および第2のOffset2)を用いて、コードワードの第2のセットをスケーリングおよびオフセットし得る。 The post-processing unit 138 or 152 may determine that the first set of codewords from the plurality of codewords has a value belonging to the first partition of the dynamic range, and the first of the codewords from the plurality of codewords. It can be determined that the set of 2 has a value belonging to the second segment of the dynamic range. Post-processing unit 138 or 152 may scale and offset the first set of codewords using a first set of scaling and offset parameters (e.g., first Scale2 and first Offset2). A second set of offset parameters (eg, second Scale2 and second Offset2) may be used to scale and offset the second set of codewords.

図21Aおよび図21Bは、末尾処理を用いた後処理の後のコードワードのヒストグラムを示す図である。図22は、静的TFを用いたコンテンツ適応型HDRパイプライン、エンコーダ側の別の例を示す概念図である。図23は、静的TFを用いたコンテンツ適応型HDRパイプライン、デコーダ側の別の例を示す概念図である。図22および図23が、いくつかの例では、スケーリングおよびオフセットされた値が範囲の外側にある場合にSminおよびSmaxコードワードが出力されることを示し、いくつかの例では、値が範囲の外側にある場合にSmin'およびSmax'(その両方がSminおよびSmaxと類似である)が受信されたコードワードであることを示すことを除いて、図22および図23は図11および図12と実質的に同一である。 FIG. 21A and FIG. 21B are diagrams showing histograms of codewords after post-processing using tail processing. FIG. 22 is a conceptual diagram showing another example of the content adaptive HDR pipeline using the static TF and the encoder side. FIG. 23 is a conceptual diagram showing another example of the content adaptive HDR pipeline using the static TF and the decoder side. Figures 22 and 23 show that in some examples, Smin and Smax codewords are output when the scaled and offset values are outside the range, and in some examples, the value is out of range. Figures 22 and 23 are the same as Figures 11 and 12, except that Smin 'and Smax' (both similar to Smin and Smax) indicate that they are received codewords when outside. Substantially the same.

上記の例では、最小値を下回るコードワードのために単一のコードワードが予約され、最大値よりも大きいコードワードのために単一のコードワードが予約される。いくつかの例では、指定されたRangeの外部でヒストグラム末尾に属するコードワードは、2つ以上の予約済みコードワードを用いて表され得る。 In the above example, a single codeword is reserved for codewords below the minimum value and a single codeword is reserved for codewords greater than the maximum value. In some examples, codewords that belong to the end of the histogram outside a specified Range may be represented using two or more reserved codewords.

後処理ユニット138は、コードワードSのダイナミックレンジの指定された領域に対していくつかの予約済みコードワードを決定およびシグナリングし得る。デコーダ側(たとえば、ビデオポストプロセッサ31)において、逆後処理ユニット144は、指定された領域に対する予約済みコードワードを決定し得る。技法は、予約済みコードワードの各々に関連する信号値S'を、デコーダにおいて決定、シグナリング、および適用し得る。 Post processing unit 138 may determine and signal a number of reserved codewords for a specified region of the dynamic range of codeword S. On the decoder side (eg, video post processor 31), inverse post-processing unit 144 may determine a reserved codeword for the specified region. The technique may determine, signal, and apply a signal value S ′ associated with each of the reserved codewords at the decoder.

図24は、カラー値を処理する2つの予約済みコードワードを用いたヒストグラムを示す図である。たとえば、S2_lowにおけるカラー値を処理する2つの予約済みコードワードを用いたこの例の可視化が図24に示される。この例では、S2_lowコードワードは2つのサブレンジに分割され、ここで、各サブレンジは、それぞれ、コードワードSmin1およびSmin2を用いてコーディングされる。これらのコードワードの各々に関連する値は、式26〜28と類似のプロセスの中でエンコーダ側において(たとえば、ビデオプリプロセッサ19によって)決定され、デコーダ30にシグナリングされ、デコーダ側において(たとえば、ビデオポストプロセッサ31によって)式29を通じて適用される。 FIG. 24 is a diagram showing a histogram using two reserved codewords for processing color values. For example, a visualization of this example using two reserved codewords that process color values at S2 _low is shown in FIG. In this example, the S2 _low codeword is divided into two subranges, where each subrange is coded using codewords Smin1 and Smin2, respectively. The value associated with each of these codewords is determined at the encoder side (eg, by video preprocessor 19) in a process similar to Equations 26-28, signaled to decoder 30, and at the decoder side (eg, video Applied by Equation 29) (by post processor 31).

いくつかの例では、技法は伝達関数の形態で実施され得る。伝達関数の非線形部、すなわち、所与の例におけるヒストグラム末尾処理は、デコーダ側において決定される、たとえば、デコーダ側においてシグナリングおよび適用される、値に関連している2つの予約済みコードワードを有する適応型伝達関数を用いてモデル化され得る。 In some examples, the technique may be implemented in the form of a transfer function. The nonlinear part of the transfer function, i.e. the histogram tail processing in the given example, has two reserved codewords associated with the values determined at the decoder side, e.g. signaling and applied at the decoder side It can be modeled using an adaptive transfer function.

図25Aおよび25Bは、パラメータ適応型関数を示す図である。図25Aは、Scale2=1.5およびOffset=-0.25としてモデル化された線形伝達関数(TF2)を示し、図25Bは、Scale2=1.5およびOffset=-0.25としてパラメータ化非線形セグメントを用いてモデル化された線形伝達関数(TF2)を示す。非線形セグメントは、決定された2つのカラー値に関連するとともにデコーダにシグナリングされる値Smin1およびSmin2である2つのコードワードを用いてパラメータ化され、S2_highのコードワードはRangeのmaxValueにクリッピングされる。 25A and 25B are diagrams showing parameter adaptive functions. Figure 25A shows a linear transfer function (TF2) modeled as Scale2 = 1.5 and Offset = -0.25, and Figure 25B was modeled using a parameterized nonlinear segment as Scale2 = 1.5 and Offset = -0.25. The linear transfer function (TF2) is shown. The non-linear segment is parameterized with two codewords that are associated with the two determined color values and signaled to the decoder, Smin1 and Smin2, and the S2 _high codeword is clipped to the maxValue of the Range .

以下のことは、区分的線形伝達関数を用いた(たとえば、後処理ユニット138または150を用いた)後処理を説明する。後処理ユニット138または150は、
Range={r_i}、ただしi=0〜N (30)
となるようにSのコードワード空間を有限数のセグメントrに分割し得る。 The following describes post-processing using a piecewise linear transfer function (eg, using post-processing unit 138 or 150). Aftertreatment unit 138 or 150
Range = {r _i }, where i = 0 to N (30)
The codeword space of S can be divided into a finite number of segments r such that

これらのセグメントr_iの各々に対して、後処理ユニット138または150は、独立した後処理パラメータScale2_iおよびOffset2_iを決定し、エンコーダ側およびデコーダ側においてScale2iおよびOffset2iパラメータをシグナリングかつコードワードSに適用して、出力S2を生成する。
Scales={Scale2i}、Offsets={Offset2i}、i=0〜N For each of these segments r _i , the post-processing unit 138 or 150 determines independent post-processing parameters Scale2 _i and Offset2 _i , signaling the Scale2i and Offset2i parameters to the codeword S on the encoder side and the decoder side. Apply to generate output S2.
Scales = {Scale2i}, Offsets = {Offset2i}, i = 0 to N

伝達関数の項において、例示的なアルゴリズムは次のようにモデル化され得る。図26A〜図26Cは、区分的線形伝達関数を用いた後処理を示す図である。図26AはPQTFによって生成されるS信号のヒストグラムを示し、図26Bは後処理のパラメータ区分的線形TFを示し、図26Cは後処理のTFによって生成されるS2信号のヒストグラムを示す。 In the transfer function terms, an exemplary algorithm can be modeled as follows. 26A to 26C are diagrams illustrating post-processing using a piecewise linear transfer function. FIG. 26A shows a histogram of the S signal generated by the PQTF, FIG. 26B shows a post-processing parameter piecewise linear TF, and FIG. 26C shows a histogram of the S2 signal generated by the post-processing TF.

図26AにおけるS信号のヒストグラムでは、ダイナミックレンジの導入されるセグメントが垂直グリッドとともに示される。わかるように、信号は、利用可能なコードワードのうちの約80%を占有しており、利用可能なコードワードのうちの20%は、式12および式13において提案されるように量子化誤差を低減するために利用され得る。同時に、ヒストグラムは、コードワードのかなりの部分がコードワード範囲としての0.6から0.8まで伸張する第4のセグメントにあることを示す。利用可能なコードワードバジェットは、この特有の第4のセグメントに対して量子化誤差を改善するために利用されてよく、他のセグメントに対する表現の精度を変化しないままにする。 In the histogram of the S signal in FIG. 26A, the segment into which the dynamic range is introduced is shown with a vertical grid. As can be seen, the signal occupies about 80% of the available codewords, and 20% of the available codewords are quantized errors as proposed in Equations 12 and 13. Can be used to reduce. At the same time, the histogram shows that a significant portion of the codeword is in the fourth segment extending from 0.6 to 0.8 as the codeword range. The available codeword budget may be used to improve the quantization error for this particular fourth segment, leaving the representation accuracy for the other segments unchanged.

これらのr個のセグメント(N=5)に関するパラメータを後処理することは、次のように書かれ得る。
Scales2={1,1,1,2,1}、Offsets2={-0.1,-0.1,-0.1,-0.1,0.1}
これは図26Bに示される区分的線形伝達関数をもたらすことになる。そのようなTFの適用から得られるS2信号のヒストグラムが図26Cに示される。見られるように、ヒストグラムが左へシフトされてより多くのコードワードを占有し、第4のセグメントにあったピークが大きいダイナミックレンジに関して伸張され、したがって、表現のより高い精度をもたらすとともに量子化誤差を低減する。 Post-processing the parameters for these r segments (N = 5) can be written as:
Scales2 = {1,1,1,2,1}, Offsets2 = {-0.1, -0.1, -0.1, -0.1,0.1}
This will result in the piecewise linear transfer function shown in FIG. 26B. A histogram of the S2 signal resulting from such TF application is shown in FIG. 26C. As can be seen, the histogram is shifted to the left to occupy more codewords, and the peak that was in the fourth segment is stretched for a large dynamic range, thus resulting in higher accuracy of the representation and quantization error Reduce.

概して、この方式のパラメータは以下のものを含み得る。
a.ダイナミックレンジに対する区分の数
b.セグメントの各々の範囲
c.これらのセグメントの各々に対するスケールおよびオフセット In general, the parameters of this scheme may include:
a. Number of segments for dynamic range
b. Range of each segment
c. Scale and offset for each of these segments

適用
a.いくつかの例では、技法は、前処理技法、後処理技法、ならびに非線形処理技法を含み得る。いくつかの例では、技法は、指定された範囲内にあるサンプルに適用される、パラメータ区分的に指定された伝達関数を用いて、上記で規定された線形コードワード後処理を増補し得る。一例では、技法は、指定された範囲内の有限数のコードワードを、このパラメータ伝達関数を呼び出すサンプルの識別のために予約し得る。
b.いくつかの例では、前処理技法および後処理技法は、入力信号に適用されるスケールおよびオフセット、すなわち、Scale1およびOffset1ならびに/またはScale2およびOffset2として規定される。いくつかの例では、技法は、伝達関数の形態で実施される。
c.いくつかの例では、非線形処理は、指定されたRangeの外部に属するカラー値を規定するための単一のパラメータ、すなわち、minValueを下回るサンプルに対する1つのパラメータと、maxValueを上回るサンプルに対する別のパラメータとを含み得る。いくつかの例では、別個に規定された伝達関数が適用され得る。
d.いくつかの例では、ダイナミックレンジ調整(DRA)は、カラー成分ごとに独立に(独立したパラメータセット)、たとえば、R、G、B、またはYに対して独立に導かれる。いくつかの例では、すべてのカラー成分に対して単一の成分間パラメータセットが適用される。
e.いくつかの例では、パラメータは、ビデオシーケンス全体に対して決定され得る。いくつかの例では、パラメータは、時間的に適応され得るか、または空間時間適応的である。
f.いくつかの例では、提案される処理は、SDR互換性を有するHDRシステムにおいて利用され得る。そのようなシステムでは、使用中のいくつかの伝達関数を用いて、提案される技法は、たとえば、100ニトを上回るおよび/または0.01ニトを下回るHDR信号を表すコードワードに適用されることになり、処理されないコードワードとしての残部を残し、したがって、SDR互換性をもたらす。 Apply
In some examples, techniques may include pre-processing techniques, post-processing techniques, as well as non-linear processing techniques. In some examples, the technique may augment the linear codeword post-processing defined above with a parameter piecewise specified transfer function that is applied to samples within a specified range. In one example, the technique may reserve a finite number of codewords within a specified range for identification of samples that invoke this parameter transfer function.
b. In some examples, pre-processing and post-processing techniques are defined as the scale and offset applied to the input signal, ie, Scale1 and Offset1 and / or Scale2 and Offset2. In some examples, the technique is implemented in the form of a transfer function.
c. In some examples, the non-linear processing is a single parameter for defining color values that fall outside the specified Range: one parameter for samples below minValue and another for samples above maxValue. Parameters. In some examples, a separately defined transfer function may be applied.
d. In some examples, dynamic range adjustment (DRA) is derived independently for each color component (independent parameter set), eg, independently for R, G, B, or Y. In some examples, a single inter-component parameter set is applied for all color components.
e. In some examples, the parameters may be determined for the entire video sequence. In some examples, the parameters can be adapted in time or are spatio-temporal adaptive.
f. In some examples, the proposed process may be utilized in an HDR system with SDR compatibility. In such a system, with some transfer functions in use, the proposed technique would be applied to codewords representing HDR signals, for example, greater than 100 nits and / or less than 0.01 nits. Leave the remainder as unprocessed codewords, thus providing SDR compatibility.

パラメータ導出
a.いくつかの例では、前処理および後処理のパラメータは独立に導出される。いくつかの例では、前処理および後処理のパラメータは一緒に導出される。
b.いくつかの例では、パラメータは、量子化誤差を最小化するプロセスから導出されるか、または費用関数を最小化することによって導出され、費用関数は、変換およびコーディングから得られるビットレート、ならびに損失のあるこれらの2つのプロセスによって持ち込まれるひずみの加重和によって形成される。
c.いくつかの例では、パラメータは、その成分に関する情報を使用して成分ごとに別個に決定され得、かつ/または成分間情報を使用することによって導出され得る。
d.いくつかの例では、適用の範囲などのパラメータは、伝達関数TFのタイプおよび特性から決定され得、たとえば、処理は、たとえば、100ニトを上回るHDR信号だけを表すコードワードに適用され得、SDR信号<=100を表すコードワードは変更されないままである。
e.いくつかの例では、範囲のパラメータは、サイド情報としてデコーダに提供され得、処理において利用されるTFに応じて適用され得る。
f.いくつかの例では、適用の範囲などのパラメータは、伝達関数TFのタイプおよび特性から決定され得、たとえば、遷移を特徴とするTFの場合、処理が適用されるダイナミックレンジを決定するために遷移(ニー(knee))の位置が利用され得る。
g.パラメータは、いくつかの予約済みコードワードの識別およびそれらの実際の値を含み得る。
h.パラメータは、予約済みコードワードに関連するプロセスの識別を含み得る。
i.パラメータは、決定された処理を適用するための、コードワード空間内のいくつかのサブレンジを含み得る。
j.パラメータは、決定された処理を適用するための、コードワード値サブレンジ識別を含み得る。 Parameter derivation
a. In some examples, pre-processing and post-processing parameters are derived independently. In some examples, the pre-processing and post-processing parameters are derived together.
b. In some examples, the parameters are derived from a process that minimizes the quantization error or by minimizing the cost function, where the cost function is the bit rate obtained from the transformation and coding, As well as the weighted sum of the strains introduced by these two lossy processes.
c. In some examples, the parameters can be determined separately for each component using information about that component and / or derived by using inter-component information.
d. In some examples, parameters such as the scope of application may be determined from the type and characteristics of the transfer function TF, for example, processing may be applied to codewords representing only HDR signals, for example, greater than 100 nits. , The codeword representing the SDR signal <= 100 remains unchanged.
e. In some examples, range parameters may be provided to the decoder as side information and may be applied depending on the TF utilized in the process.
f. In some examples, parameters such as the scope of application can be determined from the type and characteristics of the transfer function TF, for example, in the case of a TF characterized by a transition, to determine the dynamic range to which the processing is applied. The position of the transition (knee) can be used.
g. Parameters may include the identification of some reserved codewords and their actual values.
h. The parameter may include an identification of the process associated with the reserved codeword.
i. The parameters may include several sub-ranges in the codeword space for applying the determined processing.
j. The parameter may include a codeword value subrange identification for applying the determined processing.

パラメータシグナリング
a.いくつかの例では、パラメータは、エンコーダ側(たとえば、ビデオプリプロセッサ19)において推定され、ビットストリーム(メタデータ、SEIメッセージ、VUIなど)を通じてデコーダ(たとえば、ビデオポストプロセッサ31)にシグナリングされる。デコーダは、ビットストリームからパラメータを受信する。
b.いくつかの例では、パラメータは、指定されたプロセスを通じてエンコーダ側およびデコーダ側において、入力信号から、または入力信号および処理フローに関連する他の利用可能なパラメータから導出される。
c.いくつかの例では、パラメータは、明示的にシグナリングされ、デコーダ側においてDRAを実行するのに十分である。いくつかの例では、パラメータは、他の入力信号パラメータ、たとえば、入力色域およびターゲットカラーコンテナ(原色)のパラメータから導出される。
d.いくつかの例では、提案されるシステムのパラメータは、システムにおいて利用される伝達関数(TF)のパラメータとしてシグナリングされ得るか、または特定の伝達関数のためのサイド情報としてデコーダに提供され得る。
e.提案される方式のパラメータは、SEI/VUIによるビットストリームを通じてシグナリングされ得るか、もしくはサイド情報としてデコーダに提供され得るか、または入力および出力色空間、利用される伝達関数などの他の識別からデコーダによって導出され得る。 Parameter signaling
In some examples, the parameters are estimated at the encoder side (e.g., video preprocessor 19) and signaled to the decoder (e.g., video postprocessor 31) through a bitstream (metadata, SEI message, VUI, etc.) . The decoder receives parameters from the bitstream.
b. In some examples, the parameters are derived from the input signal or from other available parameters associated with the input signal and processing flow at the encoder and decoder sides through the specified process.
c. In some examples, the parameters are explicitly signaled and are sufficient to perform DRA at the decoder side. In some examples, the parameters are derived from other input signal parameters, such as input gamut and target color container (primary color) parameters.
d. In some examples, the parameters of the proposed system may be signaled as parameters of the transfer function (TF) utilized in the system or provided to the decoder as side information for a specific transfer function .
e. The parameters of the proposed scheme can be signaled through the bit stream by SEI / VUI or provided to the decoder as side information, or other identification such as input and output color space, transfer function utilized Can be derived by the decoder.

図29は、コンテンツ適応型高ダイナミックレンジ(HDR)システムにおけるビデオ処理の例示的な方法を示すフローチャートである。図29の例は、受信ビデオデータを表示のためのビデオデータに変換するためのビデオポストプロセッサ31に関して説明される。いくつかの例では、ビデオポストプロセッサ31は、処理が独立しているようにカラーごとに(赤色成分に対して1回、緑色成分に対して1回、青色成分に対して1回)例示的な技法を実行し得る。 FIG. 29 is a flowchart illustrating an exemplary method of video processing in a content adaptive high dynamic range (HDR) system. The example of FIG. 29 is described with respect to a video post processor 31 for converting received video data into video data for display. In some examples, the video post processor 31 is exemplary for each color (once for the red component, once for the green component, and once for the blue component) so that the processing is independent. Techniques can be implemented.

ビデオポストプロセッサ31は、ビデオデータの短縮カラー値を表す第1の複数のコードワードを受信し、ここで、短縮カラー値は第1のダイナミックレンジでのカラーを表す(200)。たとえば、ビットレート低減のために、ビデオプリプロセッサ19は、カラー値のダイナミックレンジを短縮していてよい。短縮の結果が、第1の複数のコードワードである。ビデオポストプロセッサ31は、ビデオデコーダ30から、またはビデオデータメモリ140を介して、第1の複数のコードワードを直接受信し得る。短縮カラー値は、RGBまたはYCrCbなどの特定の色空間の中にあってよい。短縮カラー値は、逆量子化ユニット122によって逆量子化され得る。 Video post processor 31 receives a first plurality of codewords representing shortened color values of video data, where the shortened color values represent colors in a first dynamic range (200). For example, to reduce the bit rate, the video preprocessor 19 may shorten the dynamic range of color values. The result of the shortening is the first plurality of codewords. Video post processor 31 may receive the first plurality of codewords directly from video decoder 30 or via video data memory 140. The shortened color value may be in a specific color space such as RGB or YCrCb. The shortened color value may be dequantized by the dequantization unit 122.

逆後処理ユニット144または152は、第2の複数のコードワードを生成するために、コードワードの範囲を低減する逆後処理機能を適用し得る(202)。いくつかの例では、逆後処理ユニット144は、色変換された(たとえば、逆色変換ユニット124を介して色変換された)コードワードに対して逆後処理動作を適用し得る。いくつかの例では、逆後処理ユニット152は色変換の前に逆後処理を適用し得、逆色変換ユニット124は逆後処理ユニット152の出力を変換し得る。 Reverse post-processing unit 144 or 152 may apply a reverse post-processing function that reduces the range of codewords to generate a second plurality of codewords (202). In some examples, inverse post-processing unit 144 may apply reverse post-processing operations to codewords that have been color converted (eg, color converted via inverse color conversion unit 124). In some examples, reverse post-processing unit 152 may apply reverse post-processing prior to color conversion, and reverse color conversion unit 124 may convert the output of reverse post-processing unit 152.

逆後処理ユニット144または152は、第1の複数のコードワードに対して逆スケーリングおよび逆オフセットを実行し得る。逆後処理ユニット144または152は、スケーリングおよびオフセットパラメータ(たとえば、Scale2およびOffset2)を受信し得る。いくつかの例では、逆後処理ユニット144または152は、逆後処理ユニット144または152がスケーリングおよびオフセットパラメータをそこから決定する情報を受信し得る。概して、逆後処理ユニット144または152は、コードワードの逆スケーリングおよび逆オフセットのためのスケーリングおよびオフセットパラメータを、受信されたビットストリームまたは別個にシグナリングされたサイド情報から導出し得る。スケーリングおよびオフセットパラメータをビットストリームの中で受信するとき、逆後処理ユニット144または152は、処理されるべきピクチャデータとともにスケーリングおよびオフセットパラメータを受信する。スケーリングおよびオフセットパラメータを別個にシグナリングされたサイド情報として受信するとき、逆後処理ユニット144または152は、処理されるべきいかなるピクチャデータも伴わずにスケーリングおよびオフセットパラメータを受信する(たとえば、任意のピクチャデータの前にスケーリングおよびオフセットパラメータを受信する)。 Inverse post-processing unit 144 or 152 may perform inverse scaling and inverse offset on the first plurality of codewords. Inverse post-processing unit 144 or 152 may receive scaling and offset parameters (eg, Scale2 and Offset2). In some examples, inverse post-processing unit 144 or 152 may receive information from which inverse post-processing unit 144 or 152 determines scaling and offset parameters. In general, inverse post-processing unit 144 or 152 may derive scaling and offset parameters for codeword inverse scaling and inverse offset from the received bitstream or separately signaled side information. When receiving the scaling and offset parameters in the bitstream, the inverse post-processing unit 144 or 152 receives the scaling and offset parameters along with the picture data to be processed. When receiving the scaling and offset parameters as separately signaled side information, the inverse post-processing unit 144 or 152 receives the scaling and offset parameters without any picture data to be processed (e.g., any picture Receive the scaling and offset parameters before the data).

いくつかの例では、コードワードのすべての値に対してスケーリングおよびオフセットを第1の複数のコードワードに適用する代わりに、逆後処理ユニット144または152は、区分的なスケーリングおよびオフセットを適用し得る。たとえば、逆後処理ユニット144または152は、第1の複数のコードワードからのコードワードの第1のセットが、最小しきい値よりも小さいかまたは最大しきい値よりも大きい値を有する短縮カラー値を表すことと、第1の複数のコードワードからのコードワードの第2のセットが、最大しきい値よりも小さくかつ最小しきい値よりも大きい値を有する短縮カラー値を表すこととを決定し得る。 In some examples, instead of applying scaling and offset to all values of the codeword to the first plurality of codewords, the inverse post-processing unit 144 or 152 applies piecewise scaling and offset. obtain. For example, the reverse post-processing unit 144 or 152 may use a shortened color in which the first set of codewords from the first plurality of codewords has a value that is less than the minimum threshold or greater than the maximum threshold. Representing a value, and a second set of codewords from the first plurality of codewords representing a shortened color value having a value less than a maximum threshold and greater than a minimum threshold. Can be determined.

逆後処理ユニット144または152は、最小しきい値よりも小さい、コードワードの第1のセットのコードワードに、第1のコードワード(たとえば、Smin')を割り当ててよく、最大しきい値よりも大きい、コードワードの第1のセットのコードワードに、第2のコードワード(たとえば、Smax')を割り当ててよい。逆後処理ユニット144または152は、コードワードの第2のセットを(たとえば、Scale2およびOffset2に基づいて)逆スケーリングおよび逆オフセットし得る。1つの最小しきい値および1つの最大しきい値が説明されるが、いくつかの例では、しきい値の各々に対して予約済みコードワードを伴うそのような複数のしきい値があってもよい。逆後処理ユニット144または152は、Smin'、Smax'、もしくは予約済みコードワードを受信し得、これらのコードワードをどのように決定するのかについての情報を受信し得、またはこれらのコードワードはビデオデータメモリ140の中に事前記憶されてよい。 The reverse post-processing unit 144 or 152 may assign a first codeword (e.g., Smin ') to a first set of codewords of codewords that is less than the minimum threshold and is below the maximum threshold. A second codeword (eg, Smax ′) may be assigned to a codeword of the first set of codewords that is larger. Inverse post-processing unit 144 or 152 may inverse scale and inverse offset the second set of codewords (eg, based on Scale2 and Offset2). Although one minimum threshold and one maximum threshold are described, in some examples there are multiple such thresholds with reserved codewords for each of the thresholds. Also good. The reverse post-processing unit 144 or 152 may receive Smin ′, Smax ′, or reserved codewords, may receive information about how to determine these codewords, or these codewords It may be pre-stored in the video data memory 140.

逆TFユニット126は、逆後処理ユニット144から、または逆後処理ユニット152を使用する例における逆色変換ユニット124から、出力を受信する。しかしながら、いくつかの例では、逆後処理ユニット144または152は、有効化されていなくてよく、または利用可能でなくてよい。そのような例では、逆TFユニット126が、第1の複数のコードワードを受信する。 The inverse TF unit 126 receives the output from the inverse post-processing unit 144 or from the inverse color conversion unit 124 in the example using the inverse post-processing unit 152. However, in some examples, the reverse post-processing unit 144 or 152 may not be enabled or available. In such an example, inverse TF unit 126 receives the first plurality of codewords.

逆TFユニット126は、非短縮カラー値を生成するために、ビデオデータに適応性を示さない逆静的伝達関数を使用して、第1の複数のコードワードに基づいて第2の複数のコードワードを非短縮化し得る(204)。非短縮カラー値は、第2のダイナミックレンジでのカラーを表す。第2の複数のコードワードは、逆後処理されている第1の複数のコードワード(たとえば、逆後処理ユニット144からの、または逆色変換の後の逆後処理ユニット152からの出力)または第1の複数のコードワード(たとえば、逆後処理ユニット144または152が無効化されているか、または利用可能でない場合)からのコードのうちの1つである。 The inverse TF unit 126 uses the inverse static transfer function that is not adaptable to the video data to generate a non-shortened color value, and the second plurality of codes based on the first plurality of codewords. The word may be shortened (204). The non-shortened color value represents the color in the second dynamic range. The second plurality of codewords is a first plurality of codewords that are being reverse post-processed (e.g., output from reverse post-processing unit 144 or from reverse post-processing unit 152 after reverse color conversion) or One of the codes from the first plurality of codewords (eg, when reverse post-processing unit 144 or 152 is disabled or not available).

いくつかの例では、逆TFユニット126は、表示のための出力またはさらなる処理のために、非短縮カラー値を出力し得る(208)。しかしながら、いくつかの例では、ビデオプリプロセッサ19が短縮する前に前処理を適用した場合など、逆TFユニット126は、非短縮カラー値を逆前処理ユニット142に出力し得る。この場合、逆前処理ユニット142は、随意の動作206に関して以下で説明するように、表示または逆前処理された非短縮カラー値をさらに処理するために出力し得る。非短縮カラー値のダイナミックレンジは、逆TFユニット126が受信するコードワードのダイナミックレンジよりも大きくてよい。言い換えれば、逆TFユニット126は、より大きいダイナミックレンジを有する非短縮カラー値を生成するために、コードワードのダイナミックレンジを大きくする。 In some examples, the inverse TF unit 126 may output non-shortened color values for output for display or further processing (208). However, in some examples, the inverse TF unit 126 may output the unreduced color value to the inverse preprocessing unit 142, such as when preprocessing is applied before the video preprocessor 19 shortens. In this case, the reverse preprocessing unit 142 may output the unprocessed color values that have been displayed or reverse preprocessed for further processing, as described below with respect to optional operation 206. The dynamic range of non-shortened color values may be greater than the dynamic range of codewords received by the inverse TF unit 126. In other words, the inverse TF unit 126 increases the dynamic range of the codeword in order to generate non-shortened color values having a larger dynamic range.

いくつかの例では、逆前処理ユニット142は、非短縮カラー値に対して逆前処理を適用し得る(206)。逆前処理ユニット142は、スケーリングおよびオフセットパラメータ(たとえば、Scale1およびOffset1)を受信し得る。いくつかの例では、逆前処理ユニット142は、逆前処理ユニット142がスケーリングおよびオフセットパラメータをそこから決定する情報(たとえば、ピクチャの中のカラーのヒストグラム)を受信し得る。逆前処理ユニット142は、得られた逆前処理された非短縮カラー値を、表示またはさらなる処理のために出力する(208)。 In some examples, reverse preprocessing unit 142 may apply reverse preprocessing to the non-shortened color values (206). Inverse preprocessing unit 142 may receive scaling and offset parameters (eg, Scale1 and Offset1). In some examples, inverse preprocessing unit 142 may receive information (eg, a histogram of colors in a picture) from which inverse preprocessing unit 142 determines scaling and offset parameters. The inverse preprocessing unit 142 outputs the obtained inverse preprocessed non-shortened color value for display or further processing (208).

概して、逆前処理ユニット142は、コードワードの逆スケーリングおよび逆オフセットのためのスケーリングおよびオフセットパラメータを、受信されたビットストリームまたは別個にシグナリングされたサイド情報から導出し得る。スケーリングおよびオフセットパラメータをビットストリームの中で受信するとき、逆前処理ユニット142は、処理されるべきピクチャデータとともにスケーリングおよびオフセットパラメータを受信する。スケーリングおよびオフセットパラメータを別個にシグナリングされたサイド情報として受信するとき、逆前処理ユニット142は、処理されるべきいかなるピクチャデータも伴わずにスケーリングおよびオフセットパラメータを受信する(たとえば、任意のピクチャデータの前にスケーリングおよびオフセットパラメータを受信する)。 In general, inverse preprocessing unit 142 may derive scaling and offset parameters for codeword inverse scaling and inverse offset from the received bitstream or separately signaled side information. When receiving the scaling and offset parameters in the bitstream, the inverse preprocessing unit 142 receives the scaling and offset parameters along with the picture data to be processed. When receiving the scaling and offset parameters as separately signaled side information, the inverse preprocessing unit 142 receives the scaling and offset parameters without any picture data to be processed (e.g., for any picture data). Receive scaling and offset parameters before).

上記で説明したように、ビデオポストプロセッサ31は、第2の複数のコードワードを生成するために第1の複数のコードワードを逆後処理すること、または非短縮カラー値に対して逆前処理することのうちの少なくとも1つを実行し得る。しかしながら、いくつかの例では、ビデオポストプロセッサ31は、第2の複数のコードワードを生成するために第1の複数のコードワードを(たとえば、逆後処理ユニット144または152を介して)逆後処理することと、逆前処理された非短縮カラー値を生成するために非短縮カラー値を(たとえば、逆前処理ユニット142を介して)逆前処理することの両方を実行し得る。 As described above, the video post-processor 31 performs reverse post-processing on the first plurality of code words to generate a second plurality of code words, or reverse pre-processing on non-shortened color values. At least one of doing may be performed. However, in some examples, video post-processor 31 reverses the first plurality of codewords (eg, via reverse postprocessing unit 144 or 152) to generate the second plurality of codewords. Both processing and reverse preprocessing the non-shortened color values (eg, via reverse preprocessing unit 142) to produce reverse preprocessed non-shortened color values may be performed.

図30は、コンテンツ適応型高ダイナミックレンジ(HDR)システムにおけるビデオ処理の別の例示的な方法を示すフローチャートである。図30の例は、受信ビデオデータを送信のためのビデオデータに変換するためのビデオプリプロセッサ19に関して説明される。いくつかの例では、ビデオプリプロセッサ19は、処理が独立しているようにカラーごとに(赤色成分に対して1回、緑色成分に対して1回、青色成分に対して1回)例示的な技法を実行し得る。 FIG. 30 is a flowchart illustrating another exemplary method of video processing in a content adaptive high dynamic range (HDR) system. The example of FIG. 30 is described with respect to a video preprocessor 19 for converting received video data into video data for transmission. In some examples, the video preprocessor 19 is illustrative for each color (once for the red component, once for the green component, and once for the blue component) so that the processing is independent. The technique can be performed.

ビデオプリプロセッサ19は、第1のダイナミックレンジでのカラーを表すビデオデータの複数のカラー値を受信する(300)。たとえば、ビデオプリプロセッサ19は、ビデオデータメモリ132の中に記憶されたビデオデータを受信し得、ここで、そのようなデータはビデオソース18から受信される。カラー値は、RGB色空間の中にあってよいが、色変換ユニット114が処理の前にカラーをYCrCb色空間に変換することが可能である。 Video preprocessor 19 receives a plurality of color values of video data representing color in the first dynamic range (300). For example, video preprocessor 19 may receive video data stored in video data memory 132, where such data is received from video source 18. The color value may be in the RGB color space, but the color conversion unit 114 can convert the color to the YCrCb color space before processing.

前処理ユニット134は、前処理カラー値がTFユニット112によって処理されるときのようなカラー値に対して前処理を実行し得、得られたコードワードは各々、カラー値の範囲を近似的に表し、または照度が低いカラーがコードワードのはるかに大きい範囲によって表され、照度が高いカラーがコードワードの比較的小さい範囲によって表されるように、少なくとも重度に重み付けられない(302)。たとえば、前処理ユニット134は、(たとえば、Scale1およびOffset1を介して)スケーリングおよびオフセットし得る。前処理ユニット134は、ピクチャの中のカラー値のヒストグラムに基づいてスケーリングおよびオフセットパラメータを決定し得る(たとえば、入力線形カラー値に基づいてスケーリングファクタおよびオフセットファクタを適応的に決定し得る)。前処理ユニット134は、Scale1およびOffset1に関する値を出力し得るか、またはScale1およびOffset1に関する値を決定するために使用され得る情報を出力し得る。たとえば、ビデオプリプロセッサ19は、入力線形カラー値をスケーリングおよびオフセットするためのスケーリングおよびオフセットパラメータを、ビットストリームの中でまたはサイド情報としてビデオエンコーダ20にシグナリングさせ得る。 The preprocessing unit 134 may perform preprocessing on the color values as when the preprocessed color values are processed by the TF unit 112, and each resulting codeword approximates a range of color values. It is not at least severely weighted (302) to represent or represent a low illumination color by a much larger range of codewords and a high illumination color by a relatively small range of codewords. For example, the preprocessing unit 134 may scale and offset (eg, via Scale1 and Offset1). Preprocessing unit 134 may determine scaling and offset parameters based on a histogram of color values in the picture (eg, may adaptively determine scaling and offset factors based on input linear color values). Pre-processing unit 134 may output values for Scale1 and Offset1, or may output information that may be used to determine values for Scale1 and Offset1. For example, video preprocessor 19 may cause video encoder 20 to signal scaling and offset parameters for scaling and offsetting input linear color values in the bitstream or as side information.

TFユニット112は、カラー値を短縮するために、前処理ユニット134の出力を受信し得る。しかしながら、図示したように、いくつかの例では、前処理ユニット134は有効化されていなくてよく、または利用可能でなくてよい。そのような例では、TFユニット112は、前処理を伴わない複数のカラー値を受信し得る。 The TF unit 112 may receive the output of the preprocessing unit 134 to shorten the color value. However, as shown, in some examples, the preprocessing unit 134 may not be enabled or available. In such an example, the TF unit 112 may receive multiple color values without preprocessing.

TFユニット112は、短縮カラー値を表す複数のコードワードを生成するために、短縮されているビデオデータに適応性を示さない静的伝達関数を使用してカラー値を短縮し得、ここで、短縮カラー値は第2のダイナミックレンジでのカラーを表す(304)。TFユニット112は、カラー値のダイナミックレンジを低減し得、そのようなカラー値の送信を容易にし得る。 The TF unit 112 may shorten the color value using a static transfer function that is not adaptive to the shortened video data to generate a plurality of codewords that represent the shortened color value, where The shortened color value represents the color in the second dynamic range (304). The TF unit 112 may reduce the dynamic range of color values and may facilitate the transmission of such color values.

いくつかの例では、TFユニット112は、コードワードによって表される短縮カラー値に基づくカラー値を出力し得る(308)。しかしながら、いくつかの例では、後処理ユニット138または150は、コードワード空間をより良好に使用する(たとえば、利用可能なコードワード空間の使用を高める)コードワードを生成するために、カラー値の短縮から得られたコードワードを後処理し得る(306)。後処理ユニット150は、後処理するために色変換ユニット114の出力を受信し得、色変換ユニット114は、TFユニット112の出力を受信する。後処理ユニット138または150は、Scale2を用いてスケーリングし得、Offset2を用いてオフセットし得る。たとえば、ビデオプリプロセッサ19は、コードワードのスケーリングおよびオフセットのためのスケーリングおよびオフセットパラメータを、ビットストリームの中でまたはサイド情報としてビデオエンコーダ20にシグナリングさせ得る。 In some examples, the TF unit 112 may output a color value based on the shortened color value represented by the codeword (308). However, in some examples, post-processing unit 138 or 150 may use color values to generate codewords that better use codeword space (e.g., increase use of available codeword space). The codeword resulting from the shortening may be post-processed (306). The post-processing unit 150 can receive the output of the color conversion unit 114 for post-processing, and the color conversion unit 114 receives the output of the TF unit 112. Post-processing unit 138 or 150 may scale using Scale2 and offset using Offset2. For example, video preprocessor 19 may cause video encoder 20 to signal scaling and offset parameters for codeword scaling and offset in the bitstream or as side information.

いくつかの例では、コードワードのすべての値に対してTFユニット112が出力する複数のコードワードにスケーリングおよびオフセットを適用する代わりに、後処理ユニット138または150は、区分的なスケーリングおよびオフセットを適用し得る。たとえば、後処理ユニット138または150は、スケーリングおよびオフセットされたコードワードのセットが、最小しきい値よりも小さくまたは最大しきい値よりも大きい値を有することを決定し得る。後処理ユニット138または150は、最小しきい値よりも小さい値を有するスケーリングおよびオフセットされたコードワードのセットに第1のコードワード(Smin)を割り当ててよく、最大しきい値よりも大きい値を有するスケーリングおよびオフセットされたコードワードのセットに第2のコードワード(Smax)を割り当ててよい。他のコードワードに対して、後処理ユニット138または150は、他のコードワードを(たとえば、Scale2およびOffset2に基づいて)スケーリングおよびオフセットし得る。1つの最小しきい値および1つの最大しきい値が説明されるが、いくつかの例では、しきい値の各々に対して予約済みコードワードを伴うそのような複数のしきい値があってもよい。後処理ユニット138または150は、Smin、Smax、もしくは予約済みコードワードに関する情報を出力し得、またはこれらのコードワードをどのように決定するのかについての情報を出力し得る。 In some examples, instead of applying scaling and offset to multiple codewords output by the TF unit 112 for all values of the codeword, the post-processing unit 138 or 150 performs piecewise scaling and offset. Applicable. For example, post-processing unit 138 or 150 may determine that the set of scaled and offset codewords has a value that is less than the minimum threshold or greater than the maximum threshold. The post-processing unit 138 or 150 may assign a first codeword (Smin) to a set of scaled and offset codewords having a value less than the minimum threshold, with a value greater than the maximum threshold. A second codeword (Smax) may be assigned to the set of scaled and offset codewords having. For other codewords, post-processing unit 138 or 150 may scale and offset other codewords (eg, based on Scale2 and Offset2). Although one minimum threshold and one maximum threshold are described, in some examples there are multiple such thresholds with reserved codewords for each of the thresholds. Also good. Post-processing unit 138 or 150 may output information about Smin, Smax, or reserved codewords, or may output information about how to determine these codewords.

ビデオプリプロセッサ19は、短縮カラー値または後処理された短縮カラー値のうちの1つに基づくカラー値を出力し得る(308)。たとえば、後処理ユニット138または150が有効化されている場合、ビデオプリプロセッサ19は、後処理された短縮カラー値に基づくカラー値を出力し得る。後処理ユニット138または150が有効化されていないかまたは利用可能でない場合、ビデオプリプロセッサ19は、TFユニット112によって短縮されるような短縮カラー値に基づき後処理を伴わないカラー値を出力し得る。ただし、色変換ユニット114および量子化ユニット116は、出力する前にそれぞれの処理を実行してもよい。 Video preprocessor 19 may output a color value based on one of the shortened color value or the post-processed shortened color value (308). For example, if the post-processing unit 138 or 150 is enabled, the video preprocessor 19 may output a color value based on the post-processed shortened color value. If post-processing unit 138 or 150 is not enabled or available, video preprocessor 19 may output a color value without post-processing based on the shortened color value as shortened by TF unit 112. However, the color conversion unit 114 and the quantization unit 116 may execute respective processes before outputting.

例に応じて、本明細書で説明した技法のうちのいずれかのいくつかの行為またはイベントが、異なるシーケンスで実行されてよく、追加されてよく、統合されてよく、または完全に除外されてよい(たとえば、説明したすべての行為またはイベントが技法の実施にとって必要であるとは限らない)ことを認識されたい。その上、いくつかの例では、行為またはイベントは、連続的にではなく、たとえば、マルチスレッド処理、割込み処理、または複数のプロセッサを通じて並行して実行されてよい。 Depending on the example, some acts or events of any of the techniques described herein may be performed in different sequences, added, integrated, or completely excluded. It should be appreciated that (for example, not all described acts or events are necessary for the implementation of the technique). Moreover, in some examples, actions or events may be performed in parallel rather than sequentially, for example, through multithreaded processing, interrupt processing, or multiple processors.

1つまたは複数の例では、説明した機能は、ハードウェア、ソフトウェア、ファームウェア、またはそれらの任意の組合せで実装され得る。ソフトウェアで実装される場合、機能は、1つまたは複数の命令またはコードとして、コンピュータ可読媒体上に記憶されるか、またはコンピュータ可読媒体を介して送信されてよく、ハードウェアベースの処理ユニットによって実行されてもよい。コンピュータ可読媒体は、データ記憶媒体などの有形媒体に相当するコンピュータ可読記憶媒体、または、たとえば、通信プロトコルに従って、ある場所から別の場所へのコンピュータプログラムの転送を容易にする任意の媒体を含む通信媒体を含み得る。このようにして、コンピュータ可読媒体は、一般に、(1)非一時的である有形コンピュータ可読記憶媒体、または(2)信号もしくは搬送波などの通信媒体に相当し得る。データ記憶媒体は、本開示で説明した技法の実装のための命令、コード、および/またはデータ構造を取り出すために1つもしくは複数のコンピュータまたは1つもしくは複数のプロセッサによってアクセスされ得る任意の利用可能な媒体であり得る。コンピュータプログラム製品は、コンピュータ可読媒体を含み得る。 In one or more examples, the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium and executed by a hardware-based processing unit. May be. A computer-readable medium is a computer-readable storage medium that corresponds to a tangible medium such as a data storage medium, or communication that includes any medium that facilitates transfer of a computer program from one place to another, eg, according to a communication protocol Media may be included. In this manner, computer-readable media generally may correspond to (1) tangible computer-readable storage media which is non-transitory or (2) a communication medium such as a signal or carrier wave. A data storage medium may be any available that can be accessed by one or more computers or one or more processors to retrieve instructions, code, and / or data structures for implementation of the techniques described in this disclosure. Medium. The computer program product may include a computer readable medium.

限定ではなく例として、そのようなコンピュータ可読記憶媒体は、RAM、ROM、EEPROM、CD-ROMもしくは他の光ディスクストレージ、磁気ディスクストレージもしくは他の磁気記憶デバイス、フラッシュメモリ、または、命令もしくはデータ構造の形態の所望のプログラムコードを記憶するために使用され得るとともにコンピュータによってアクセスされ得る任意の他の媒体を備えることができる。また、いかなる接続もコンピュータ可読媒体と適切に呼ばれる。たとえば、命令が、同軸ケーブル、光ファイバーケーブル、ツイストペア、デジタル加入者線(DSL)、または赤外線、無線、およびマイクロ波などのワイヤレス技術を使用してウェブサイト、サーバ、または他のリモートソースから送信される場合、同軸ケーブル、光ファイバーケーブル、ツイストペア、DSL、または赤外線、無線、およびマイクロ波などのワイヤレス技術は媒体の定義に含まれる。しかしながら、コンピュータ可読記憶媒体およびデータ記憶媒体は、接続、搬送波、信号、または他の一時的媒体を含まないが、代わりに非一時的有形記憶媒体を対象とすることを理解されたい。本明細書で使用するディスク(disk)およびディスク(disc)は、コンパクトディスク(disc)(CD)、レーザーディスク（登録商標）(disc)、光ディスク(disc)、デジタル多用途ディスク(disc)(DVD)、フロッピーディスク(disk)、およびBlu-ray（登録商標）ディスク(disc)を含み、ディスク(disk)は、通常、データを磁気的に再生し、ディスク(disc)は、レーザーを用いてデータを光学的に再生する。上記の組合せもコンピュータ可読媒体の範囲内に含まれるべきである。 By way of example, and not limitation, such computer readable storage media may be RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage device, flash memory, or instruction or data structure Any other medium that can be used to store the form of the desired program code and that can be accessed by the computer can be provided. Any connection is also properly termed a computer-readable medium. For example, instructions are sent from a website, server, or other remote source using coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, wireless, and microwave. In this case, coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of media. However, it should be understood that computer readable storage media and data storage media do not include connections, carrier waves, signals, or other temporary media, but instead are directed to non-transitory tangible storage media. The disc and disc used in this specification are a compact disc (CD), a laser disc (registered trademark) (disc), an optical disc (disc), a digital versatile disc (DVD) ), Floppy disk, and Blu-ray® disc, the disk normally reproduces data magnetically, and the disc uses lasers for data. Is reproduced optically. Combinations of the above should also be included within the scope of computer-readable media.

命令は、1つもしくは複数のデジタル信号プロセッサ(DSP)、汎用マイクロプロセッサ、特定用途向け集積回路(ASIC)、フィールドプログラマブル論理アレイ(FPGA)、または他の等価な集積論理回路構成もしくは個別論理回路構成などの、1つまたは複数のプロセッサによって実行され得る。したがって、本明細書で使用する「プロセッサ」という用語は、上記の構造、または本明細書で説明した技法の実装に適した任意の他の構造のいずれかを指すことがある。加えて、いくつかの態様では、本明細書で説明した機能は、符号化および復号のために構成された専用のハードウェアおよび/もしくはソフトウェアモジュール内で提供されることがあり、または複合コーデックに組み込まれることがある。また、技法は、1つまたは複数の回路または論理要素で完全に実装され得る。 Instructions can be one or more digital signal processors (DSPs), general purpose microprocessors, application specific integrated circuits (ASICs), field programmable logic arrays (FPGAs), or other equivalent integrated or discrete logic configurations Can be executed by one or more processors. Thus, as used herein, the term “processor” may refer to either the above structure or any other structure suitable for implementation of the techniques described herein. In addition, in some aspects, the functionality described herein may be provided in dedicated hardware and / or software modules configured for encoding and decoding, or in a composite codec May be incorporated. Also, the techniques may be fully implemented with one or more circuits or logic elements.

本開示の技法は、ワイヤレスハンドセット、集積回路(IC)またはICのセット(たとえば、チップセット)を含む、多種多様なデバイスまたは装置において実装され得る。様々な構成要素、モジュール、またはユニットは、開示する技法を実行するように構成されたデバイスの機能的態様を強調するために本開示で説明されるが、必ずしも異なるハードウェアユニットによる実現を必要とするとは限らない。むしろ、上記で説明したように、様々なユニットは、コーデックハードウェアユニットにおいて組み合わせられてよく、または適切なソフトウェアおよび/もしくはファームウェアとともに、上記で説明したような1つもしくは複数のプロセッサを含む相互動作可能なハードウェアユニットの集合によって提供されてよい。 The techniques of this disclosure may be implemented in a wide variety of devices or apparatuses, including a wireless handset, an integrated circuit (IC) or a set of ICs (eg, a chip set). Various components, modules, or units are described in this disclosure to highlight functional aspects of a device configured to perform the disclosed techniques, but need not necessarily be implemented by different hardware units. Not always. Rather, as described above, the various units may be combined in a codec hardware unit, or interoperate including one or more processors as described above, with appropriate software and / or firmware. It may be provided by a collection of possible hardware units.

様々な例が説明されている。これらおよび他の例は、以下の特許請求の範囲内である。 Various examples have been described. These and other examples are within the scope of the following claims.

10 ビデオ符号化および復号システム
12 ソースデバイス
14 宛先デバイス
16 コンピュータ可読媒体
18 ビデオソース
19 ビデオプリプロセッサ
20 ビデオエンコーダ
21 ビデオ符号化ユニット
22 出力インターフェース
28 入力インターフェース
29 ビデオ復号ユニット
30 ビデオデコーダ
31 ビデオポストプロセッサ
32 ディスプレイデバイス
110 線形RGBデータ
112 伝達関数ユニット
112' 適応形状TFユニット
114 色変換ユニット
116 量子化ユニット
118 HDRデータ
120 HDRデータ
122 逆量子化ユニット
124 逆色変換ユニット
126 逆伝達関数ユニット
128 線形RGBデータ
132 ビデオデータメモリ
134 前処理ユニット
138 後処理ユニット
140 ビデオデータメモリ
142 逆前処理ユニット
144 逆後処理ユニット
150 後処理ユニット
152 逆後処理ユニット 10 Video encoding and decoding system
12 Source device
14 Destination device
16 Computer-readable media
18 Video source
19 Video preprocessor
20 Video encoder
21 Video encoding unit
22 Output interface
28 Input interface
29 Video decoding unit
30 video decoder
31 Video post processor
32 display devices
110 Linear RGB data
112 Transfer function unit
112 'Applicable shape TF unit
114 color conversion unit
116 Quantization unit
118 HDR data
120 HDR data
122 Inverse quantization unit
124 Reverse color conversion unit
126 Inverse transfer function unit
128 linear RGB data
132 Video data memory
134 Pretreatment unit
138 Aftertreatment unit
140 Video data memory
142 Reverse pretreatment unit
144 Reverse post-processing unit
150 Aftertreatment unit
152 Reverse post-processing unit

Claims

A method of video processing,
Receiving a first plurality of codewords representing shortened color values of video data, wherein the shortened color values represent colors in a first dynamic range; and
Non-shortening a second plurality of codewords based on the first plurality of codewords using an inverse static transfer function that is not adaptable to the video data to generate unshortened color values The first plurality of codewords, wherein the non-shortened color value represents a color in a second different dynamic range, and the second plurality of codewords are reverse post-processed Or one of codewords from the first plurality of codewords, and
At least one of reverse post-processing the first plurality of code words to generate the second plurality of code words, or reverse pre-processing the non-shortened color values;
Outputting the non-shortened color value or the reverse preprocessed non-shortened color value.

At least one of reverse post-processing the first plurality of code words to generate the second plurality of code words, or reverse pre-processing the non-shortened color values;
Reverse post-processing the first plurality of codewords to generate the second plurality of codewords;
Reverse pre-processing the non-shortened color value to generate the reverse pre-processed non-shortened color value.
The method of claim 1.

The step of inverse post-processing comprises the step of inverse scaling and inverse offsetting the first plurality of codewords, wherein the step of inverse pre-processing is a non-shortened obtained from the step of non-shortening The method of claim 1, comprising inverse scaling and inverse offsetting the second plurality of codewords.

Descale and reverse offset the first plurality of codewords, and descale and reverse offset the non-shortened second plurality of codewords obtained from the step of unshortening The method of claim 3, further comprising: deriving scaling and offset parameters for one or both of the steps from the received bitstream or separately signaled side information.

Determining that a first set of codewords from the first plurality of codewords represents a shortened color value having a value less than a minimum threshold or greater than a maximum threshold;
Determining that a second set of codewords from the first plurality of codewords represents a shortened color value having a value greater than or equal to the minimum threshold and less than or equal to the maximum threshold. ,
Reverse post-processing the first plurality of codewords;
Assigning a first codeword to a codeword that is less than the minimum threshold of the first set of codewords;
Assigning a second codeword to a codeword greater than the maximum threshold of the first set of codewords;
Inverse scaling and inverse offsetting the second set of codewords;
The method of claim 1.

Based on information received from side information signaled from a bitstream or separately signaled indicating the first codeword, the second codeword, the maximum threshold, or the minimum threshold, Determining one or more of a first codeword, the second codeword, the maximum threshold, and the minimum threshold, or the first codeword, the second code 6. The method of claim 5, further comprising: applying a process for determining a word, the maximum threshold, or the minimum threshold.

Determining that a first set of codewords from the first plurality of codewords represents a shortened color value having a value belonging to a first partition of the first dynamic range;
Determining that a second set of codewords from the first plurality of codewords represents a shortened color value having a value belonging to a second segment of the first dynamic range; and
Reverse post-processing the first plurality of codewords;
Scaling and offsetting said first set of codewords using a first set of scaling and offset parameters;
Scaling and offsetting the second set of codewords using a second set of scaling and offset parameters;
The method of claim 1.

Determining that the first set of non-shortened color values represents a non-shortened color value having a value belonging to a first segment of the second dynamic range;
Determining that the second set of non-shortened color values represents a non-shortened color value having a value belonging to a second segment of the second dynamic range; and
Reverse pre-processing the non-shortened color values comprises:
Scaling and offsetting the first set of non-shortened color values using a first set of scaling and offset parameters;
Scaling and offsetting the second set of non-shortened color values using a second set of scaling and offset parameters;
The method of claim 1.

A method of video processing,
Receiving a plurality of color values of video data representing color in a first dynamic range;
Shortening the color value using a static transfer function that is not adaptable to the shortened video data to generate a plurality of codewords representing the shortened color value, the shortened color value comprising: A step whose value represents a color in a second different dynamic range;
At least one of pre-processing the color value before shortening to generate the color value to be shortened, or post-processing a codeword obtained from the step of shortening the color value; ,
Outputting a color value based on one of the shortened color value or the post-processed shortened color value.

At least one of pre-processing the color value prior to shortening to generate the color value to be shortened, or post-processing a codeword obtained from the step of shortening the color value; ,
Pre-processing the color value before shortening to produce the color value to be shortened;
Post-processing the codeword obtained from the step of shortening the color value;
The method of claim 9.

The step of preprocessing comprises scaling and offsetting input linear color values, and the step of postprocessing comprises scaling and offsetting the codeword obtained from the step of shortening. The method described in 1.

Scaling and offset parameters for one or both of the steps of scaling and offsetting input linear color values and scaling and offsetting the codeword obtained from the step of reducing in the bitstream The method according to claim 11, further comprising: signaling as side information.

Scaling and offsetting the input linear color value comprises adaptively determining a scaling factor and an offset factor based on the input linear color value, and the step of scaling and offsetting the codeword comprises an available code 12. The method of claim 11, comprising adaptively determining a scaling factor and an offset factor based on the codeword to enhance word space usage.

Post-processing the codeword comprises:
Scaling and offsetting the codeword obtained from the step of shortening the color value;
Determining that the scaled and offset set of codewords has a value less than a minimum threshold or greater than a maximum threshold;
Assigning a first codeword to the set of scaled and offset codewords having a value less than the minimum threshold;
Assigning a second codeword to the set of scaled and offset codewords having a value greater than the maximum threshold.
The method of claim 9.

Signaling information indicating one or more of the first codeword, the second codeword, the maximum threshold, or the minimum threshold in a bitstream or as side information 15. The method of claim 14, further comprising:

Determining that the first set of color values has a value belonging to a first segment of the first dynamic range;
Determining that the second set of color values has a value belonging to a second segment of the first dynamic range; and
Pre-processing the color value comprises:
Scaling and offsetting the first set of color values using a first set of scaling and offset parameters;
Scaling and offsetting the second set of color values using a second set of scaling and offset parameters;
The method of claim 9.

Determining that a first set of codewords from the plurality of codewords has a value belonging to a first partition of the second dynamic range;
Determining that a second set of codewords from the plurality of codewords has a value belonging to a second segment of the second dynamic range; and
Post-processing the plurality of codewords;
Scaling and offsetting said first set of codewords using a first set of scaling and offset parameters;
Scaling and offsetting the second set of codewords using a second set of scaling and offset parameters;
The method of claim 9.

A device for video processing,
A video data memory configured to store video data;
A video post processor comprising at least one of a fixed function or a programmable circuit configuration, the video post processor comprising:
Receiving a first plurality of codewords representing a shortened color value of video data from the video data memory, wherein the shortened color value represents a color in a first dynamic range;
Non-shortening a second plurality of codewords based on the first plurality of codewords using an inverse static transfer function that is not adaptable to the video data to generate unshortened color values The first plurality of codewords, wherein the non-shortened color value represents a color in a second different dynamic range, and the second plurality of codewords are reverse post-processed Or one of the codewords from the first plurality of codewords;
At least one of reverse post-processing the first plurality of code words to generate the second plurality of code words, or reverse pre-processing the non-shortened color values;
Outputting the non-shortened color value or the reverse preprocessed non-shortened color value;
device.

The video post processor is
Reverse post-processing the first plurality of codewords to generate the second plurality of codewords;
Configured to reverse pre-process the non-shortened color values to produce the reverse-preprocessed non-shortened color values;
The device of claim 18.

For reverse post-processing, the video post processor is configured to reverse scale and reverse offset the first plurality of codewords, and for reverse pre-processing, the video post processor 19. The device of claim 18, wherein the device is configured to inverse scale and inverse offset the non-shortened second plurality of codewords obtained from doing so.

The video post processor is
De-scaling and inverse-offset the non-shortened second plurality of code words resulting from the de-scaling and reverse- offsetting the first plurality of code words and the non-shortening Configured to derive scaling and offset parameters for one or both of the received bitstreams or separately signaled side information;
21. A device according to claim 20.

The video post processor is
Determining that a first set of codewords from the first plurality of codewords represents a shortened color value having a value less than a minimum threshold or greater than a maximum threshold;
Configured to determine that a second set of codewords from the first plurality of codewords represents a shortened color value having a value greater than or equal to the minimum threshold and less than or equal to the maximum threshold;
In order to reverse post-process the first plurality of codewords, the video post processor includes:
Assigning a first codeword to a codeword smaller than the minimum threshold of the first set of codewords;
Assigning a second codeword to a codeword greater than the maximum threshold of the first set of codewords;
Configured to inversely scale and inverse offset the second set of codewords;
The device of claim 18.

The video post processor is
Based on information received from side information signaled from a bitstream or separately signaled indicating the first codeword, the second codeword, the maximum threshold, or the minimum threshold, Determining one or more of a first codeword, the second codeword, the maximum threshold, and the minimum threshold, or the first codeword, the second code Configured to apply a process for determining a word, the maximum threshold, or the minimum threshold;
23. A device according to claim 22.

The video post processor is
Determining that a first set of codewords from the first plurality of codewords represents a shortened color value having a value belonging to a first partition of the first dynamic range;
Configured to determine that a second set of codewords from the first plurality of codewords represents a shortened color value having a value belonging to a second segment of the first dynamic range;
In order to reverse post-process the first plurality of codewords, the video post processor includes:
Using the first set of scaling and offset parameters to scale and offset the first set of codewords;
Configured to scale and offset the second set of codewords using a second set of scaling and offset parameters;
The device of claim 18.

The video post processor is
Determining that the first set of non-shortened color values represents a non-shortened color value having a value belonging to a first segment of the second dynamic range;
Configured to determine that the second set of non-shortened color values represents a non-shortened color value having a value belonging to a second segment of the second dynamic range;
To reverse preprocess the non-shortened color values, the video post processor
Scaling and offset the first set of non-shortened color values using a first set of scaling and offset parameters;
Configured to scale and offset the second set of non-shortened color values using a second set of scaling and offset parameters;
The device of claim 18.

A device for video processing,
A video data memory configured to store video data;
A video preprocessor comprising at least one of a fixed function or a programmable circuit configuration, the video preprocessor comprising:
Receiving a plurality of color values of video data representing colors in a first dynamic range from the video data memory;
Shortening the color value using a static transfer function that is not adaptable to the shortened video data to generate a plurality of codewords representing the shortened color value, The value represents a color in a second different dynamic range;
At least one of pre-processing the color value before shortening to generate the color value to be shortened, or post-processing a codeword resulting from the shortening of the color value; ,
Outputting a color value based on one of the shortened color value or the post-processed shortened color value,
device.

The video preprocessor is
Pre-processing the color values before shortening to produce the color values to be shortened;
Configured to post-process codewords resulting from the shortening of the color values;
27. A device according to claim 26.

For preprocessing, the video preprocessor is configured to scale and offset the input linear color value, and for postprocessing, the video preprocessor scales and encodes the codeword resulting from the shortening. 27. The device of claim 26, configured to be offset.

A video encoder, wherein the video preprocessor is configured to either scale and offset the input linear color value to the video encoder and scale and offset the codeword resulting from the shortening. 30. The device of claim 28, configured to cause scaling and offset parameters for or both to be signaled in the bitstream or as side information.

To scale and offset the input linear color value, the video preprocessor is configured to adaptively determine a scaling factor and an offset factor based on the input linear color value, and to scale and offset the codeword 29. The device of claim 28, wherein the video preprocessor is configured to adaptively determine a scaling factor and an offset factor based on the codeword to enhance use of available codeword space.

To post-process the codeword, the video preprocessor
Scaling and offsetting the codeword resulting from the shortening of the color value;
Determining that the set of scaled and offset codewords has a value less than a minimum threshold or greater than a maximum threshold;
Assigning a first codeword to the set of scaled and offset codewords having a value less than the minimum threshold;
Configured to assign a second codeword to the set of scaled and offset codewords having a value greater than the maximum threshold;
27. A device according to claim 26.

The video preprocessor further includes one or more of the first codeword, the second codeword, the maximum threshold, or the minimum threshold to the video encoder. 32. The device of claim 31, configured to cause the indicating information to be signaled in a bitstream or as side information.

The video preprocessor is
Determining that the first set of color values has a value belonging to a first partition of the first dynamic range;
Configured to determine that the second set of color values has a value belonging to a second segment of the first dynamic range;
In order to preprocess the color values, the video preprocessor
Scaling and offsetting the first set of color values using a first set of scaling and offset parameters;
Configured to scale and offset the second set of color values using a second set of scaling and offset parameters;
27. A device according to claim 26.

The video preprocessor is
Determining that a first set of codewords from the plurality of codewords has a value belonging to a first partition of the second dynamic range;
Configured to determine that a second set of codewords from the plurality of codewords has a value belonging to a second segment of the second dynamic range;
In order to post-process the plurality of codewords, the video preprocessor
Using the first set of scaling and offset parameters to scale and offset the first set of codewords;
Configured to scale and offset the second set of codewords using a second set of scaling and offset parameters;
27. A device according to claim 26.

When executed, to one or more processors of the device for video processing,
Receiving a first plurality of codewords representing shortened color values of video data, wherein the shortened color values represent colors in a first dynamic range;
Non-shortening a second plurality of codewords based on the first plurality of codewords using an inverse static transfer function that is not adaptable to the video data to generate unshortened color values The first plurality of codewords, wherein the non-shortened color value represents a color in a second different dynamic range, and the second plurality of codewords are reverse post-processed Or one of the codewords from the first plurality of codewords;
At least one of reverse post-processing the first plurality of code words to generate the second plurality of code words, or reverse pre-processing the non-shortened color values;
A computer readable storage medium storing instructions for outputting the non-shortened color value or the reverse preprocessed non-shortened color value.

Reverse post-processing the first plurality of codewords to generate the second plurality of codewords to the one or more processors, or reverse preprocessing the non-shortened color values. The instructions causing at least one of the instructions to the one or more processors,
Reverse post-processing the first plurality of codewords to generate the second plurality of codewords;
Instructions for reverse pre-processing the non-shortened color value to produce the non-shortened color value that has been reverse pre-processed;
36. A computer readable storage medium according to claim 35.

A device for video processing,
Means for receiving a first plurality of codewords representing shortened color values of video data, wherein the shortened color values represent colors in a first dynamic range;
Non-shortening a second plurality of codewords based on the first plurality of codewords using an inverse static transfer function that is not adaptable to the video data to generate unshortened color values The non-shortened color value represents a color in a second different dynamic range, and the second plurality of codewords are reverse post-processed. Means, which is one of a codeword or a codeword from said first plurality of codewords;
At least one of means for reverse post-processing the first plurality of codewords to generate the second plurality of codewords, or means for reverse preprocessing the non-shortened color values; ,
Means for outputting the non-shortened color value or the reverse preprocessed non-shortened color value.

The means for reverse post-processing the first plurality of codewords to generate the second plurality of codewords;
38. The device of claim 37, further comprising: said means for reverse preprocessing the non-shortened color values to produce the non-shortened color values that have been reverse preprocessed.