JP6395750B2

JP6395750B2 - Signal reconstruction for high dynamic range signals

Info

Publication number: JP6395750B2
Application number: JP2016054345A
Authority: JP
Inventors: アトキンス，ロビン; イン，ペン; ルー，タオラン; ピットラーツ，ジャクリン，アン
Original assignee: ドルビーラボラトリーズライセンシングコーポレイション
Priority date: 2016-03-17
Filing date: 2016-03-17
Publication date: 2018-09-26
Anticipated expiration: 2036-03-17
Also published as: JP2017169134A

Description

本発明は、画像全般に関する。より具体的には、本発明の一実施形態は、下位互換性を改善するための、ハイダイナミックレンジを有する画像の信号再構成に関する。 The present invention relates generally to images. More specifically, one embodiment of the present invention relates to signal reconstruction of an image having a high dynamic range to improve backward compatibility.

本明細書において、用語「ダイナミックレンジ」（ＤＲ）は、人間の視覚システム（ＨＶＳ）が画像においてある範囲の強度（例えば、輝度、ルマ）（例えば、最暗部（黒）から最も明るい白（すなわち）まで）を知覚する能力に関連し得る。この意味では、ＤＲはシーン−リファード（ｓｃｅｎｅ−ｒｅｆｅｒｒｅｄ）の強度に関する。ＤＲはまた、ディスプレイデバイスが特定の幅を有する強度範囲を妥当にまたは近似的に描画する能力にも関連し得る。この意味では、ＤＲは、ディスプレイ−リファード（ｄｉｓｐｌａｙ−ｒｅｆｅｒｒｅｄ）の強度に関する。本明細書中の任意の箇所において、ある特定の意味が特に明示的に指定されている場合を除いて、この用語はどちらの意味としても（例えば、区別なく）使用できるものとする。 As used herein, the term “dynamic range” (DR) refers to a range of intensities (eg, luminance, luma) (eg, darkest (black) to brightest white (ie, luminance) in the human visual system (HVS). ))) May be related to the ability to perceive. In this sense, DR relates to the intensity of scene-referred. DR may also relate to the ability of the display device to reasonably or approximately draw an intensity range with a particular width. In this sense, DR relates to the intensity of display-referred. It is intended that the term be used interchangeably (eg, without distinction) anywhere in the specification, unless a specific meaning is specifically designated.

本明細書において、ハイダイナミックレンジ（ＨＤＲ）という用語は、人間の視覚システム（ＨＶＳ）において１４〜１５桁ほどにわたるＤＲ幅に関する。実際において、人間が広範囲の強度範囲を同時に知覚し得るＤＲは、ＨＤＲに対して幾分端折られ得る。本明細書において、エンハンストダイナミックレンジ（ＥＤＲ）または視覚ダイナミックレンジ（ＶＤＲ）という用語は、個別にまたは区別なく、人間の視覚システム（ＨＶＳ）（眼球運動を含み、シーンまたは画像にわたってある程度の明順応変化を可能にする）が、あるシーンまたは画像中において知覚可能なＤＲに関する。本明細書において、ＥＤＲは、５〜６桁にわたるＤＲに関連し得る。従って、真のシーンリファードのＨＤＲに対しては幾分狭いものの、ＥＤＲは広いＤＲ幅を表し、ＨＤＲとも呼ばれ得る。 As used herein, the term high dynamic range (HDR) relates to a DR width that spans 14-15 orders of magnitude in the human visual system (HVS). In practice, a DR that allows a human to perceive a wide range of intensity simultaneously can be somewhat broken with respect to HDR. As used herein, the terms enhanced dynamic range (EDR) or visual dynamic range (VDR), either individually or indistinguishably, include the human visual system (HVS) (including eye movements, some degree of light adaptation change over a scene or image). Is related to a DR that can be perceived in a scene or image. As used herein, EDR can be associated with DR spanning 5-6 orders of magnitude. Thus, although somewhat narrower for true scene-referred HDR, EDR represents a wider DR width and may also be referred to as HDR.

実際において、画像は１つ以上の色成分（例えばルマＹおよびクロマＣｂおよびＣｒ）を有しており、各色成分は、画素あたりｎビットの精度（例えばｎ＝８）で表される。線形輝度符号化（ｌｉｎｅａｒｌｕｍｉｎａｎｃｅｃｏｄｉｎｇ）を用いた場合、ｎ≦８の画像（例えばカラー２４ビットＪＰＥＧ画像）はスタンダードダイナミックレンジとされ、ｎ＞８の画像はエンハンストダイナミックレンジの画像とされる。ＥＤＲおよびＨＤＲ画像はまた、ＩｎｄｕｓｔｒｉａｌＬｉｇｈｔａｎｄＭａｇｉｃが開発したＯｐｅｎＥＸＲファイルフォーマットなどの高精度の（例えば１６ビット）浮動小数点フォーマットを用いて、格納および配信され得る。 In practice, the image has one or more color components (eg luma Y and chroma Cb and Cr), and each color component is represented with an accuracy of n bits per pixel (eg n = 8). When linear luminance coding is used, an image with n ≦ 8 (for example, a color 24-bit JPEG image) is a standard dynamic range, and an image with n> 8 is an image with an enhanced dynamic range. EDR and HDR images can also be stored and distributed using high-precision (eg, 16-bit) floating point formats such as the OpenEXR file format developed by Industrial Light and Magic.

ある映像ストリームが与えられたとき、その符号化パラメータに関する情報は、典型的にはメタデータとしてビットストリーム中に埋め込まれる。本明細書において、「メタデータ」の語は、符号化ビットストリームの一部として送信され、デコーダが復号化画像を描画することを助ける、任意の補助的情報に関する。そのようなメタデータは、本明細書において記載されるような、色空間または色域情報、リファレンスディスプレイパラメータ、および補助的な信号パラメータなどを含むが、これらに限定されない。 When a video stream is given, information on the encoding parameters is typically embedded in the bitstream as metadata. As used herein, the term “metadata” refers to any auxiliary information that is transmitted as part of the encoded bitstream and helps the decoder render the decoded image. Such metadata includes, but is not limited to, color space or gamut information, reference display parameters, auxiliary signal parameters, and the like, as described herein.

ほとんどのコンシューマー用デスクトップディスプレイは現在、２００〜３００ｃｄ／ｍ²またはニトの輝度をサポートしている。ほとんどのコンシューマー用ＨＤＴＶは３００〜５００ニトの範囲であるが、新しいモデルは１０００ニト（ｃｄ／ｍ²）に達する。このような従来のディスプレイはしたがって、ＨＤＲやＥＤＲに対し、より低いダイナミックレンジ（ＬＤＲ）（またはスタンダードダイナミックレンジ（ＳＤＲ）とも呼ばれる）の典型例となる。キャプチャ機器（例えばカメラ）およびＨＤＲディスプレイ（例えばＤｏｌｂｙＬａｂｏｒａｔｏｒｉｅｓのＰＲＭ−４２００プロフェッショナルリファレンスモニター）両方の進化によって、ＨＤＲコンテンツの普及率が高まるにつれ、ＨＤＲコンテンツはカラーグレーディングされてより高いダイナミックレンジ（例えば１，０００ニトから５，０００ニト以上）をサポートするＨＤＲディスプレイ上に表示されることがある。一般的に、限定しないが、本開示の方法はＳＤＲよりも高い任意のダイナミックレンジに関連する。本発明者らの理解によれば、ハイダイナミックレンジ画像の符号化のための、改良された手法が望まれる。 Most consumer desktop displays currently support brightness of 200-300 cd / m ² or nits. Most consumer HDTVs range from 300 to 500 nits, but newer models reach 1000 nits (cd / m ² ). Such conventional displays are therefore typical examples of lower dynamic range (LDR) (also called standard dynamic range (SDR)) for HDR and EDR. With the evolution of both capture devices (eg cameras) and HDR displays (eg Dolby Laboratories' PRM-4200 Professional Reference Monitor), as HDR content becomes more prevalent, HDR content is color-graded to a higher dynamic range (eg 1, 000 nits to 5,000 nits or more). In general, but not limiting, the disclosed method relates to any dynamic range higher than SDR. According to the understanding of the present inventors, an improved technique for encoding a high dynamic range image is desired.

本節に記載されているアプローチは、探求し得るアプローチではあるが、必ずしもこれまでに着想または探求されてきたアプローチではない。従って、特に反対の記載がない限り、本節に記載されたアプローチのいずれも、本節に記載されているという理由だけで従来技術としての適格性を有すると考えるべきではない。同様に、特に反対の記載がない限り、本節に基づいて、１以上のアプローチに関して特定される問題が、いずれかの先行技術において認識されたことがあると考えるべきではない。 The approaches described in this section are approaches that can be explored, but not necessarily approaches that have been conceived or explored. Thus, unless otherwise stated to the contrary, none of the approaches described in this section should be considered as eligible for prior art just because they are described in this section. Similarly, unless specified to the contrary, based on this section, problems identified with respect to one or more approaches should not be considered recognized in any prior art.

本発明の一実施形態を、限定ではなく例示として、添付図面の各図により示す。これらの図において、類似の要素には同種の参照符号を付している。 One embodiment of the invention is illustrated by way of example and not limitation in the figures of the accompanying drawings. In these figures, similar elements are given the same reference numerals.

図１は、映像配信パイプラインのためのプロセス例を示す。FIG. 1 shows an example process for a video distribution pipeline. 図２は、ＩＰＴ−ＰＱ色空間への色変換のプロセス例を示す。FIG. 2 illustrates an example process for color conversion to the IPT-PQ color space. 図３は、信号再構成および符号化のプロセス例を示す。FIG. 3 shows an example process for signal reconstruction and encoding. 図４は、本発明の一実施形態における、ＳＴ２０８４ＩＰＴとＢＴ１８６６ＩＰＴとの間の輝度再構成のためのトーンマッピング曲線例を示す。FIG. 4 shows an example tone mapping curve for luminance reconstruction between ST 2084 IPT and BT 1866 IPT in one embodiment of the present invention. 図５は、本発明の一実施形態における色空間再構成を用いた、下位互換的な符号化および復号化のためのシステム例を示す。FIG. 5 illustrates an example system for backward compatible encoding and decoding using color space reconstruction in one embodiment of the present invention. 図６は、本発明の一実施形態における、色回転・スケーリング行列を生成するためのプロセスフロー例を示す。FIG. 6 illustrates an example process flow for generating a color rotation / scaling matrix in one embodiment of the invention. 図７Ａは、本発明の一実施形態における、色相・彩度再構成関数を示す。FIG. 7A shows a hue / saturation reconstruction function in one embodiment of the present invention. 図７Ｂは、本発明の一実施形態における、色相・彩度再構成関数を示す。FIG. 7B shows a hue / saturation reconstruction function in one embodiment of the present invention. 図８は、本発明の一実施形態における、ＩＰＴ−ＰＱとＹＣｂＣｒ−ガンマ色空間との間の色相・彩度再構成の例を示す。FIG. 8 shows an example of hue / saturation reconstruction between the IPT-PQ and the YCbCr-gamma color space in an embodiment of the present invention. 図９は、本発明の一実施形態におけるＥＥＴＦ関数の一例を示す。FIG. 9 shows an example of the EETF function in one embodiment of the present invention.

ハイダイナミックレンジ（ＨＤＲ）画像の信号再構成および符号化を、本明細書に記載する。以下の説明においては、便宜上、本発明を完全に理解できるように、多数の詳細事項を説明する。ただし、これらの詳細事項が無くても本発明を実施可能であることは明白であろう。他方、本発明の説明を不必要に煩雑にしたり、不明瞭にしたり、難読化したりしないように、周知の構造およびデバイスの細かな詳細までは説明しない。 Signal reconstruction and encoding of high dynamic range (HDR) images is described herein. In the following description, for the sake of convenience, numerous details are set forth in order to provide a thorough understanding of the present invention. However, it will be apparent that the invention may be practiced without these details. On the other hand, details of well-known structures and devices are not described in detail so as not to unnecessarily obscure, obscure, or obfuscate the description of the present invention.

概要
本明細書に説明する実施形態例は、ハイダイナミックレンジ画像の再構成および符号化に関する。下位互換的な復号化を改善する方法において、エンコーダ中で、プロセッサは画像データベースにアクセスし、
第１の色空間における前記データベース内の画像の第１の色相値を算出し、
第２の色空間における前記データベース内の画像の第２の色相値を算出し、
色相コスト関数を最小にすることに基づき、色相回転角を算出し、ここで、前記色相コスト関数は第１の色相値と回転された第２の色相値との差分の尺度（ｄｉｆｆｅｒｅｎｃｅｍｅａｓｕｒｅ）に基づいており、
前記色相回転角に基づいて、色回転行列を生成する。 Overview The example embodiments described herein relate to reconstruction and encoding of high dynamic range images. In a method for improving backward compatible decoding, in an encoder, a processor accesses an image database;
Calculating a first hue value of an image in the database in a first color space;
Calculating a second hue value of an image in the database in a second color space;
A hue rotation angle is calculated based on minimizing the hue cost function, wherein the hue cost function is a measure of the difference between the first hue value and the rotated second hue value. Based on
A color rotation matrix is generated based on the hue rotation angle.

一実施形態において、前記第１の色空間はガンマベース（ｇａｍｍａ−ｂａｓｅｄ）のＹＣｂＣｒ色空間であり、前記第２の色空間はＰＱベース（ＰＱ−ｂａｓｅｄ）のＩＰＴ色空間である。 In one embodiment, the first color space is a gamma-based YCbCr color space and the second color space is a PQ-based (PQ-based) IPT color space.

一実施形態において、前記色回転行列は、好適な色空間に基づき、再構成された色空間を生成するために用いられる。前記再構成された色空間を用いて画像は符号化され、前記色回転行列に関する情報が、前記エンコーダからデコーダへ伝えられる。 In one embodiment, the color rotation matrix is used to generate a reconstructed color space based on a suitable color space. An image is encoded using the reconstructed color space, and information about the color rotation matrix is transmitted from the encoder to a decoder.

一実施形態において、再構成された色空間において符号化された入力画像をデコーダ中で再構築する方法であって、前記デコーダは、
再構成された色空間にある符号化された入力画像を受け取り、ここで、前記再構成された色空間は、好適な色空間のクロマ成分を回転させて、旧式の色空間の１つ以上のパラメータを近似することにより生成されており、
エンコーダから前記デコーダへと送信されたメタデータにアクセスし、ここで、前記メタデータは前記符号化された入力画像に対して関連付けられており、かつ前記メタデータは、
色回転・スケーリング行列が存在するか否かを示すフラグと、
前記フラグが色回転・スケーリング行列の存在を示している場合において、前記色回転・スケーリング行列用の複数の係数と、
を含んでおり、
前記符号化された入力画像を復号化することにより、前記再構成された色空間における復号化された画像を生成し、
前記再構成された色空間における前記復号化された画像と、前記色回転・スケーリング行列とに基づき、前記好適な色空間における復号化された画像を生成する。 In one embodiment, a method for reconstructing an input image encoded in a reconstructed color space in a decoder, the decoder comprising:
Receives an encoded input image in a reconstructed color space, wherein the reconstructed color space rotates one or more of the old color spaces by rotating the chroma components of the preferred color space. Generated by approximating parameters,
Accessing metadata transmitted from an encoder to the decoder, wherein the metadata is associated with the encoded input image, and the metadata is
A flag indicating whether a color rotation / scaling matrix exists,
When the flag indicates the presence of a color rotation / scaling matrix, a plurality of coefficients for the color rotation / scaling matrix;
Contains
Generating a decoded image in the reconstructed color space by decoding the encoded input image;
Based on the decoded image in the reconstructed color space and the color rotation / scaling matrix, a decoded image in the preferred color space is generated.

別の実施形態において、エンコーダ中で、プロセッサは、
好適な色空間にある入力画像を受け取り、
色相回転関数にアクセスし、ここで前記色相回転関数は、前記好適な色空間にある入力画像中の１つの画素の色相値について、ある色相コスト条件にしたがえば旧式の色空間中の１つの色相値にマッチするような、回転された色相出力値を生成し、
前記入力画像と前記色相回転関数とに基づき、再構成された画像を生成し、
前記再構成された画像を符号化することにより、符号化された再構成画像を生成する。 In another embodiment, in the encoder, the processor
Receives an input image in a suitable color space;
Accessing a hue rotation function, wherein the hue rotation function is a hue value for one pixel in the input image in the preferred color space, according to a hue cost condition, and one in an old color space. Generate a rotated hue output value that matches the hue value,
Generating a reconstructed image based on the input image and the hue rotation function;
An encoded reconstructed image is generated by encoding the reconstructed image.

別の実施形態において、デコーダ中で、プロセッサは、
再構成された色空間において符号化された入力画像にアクセスし、
前記入力画像に関連付けられたメタデータにアクセスし、ここで、前記メタデータは、前記入力画像を好適な色空間から前記再構成された色空間へと変換するために用いられる、色相回転関数に対して関連付けられたデータを含んでおり、ここで前記色相回転関数は、前記好適な色空間にある入力画像中の１つの画素の色相値について、ある色相コスト条件にしたがえば旧式の色空間中の１つの色相値にマッチするような、回転された色相出力値を生成し、入力画像と、前記色相回転関数に関連付けられたデータとに基づき、前記好適な色空間における出力画像を生成する。 In another embodiment, in the decoder, the processor
Accessing the input image encoded in the reconstructed color space;
Accessing metadata associated with the input image, wherein the metadata is a hue rotation function used to convert the input image from a suitable color space to the reconstructed color space; The hue rotation function is an old color space according to a hue cost condition for a hue value of one pixel in the input image in the preferred color space. Generating a rotated hue output value that matches one of the hue values, and generating an output image in the preferred color space based on the input image and the data associated with the hue rotation function .

映像供給処理パイプライン例
図１は、従来の映像配信パイプライン（１００）のプロセス例を示しており、映像のキャプチャから映像コンテンツの表示までの、様々な段を示している。映像フレームのシーケンス（１０２）を、画像生成ブロック（１０５）を用いてキャプチャまたは生成する。映像フレームは、デジタル的にキャプチャされるか（例えばデジタルカメラにより）あるいはコンピュータ（例えばコンピュータアニメーションを用いて）によって生成され、これにより映像データ（１０７）が得られる。あるいは映像フレーム（１０２）は、銀塩カメラによってフィルムに取得されてもよい。フィルムをデジタルフォーマットに変換することによって、映像データ（１０７）が得られてもよい。プロダクションフェーズ（１１０）において、映像データ（１０７）は編集され、映像プロダクションストリーム（１１２）を得る。 Example of Video Supply Processing Pipeline FIG. 1 shows a process example of a conventional video distribution pipeline (100), showing various stages from video capture to video content display. A sequence of video frames (102) is captured or generated using an image generation block (105). Video frames are captured digitally (eg, by a digital camera) or generated by a computer (eg, using computer animation), thereby obtaining video data (107). Alternatively, the video frame (102) may be acquired on film by a silver halide camera. Video data (107) may be obtained by converting the film to a digital format. In the production phase (110), the video data (107) is edited to obtain a video production stream (112).

プロダクションストリーム（１１２）の映像データは次に、ブロック（１１５）のプロセッサに提供されて、ポストプロダクション編集を受ける。ポストプロダクション編集（１１５）は、画像の特定の領域の色または明るさを調節または変更することにより、映像制作者の制作意図にしたがって、画質を上げたり、その画像が特定の見え方をするようにすることを含み得る。これは、「カラータイミング」あるいは「カラーグレーディング」と呼ばれることがある。ブロック（１１５）において、その他の編集（例えば、シーン選択およびシーケンシング、画像クロッピング、コンピュータ生成された視覚的特殊効果の追加など）を行うことにより、プロダクションの、配信用の最終バージョン（１１７）を作成してもよい。ポストプロダクション編集（１１５）において、映像イメージは、リファレンスディスプレイ（１２５）上で視聴される。 The video data of the production stream (112) is then provided to the processor of block (115) for post production editing. Post-production editing (115) adjusts or changes the color or brightness of a specific area of the image to improve the image quality or make the image look specific according to the production intention of the video producer. Can include. This is sometimes called “color timing” or “color grading”. At block (115), the final version (117) for distribution of the production is made by performing other edits (eg, scene selection and sequencing, image cropping, addition of computer generated visual special effects, etc.). You may create it. In post-production editing (115), the video image is viewed on a reference display (125).

ポストプロダクション（１１５）の後、最終プロダクションとしての映像データ（１１７）は、下流のテレビ受像機、セットトップボックス、映画館などの復号化・再生機器まで供給されるために、符号化ブロック（１２０）に供給されてもよい。いくつかの実施形態において、符号化ブロック（１２０）は、符号化されたビットストリーム（１２２）を生成するための、ＡＴＳＣ、ＤＶＢ、ＤＶＤ、ブルーレイおよびその他の供給フォーマットに規定されるような音声および映像エンコーダを有していてもよい。受信機において、符号化されたビットストリーム（１２２）は、復号化ユニット（１３０）により復号化されることにより、信号（１１７）と同一またはこれに近い近似を表す、復号化された信号（１３２）を生成し得る。受信機は、リファレンスディスプレイ（１２５）と全く異なる特性を有し得るターゲットディスプレイ（１４０）に取り付けられていてもよい。その場合、ディスプレイマネジメントブロック（１３５）を用いてディスプレイマッピング化信号（１３７）を生成することで、復号化された信号（１３２）のダイナミックレンジを、ターゲットディスプレイ（１４０）の特性にマッピングしてもよい。 After the post-production (115), the video data (117) as the final production is supplied to a decoding / playback device such as a downstream television receiver, set-top box, movie theater or the like. ). In some embodiments, the encoding block (120) includes audio and audio as defined in ATSC, DVB, DVD, Blu-ray and other delivery formats to generate an encoded bitstream (122). You may have a video encoder. At the receiver, the encoded bitstream (122) is decoded by the decoding unit (130) to produce a decoded signal (132) representing the same or close approximation as the signal (117). ) May be generated. The receiver may be attached to a target display (140) that may have completely different characteristics than the reference display (125). In that case, the display mapping signal (137) is generated using the display management block (135), so that the dynamic range of the decoded signal (132) is mapped to the characteristics of the target display (140). Good.

ＩＰＴ−ＰＱ色空間
好適な実施形態において、限定しないが、処理パイプラインの一部、例えば、符号化（１２０）、復号化（１３０）、およびディスプレイマネジメント（１３５）を、ここにおいてＩＰＴ−ＰＱ色空間と呼ぶものの中で行い得る。ディスプレイマネジメント用途でのＩＰＴ−ＰＱ色空間の使用の一例を、Ｒ．Ａｔｋｉｎｓら「ＤｉｓｐｌａｙＭａｎａｇｅｍｅｎｔｆｏｒＨｉｇｈＤｙｎａｍｉｃＲａｎｇｅＶｉｄｅｏ」ＷＩＰＯ公開ＷＯ２０１４／１３０３４３に見出すことができる。その全文を本願において援用する。「Ｄｅｖｅｌｏｐｍｅｎｔａｎｄｔｅｓｔｉｎｇｏｆａｃｏｌｏｒｓｐａｃｅ（ｉｐｔ）ｗｉｔｈｉｍｐｒｏｖｅｄｈｕｅｕｎｉｆｏｒｍｉｔｙ」、Ｆ．ＥｂｎｅｒおよびＭ．Ｄ．Ｆａｉｒｃｈｉｌｄ、Ｐｒｏｃ．６^th ＣｏｌｏｒＩｍａｇｉｎｇＣｏｎｆｅｒｅｎｃｅ：ＣｏｌｏｒＳｃｉｅｎｃｅ，Ｓｙｓｔｅｍｓ，ａｎｄＡｐｐｌｉｃａｔｉｏｎｓ、ＩＳ＆Ｔ、アリゾナ州スコッツデール、１９９８年１１月、ｐｐ．８−１３（以降Ｅｂｎｅｒ文献と呼ぶ。本願においてその全文を援用する）に記載されたＩＰＴ色空間は、人間の視覚システムにおける錐体間の色差のモデルである。この意味においては、ＹＣｂＣｒやＣＩＥ−Ｌａｂ色空間のようなものであるが、いくつかの科学研究において、これらの空間よりも人間の視覚処理をよりよく模擬するものであることが分かっている。ＣＩＥ−Ｌａｂと同様に、ＩＰＴは何らかの基準輝度に対して正規化された空間である。一実施形態において、正規化は、ターゲットディスプレイの最大輝度（例えば５，０００ニト）に基づく。 IPT-PQ Color Space In a preferred embodiment, but not limited to, a portion of the processing pipeline, such as encoding (120), decoding (130), and display management (135), where IPT-PQ color You can do it in what you call space. An example of the use of the IPT-PQ color space for display management applications is described in R.C. Atkins et al. “Display Management for High Dynamic Range Video” WIPO publication WO 2014/130343. The full text is incorporated herein by reference. “Development and testing of a color space (ipt) with implied hue uniformity”, F.M. Ebner and M.M. D. Fairchild, Proc. 6 ^th Color Imaging Conference: Color Science, Systems, and Applications, IS & T, Scottsdale, Ariz. The IPT color space described in 8-13 (hereinafter referred to as Ebner literature, which is incorporated herein in its entirety) is a model of color differences between cones in the human visual system. In this sense, it is like the YCbCr or CIE-Lab color space, but in some scientific studies it has been found that it simulates human visual processing better than these spaces. Similar to CIE-Lab, IPT is a space normalized to some reference luminance. In one embodiment, normalization is based on the maximum brightness of the target display (eg, 5,000 nits).

本明細書において、用語「ＰＱ」は知覚的量子化を指す。人間の視覚システムは、光レベルの増大に対して非常に非線形的に反応する。人間が刺激を見る能力は、その刺激の輝度、その刺激の大きさ、その刺激を構成する空間周波数、および、その刺激を見ている瞬間までに目が適応した輝度レベルに影響される。好適な実施形態において、知覚的量子化器関数は、線形入力グレイレベルを、人間の視覚システムにおけるコントラスト感度閾値によりマッチした出力グレイレベルにマッピングする。ＰＱマッピング関数の例は、Ｊ．Ｓ．Ｍｉｌｌｅｒらの米国特許Ｓｅｒ．Ｎｏ．９，０７７，９９４（‘９９４特許と呼ぶ）」に記載されており、この出願の全文を本願に援用する。その一部は、ＳＭＰＴＥＳＴ２０８４：２０１４規格において、「ＨｉｇｈＤｙｎａｍｉｃＲａｎｇｅＥｌｅｃｔｒｏ−ｏｐｔｉｃａｌＴｒａｎｓｆｅｒＦｕｎｃｔｉｏｎｏｆＭａｓｔｅｒｉｎｇＲｅｆｅｒｅｎｃｅＤｉｓｐｌａｙｓ」２０１４年８月１６日の名称で採用されている（全文を本願に援用する）。ある固定刺激サイズに対して、それぞれの輝度レベル（即ち、刺激レベル）について、最高感度の適応レベルおよび最高感度の空間周波数（ＨＶＳモデルによる）に応じて、その輝度レベルにおける最小可視コントラストステップを選択する。物理的な陰極線管（ＣＲＴ）装置の応答曲線を表しており、人間の視覚システムの応答の仕方に対して非常に大まかな類似性を偶然有し得る従来のガンマ曲線と比較して、‘９９４特許において決定されているＰＱ曲線は、比較的シンプルな関数モデルを用いながら人間の視覚システムの本当の視覚応答を模擬している。 As used herein, the term “PQ” refers to perceptual quantization. The human visual system responds very nonlinearly to increased light levels. A person's ability to see a stimulus is affected by the brightness of the stimulus, the magnitude of the stimulus, the spatial frequencies that make up the stimulus, and the brightness level to which the eye has adapted up to the moment of watching the stimulus. In a preferred embodiment, the perceptual quantizer function maps a linear input gray level to an output gray level that is matched by a contrast sensitivity threshold in the human visual system. Examples of PQ mapping functions are described in J. S. Miller et al., US Patent Ser. No. 9,077,994 (referred to as the '994 patent), the entire text of which is incorporated herein by reference. A part thereof is adopted in the name of "High Dynamic Range Electro-Optical Transfer Function of Mastering Reference Display" on August 16, 2014 in the SMPTE ST 2084: 2014 standard. For a fixed stimulus size, for each brightness level (ie, stimulus level), select the minimum visible contrast step at that brightness level, depending on the highest sensitivity adaptation level and the highest sensitivity spatial frequency (according to the HVS model) To do. It represents the response curve of a physical cathode ray tube (CRT) device, compared to the conventional gamma curve, which can have a very rough similarity to how the human visual system responds, '994. The PQ curve determined in the patent simulates the true visual response of the human visual system using a relatively simple functional model.

図２は、一実施形態におけるＩＰＴ−ＰＱ色空間への色変換のための、プロセス例（２００）をより詳細に示す。図２に示すように、第１の色空間（例えばＲＧＢ）にある入力信号（２０２）が与えられたとき、知覚的に補正されたＩＰＴ色空間（ＩＰＴ−ＰＱ）における色空間変換は、以下のステップを含み得る。すなわち、
ａ）オプションのステップ（２１０）により、入力信号（２０２）の画素値（例えば０から４０９５）を、０と１の間のダイナミックレンジを有する画素値に正規化してもよい。
ｂ）入力信号（２０２）が（例えばＢＴ．１８６６またはＳＭＰＴＥＳＴ２０８４に準拠して）ガンマ符号化またはＰＱ符号化されている場合、オプションのステップ（２１５）として、信号の電気光学伝達関数（ＥＯＴＦ）（信号メタデータにより提供される）を用いて、ソースディスプレイによる符号値から輝度への変換を、戻すか取り消してもよい。例えば、入力信号がガンマ符号化されている場合、このステップでは逆ガンマ関数を適用する。入力信号がＳＭＰＴＥＳＴ２０８４に従いＰＱ符号化されているなら、このステップは逆ＰＱ関数を適用する。実際においては、予め算出された１−Ｄルックアップテーブル（ＬＵＴ）を用いて正規化ステップ（２１０）および逆非線形的符号化（２１５）を行うことにより、線形信号２１７を生成してもよい。
ｃ）ステップ（２２０）において、線形信号２１７を元の色空間（例えばＲＧＢ、ＸＹＺなど）からＬＭＳ色空間へと変換する。例えば、もし元の信号がＲＧＢならば、このステップは２つのステップを包含し得る。すなわち、ＲＧＢからＸＹＺへの色変換および、ＸＹＺからＬＭＳへの色変換である。一実施形態において、限定しないが、ＸＹＺからＬＭＳへの変換は、以下により与えられてもよい。

FIG. 2 illustrates in more detail an example process (200) for color conversion to the IPT-PQ color space in one embodiment. As shown in FIG. 2, when an input signal (202) in a first color space (eg, RGB) is given, the color space conversion in the IPT color space (IPT-PQ) corrected perceptually is as follows: Steps may be included. That is,
a) An optional step (210) may normalize the pixel values (eg, 0 to 4095) of the input signal (202) to pixel values having a dynamic range between 0 and 1.
b) If the input signal (202) is gamma encoded or PQ encoded (eg according to BT.1866 or SMPTE ST 2084), as an optional step (215), the electro-optic transfer function (EOTF) of the signal ) (Provided by the signal metadata) may be used to undo or cancel the code value to luminance conversion by the source display. For example, if the input signal is gamma encoded, an inverse gamma function is applied in this step. If the input signal is PQ encoded according to SMPTE ST 2084, this step applies an inverse PQ function. In practice, the linear signal 217 may be generated by performing the normalization step (210) and the inverse nonlinear encoding (215) using a pre-calculated 1-D lookup table (LUT).
c) In step (220), the linear signal 217 is converted from the original color space (eg, RGB, XYZ, etc.) to the LMS color space. For example, if the original signal is RGB, this step can include two steps. That is, color conversion from RGB to XYZ and color conversion from XYZ to LMS. In one embodiment, but not limited to, XYZ to LMS conversion may be given by:

別の実施形態において、本願においてその全文を援用する２０１４年９月２６日付け出願の「Ｅｎｃｏｄｉｎｇａｎｄｄｅｃｏｄｉｎｇｐｅｒｃｅｐｔｕａｌｌｙ−ｑｕａｎｔｉｚｅｄｖｉｄｅｏｃｏｎｔｅｎｔ」の名称を有する米国仮特許出願Ｓｅｒ．Ｎｏ．６２／０５６，０９３（また２０１５年９月２４日付けでＰＣＴ／ＵＳ２０１５／０５１９６４としても出願）に記載されているように、ＸＹＺからＬＭＳへの変換の一部として以下のクロストーク行列

を導入することにより、ＩＰＴ−ＰＱ色空間での全体的な符号化効率はさらに増大され得る。例えば、ｃ＝０．０２について、クロストーク行列に式（１ａ）の３×３行列を乗算することにより、以下が得られる。

同様に、ｃ＝０．０４について、別の実施形態において、クロストーク行列に、元のＸＹＺからＬＭＳへの行列（例えば式（１ａ））を乗算することにより、以下が得られる。すなわち、

ｄ）Ｅｂｎｅｒ文献によれば、従来のＬＭＳからＩＰＴへの色空間変換は、まずＬＭＳデータに非線形のべき関数を適用し、次に線形変換行列を適用することを含む。データをＬＭＳからＩＰＴに変換したのちにＰＱ関数を適用することによりＩＰＴ−ＰＱドメインに入れることも可能であるが、好適な実施形態においては、ステップ（２２５）でのＬＭＳからＩＰＴへの非線形的符号化のための従来のべき関数に替えて、代わりにＬ、ＭおよびＳ成分の各々のＰＱ非線形的符号化を行う。
ｅ）ＬＭＳからＩＰＴへの線形変換（例えばＥｂｎｅｒ文献に定義される）を用いることにより、ステップ（２３０）は、信号２２２のＩＰＴ−ＰＱ色空間への変換を完了する。例えば、一実施形態において、Ｌ’Ｍ’Ｓ’からＩＰＴ−ＰＱへの変換は、以下によって与えられ得る。

別の実施形態において、実験からＩ’成分はＳ’成分に依存せずに導出することが好ましくあり得ることがわかっており、したがって式（２ａ）は、

となる。 In another embodiment, U.S. Provisional Patent Application Ser. Having the name “Encoding and decoding perceptually-quantized video content”, filed September 26, 2014, which is incorporated herein in its entirety. No. 62 / 056,093 (also filed as PCT / US2015 / 051964 dated September 24, 2015), as part of the XYZ to LMS conversion, the following crosstalk matrix:

, The overall coding efficiency in the IPT-PQ color space can be further increased. For example, for c = 0.02, multiplying the crosstalk matrix by the 3 × 3 matrix of equation (1a) yields:

Similarly, for another embodiment, for c = 0.04, multiplying the crosstalk matrix by the original XYZ to LMS matrix (eg, equation (1a)) yields: That is,

d) According to Ebner literature, conventional color space conversion from LMS to IPT involves first applying a non-linear power function to LMS data and then applying a linear transformation matrix. Although it is possible to enter the IPT-PQ domain by applying the PQ function after converting the data from LMS to IPT, in the preferred embodiment, the non-linear LMS to IPT in step (225) Instead of the conventional power function for encoding, PQ nonlinear encoding of each of the L, M, and S components is performed instead.
e) Step (230) completes the conversion of the signal 222 to the IPT-PQ color space by using a linear transformation from LMS to IPT (eg as defined in Ebner literature). For example, in one embodiment, the conversion from L′ M ′S ′ to IPT-PQ may be given by:

In another embodiment, it has been found from experiments that it may be preferable to derive the I ′ component independent of the S ′ component, so that equation (2a) is

It becomes.

ＩＰＴ−ＰＱ対ＹＣｂＣｒ−ガンマ
ＭＰＥＧ−１、ＭＰＥＧ−２、ＡＶＣ、ＨＥＶＣなどの既存のビデオ圧縮規格のほとんどは、ＹＣｂＣｒ色空間におけるガンマ符号化された画像用にテスト、評価、および最適化されている。しかし実験結果から、各色成分につき画素あたり１０ビット以上のハイダイナミックレンジ画像に対しては、ＩＰＴ−ＰＱ色空間がより良い表現フォーマットを提供し得ることがわかっている。ＨＤＲやさらに広色域の信号により適した色空間（例えばＩＰＴ−ＰＱ）で信号符号化すれば、よりよい全体的な画質は生み得るが、旧式のデコーダ（例えばセットトップボックスなど）が適正な復号化および色変換を行うことができない可能性がある。新しい色空間を知らないような機器であっても妥当な画（像）を生成できるように下位互換性を改善するため、発明者らの理解によれば、新規な信号再構成技術が必要である。 IPT-PQ vs. YCbCr-gamma Most of the existing video compression standards such as MPEG-1, MPEG-2, AVC, HEVC have been tested, evaluated, and optimized for gamma-coded images in the YCbCr color space. Yes. However, experimental results show that the IPT-PQ color space can provide a better representation format for high dynamic range images of 10 bits or more per pixel for each color component. If signal coding is performed in a color space (for example, IPT-PQ) that is more suitable for HDR and a signal with a wider color gamut, better overall image quality can be produced, but older decoders (for example, set-top boxes) are appropriate. Decoding and color conversion may not be possible. According to the understanding of the inventors, a new signal reconstruction technique is required to improve backward compatibility so that even devices that do not know the new color space can generate reasonable images (images). is there.

図３は、一実施形態における信号再構成および符号化のプロセス例を示す。図３に示すように、入力（３０２）を与えられると、順方向色再構成ブロック（３０５）は、必要に応じて色変換および／または再構成関数を適用することにより、好適な色空間（例えばＩＰＴ−ＰＱ−ｒ）における信号（３０７）を生成する。また再構成に関連するメタデータ（３０９）も生成し、これがエンコーダ（３１０）、デコーダ（３１５）、および逆方向色再構成（３２０）などの符号化パイプライン中の後のブロックに伝えられてもよい。 FIG. 3 illustrates an example process for signal reconstruction and encoding in one embodiment. As shown in FIG. 3, given the input (302), the forward color reconstruction block (305) may apply a color transformation and / or reconstruction function as needed to apply a suitable color space ( For example, a signal (307) in IPT-PQ-r) is generated. It also generates metadata (309) related to reconstruction, which is communicated to later blocks in the encoding pipeline such as encoder (310), decoder (315), and reverse color reconstruction (320). Also good.

デコーダは、符号化された信号（３１５）を受け取ったあと、復号化（３１５）（ＨＥＶＣ復号化など）を適用して、復号化された信号（３１７）を生成する。好適なＨＤＲ−ＷＣＧ符号化色空間（例えばＩＰＴ−ＰＱ−ｒ）を知っているデコーダは、適正な逆方向または逆（ｂａｃｋｗａｒｄｏｒｒｅｖｅｒｓｅ）再構成（３２０）を適用することにより、適正な色空間（例えばＩＰＴ−ＰＱ）にある信号（３２２）を生成する。そして、信号（３２２）を、追加的な後処理、格納、または表示のために、ＹＣｂＣｒまたはＲＧＢに変換してもよい。 After receiving the encoded signal (315), the decoder applies decoding (315) (such as HEVC decoding) to generate a decoded signal (317). A decoder that knows the preferred HDR-WCG coded color space (eg, IPT-PQ-r) will apply the proper reverse or reverse reconstruction (320) to ensure that the proper color space is A signal (322) in (for example, IPT-PQ) is generated. The signal (322) may then be converted to YCbCr or RGB for additional post-processing, storage, or display.

好適なＨＤＲ−ＷＣＧ符号化空間を知らない旧式のデコーダは、ＨＤＲ−ＷＣＧ空間を、旧式の色空間（例えばガンマ符号化されたＹＣｂＣｒ）として扱うかもしれない。しかし、順方向色再構成（３０５）のおかげで、デコーダの出力（３１７）に対し逆方向再構成その他の色変換が適用されていないにもかかわらず、出力（３１７）はなお妥当な画質を有し得る。 Older decoders that are unaware of the preferred HDR-WCG coding space may treat the HDR-WCG space as an older color space (eg, a gamma-coded YCbCr). However, thanks to the forward color reconstruction (305), the output (317) still has reasonable image quality, although no reverse reconstruction or other color transformations are applied to the decoder output (317). Can have.

色再構成
限定するものではないが、ＩＰＴ−ＰＱ色空間を考える。一実施形態において、線形再構成行列（例えば３×３行列）を生成することにより、ＩＰＴ−ＰＱ信号における肌色（ｓｋｉｎｔｏｎｅｓ）を、ＹＣｂＣｒ−ガンマ信号における肌色に、知覚的にマッチさせる。このような色変換は、ＩＰＴ色空間でのほとんどの画像処理用途の性能に影響を与えない一方で、旧式の機器による色再生を大きく改善する。肌色に代えてあるいは肌色に加えて、群葉や空などの他の重要な色にマッチするように同様な変換行列を生成してもよい。一実施形態において、再構成行列は以下のように算出され得る。すなわち、 Color Reconstruction Consider, but not limited to, the IPT-PQ color space. In one embodiment, the skin tone in the IPT-PQ signal is perceptually matched to the skin color in the YCbCr-gamma signal by generating a linear reconstruction matrix (eg, a 3 × 3 matrix). While such color conversion does not affect the performance of most image processing applications in the IPT color space, it greatly improves color reproduction with older equipment. A similar transformation matrix may be generated so as to match other important colors such as foliage and sky instead of or in addition to the skin color. In one embodiment, the reconstruction matrix may be calculated as follows: That is,

ａ）例えば反射率スペクトルなどの肌色カラーのデータベースをロードし、これらをＸＹＺなどの機器独立的な色空間に変換する。 a) Load a skin color database, such as a reflectance spectrum, and convert them to a device independent color space such as XYZ.

ｂ）肌色のデータベースを、ＸＹＺから旧式の色空間フォーマット（例えばＹＣｂＣｒ，Ｒｅｃ．７０９）に変換する。このステップは例えば、以下のサブステップを含んでもよい。すなわち、
ｂ．１）データベースをＲＧＢに変換（Ｒｅｃ．７０９）、
ｂ．２）ＲＧＢ値（例えばＢＴ．１８８６に準拠）にガンマを適用することにより、ガンマ符号化されたＲ’Ｇ’Ｂ’信号を生成、
ｂ．３）Ｒ’Ｇ’Ｂ’信号をＹＣｂＣｒ−ガンマ値（例えばＲｅｃ．７０９に準拠）に変換、
ｂ．４）ＹＣｂＣｒ−ガンマ信号の色相（例えば

）値を算出、および
ｂ．５）ＹＣｂＣｒ−ガンマ信号の彩度値（例えば

）を算出。 b) The skin color database is converted from XYZ to an old color space format (for example, YCbCr, Rec. 709). This step may include the following sub-steps, for example. That is,
b. 1) Convert database to RGB (Rec. 709),
b. 2) Generate a gamma-coded R′G′B ′ signal by applying gamma to the RGB values (eg compliant with BT.1886),
b. 3) Convert the R′G′B ′ signal into a YCbCr-gamma value (eg, according to Rec. 709),
b. 4) Hue of YCbCr-gamma signal (for example,

) Calculating a value; and b. 5) Saturation value of YCbCr-gamma signal (for example,

).

ｃ）データベース中の肌色値を、好適なカラーフォーマット（例えば、ＩＰＴ−ＰＱ）で算出する。このステップは、以下のサブステップを含み得る。すなわち、
ｃ．１）ＸＹＺからＬＭＳへ変換、
ｃ．２）ＰＱ（例えばＳＴ２０８４準拠）を適用することにより、ＬＭＳからＬ’Ｍ’Ｓ’へ、そしてＩ’Ｐ’Ｔ’へと変換、
ｃ．３）色相値（例えば

を算出、および
ｃ．４）彩度値（

）を算出。 c) The skin color value in the database is calculated in a suitable color format (for example, IPT-PQ). This step may include the following sub-steps. That is,
c. 1) Conversion from XYZ to LMS,
c. 2) Conversion from LMS to L'M'S 'and I'P'T' by applying PQ (eg ST 2084 compliant)
c. 3) Hue value (for example,

And c. 4) Saturation value (

).

ｄ）回転または再構成されたＩＰＴ−ＰＱ（例えばＩＰＴ−ＰＱ−ｒ）における肌色が、ＹＣｂＣｒ−ガンマにおける肌色と整合するように、ＩＰＴ値を回転するための回転行列を算出。一実施形態においてこのステップは、２つの色空間内のサンプルの、色相および彩度値に関するコスト関数を、最適化することにより算出される。例えば、一実施形態においてコスト関数は、旧式の色空間（例えばＹＣｂＣｒ）と、回転された好適なＨＤＲ色空間（例えばＩＰＴ−ＰＱ）との間の平均二乗誤差（ＭＳＥ）を表していてもよい。例えば、

が、色相に関連するコスト関数を表すとする。ここで、Ｈｕｅ_{ＩＰＴ−ＰＱ−ｒ}は、再構成された色（すなわち、ＩＰＴ−ＰＱ−ｒ）の色相を表しており、

と定義できる。ここで、すべての逆ｔａｎ関数は、（−π，π）で算出される。 d) Calculate a rotation matrix for rotating the IPT values so that the skin color in the rotated or reconstructed IPT-PQ (eg, IPT-PQ-r) matches the skin color in YCbCr-gamma. In one embodiment, this step is calculated by optimizing the cost function for the hue and saturation values of the samples in the two color spaces. For example, in one embodiment, the cost function may represent a mean square error (MSE) between an old color space (eg, YCbCr) and a preferred rotated HDR color space (eg, IPT-PQ). . For example,

Denote a cost function related to hue. Here, Hue _IPT-PQ-r represents the hue of the reconstructed color (ie, IPT-PQ-r),

Can be defined. Here, all inverse tan functions are calculated by (−π, π).

一実施形態において、ある基準にしたがってコスト関数を最小にする角度「ａ」の値（これをａ’と表記する）を見出すための、当該分野において公知の最適化法を適用し得る。例えば、ＭＡＴＬＡＢ関数ｆｍｉｎｕｎｃ（ｆｕｎ，ｘ０）を、ｆｕｎ＝Ｃｏｓｔ_Hおよびｘ０＝０．１で適用してもよい。所与のａ’について、回転行列Ｒを、以下のように定義し得る。

In one embodiment, optimization methods known in the art for finding the value of the angle “a” (denoted a ′) that minimizes the cost function according to certain criteria may be applied. For example, the MATLAB function fminunc (fun, x0) may be applied with fun = Cost _H and x0 = 0.1. For a given a ′, the rotation matrix R may be defined as follows:

一例として、サンプルデータベースに基づけば、一実施形態において、ａ’＝７１．７４度について、

である。 As an example, based on a sample database, in one embodiment, for a ′ = 71.74 degrees,

It is.

Ｒおよび元のＬ’Ｍ’Ｓ’からＩ’Ｐ’Ｔ’への行列であるＬＭＳ２ＩＰＴｍａｔ（例えば式（２）を参照）を与えられたとき、再構成されたＩＰＴ−ＰＱ−ｒ色空間への変換は、以下のように定義される新しいＬＭＳ２ＩＰＴｍａｔ−ｒ行列を用い得る。すなわち、
ＬＭＳ２ＩＰＴｍａｔ−ｒ＝Ｒ^T＊ＬＭＳ２ＩＰＴｍａｔ＝（（ＬＭＳ２ＩＰＴｍａｔ^T＊Ｒ））^T （７）
ここで、Ａ^Tは、行列Ａの転置行列を表す。 Given LMS2IPTmat (see, eg, equation (2)), which is a matrix from R and the original L′ M ′S ′ to I′P′T ′, into the reconstructed IPT-PQ-r color space May use the new LMS2IPTmat-r matrix defined as: That is,
LMS2IPTmat−r = R ^T * LMS2IPTmat = ((LMS2IPT mat ^T * R)) ^T (7)
Here, ^AT represents a transposed matrix of the matrix A.

一実施形態において、肌色の色相を整合させることに加えて、彩度を整合させてもよい。これは、以下のステップを含み得る。すなわち、
ａ）Ｒを元のＩＰＴ−ＰＱデータに適用して、色回転されたクロマ値Ｐ_RおよびＴ_Rデータを生成する。
ｂ）彩度コスト関数、例えば元の色空間およびターゲット色空間における彩度値間のＭＳＥなど、を定義する。すなわち、

ここで、

ｃ）ｂ’が、Ｃｏｓｔ_Sを最適化するｂ値を表すとする。すると、スケーリングベクトル

をクロマ回転行列に適用することで、１つの色回転・スケーリング３×３行列

を形成し得る。 In one embodiment, saturation may be matched in addition to matching the flesh color hue. This may include the following steps: That is,
a) by applying the R to the original IPT-PQ data to generate a chroma values P _R and T _R data color rotation.
b) Define a saturation cost function, such as the MSE between the saturation values in the original color space and the target color space. That is,

here,

c) _Let b ′ represent the b value that optimizes Cost _S. Then the scaling vector

1 color rotation / scaling 3 × 3 matrix by applying to the chroma rotation matrix

Can be formed.

いくつかの実施形態において、色相コスト関数および彩度コスト関数（例えば式（３）および（８）を、１つの色相／彩度コスト関数にまとめ、ａ’とｂ’の両方について同時に解いてもよい。例えば、式（１１）より、一実施形態において、

について、式（４）は、

と変形し得、最適なａ’および最適なｂｉ’（ｉ＝１から４）スケーリング係数の両方について、式（３）を解くことができる。 In some embodiments, the hue and saturation cost functions (eg, equations (3) and (8) may be combined into a single hue / saturation cost function and solved for both a ′ and b ′ simultaneously). For example, from Equation (11), in one embodiment:

Equation (4) is

Equation (3) can be solved for both optimal a ′ and optimal bi ′ (i = 1 to 4) scaling factors.

例えば、一実施形態において、ａ’＝６５度およびｂ１’＝１．４、ｂ２’＝１．０、ｂ３’＝１．４、およびｂ４’＝１．０について、式（１２）より以下が得られる。

For example, in one embodiment, for a ′ = 65 degrees and b1 ′ = 1.4, b2 ′ = 1.0, b3 ′ = 1.4, and b4 ′ = 1.0, from Equation (12): can get.

トーン再構成
提案した回転行列Ｒにより色再生を改善し得るが、しかし復号化された画像（３１７）は依然として、非線形的ＥＯＴＦ符号化関数間における差異（例えばＳＴ２０８４とＢＴ１８６６など）のために、低いコントラストを有するように知覚されるかもしれない。一実施形態において、１−Ｄトーンマッピング曲線を輝度チャンネル（例えばＩ’）に適用することによって、コントラストを改善してもよい。このステップは、以下のサブステップを含み得る。すなわち、
ａ）トーンマッピング曲線（例えばＳ字状）を適用することにより、元のＨＤＲの最大明るさ（例えば４，０００ニト）の元のコンテンツから、ＳＤＲのターゲット明るさ（例えば１００ニト）へマッピング。そのようなＳ字関数の一例を、本願においてその全文を援用するＡ．ＢａｌｌｅｓｔａｄおよびＡ．Ｋｏｓｔｉｎの米国特許第８，５９３，４８０号「ＭｅｔｈｏｄａｎｄＡｐｐａｒａｔｕｓｆｏｒＩｍａｇｅＤａｔａＴｒａｎｓｆｏｒｍａｔｉｏｎ」に、見出すことができる。別の再構成関数の例がまた、本願においてその全文を援用するＷＩＰＯ公開ＷＯ２０１４／１６０７０５「Ｅｎｃｏｄｉｎｇｐｅｒｃｅｐｔｕａｌｌｙ−ｑｕａｎｔｉｚｅｄｖｉｄｅｏｃｏｎｔｅｎｔｉｎｍｕｌｔｉ−ｌａｙｅｒＶＤＲｃｏｄｉｎｇ」にも開示されている。Ｉ'_T＝ｆ（Ｉ'）がトーンマッピング関数ｆ（）の出力を表すとして、
ｂ）Ｉ'_Tを線形化（例えば逆ＰＱまたはガンマ関数を適用）することにより、線形Ｉ_Tデータを生成、および
ｃ）線形化されたＩ_T信号に対して旧式のＥＯＴＦ符号化（例えばＢＴ．１８６６）符号化を適用することにより、ガンマ符号化された輝度信号を生成。これは、圧縮されてエンコーダに送信される。 Tone reconstruction The proposed rotation matrix R can improve color reproduction, but the decoded image (317) is still due to differences between nonlinear EOTF encoding functions (eg ST 2084 and BT 1866). May be perceived as having a low contrast. In one embodiment, contrast may be improved by applying a 1-D tone mapping curve to the luminance channel (eg, I ′). This step may include the following sub-steps. That is,
a) Mapping from original content with original HDR maximum brightness (eg 4,000 nits) to SDR target brightness (eg 100 nits) by applying a tone mapping curve (eg sigmoid). An example of such an S-shaped function is described in A. Ballestad and A.M. Kostin, U.S. Patent No. 8,593,480, "Method and Apparatus for Image Data Transformation". An example of another reconstruction function is also disclosed in WIPO publication WO 2014/160705 “Encoding perceptually-quantized video content in multi-layer VDR coding”, which is incorporated herein in its entirety. As I ′ _T = f (I ′) represents the output of the tone mapping function f (),
b) generate linear I _T data by linearizing I ′ _T (eg, applying inverse PQ or gamma function), and c) old-style EOTF encoding (eg, BT) on the linearized I _T signal .1866) generating a gamma encoded luminance signal by applying encoding. This is compressed and sent to the encoder.

ＳＴ２０８４（ＰＱ）およびＢＴ１８６６間におけるそのようなマッピングの一例を、図４に示す。この曲線は、より高いミッドトーンコントラスト、より低い黒部、およびより明るい（コントラストのより少ない）ハイライトを有している。これにより、トーンスケールを標準的なＳＤＲに対してより近く整合させ、もし入力が旧式の機器によって復号化されたとしても画像が視聴可能であるようになる。図４において、限定はしないが、入力値および出力値は（０，１）に正規化されている。 An example of such a mapping between ST 2084 (PQ) and BT 1866 is shown in FIG. This curve has higher midtone contrast, lower blacks, and brighter (less contrast) highlights. This matches the tone scale closer to a standard SDR and allows the image to be viewed even if the input is decoded by an older device. In FIG. 4, although not limited, the input value and the output value are normalized to (0, 1).

再構成情報は、エンコーダからパイプラインの残り部分に対して、メタデータとして伝えられ得る。再構成パラメータは、与えられた映像シーケンスについて可能なかぎり最高の性能を得るために、フレーム毎、シーン毎、またはシーケンス毎などの様々な時間関係で決定されてよい。 The reconstruction information can be conveyed as metadata from the encoder to the rest of the pipeline. The reconstruction parameters may be determined in various temporal relationships, such as per frame, per scene, or per sequence, to obtain the best possible performance for a given video sequence.

本説明ではＩＰＴ−ＰＱ色空間について述べているが、これらの手法は他の色空間およびカラーフォーマットについても等しく適用可能である。例えば、同様な手法を適用することにより、異なるバージョンのＹＣｂＣｒ間（例えば、Ｒｅｃ．７０９ＹＣｂＣｒとＲｅｃ．２０２０ＹＣｂＣｒなど）にわたって、下位互換性を改善することができる。このように、一実施形態において、本明細書に記載した信号再構成法を用いてＲｅｃ．２０２０ビットストリームを調節することにより、旧式のＲｅｃ．７０９デコーダを用いて復号化された際にも、よりよい色相・彩度出力を得ることができる。 In this description, the IPT-PQ color space is described, but these methods are equally applicable to other color spaces and color formats. For example, backward compatibility can be improved between different versions of YCbCr (for example, Rec.709 YCbCr and Rec.2020 YCbCr) by applying a similar method. Thus, in one embodiment, using the signal reconstruction method described herein, Rec. By adjusting the 2020 bitstream, the old Rec. Even when decoding is performed using the 709 decoder, a better hue / saturation output can be obtained.

図６は、一実施形態における、色回転・スケーリング行列を生成するための、プロセスフロー例を示す。ある画像データベース（６０５）を与えられると、ステップ（６１０）は、第１の（旧式）色空間（例えば、ＹＣｂＣｒ−ガンマ）において、データベース内の画像についての色相および彩度値を算出する。ステップ（６１５）は、第２の（好適な）色空間（例えば、ＩＰＴ−ＰＱ）において、データベース内の画像についての色相を算出する。 FIG. 6 illustrates an example process flow for generating a color rotation and scaling matrix in one embodiment. Given an image database (605), step (610) calculates hue and saturation values for the images in the database in a first (old) color space (eg, YCbCr-gamma). Step (615) calculates a hue for an image in the database in a second (preferred) color space (eg, IPT-PQ).

色相に関連するコスト関数（例えば式（３））を与えられると、ステップ（６２０）は、ある最小化コスト条件（例えば平均二乗誤差（ＭＳＥ））にしたがって最適である回転角ａ’を求める。この回転角ａ’は、旧式の色空間において算出された色相群と、回転された好適な色空間において算出された色相群との距離を最小にするものである。ステップ（６２５）において、ａ’の値を用いて色回転行列を生成する。 Given a cost function related to hue (eg, equation (3)), step (620) determines a rotation angle a ′ that is optimal according to some minimized cost condition (eg, mean square error (MSE)). This rotation angle a 'minimizes the distance between the hue group calculated in the old color space and the hue group calculated in the rotated preferred color space. In step (625), a color rotation matrix is generated using the value of a '.

また、オプションとして彩度スケーラを算出してもよい。彩度コスト関数（例えば式８）を与えられると、ステップ（６３０）は、オプションとして、ある最小化コスト条件にしたがって（例えば第１の色空間にある信号の彩度と色回転された好適な色空間にあるスケーリングされた信号の彩度との間のＭＳＥなど）、最適なスケーラｂ’を求める（６４０、６４５）。 In addition, a saturation scaler may be calculated as an option. Given a saturation cost function (e.g., Equation 8), step (630) may optionally be performed according to certain minimization cost conditions (e.g., the saturation and color rotation of the signal in the first color space). And the optimal scaler b ′ is determined (640, 645).

最後に、ステップ（６３５）において、回転角およびスケーラを組み合わせることにより、色回転・スケーリング行列（例えば式（１１））を生成する。 Finally, in step (635), a color rotation / scaling matrix (for example, Equation (11)) is generated by combining the rotation angle and the scaler.

エンコーダにおいては、色回転・スケーリング行列を、好適な色空間にある入力データに適用することにより、再構成された色空間にあるデータを生成する。データは符号化（圧縮）され、色回転・スケーリング行列に関連する情報とともにデコーダに送信される。 In the encoder, a color rotation / scaling matrix is applied to input data in a suitable color space to generate data in the reconstructed color space. The data is encoded (compressed) and sent to the decoder along with information related to the color rotation / scaling matrix.

デコーダにおいては、旧式のデコーダの場合、旧式の色空間において符号化されたものとしてデータを復号化するであろう。間違った色空間情報を用いているにもかかわらず、ダイナミックレンジは下がるものの、画像は依然として十分な質で視聴可能である。より新しくフル機能のデコーダは、受け取ったメタデータ情報（色回転・スケーリング行列）を利用して、画像データを好適な色空間において復号化することにより、視聴者にデータの持つハイダイナミックレンジをフルに提供し得る。 At the decoder, in the case of an older decoder, the data will be decoded as if it were encoded in an older color space. Despite using the wrong color space information, the dynamic range is reduced, but the image is still viewable with sufficient quality. A newer full-function decoder uses the received metadata information (color rotation / scaling matrix) to decode the image data in a suitable color space, thereby giving the viewer the full high dynamic range of the data. Can be provided to.

再構成情報用のＳＥＩメッセージシンタックス
上記のように、一実施形態において、回転（Ｒ）行列およびスケーリングベクトル（Ｓ）は、（２３０）におけるＬ’Ｍ’Ｓ’からＩ’Ｐ’Ｔ’への変換行列により吸収されてもよい。トーン再構成曲線は順方向色再構成（３０５）の一部分であってもよい。両方の場合において、適応的再構成情報（すなわち、行列およびトーンマッピング曲線）は、２０１５年７月１６日付け出願の米国特許仮出願Ｓｅｒ．Ｎｏ．６２／１９３３９０において提案されているシンタックスを用いて、エンコーダからデコーダに送信され得る。本願においてその全文を援用する。 SEI message syntax for reconstruction information As described above, in one embodiment, the rotation (R) matrix and scaling vector (S) are changed from L′ M ′S ′ to I′P′T ′ in (230). May be absorbed by the transformation matrix. The tone reconstruction curve may be part of the forward color reconstruction (305). In both cases, adaptive reconstruction information (ie, matrix and tone mapping curves) is obtained from US Provisional Application Ser. No. It can be sent from the encoder to the decoder using the syntax proposed in 62/193390. The full text is incorporated herein by reference.

別の実施形態において、図５に示すように、新たな色回転およびスケールブロック（５１０）をエンコーダ（５００Ａ）に追加してもよい。このブロックは、色変換（２００）（例えばＲＧＢからＩＰＴ−ＰＱ）の後に追加され得るが、好ましくは順方向再構成（３０５）の前に追加される。デコーダ（５００Ｂ）において、対応する逆色回転・スケーリングブロック（５１５）を、逆方向再構成ボックス（３２０）の後に追加してもよい。図５に示すように、オプションとしてのカラーフォーマット変換ボックス（例えば４：４：４から４：２：０（５０５）、または４：２：０から４：４：４（５２０））を、符号化および／または復号化パイプラインに必要に応じて追加してもよい。 In another embodiment, a new color rotation and scale block (510) may be added to the encoder (500A), as shown in FIG. This block may be added after color conversion (200) (eg RGB to IPT-PQ), but is preferably added before forward reconstruction (305). In the decoder (500B), a corresponding inverse color rotation and scaling block (515) may be added after the reverse reconstruction box (320). As shown in FIG. 5, an optional color format conversion box (eg, 4: 4: 4 to 4: 2: 0 (505), or 4: 2: 0 to 4: 4: 4 (520)) is encoded. May be added to the encryption and / or decoding pipeline as needed.

シンタックス的には、３×３回転行列、または単に２×２行列を指定すればよい。なぜなら、典型的には輝度チャンネル（例えばＹまたはＩ）は変更されないままだからである。表１は、色回転・スケーリング行列を伝えるためのＳＥＩメッセージングの一例を提供する。ただし、信号送信はＳＥＩメッセージに限定されない。ＳＰＳ、ＰＰＳその他などの任意の高レベルシンタックスに挿入され得る。
表１：色回転・スケーリング行列のためのＳＥＩメッセージング例

In terms of syntax, a 3 × 3 rotation matrix or simply a 2 × 2 matrix may be specified. This is because typically the luminance channel (eg, Y or I) remains unchanged. Table 1 provides an example of SEI messaging for conveying the color rotation and scaling matrix. However, signal transmission is not limited to SEI messages. It can be inserted into any high level syntax such as SPS, PPS, etc.
Table 1: Example SEI messaging for color rotation and scaling matrix

ｃｏｌｏｕｒ＿ｒｏｔａｔｉｏｎ＿ｓｃａｌｅ＿ｍａｔｒｉｘ＿ｐｒｅｓｅｎｔ＿ｆｌａｇが１に等しいことは、０から１の範囲（両端を含む）にあるｃおよびｉについて、シンタックス要素ｃｏｌｏｕｒ＿ｒｏｔａｔｉｏｎ＿ｓｃａｌｅ＿ｃｏｅｆｆｓ［ｃ］［ｉ］が存在することを示す。ｃｏｌｏｕｒ＿ｒｏｔａｔｉｏｎ＿ｓｃａｌｅ＿ｍａｔｒｉｘ＿ｐｒｅｓｅｎｔ＿ｆｌａｇが０に等しいことは、０から１の範囲（両端を含む）にあるｃおよびｉについて、シンタックス要素ｃｏｌｏｕｒ＿ｒｏｔａｔｉｏｎ＿ｓｃａｌｅ＿ｃｏｅｆｆｓ［ｃ］［ｉ］が存在しないことを示す。 A color_rotation_scale_matrix_present_flag equal to 1 indicates that there is a syntax element color_rotation_scale_coeffs [c] [i] for c and i in the range of 0 to 1 (inclusive). The color_rotation_scale_matrix_present_flag equal to 0 indicates that there is no syntax element color_rotation_scale_coeffs [c] [i] for c and i in the range of 0 to 1 (inclusive).

ｃｏｌｏｕｒ＿ｒｏｔａｔｉｏｎ＿ｓｃａｌｅ＿ｃｏｅｆｆｓ［ｃ］［ｉ］は、２×２色回転およびスケール行列係数の値を指定する。ｃｏｌｏｕｒ＿ｒｏｔａｔｉｏｎ＿ｓｃａｌｅ＿ｃｏｅｆｆｓ［ｃ］［ｉ］の値は、−２＾１５から２＾１５−１（両端を含む）の範囲になる。ｃｏｌｏｕｒ＿ｒｏｔａｔｉｏｎ＿ｓｃａｌｅ＿ｃｏｅｆｆｓ［ｃ］［ｉ］が存在しないときは、デフォルトの色回転およびスケール行列行列が用いられる。 color_rotation_scale_coeffs [c] [i] specifies the value of 2 × 2 color rotation and scale matrix coefficients. The value of color_rotation_scale_coeffs [c] [i] is in the range of −2 ^ 15 to 2 ^ 15-1 (including both ends). When color_rotation_scale_coeffs [c] [i] is not present, a default color rotation and scale matrix is used.

一実施形態において、エンコーダおよびデコーダの両方が色回転・スケーリング行列を知っており（例えば新しい色空間を相互に定義することにより）、色回転行列をエンコーダからデコーダに信号送信する必要がない場合がある。別の実施形態においては、色回転・スケーリング行列は、ＩＰＴ−ＰＱとともにＶＵＩ（ビデオユーザビリティ情報）中において参照されてもよい。 In one embodiment, both the encoder and decoder may know the color rotation and scaling matrix (eg, by mutually defining a new color space) and may not need to signal the color rotation matrix from the encoder to the decoder. is there. In another embodiment, the color rotation / scaling matrix may be referenced in a VUI (Video Usability Information) along with IPT-PQ.

多色相・彩度再構成
いくつかの実施形態において、多数の色相に対して再構成を適用することが有利である場合がある。これは、再構成された色空間を旧式の色にマッチさせる精度を増大させるが、デコーダにおいて追加的な演算というコストが発生する。例えばＮ個の色相（例えば肌色、空、草木など）について再構成を最適化する問題を考える。一実施形態において、前述した処理を繰り返すことにより、最適角度および彩度の組を、色相の関数として決定することができる。例えば、様々な色相群に対してデータベース画像を用いることにより、最適な（回転角、彩度スケール）値の組を生成し得る。例えば｛（ａ₁，ｂ₁），（ａ₂，ｂ₂），．．．，（ａ_N，ｂ_N）｝である。あるいはより一般的に、画素ｐについて

が最適なクロマ（色相）回転および彩度スケーリング値を示すとする。ここで、ｈ（ｐ）は画素ｐの色相の尺度を表す。例えば、ＩＰＴ−ＰＱ色空間について、ｆ_Hおよびｆ_S関数は、色相関数ｈ（ｐ）および彩度関数ｓ（ｐ）的に算出することが可能である。すなわち、

Multi-Hue / Saturation Reconstruction In some embodiments, it may be advantageous to apply reconstruction to multiple hues. This increases the accuracy of matching the reconstructed color space to the old color, but incurs additional computational costs at the decoder. For example, consider the problem of optimizing reconstruction for N hues (eg skin color, sky, vegetation, etc.). In one embodiment, by repeating the process described above, the optimal angle and saturation set can be determined as a function of hue. For example, by using a database image for various hue groups, an optimal (rotation angle, saturation scale) value set can be generated. For example, {(a ₁ , b ₁ ), (a ₂ , b ₂ ),. . . , (A _N , b _N )}. Or more generally, for pixel p

Denote the optimal chroma (hue) rotation and saturation scaling values. Here, h (p) represents a measure of the hue of the pixel p. For example, for the IPT-PQ color space, the f _H and f _S functions can be calculated as a color correlation number h (p) and a saturation function s (p). That is,

関数ｆ_H（ｈ（ｐ））およびｆ_S（ｈ（ｐ））は、当該分野において公知の様々な方法、例えばルックアップテーブルまたは区分的線形または非線形多項式として表現および格納されればよく、またメタデータとしてエンコーダからデコーダへと伝えられることができる。 The functions f _H (h (p)) and f _S (h (p)) may be represented and stored in various ways known in the art, such as look-up tables or piecewise linear or nonlinear polynomials, and It can be transmitted as metadata from the encoder to the decoder.

ｆ_H（ｈ（ｐ））およびｆ_S（ｈ（ｐ））が与えられたとき、エンコーダは、各画素に対して以下の再構成関数

を適用することにより、適切な再構成された信号を生成する。例えば、ＩＰＴ−ＰＱ色空間について、画素ｐについての再構成されたＰ’およびＴ’色成分は、以下を用いて導出してもよい。すなわち、

Given f _H (h (p)) and f _S (h (p)), the encoder uses the following reconstruction function for each pixel:

To generate an appropriate reconstructed signal. For example, for the IPT-PQ color space, the reconstructed P ′ and T ′ color components for pixel p may be derived using: That is,

デコーダにおいては、プロセスは逆である。例えば、ｆ_H（ｈ（ｐ））およびｆ_S（ｈ（ｐ））が与えられたとき、式（１４）および（１６）から、デコーダは、

を生成する。 At the decoder, the process is reversed. For example, given f _H (h (p)) and f _S (h (p)), from equations (14) and (16), the decoder:

Is generated.

デコーダにおける除算を避けるために、いくつかの実施形態において、エンコーダはｆ_S（ｈ（ｐ））の逆数（例えば１／ｂ（ｐ）値）を、デコーダに信号送信し得る。ＩＰＴ−ＰＱ空間における入力データに対し、元のデータを、

として生成し得る。 To avoid division at the decoder, in some embodiments, the encoder may signal the inverse of f _S (h (p)) (eg, 1 / b (p) value) to the decoder. For the input data in the IPT-PQ space, the original data is

Can be generated as

式（１７）より、逆再構成を適用して好適な色空間にあるデータを回復することは、三角関数演算を必要とする。いくつかの実施形態において、三角関数演算はルックアップテーブルを用いて行い得る。一例として、式（１８）より、式（１９）は以下のように書き換え得る。

これらの演算は、コサインおよびサイン関数を算出するために適切なルックアップテーブルを用いることにより、さらに簡素化することができる。 From equation (17), applying inverse reconstruction to recover data in a suitable color space requires trigonometric function operations. In some embodiments, trigonometric functions can be performed using look-up tables. As an example, from the equation (18), the equation (19) can be rewritten as follows.

These operations can be further simplified by using appropriate look-up tables to calculate cosine and sine functions.

図７Ａは、旧式の色空間がＹＣｂＣｒ−ガンマである場合に、再構成されたＩＰＴ−ＰＱ−ｒ（これは旧式の機器にとっては、ＹＣｂＣｒとして見える）から色相を変換してＩＰＴ−ＰＱに戻す、逆方向再構成関数の一例を示す。図７Ｂは、彩度を調整するための、対応する逆方向再構成関数を示す。図８は、好適な色空間ＩＰＴ−ＰＱ（８２０）を調節して、旧式のＹＣｂＣｒ色空間（８１０）の特性にマッチさせる様子を示す。光線（８３０）は、回転およびスケーリングを表している。 FIG. 7A shows the transformation of the hue from the reconstructed IPT-PQ-r (which appears to older equipment as YCbCr) back to IPT-PQ when the old color space is YCbCr-gamma. An example of the backward reconstruction function is shown. FIG. 7B shows the corresponding backward reconstruction function for adjusting the saturation. FIG. 8 shows how the preferred color space IPT-PQ (820) is adjusted to match the characteristics of the old YCbCr color space (810). Ray (830) represents rotation and scaling.

別の実施形態において、ＰおよびＴ値を色相のコサインまたはサイン関数的に算出する代わりに、色相の他の何らかの関数（例えばｆ（ｔａｎ^-1（ｈ（ｐ））））に基づいて生成されたルックアップテーブルを有するより単純なデコーダを構築することもできる。例えば、所与の再構成された画素値成分Ｐ'_r（ｐ）およびＴ'_r（ｐ）について、一実施形態において、デコーダは、元の画素値を以下のように回復し得る。すなわち、

ここで、ｖ（）およびｗ（）は、再構成された色空間における画像が、旧式の色空間における色相および彩度の組にマッチするように生成された、色相に関連する関数を表す。ｖ（）およびｗ（）関数は、前述のようにメタデータを用いてエンコーダからデコーダへと伝えられてもよく、あるいはエンコーダおよびデコーダの両方が知っている、確立された符号化プロトコルまたは規格の一部であってもよい。 In another embodiment, instead of calculating P and T values as a cosine or sine function of the hue, it is generated based on some other function of the hue (eg, f (tan ⁻¹ (h (p)))). It is also possible to construct a simpler decoder with a lookup table. For example, for a given reconstructed pixel value component P ′ _r (p) and T ′ _r (p), in one embodiment, the decoder may recover the original pixel value as follows: That is,

Here, v () and w () represent hue related functions that are generated so that the image in the reconstructed color space matches the hue and saturation pairs in the old color space. The v () and w () functions may be communicated from the encoder to the decoder using metadata as described above, or of established coding protocols or standards known to both the encoder and decoder. It may be a part.

ＩＣ_TＣ_P色空間
ＩＣ_TＣ_PはＩＣｔＣｐ（またはＩＰＴ）とも呼ばれ、ハイダイナミックレンジおよび広色域（ｗｉｄｅｃｏｌｏｒｇａｍｕｔ：ＷＣＧ）信号を処理するために特に設計された、新しい提案としての色空間である。ＩＴＰ−ＰＱと同様に、Ｉ（Ｉｎｔｅｎｓｉｔｙ：強度）は、ＰＱ符号化された信号の明るさを表し、第３軸（ＴｒｉｔａｎＡｘｉｓ）Ｃ_Tは、青−黄色の知覚に対応し、第１軸（ＰｒｏｔａｎＡｘｉｓ）Ｃ_Pは、赤−緑の色知覚に対応する。上に述べたＩＰＴ−ＰＱの特徴に加えて、ＩＣ_TＣ_Pにおいては、
・前述のように、クロマを回転し肌色をＹＣｂＣｒにより近く整合
・ＸＹＺからＬＭＳへの行列が、ＷＣＧ画像についてのよりよい均一性および線形性のために最適化
・Ｌ’Ｍ’Ｓ’からＩＣｔＣｐへの行列が、ＨＤＲおよびＷＣＧ画像についての等輝度性（ｉｓｏｌｕｍｉｎａｎｃｅ）および安定性を改善するために最適化
されている。 IC _TC _P color space IC _TC _P , also called ICtCp (or IPT), is a new proposed color specifically designed to process high dynamic range and wide color gamut (WCG) signals. It is space. As with ITP-PQ, I (Intensity: intensity) represents the brightness of the PQ encoded signal, the third axis (Tritan Axis) C _T is blue - correspond to the perception of yellow, the first shaft (Protan Axis) C _P corresponds to red-green color perception. In addition to the features of the IPT-PQ mentioned above, in the IC _T C _P,
Rotate chroma and match skin color closer to YCbCr as described above XYZ to LMS matrix optimized for better uniformity and linearity for WCG images L'M'S 'to ICtCp Is optimized to improve the isoluminance and stability for HDR and WCG images.

本明細書において、用語「等輝度性」は、輝度（例えばＩＣｔＣｐのＩ、またはＹ’Ｃｂ’Ｃｒ’のＹ’）がいかによく輝度Ｙに対応するかの尺度を意味する。間接的にこれは、ある色空間がいかによくルマをクロマから分離するかを測っている。発明者の行った実験から、ＩＣｔＣｐのＩは、Ｙ’Ｃｂ’Ｃｒ’のＹ’よりも、よりルマに近く対応することがわかっている。 As used herein, the term “isoluminous” means a measure of how well the luminance (eg, I of ICtCp or Y ′ of Y′Cb′Cr ′) corresponds to the luminance Y. Indirectly this measures how well a color space separates luma from chroma. From experiments conducted by the inventors, it is known that I of ICtCp corresponds to luma more closely than Y ′ of Y′Cb′Cr ′.

実装の観点から言えば、ＩＣ_TＣ_P色空間を用いることは、従来のガンマ符号化されたＹＣｂＣｒを用いるのと同じハードウェアおよび信号フローを必要とする。例えば、カメラパイプラインにおいてガンマ補正されたＹＣｂＣｒ（Ｙ’Ｃｂ’Ｃｒ’）を使用することを考える。ＸＹＺから開始して、プロセスは以下のステップを必要とする。
ａ）３×３行列を用い、ＸＹＺからＲＧＢＢＴ．２０２０へと変換、
ｂ）ステップａ）の出力に逆ＥＯＴＦ（またはＯＥＴＦ）を適用、および
ｃ）ステップｂ）の出力に３×３行列を適用。 In terms of implementation, the use of IC _T C _P color space requires the same hardware and signal flow and to use conventional gamma encoded YCbCr. For example, consider using YCbCr (Y′Cb′Cr ′) that is gamma corrected in the camera pipeline. Starting from XYZ, the process requires the following steps:
a) Using XYZ to RGB BT. Converted to 2020,
b) Apply inverse EOTF (or OETF) to the output of step a), and c) Apply a 3x3 matrix to the output of step b).

図２に示したように、ＩＣ_TＣ_P色を用いることは、以下のステップを必要とする。
ａ）ステップ（２２０）にて、好適な実施形態においては以下の３×３行列を用いて、ＸＹＺからＬＭＳへ変換する。すなわち、

これは式（１ａ）のＸＹＺからＬＭＳへの３×３行列を、ｃ＝０．０４であるクロストーク行列と組み合わせることに相当する（式（１ｃ）も参照のこと）。
ｂ）ステップ（２２５）において、ＰＱ非線形性を適用することにより、前述のように信号（２２２）をＬ’Ｍ’Ｓ’に変換。
ｃ）ステップ（２３０）において、３×３行列を用いてＬ’Ｍ’Ｓ’からＩＣ_TＣ_Pへ変換。この３×３行列は好適な実施形態において、以下のように定義し得る。すなわち、

式（２３）は、式（１２ｂ）の回転行列に、式（２ｂ）の元のＬ’Ｍ’Ｓ’からＩ’Ｐ’Ｔ’への行列を乗算することに相当する。 As shown in FIG. 2, using IC _TC _P color requires the following steps.
a) In step (220), transform from XYZ to LMS using the following 3 × 3 matrix in the preferred embodiment: That is,

This corresponds to combining the XYZ to LMS 3 × 3 matrix of equation (1a) with a crosstalk matrix where c = 0.04 (see also equation (1c)).
b) In step (225), convert signal (222) to L'M'S 'as described above by applying PQ nonlinearity.
c) In step (230), converted with 3 × 3 matrix from L'M'S 'to IC _T C _P. This 3 × 3 matrix may be defined in the preferred embodiment as follows: That is,

Equation (23) corresponds to multiplying the rotation matrix of Equation (12b) by the matrix from the original L′ M ′S ′ to I′P′T ′ of Equation (2b).

別の実施形態において、ステップａ）からｃ）までを、以下のようにも表現し得る。

ここで、ＲＧＢ_BT.2020は、ＢＴ．２０２０における３つのＲＧＢ値を表し、ＥＯＴＦ^-1 _ST2084は、ＳＭＰＴＥＳＴ２０８４に従ったＥＯＴＦの逆関数を表す。いくつかの実施形態において、ＥＯＴＦ^-1 _ST2084関数を、ハイブリッドログガンマ（ＨｙｂｒｉｄＬｏｇ−Ｇａｍｍａ：ＨＬＧ）関数などの、別の非線形的量子化関数により置き換えてもよい。完全な参考が必要であれば、適切な式を表２にも要約しておく。下付き文字Ｄは、ディスプレイ光を意味する。
表２：ＩＣ_TＣ_Pへの色変換

In another embodiment, steps a) to c) can also be expressed as:

Here, RGB _BT.2020 is BT. EOTF- ¹ _ST2084 represents the inverse function of EOTF according to SMPTE ST 2084. In some embodiments, the EOTF ^-1 _ST2084 function may be replaced by another non-linear quantization function, such as a Hybrid Log-Gamma (HLG) function. Appropriate equations are also summarized in Table 2 if full reference is needed. Subscript D means display light.
Table 2: color conversion to the IC _T C _P

ＩＣ_TＣ_Pから元の色空間へと戻る変換は、同様なアプローチをたどる。一実施形態において、これは以下のステップを含み得る。すなわち、
ａ）ＩＣ_TＣ_PからＬ’Ｍ’Ｓ’へと変換。これは式（２３）の逆行列、すなわち

を用いて行う。
ｂ）Ｌ’Ｍ’Ｓ’信号から、信号のＥＯＴＦ関数（例えばＳＴ２０８４の定義に準拠）を用いて、ＬＭＳへと変換。
ｃ）式（２２）の逆行列、例えば

を用いて、ＬＭＳからＸＹＺへと変換。 Conversion back from IC _T C _P to the original color space, follows a similar approach. In one embodiment, this may include the following steps: That is,
a) conversion from IC _T C _P to L'M'S '. This is the inverse matrix of equation (23), ie

To do.
b) Conversion from L'M'S 'signal to LMS using the EOTF function of the signal (e.g. compliant with ST 2084 definition).
c) Inverse matrix of equation (22), eg

To convert from LMS to XYZ.

一実施形態において、対応するＬ’Ｍ’Ｓ’からＲＧＢへの行列、およびＩＣ_TＣ_PからＬ’Ｍ’Ｓ’への行列は、以下で与えられる。

In one embodiment, the corresponding L′ M ′S ′ to RGB and IC _T C _P to L′ M ′S ′ matrices are given below.

リファレンスディスプレイ管理
ハイダイナミックレンジのコンテンツが、そのコンテンツのマスタリングのために用いられたリファレンスディスプレイよりも、低いダイナミックレンジを有するディスプレイ上で視聴されることもある。ＨＤＲコンテンツをより低ダイナミックレンジのディスプレイ上で視聴するためには、ディスプレイマッピングを行う必要がある。これは、ディスプレイにおけるＥＥＴＦ（電気−電気的伝達関数（ｅｌｅｃｔｒｉｃａｌ−ｅｌｅｃｔｒｉｃａｌｔｒａｎｓｆｅｒｆｕｎｃｔｉｏｎ））の形をとり得る。これは典型的には、ディスプレイにＥＯＴＦを適用する前に適用される。この関数は、品良くハイライトと影とをロールオフするためのトゥとショルダー（ｔｏｅａｎｄｓｈｏｕｌｄｅｒ）を提供することにより、表現意図を保持することとディテールを維持することとの間のバランスを保つ。図９は、フルの０〜１０，０００ニトのダイナミックレンジから、０．１〜１，０００ニトの能力を有するターゲットディスプレイへのＥＥＴＦマッピングの一例を示す。このＥＥＴＦはＰＱ信号に導入され得る。プロットした点は、マッピングの効果を示す。すなわち、意図された光がどのように実際の表示光に変化するかを示している。 Reference Display Management High dynamic range content may be viewed on a display having a lower dynamic range than the reference display used for mastering that content. In order to view HDR content on a display with a lower dynamic range, it is necessary to perform display mapping. This may take the form of an EETF (electrical-electrical transfer function) in the display. This is typically applied before applying EOTF to the display. This function keeps the balance between preserving intent and maintaining detail by providing a toe and shoulder to elegantly roll off highlights and shadows . FIG. 9 shows an example of EETF mapping from a full dynamic range of 0-10,000 nits to a target display having a capability of 0.1-1,000 nits. This EETF can be introduced into the PQ signal. The plotted points indicate the mapping effect. That is, it shows how the intended light changes to actual display light.

以下に、様々な黒および白輝度レベルを有するディスプレイに対しこのトーンマッピング関数を実装するための、数学的ステップを示す。このＥＥＴＦは、非線形ドメインにおいて、ＩＣ_TＣ_PまたはＹ’Ｃ’_BＣ’_Rのルマチャンネル、あるいはＲＧＢのチャンネルに対し個別に適用され得る。
ＥＥＴＦの計算： The following are mathematical steps for implementing this tone mapping function for displays with various black and white luminance levels. This EETF, in non-linear domain, IC _T C _P or Y'C _'B C' _R luma channel or to the RGB channels, can be applied individually.
EETF calculation:

トーンマッピング曲線の中央領域は、ソースからターゲットへの一対一対応として定義される。エルミートスプラインを用いて追加的なトゥ／ショルダーロールオフを計算することにより、ダイナミックレンジをターゲットディスプレイの能力まで減少させる。 The central region of the tone mapping curve is defined as a one-to-one correspondence from source to target. By calculating additional toe / shoulder roll-off using Hermite splines, the dynamic range is reduced to the capability of the target display.

スプラインのターニングポイント（トゥスタート（ＴＳ）およびショルダースタート（ＳＳ））を最初に定義する。これらは、ロールオフが始まる点である。ｍｉｎＬｕｍおよびｍａｘＬｕｍがターゲットディスプレイの最小および最大輝度値を表すとすると、

である。 The spline turning points (toe start (TS) and shoulder start (SS)) are first defined. These are the points where roll-off begins. If minLum and maxLum represent the minimum and maximum brightness values of the target display,

It is.

正規化されたＰＱ符号語で表現されたソース入力信号であるＥ₁を与えられると、出力Ｅ₂は以下のように算出される。

エルミートスプライン式：

Given E ₁ is the source input signal represented in normalized PQ codeword, the output E ₂ is calculated as follows.

Hermite spline type:

別の実施形態において、
ステップ１：

ステップ２：

エルミートスプライン式

ここで

In another embodiment,
Step 1:

Step 2:

Hermite spline type

here

得られたＥＥＴＦ曲線は、ＩＣ_TＣ_Pの強度Ｉチャンネル、またはＹ’Ｃ’_BＣ’_RのルマＹチャンネルに適用することができる。特記すべきオプションをいくつかあげておくと、
１）ＩＣ_TＣ_PのＩ−ＥＥＴＦを介して、ＩＣ_TＣ_Pの強度（Ｉ）チャンネルを処理

Ｉ₂＝ＥＥＴＦ（Ｉ₁）
・グレースケールをより正確に調節する
・カラーシフト無し
・彩度の変更が必要であり、以下の等式を用いてＣ_TおよびＣ_Pチャンネルに適用しなければならない。すなわち、

２）Ｙ’Ｃ’_BＣ’_RのＹ’−ＥＥＴＦを介して、Ｙ’Ｃ’_BＣ’_RのルマＹ’チャンネルを処理

Ｙ’₂＝ＥＥＴＦ（Ｙ’₁）
・グレースケールをより正確に調節する
・カラーシフトは限定的
・彩度の変更が必要であり、以下の等式を用いてＣ’_BおよびＣ’_Rチャンネルに適用しなければならない。すなわち、

The resulting EETF curve can be applied to the intensity I channel or Y'C _'B C' luma Y channel _R, the IC _T C _P. Here are a few options to mention:
1) through the I-EETF of IC _T C _P, processing IC _T C _P of the intensity (I) channel

I ₂ = EETF (I ₁ )
• Adjust grayscale more accurately • No color shift • Saturation changes are required and should be applied to the C _T and C _P channels using the following equation: That is,

2) Y'C _'B C' via the Y'-EETF of _R, processing luma Y 'channel Y'C' _B C _'R

Y ′ ₂ = EETF (Y ′ ₁ )
• Adjust grayscale more accurately • Limited color shift • Saturation changes are required and must be applied to the C ′ _B and C ′ _R channels using the following equation: That is,

コンピュータシステム実装例
本発明の実施形態は、コンピュータシステム、電子回路およびコンポーネントで構成されたシステム、マイクロコントローラ、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）または他のコンフィギュラブルまたはプログラマブルロジックデバイス（ＰＬＤ）、離散時間またはデジタル信号プロセッサ（ＤＳＰ）、特定用途向けＩＣ（ＡＳＩＣ）などの集積回路（ＩＣ）デバイス、および／または、このようなシステム、デバイスまたはコンポーネントを１つ以上含む装置、を用いて実施し得る。このコンピュータおよび／またはＩＣは、本明細書に記載のようなエンハンストダイナミックレンジを有する画像の信号再構成および符号化に関する命令を行い、制御し、または実行し得る。このコンピュータおよび／またはＩＣは、本明細書に記載の信号再構成および符号化プロセスに関する様々なパラメータまたは値のいずれを演算してもよい。画像およびビデオ実施形態は、ハードウェア、ソフトウェア、ファームウェア、および、その様々な組み合わせで実施され得る。 Embodiments of Computer Systems Embodiments of the present invention include computer systems, systems composed of electronic circuits and components, microcontrollers, field programmable gate arrays (FPGAs) or other configurable or programmable logic devices (PLDs), discrete time Alternatively, it may be implemented using a digital signal processor (DSP), an integrated circuit (IC) device such as an application specific IC (ASIC), and / or an apparatus that includes one or more such systems, devices or components. The computer and / or IC may perform, control, or execute instructions relating to signal reconstruction and encoding of images having an enhanced dynamic range as described herein. The computer and / or IC may compute any of a variety of parameters or values related to the signal reconstruction and encoding process described herein. Image and video embodiments may be implemented in hardware, software, firmware, and various combinations thereof.

本発明の特定の態様は、本発明の方法をプロセッサに行わせるためのソフトウェア命令を実行するコンピュータプロセッサを含む。例えば、ディスプレイ、エンコーダ、セットトップボックス、トランスコーダなどの中の１つ以上のプロセッサは、そのプロセッサがアクセス可能なプログラムメモリ内にあるソフトウェア命令を実行することによって、上記のようなＨＤＲ画像の信号再構成および符号化に関する方法を実装し得る。本発明は、プログラム製品形態で提供されてもよい。このプログラム製品は、データプロセッサによって実行された時に本発明の方法をデータプロセッサに実行させるための命令を含む１セットの、コンピュータ読み取り可能な信号を格納する任意の非一時的媒体を含み得る。本発明によるプログラム製品は、様々な形態をとり得る。例えば、このプログラム製品は、フロッピーディスク、ハードディスクドライブを含む磁気データ記憶媒体、ＣＤＲＯＭ、ＤＶＤを含む光学データ記憶媒体、ＲＯＭ、フラッシュＲＡＭなどを含む電子データ記憶媒体、などの物理的媒体を含み得る。このプログラム製品上のコンピュータ可読信号は、任意に、圧縮または暗号化されていてもよい。 Particular aspects of the present invention include computer processors that execute software instructions to cause a processor to perform the methods of the present invention. For example, one or more processors in a display, encoder, set-top box, transcoder, etc. may execute a software instruction in a program memory accessible to the processor, thereby causing the HDR image signal as described above. Methods for reconstruction and encoding may be implemented. The present invention may be provided in the form of a program product. The program product may include any non-transitory medium that stores a set of computer-readable signals containing instructions for causing the data processor to perform the methods of the invention when executed by the data processor. The program product according to the present invention may take various forms. For example, the program product may include physical media such as floppy disks, magnetic data storage media including hard disk drives, CD ROM, optical data storage media including DVD, electronic data storage media including ROM, flash RAM, and the like. . The computer readable signal on the program product may optionally be compressed or encrypted.

上記においてあるコンポーネント（例えば、ソフトウェアモジュール、プロセッサ、アセンブリ、デバイス、回路など）に言及している場合、そのコンポーネントへの言及（「手段」への言及を含む）は、そうでないと明記されている場合を除いて、当該コンポーネントの機能を果たす（例えば、機能的に均等である）あらゆるコンポーネント（上記した本発明の実施形態例に出てくる機能を果たす開示構造に対して構造的に均等ではないコンポーネントも含む）を、当該コンポーネントの均等物として、含むものと解釈されるべきである。 Where reference is made to a component (eg, a software module, processor, assembly, device, circuit, etc.) above, a reference to that component (including a reference to “means”) is explicitly stated otherwise. Except where noted, any component that performs the function of the component (eg, is functionally equivalent) (not structurally equivalent to the disclosed structure that performs the function that appears in the example embodiments of the present invention described above) As a component equivalent).

均等物、拡張物、代替物、その他
ＨＤＲ画像の効率的な信号再構成および符号化に関する実施形態例を上述した。この明細書中において、各実装毎に異なり得る多数の具体的な詳細に言及しながら本発明の実施形態を説明した。従って、本発明が如何なるものかおよび出願人は本発明が如何なるものであると意図しているかについての唯一且つ排他的な指標は、後の訂正を含む、これら請求項が生じる具体的な形態の、本願から生じる１組の請求項である。当該請求項に含まれる用語に対して本明細書中に明示したあらゆる定義が、請求項内で使用される当該用語の意味を決定するものとする。よって、請求項に明示的に記載されていない限定事項、構成要素、特性、特徴、利点または属性は、いかなる形であれ請求の範囲を限定するものではない。従って、本明細書および図面は、限定的ではなく、例示的であると認識されるべきものである。 Equivalents, extensions, alternatives, etc. Exemplary embodiments for efficient signal reconstruction and coding of HDR images have been described above. In this specification, embodiments of the invention have been described with reference to numerous specific details that may vary from implementation to implementation. Accordingly, the only and exclusive indication as to what the present invention is and what the applicant intends to be is the specific form in which these claims arise, including later corrections. , A set of claims arising from this application. Any definitions expressly set forth herein for terms contained in such claims shall govern the meaning of such terms as used in the claims. Thus, no limitation, element, property, feature, advantage or attribute that is not expressly recited in a claim should limit the scope of the claim in any way. Accordingly, the specification and drawings are to be regarded as illustrative rather than restrictive.

Claims

A method for improving processing of a high dynamic range image, comprising:
By the processor, RGB BT. Accessing a high dynamic range input image in the 2020 color space ;
Color-converting the high dynamic range input image into a first image in an XYZ color space;
Applying a first 3 × 3 color conversion matrix to the first image to generate a second image in a first color space, the first 3 × 3 color conversion matrix Is

The first 3 × 3 color conversion matrix is generated by multiplying the XYZ to LMS 3 × 3 color conversion matrix by the crosstalk matrix;
Generating a third image by applying a non-linear function to each color component of the second image;
Generating an output image by applying a second 3 × 3 color conversion matrix to the third image, wherein the second 3 × 3 color conversion matrix is:

Is included.

The 3 × 3 color conversion matrix from XYZ to LMS is:

Including
The crosstalk matrix is

The method of claim 1 comprising:

Color-converting the high dynamic range input image into the first image in the XYZ color space, and further applying the first 3 × 3 color conversion matrix to the first image, The step of generating the second image in the first color space includes the RGB BT. By multiplying the color conversion matrix used for the conversion from the 2020 color space to the XYZ color space by the 3 × 3 color conversion matrix from the XYZ to the LMS and the crosstalk matrix, the RGB BT. Generating a third 3 × 3 color conversion matrix for use in conversion from a 2020 color space to the first color space;
The third 3 × 3 color conversion matrix is

The method of claim 2 comprising:

The method of claim 1 , wherein the non-linear function comprises an inverse function of an electro-optic transfer function.

5. The method of claim 4 , wherein the electro-optic transfer function is determined according to the SMPTE ST 2084 standard.

A method for reconstructing an input image encoded in a reconstructed color space at a receiver, the reconstructed color space being generated based on minimizing a hue cost function The hue cost function is generated based on a wide gamut space using a rotation matrix, and the hue cost function is a measure of a first hue value in the old color space and a rotated second hue value in the wide gamut space. Based on the difference of and including the following:
And receiving an input image in the reconstructed color space,
Applying a first 3 × 3 color conversion matrix to generate a first image in a first color space, wherein the first 3 × 3 color conversion matrix is:

Including
Generating a second image by applying a non-linear function to each color component of the first image;
By applying a second 3 × 3 color conversion matrix to the second image, RGB BT. Generating an output image in 2020 color space, wherein the second 3 × 3 color transformation matrix is:

Is included.

The method of claim 6 , wherein the non-linear function comprises an electro-optic transfer function (EOTF).

8. The method of claim 7 , wherein the electro-optic transfer function is determined according to the SMPTE ST 2084 standard.

The method of claim 6 , wherein the non-linear function comprises a hybrid log gamma (HLG) EOTF function.

A method for reconstructing an input image encoded in a reconstructed color space at a receiver, the reconstructed color space being generated based on minimizing a hue cost function The hue cost function is generated based on a wide gamut space using a rotation matrix, and the hue cost function is a measure of a first hue value in the old color space and a rotated second hue value in the wide gamut space. Based on the difference of and including the following:
And receiving an input image in the reconstructed color space,
Applying a first 3 × 3 color conversion matrix to generate a first image in a first color space, wherein the first 3 × 3 color conversion matrix is :

Including
Generating a second image by applying a non-linear function to each color component of the first image;
Applying a second 3 × 3 color conversion matrix to the second image to generate a third image in an XYZ color space, wherein the second 3 × 3 color conversion matrix is :

Including
Generating an output image in the target color space by color-converting the third image from an XYZ color space to a target color space;

The method of claim 10 , wherein the non-linear function comprises an electro-optic transfer function (EOTF).

The method of claim 11 , wherein the electro-optic transfer function is determined according to the SMPTE ST 2084 standard.

11. A method according to claim 6 or claim 10 , wherein the reconstructed color space comprises an ICtCp color space.

A processor, the apparatus configured to perform any of the methods according to claims 1 13.

A computer-readable non-transitory storage medium storing computer-executable instructions for performing the method according to claims 1 to 13 using one or more processors.