KR20090037475A

KR20090037475A - Multi-pass video encoding

Info

Publication number: KR20090037475A
Application number: KR1020097003420A
Authority: KR
Inventors: 신 통; 시정 위; 토마스 펀; 아드리아나 두미트라; 바린 하스켈; 짐 노밀
Original assignee: 애플 인크.
Priority date: 2004-06-27
Filing date: 2005-06-24
Publication date: 2009-04-15
Also published as: KR100909541B1; JP2008504750A; KR100988402B1; JP4988567B2; CN102833538A; JP5318134B2; KR20090034992A; WO2006004605B1; KR100997298B1; CN1926863A; KR20070011294A; CN102833538B; CN102833539A; EP1762093A2; JP2011151838A; EP1762093A4; WO2006004605A2; WO2006004605A3; CN1926863B; CN102833539B

Abstract

Some embodiments of the invention provide a multi-pass encoding method that encodes several images (e.g., several frames of a video sequence). The method iteratively performs an encoding operation that encodes these images. The encoding operation is based on a nominal quantization parameter, which the method uses to compute quantization parameters for the images. During several different iterations of the encoding operation, the method uses several different nominal quantization parameters. The method stops its iterations when it reaches a terminating criterion (e.g., it identifies an acceptable encoding of the images).

Description

Multi-pass video encoding method {MULTI-PASS VIDEO ENCODING}

비디오 인코더는 다양한 인코딩 스킴을 이용함으로써 비디오 이미지들의 시퀀스(예를 들면, 비디오 프레임)를 인코딩한다. 비디오 인코딩 스킴은 통상적으로 비디오 프레임 또는 비디오 프레임의 일부(예를 들면, 비디오 프레임의 픽셀 집합)를 프레임 내(intraframe) 또는 프레임 간(interframe)의 점에서 인코딩한다. 인트라 프레임 인코딩된 프레임 또는 픽셀 집합은 다른 프레임 또는 다른 프레임의 픽셀 집합과는, 독립적으로 인코딩된 것이다. 인터 프레임 인코딩된 프레임 또는 픽셀 집합은 하나 이상의 다른 프레임 또는 다른 프레임의 픽셀 집합을 참조하여 인코딩된 것이다.The video encoder encodes a sequence of video images (eg, video frames) by using various encoding schemes. Video encoding schemes typically encode a video frame or a portion of a video frame (eg, a set of pixels of a video frame) in terms of intraframe or interframe. An intra frame encoded frame or set of pixels is encoded independently of another frame or set of pixels of another frame. An inter frame encoded frame or set of pixels is encoded with reference to a set of pixels of one or more other frames or other frames.

비디오 프레임을 압축할 때, 몇몇의 인코더는 인코딩되어야 할 비디오 프레임 또는 비디오 프레임의 집합에 대한 '비트 버짓(bit budget)'을 제공하는 '레이트 제어기(rate controller)'를 구현한다. 비트 버짓은 비디오 프레임 또는 비디오 프레임의 집합을 인코딩하는 데에 할당된 비트 개수를 기술한다. 비트 버짓을 효과적으로 할당함으로써, 레이트 제어기는 특정 제약사항(예를 들면, 목표 비트레이트, 등)의 관점으로 가장 품질이 좋게 압축된 비디오 스트림을 생성하는 것을 도모한다.When compressing video frames, some encoders implement a 'rate controller' which provides a 'bit budget' for the video frame or set of video frames to be encoded. Bit budget describes the number of bits allocated for encoding a video frame or set of video frames. By effectively allocating bit budgets, the rate controller seeks to produce the best quality compressed video stream in terms of certain constraints (eg, target bitrate, etc.).

현재까지, 다양한 단일-패스 및 멀티-패스 레이트 제어기가 제안되어 왔다. 단일-패스 레이트 제어기는 1 패스로 연속의 비디오 이미지를 인코딩하는 인코딩 스킴에 대한 비트 버짓을 제공하는 한편, 멀티-패스 레이트 제어기는 복수의 패스로 연속의 비디오 이미지를 인코딩하는 인코딩 스킴에 대한 비트 버짓을 제공한다.To date, various single-pass and multi-pass rate controllers have been proposed. Single-pass rate controller provides a bit budget for an encoding scheme that encodes a continuous video image in one pass, while a multi-pass rate controller provides a bit budget for an encoding scheme that encodes a continuous video image in multiple passes. To provide.

단일-패스 레이트 제어기는 실시간 인코딩 상황에 유용하다. 반면에, 멀티-패스 레이트 제어기는 제약사항 집합에 기초하여 특정 비트레이트의 인코딩을 최적화한다. 현재까지는 일부 레이트 제어기만이 자신의 인코딩의 비트레이트를 제어하는 데에 프레임 또는 프레임 내의 픽셀 집합의 공간적 또는 시간적인 복잡도를 고려한다. 또한, 대부분의 멀티-패스 레이트 제어기는 원하는 비트레이트 관점에서 프레임 및/또는 프레임 내의 픽셀 집합에 대한 최적의 양자화 파라미터를 이용하는 인코딩 솔루션(solution)을 찾기 위하여 솔루션 공간을 적절히 검색하지 않는다.Single-pass rate controllers are useful for real time encoding situations. On the other hand, the multi-pass rate controller optimizes the encoding of a particular bitrate based on a set of constraints. To date, only some rate controllers take into account the spatial or temporal complexity of a frame or set of pixels within a frame in controlling the bitrate of its encoding. In addition, most multi-pass rate controllers do not properly search the solution space to find an encoding solution that uses the optimal quantization parameter for the frame and / or set of pixels within the frame in terms of the desired bitrate.

그러므로, 본 기술 분야에서는 비디오 이미지 및/또는 비디오 이미지의 일부의 공간 또는 시간 복잡도를 고려하면서 비디오 이미지의 집합을 인코딩하기 위한 비트레이트를 제어하는 새로운 기술을 이용하는 레이트 제어기가 필요하다. 또한 본 기술 분야에서는 인코딩 솔루션을 적절히 검토하여 비디오 이미지 및/또는 비디오 이미지의 일부에 대한 최적의 양자화 파라미터 집합을 이용하는 인코딩 솔루션을 식별하는 멀티-패스 제어기도 필요하다.Therefore, there is a need in the art for rate controllers using new techniques for controlling the bitrate for encoding a set of video images while taking into account the spatial or temporal complexity of the video image and / or a portion of the video image. There is also a need in the art for a multi-pass controller to properly identify an encoding solution and identify the encoding solution using an optimal set of quantization parameters for the video image and / or a portion of the video image.

본 발명의 몇몇의 실시예는 복수의 이미지(예를 들면, 복수의 비디오 시퀀스 프레임)를 인코딩하는 멀티-패스 인코딩 방법을 제공한다. 이 방법은 이들 이미지를 인코딩하는 인코딩 동작을 반복적으로 수행한다. 이 인코딩 동작은 이 방법이 이미지들에 대한 양자화 파라미터를 계산하는 데에 이용하는 명목상의 양자화 파라미터에 기초한다. 몇몇의 서로 다른 인코딩 동작의 반복 시에, 방법은 몇몇의 서로 다른 명목상의 양자화 파라미터를 이용한다. 방법은 종료 기준에 도달할 때(예를 들면, 이미지들의 허용가능한 인코딩을 식별할 때) 이 반복들을 중지한다.Some embodiments of the present invention provide a multi-pass encoding method for encoding a plurality of images (eg, a plurality of video sequence frames). This method repeatedly performs an encoding operation for encoding these images. This encoding operation is based on the nominal quantization parameter that the method uses to calculate the quantization parameter for the images. In repetition of several different encoding operations, the method uses several different nominal quantization parameters. The method stops these iterations when it reaches an end criterion (eg, when identifying an acceptable encoding of the images).

본 발명의 몇몇의 실시예는 비디오 시퀀스를 인코딩하기 위한 방법을 제공한다. 이 방법은 비디오의 제1 이미지의 복잡도를 정량화하는 제1 속성을 식별한다. 방법은 또한 식별된 제1 속성에 기초하여 제1 이미지를 인코딩하기 위한 양자화 파라미터를 식별한다. 그 다음, 방법은 식별된 양자화 파라미터에 기초하여 제1 이 미지를 인코딩한다. 몇몇의 실시예에서, 이 방법은 비디오의 몇몇의 이미지에 대하여 이러한 3가지 동작을 수행한다.Some embodiments of the present invention provide a method for encoding a video sequence. This method identifies a first attribute that quantifies the complexity of the first image of the video. The method also identifies a quantization parameter for encoding the first image based on the identified first attribute. The method then encodes the first image based on the identified quantization parameter. In some embodiments, the method performs these three operations on several images of the video.

본 발명의 몇몇의 실시예는 비디오 이미지 및/또는 비디오 이미지의 일부의 "시각적 마스킹" 속성에 기초하여 비디오 이미지 시퀀스를 인코딩한다. 이미지 또는 이미지의 일부의 시각적 마스킹은 코딩 아티팩트가 이미지 또는 이미지 일부에서 얼마나 허용될 수 있는지에 대한 표시자이다. 이미지 또는 이미지 일부의 시각적인 마스킹 속성을 표현하기 위하여, 몇몇의 실시예는 이미지 또는 이미지 일부의 휘도 에너지를 정량화하는 시각적 마스킹 강도를 계산한다. 몇몇의 실시예에서, 휘도 에너지는 이미지 또는 이미지 일부의 평균 루마(luma) 또는 픽셀 에너지의 함수로서 측정된다.Some embodiments of the present invention encode a video image sequence based on the "visual masking" attribute of the video image and / or a portion of the video image. Visual masking of an image or part of an image is an indicator of how much coding artifacts can be allowed in the image or part of the image. In order to represent the visual masking properties of an image or part of an image, some embodiments calculate visual masking intensities that quantify the luminance energy of the image or part of the image. In some embodiments, luminance energy is measured as a function of average luma or pixel energy of an image or portion of an image.

휘도 에너지 대신에, 또는 휘도 에너지와 함께, 이미지 또는 이미지 일부의 시각적 마스킹 강도 또한 이미지 또는 이미지 일부의 활성 에너지를 정량화할 수 있다. 활성 에너지는 이미지 또는 이미지 일부의 복잡도를 표현한다. 몇몇의 실시예에서, 활성 에너지는 이미지 또는 이미지 일부의 공간적 복잡도를 정량화하는 공간 성분, 및/또는 이미지들 간의 움직임으로 인한 허용될 수 있거나 마스킹될 수 있는 왜곡 양을 정량화하는 움직임 성분을 포함한다.Instead of, or with, the luminance energy, the visual masking intensity of the image or portion of the image may also quantify the active energy of the image or portion of the image. Active energy expresses the complexity of an image or part of an image. In some embodiments, the activation energy includes a spatial component that quantifies the spatial complexity of the image or portion of the image, and / or a movement component that quantifies the amount of distortion that can be tolerated or masked due to movement between the images.

본 발명의 몇몇의 실시예는 비디오 시퀀스를 인코딩하기 위한 방법을 제공한다. 이 방법은 비디오의 제1 이미지의 시각적인-마스킹 속성을 식별한다. 이 방법은 또한 식별된 시각적-마스킹 속성에 기초하여 제1 이미지를 인코딩하기 위한 양자화 파라미터를 식별한다. 그 다음, 방법은 식별된 양자화 파라미터에 기초하 여 제1 이미지를 인코딩한다.Some embodiments of the present invention provide a method for encoding a video sequence. This method identifies the visual-masking properties of the first image of the video. The method also identifies quantization parameters for encoding the first image based on the identified visual-masking attributes. The method then encodes the first image based on the identified quantization parameter.

본 발명의 새로운 특징은 첨부된 특허청구범위에 설명된다. 그러나, 설명을 위하여 본 발명의 몇몇의 실시예가 다음의 도면에 설명된다.New features of the invention are set forth in the appended claims. However, for the sake of illustration, some embodiments of the invention are described in the following figures.

다음의 본 발명의 상세한 설명에서는, 본 발명의, 다양한 상세한 사항, 예, 및 실시예가 설명되고 기술된다. 그러나, 본 발명은 설명된 실시예로 한정되지 않으며 본 발명은 몇몇의 특정 상세한 사항 및 기술된 예 없이도 실행될 수 있음이 본 기술 분야에서 숙련된 기술을 가진 자에게 자명하고 명백할 것이다.In the following detailed description of the invention, various details, examples, and examples of the invention are described and described. However, it will be apparent to those skilled in the art that the present invention is not limited to the described embodiments and that the present invention may be practiced without some specific details and described examples.

Ⅰ. 정의I. Justice

이 섹션은 본 명세서에서 이용되는 몇몇의 기호에 대한 정의를 제공한다.This section provides definitions of some of the symbols used herein.

는 프레임 시퀀스를 인코딩하는 데에 바람직한 비트레이트인 목표 비트레이트를 나타낸다. 통상적으로, 이러한 비트레이트는 bit/sec(초 당 비트) 단위로 표현되며, 바람직한 최종 파일 사이즈, 시퀀스에서의 프레임 개수, 및 프레임레이트로부터 계산된다.

Denotes a target bitrate, which is the preferred bitrate for encoding the frame sequence. Typically, this bitrate is expressed in bits / sec (bits per second) and is calculated from the desired final file size, the number of frames in the sequence, and the frame rate.

는 패스 p의 끝에서의 인코딩된 비트 스트림의 비트레이트를 나타낸다.

Denotes the bitrate of the encoded bit stream at the end of pass p.

는 패스 p의 끝에서의 비트레이트의 오차 퍼센트를 나타낸다. 몇몇의 경우, 이러한 퍼센트는 100 x

로서 계산된다.

Denotes the error percentage of the bitrate at the end of pass p. In some cases, this percentage is 100 x

Is calculated as

ε는 최종 비트레이트에서의 오차 허용치를 나타낸다.ε represents the error tolerance at the final bitrate.

는 제1 QP 검색 단계에 대한 비트레이트의 오차 허용치를 나타낸다.

Denotes an error tolerance of the bitrate for the first QP search step.

QP는 양자화 파라미터를 나타낸다.QP represents a quantization parameter.

는 프레임 시퀀스에 대한 패스 p 인코딩에 이용되는 명목상의 양자화 파라미터를 나타낸다.

의 값은 제1 QP 조정 단계에서 본 발명의 멀티-패스 인코더에 의해 목표 비트레이트에 도달하도록 조정된다.

Denotes a nominal quantization parameter used for pass p encoding for a frame sequence.

Is adjusted to reach the target bitrate by the multi-pass encoder of the present invention in the first QP adjustment step.

는 패스 p에서의 프레임 k에 대한 양자화 파라미터(QP)인 마스킹된 프레임 QP를 나타낸다. 몇몇의 실시예에는 이 값을 명목상의 QP 및 프레임-레벨 시각적 마스킹에 의해 계산한다.

Denotes the masked frame QP, which is the quantization parameter QP for frame k in pass p. In some embodiments this value is calculated by nominal QP and frame-level visual masking.

는 프레임 k 및 패스 p의 (매크로블록 인덱스 m을 가지는) 개별적인 매크로블록에 대한 양자화 파라미터(QP)인 마스킹된 매크로블록 QP를 나타낸다. 몇몇의 실시예는

를

및 매크로블록 레벨 시각적 마스킹을 이용하여 계산한다.

Denotes the masked macroblock QP, which is the quantization parameter (QP) for the individual macroblocks (with macroblock index m) of frame k and pass p. Some embodiments

To

And macroblock level visual masking.

는 프레임 k에 대한 마스킹 강도라 칭하는 값을 나타낸다. 마스킹 강도

는 프레임에 대한 복잡도의 측정치이며, 몇몇의 실시예에서, 이 값은 어떻게 시각적인 코딩 아티팩트(artifact)/잡음이 나타날지를 결정하고 프레임 k의

를 계산하는 데에 이용된다.

Denotes a value called masking strength for frame k. Masking strength

Is a measure of the complexity of the frame, and in some embodiments, this value determines how visual coding artifacts / noises will appear and

It is used to calculate

는 패스 p의 기준 마스킹 강도를 나타낸다. 기준 마스킹 강도는 프레 임 k의

를 계산하는 데에 이용되고, 제2 단계에서 본 발명의 멀티-패스 인코더에 의해 목표 비트레이트에 도달하도록 조정된다.

Denotes the reference masking intensity of pass p. Reference masking strength of frame k

It is used to calculate, and adjusted to reach the target bitrate by the multi-pass encoder of the present invention in the second step.

는 프레임 k의 인덱스 m을 가지는 매크로블록에 대한 마스킹 강도를 나타낸다. 마스킹 강도

는 매크로블록에 대한 복잡도의 측정치이고, 몇몇의 실시예에서 이 값은 시각적인 코딩 아티팩트/잡음이 어떻게 나타날지를 결정하고

를 계산하는 데에 이용된다. AMQP_p는 패스 p의 프레임들에 대한 평균 마스킹된 QP를 나타낸다. 몇몇의 실시예에서, 이 값은 패스 p의 모든 프레임들에 대한 평균

로서 계산된다.

Denotes the masking strength for the macroblock with index m of frame k. Masking strength

Is a measure of complexity for the macroblock, and in some embodiments this value determines how visual coding artifacts / noise will appear and

It is used to calculate AMQP _p represents the average masked QP for the frames of pass p. In some embodiments, this value is an average over all frames of pass p.

Is calculated as

Ⅱ. 개관II. survey

본 발명의 몇몇의 실시예는 소정의 비트레이트로 프레임 시퀀스를 인코딩하는데에 있어 최상의 시각적인 품질을 이루는 인코딩 방법을 제공한다. 몇몇의 실시예에서, 이 방법은 양자화 파라미터 QP를 모든 매크로블록에 할당하는 시각적인 마스킹 프로세스를 이용한다. 이 할당은 이미지 또는 비디오 프레임의 보다 밝거나 공간적으로 복잡한 구역에서의 코딩 아티팩트/잡음이 보다 어둡거나 평탄한 구역에서의 코딩 아티팩트/잡음에 비하여 잘 보이지 않는다는 사실에 기초한다.Some embodiments of the present invention provide an encoding method that achieves the best visual quality in encoding a frame sequence at a predetermined bitrate. In some embodiments, this method uses a visual masking process that assigns the quantization parameter QP to all macroblocks. This assignment is based on the fact that coding artifacts / noise in brighter or spatially complex areas of an image or video frame are less visible than coding artifacts / noise in darker or flat areas.

몇몇의 실시예에서, 이러한 시각적인 마스킹 프로세스는 본 발명의 멀티-패스 인코딩 프로세스의 일부로서 수행된다. 이러한 인코딩 프로세스는, 최종 인코딩된 비트 스트림이 목표 비트레이트에 도달하게 하기 위하여, 명목상의 양자화 파 라미터를 조정하고 기준 마스킹 강도 파라미터

을 통하여 시각적인 마스킹 프로세스를 제어한다. 이하에 더 기술될 바와 같이, 명목상의 양자화 파라미터를 조정하는 것과 마스킹 알고리즘을 제어하는 것은 각각의 화상(즉, 통상적인 비디오 인코딩 스킴에서의 각 프레임) 및 각각의 화상 내의 각각의 매크로블록에 대한 QP 값을 조정한다.In some embodiments, this visual masking process is performed as part of the multi-pass encoding process of the present invention. This encoding process adjusts the nominal quantization parameter and sets the reference masking intensity parameter so that the final encoded bit stream reaches the target bitrate.

To control the visual masking process. As will be described further below, adjusting the nominal quantization parameter and controlling the masking algorithm are QP for each picture (ie, each frame in a typical video encoding scheme) and each macroblock in each picture. Adjust the value.

몇몇의 실시예에서, 멀티-패스 인코딩 프로세스는 전체 시퀀스에 대한 명목상의 QP 및

을 전역적으로 조정한다. 다른 실시예에서, 이 프로세스는 비디오 시퀀스를 세그먼트들로 나누고 각 세그먼트에 대하여 명목상의 QP 및

이 조정된다. 이하의 설명은 멀티-패스 인코딩 프로세스가 채용된 프레임의 시퀀스를 언급한다. 본 기술 분야에서 통상의 기술을 가진 자라면 이러한 시퀀스는 몇몇의 실시예에서는 전체 시퀀스를 포함하지만, 다른 실시예에서는 시퀀스 중 어느 한 세그먼트만을 포함함을 인식할 것이다.In some embodiments, the multi-pass encoding process includes a nominal QP for the entire sequence and

Globally. In another embodiment, this process divides the video sequence into segments and for each segment a nominal QP and

This is adjusted. The following description refers to a sequence of frames in which a multi-pass encoding process is employed. Those skilled in the art will recognize that such a sequence includes the entire sequence in some embodiments, but only one segment of the sequence in other embodiments.

몇몇의 실시예에서, 방법은 3가지 인코딩 단계를 가진다. 이들 3 단계는 (1) 패스 0에서 수행되는 초기 분석 단계, (2) 패스 1 내지 패스

에서 수행되는 제1 검색 단계, 및 (3) 패스

내지 패스

에서 수행되는 제2 검색 단계이다.In some embodiments, the method has three encoding steps. These three steps are (1) an initial analysis step performed in pass 0, (2) pass 1 to pass

A first search step performed in (3), and (3) pass

To pass

The second search step is performed at.

초기 분석 단계(즉, 패스 0)에서, 방법은 명목상의 QP에 대한 초기 값(인코딩의 패스 1에 이용될 QP_Nom(1))을 식별한다. 초기 분석 단계 중에, 방법은 제1 검 색 단계의 모든 패스에서 이용되는 기준 마스킹 강도

의 값 또한 식별한다.In an initial analysis step (ie pass 0), the method identifies an initial value for the nominal QP (QP _Nom (1) to be used in pass 1 of encoding). During the initial analysis phase, the method uses the reference masking intensity used in all passes of the first search phase.

It also identifies the value of.

제1 검색 단계에서, 방법은 인코딩 프로세스의

번의 반복(즉,

패스들)을 수행한다. 각 패스 p 중 각각의 프레임 k에 대하여, 프로세스는 프레임 k 내의 개별적인 매크로블록 m에 대한 특정 양자화 파라미터

및 특정 양자화 파라미터

를 이용함으로써 프레임을 인코딩하며, 여기서

는

를 이용하여 계산된다.In the first retrieval step, the method comprises

Repetitions (i.e.

Passes). For each frame k of each pass p, the process may specify specific quantization parameters for the individual macroblock m in frame k.

And specific quantization parameters

Encode the frame by using

Is

Calculated using

제1 검색 단계에서, 양자화 파라미터

는 패스들마다 변화하는 명목상의 양자화 파라미터

로부터 유도되기 때문에 패스들마다 변한다. 다시 말하면, 제1 검색 단계 중에 각각의 패스 p의 끝에서, 프로세스는 패스

에 대한 명목상의

을 계산한다. 몇몇의 실시예에서, 명목상의

는 이전의 패스(들)로부터의 명목상의 QP 값(들) 및 비트레이트 오차(들)에 기초한다. 다른 실시예에서, 명목상의

값은 제2 검색 단계에서 각각의 패스의 끝에서 서로 다르게 계산된다.In the first search step, the quantization parameter

Is the nominal quantization parameter that varies from pass to pass

Since it is derived from, it changes from pass to pass. In other words, at the end of each pass p during the first search phase, the process passes

Nominal for

Calculate In some embodiments, nominal

Is based on the nominal QP value (s) and bitrate error (s) from the previous pass (s). In another embodiment, nominal

The value is calculated differently at the end of each pass in the second search step.

제2 검색 단계에서, 방법은 인코딩 프로세스의

번의 반복(즉,

개의 패스들)을 수행한다. 제1 검색 단계에서와 같이, 프로세스는 프레임 k 내의 개별적인 매크로블록 m에 대한 특정 양자화 파라미터

및 특정 양자화 파라미터

를 이용함으로써 각각의 패스 p 중에 각각의 프레임 k를 인코딩하 는데, 여기서

는

로부터 유도된다.In the second retrieval step, the method is characterized by

Repetitions (i.e.

Pass). As in the first retrieval step, the process performs specific quantization parameters for the individual macroblock m in frame k.

And specific quantization parameters

Encode each frame k in each pass p by using

Is

Derived from.

또한, 제1 검색 단계에서와 같이, 양자화 파라미터

는 패스들마다 변한다. 그러나, 제2 검색 단계 중에, 이러한 파라미터는 패스들마다 변하는 기준 마스킹 강도

를 이용하여 계산되기 때문에 변화하는 것이다. 몇몇의 실시예에서, 기준 마스킹 강도

는 이전 패스(들)로부터의 값(들) 및 비트레이트(들)에서의 오차에 기초하여 계산된다. 다른 실시예에서, 이러한 기준 마스킹 강도는 제2 검색 단계에서의 각각의 패스 끝에서 다른 값이 되도록 계산된다.Also, as in the first search step, the quantization parameter

Changes from pass to pass. However, during the second search phase, this parameter varies with reference masking intensity from pass to pass.

Because it is calculated using In some embodiments, reference masking strength

Is calculated based on the value (s) from the previous pass (s) and the error in the bitrate (s). In another embodiment, this reference masking intensity is calculated to be a different value at the end of each pass in the second search step.

멀티-패스 인코딩 프로세스가 시각적인 마스킹 프로세스와 함께 기술되었지만, 본 기술 분야에서 통상의 기술을 가진 자라면 인코더는 이들 프로세스 둘다를 함께 이용할 필요는 없음을 인식할 것이다. 예를 들면, 몇몇의 실시예에서, 멀티-패스 인코딩 프로세스는

을 무시하고 상술한 제2 검색 단계를 생략함으로써 시각적인 마스킹 없이 소정의 목표 비트레이트에 근접하게 비트스트림을 인코딩하는 데에 이용된다.Although the multi-pass encoding process has been described with a visual masking process, one of ordinary skill in the art will recognize that the encoder does not need to use both of these processes together. For example, in some embodiments, the multi-pass encoding process is

By ignoring the above and skipping the above-described second search step, it is used to encode the bitstream close to the desired target bitrate without visual masking.

시각적인 마스킹 및 멀티-패스 인코딩 프로세스는 본 명세서의 섹션 Ⅲ 및 Ⅳ에 더 기술된다.The visual masking and multi-pass encoding process is further described in sections III and IV herein.

Ⅲ. 시각적인 마스킹III. Visual masking

명목상의 양자화 파라미터가 주어지면, 시각적인 마스킹 프로세스는 먼저 기준 마스킹 강도 (

) 및 프레임 마스킹 강도

를 이용하여 각각의 프레임에 대 한 마스킹된 프레임 양자화 파라미터(MQP)를 계산한다. 그 다음 이 프로세스는 각각의 매크로블록에 대한 마스킹된 매크로블록 양자화 파라미터

를, 프레임 및 매크로블록-레벨 마스킹 강도(φ_F 및 φ_MB)에 기초하여, 계산한다. 시각적인 마스킹 프로세스가 멀티-패스 인코딩 프로세스에 채용된다면, 몇몇의 실시예에서의 기준 마스킹 강도(

)는 상술하였고 후술될 바와 같이 제1 인코딩 패스 중에 식별된다.Given a nominal quantization parameter, the visual masking process first begins with the reference masking intensity (

A) and frame masking strength

Calculate the masked frame quantization parameter (MQP) for each frame. The process then performs a masked macroblock quantization parameter for each macroblock.

Is calculated based on the frame and macroblock-level masking intensities φ _F and φ _MB . If a visual masking process is employed in the multi-pass encoding process, the reference masking intensity (in some embodiments)

) Is identified during the first encoding pass as described above and will be described later.

A. 프레임-레벨 마스킹 강도 계산A. Frame-Level Masking Strength Calculation

1. 제1 접근법1. The first approach

프레임-레벨 마스킹 강도 φ_F(k) 를 계산하기 위하여, 몇몇의 실시예는 다음의 수학식 1을 이용한다:To calculate the frame-level masking intensity φ _F (k), some embodiments use the following equation:

여기에서,From here,

ㆍ

는 bxb 영역을 이용하여 계산된 프레임 k에서의 평균 픽셀 강도이며, b는 1 이상의 정수이다(예를 들면, b=1 또는 b=4).ㆍ

Is the average pixel intensity in frame k calculated using the bxb region, and b is an integer of at least 1 (e.g., b = 1 or b = 4).

ㆍ

는 프레임 k의 모든 매크로블록에 대한

의 평균이다.ㆍ

For all macroblocks in frame k

Is the average.

ㆍ

는 인덱스 m을 가지는 매크로블록의 모든 4x4 블록에 대한 함수

에 의해 주어진 값들의 합이다.ㆍ

Is a function of all 4x4 blocks of the macroblock with index m

Is the sum of the values given by.

ㆍ

, C, D 및 E는 상수이고/거나 지역적 통계에 적응된다.ㆍ

, C, D and E are constant and / or adapted to local statistics.

ㆍpower(a, b)는

을 의미한다.Power (a, b) is

Means.

함수

의 의사 코드는 다음과 같다:function

The pseudo code for is as follows:

2. 제2 접근법2. The Second Approach

다른 실시예는 프레임-레벨 마스킹 강도를 다르게 계산한다. 예를 들면, 상술한 수학식 1은 본질적으로 다음과 같이 프레임 마스킹 강도를 계산한다:Another embodiment calculates frame-level masking intensity differently. For example, Equation 1 above essentially calculates the frame masking intensity as follows:

수학식 1에서, 프레임의

는

와 같고,

는 프레임의 모든 매크로블록에 대한 평균 매크로블록 SAD(

) 값인

와 같으며, 여기서 평균 매크로블록 SAD는 매 크로블록의 모든 4x4 블록에 대한 (

에 의해 주어진) 평균 제거된 4x4 픽셀 편차의 절대값의 합과 같다. 이

는 코딩되고 있는 프레임 내의 픽셀의 영역의 공간적 전개의 총계를 측정한다.In Equation 1, the frame

Is

Is the same as

Is the average macroblock SAD (for all macroblocks in a frame)

) Value

Where the average macroblock SAD is (for all 4x4 blocks of each macroblock)

Equal to the sum of the absolute values of the mean removed 4x4 pixel deviations, given by. this

Measures the total amount of spatial evolution of the region of pixels in the frame being coded.

다른 실시예는 복수의 연속되는 프레임에 걸친 픽셀 영역에서의 시간적 전개의 양을 포함하도록 활성 측정치를 확장한다. 상세히는, 이들 실시예는 다음과 같이 프레임 마스킹 강도를 계산한다.Another embodiment extends the active measure to include the amount of temporal evolution in the pixel region over a plurality of consecutive frames. Specifically, these examples calculate the frame masking strength as follows.

이 수학식에서,

는 다음의 수학식 3에 의해 주어진다.In this equation,

Is given by the following equation (3).

몇몇의 실시예에서,

는 프레임들 간의 움직임에 의한, 허용될 수 있는(즉, 마스킹될 수 있는) 왜곡의 양을 정량화한다. 몇몇의 이들 실시예에서, 프레임의

는 프레임 내에 정의된 픽셀 영역의 움직임 보상된 오차 신호의 절대값 합의 정수배와 같다. 다른 실시예에서,

는 아래의 수학식 4에 의해 제공된다.In some embodiments,

Quantifies the amount of distortion that can be tolerated (ie, masked) by motion between frames. In some of these embodiments, the frame

Is an integer multiple of the absolute value sum of the motion compensated error signals of the pixel region defined within the frame. In another embodiment,

Is provided by Equation 4 below.

수학식 4에서, "

"는 (상술한 바와 같이) 프레임의 평균 매크로블록 SAD(

) 값을 나타내고,

(0)은 현재 프레임에 대한

이며, 음수 j는 현재 프레임 이전의 순시(time instant)를 인덱싱하고 양수 j는 현재 프레임 이후의 순시를 인덱싱한다. 그러므로,

는 현재 프레임 이전의 2개의 프레임의 평균 프레임 SAD를 나타내고,

는 현재 프레임 이후의 3개의 프레임의 평균 프레임 SAD를 나타낸다.In Equation 4, "

"Is the average macroblock SAD (

) Value,

(0) for the current frame

Negative j indexes the instant instant before the current frame and positive j indexes the instant instant after the current frame. therefore,

Denotes the average frame SAD of the two frames before the current frame,

Denotes an average frame SAD of three frames after the current frame.

또한, 수학식 4에서, 변수 N 및 M은 각각 현재 프레임 이전의 프레임 개수 및 현재 프레임 이후의 프레임 개수를 나타낸다. 단순히 특정 개수의 프레임에 기초하여 값 N 및 M을 선택하는 대신에, 몇몇의 실시예는 현재 프레임 시간의 이전 및 이후 특정 시간 간격에 기초하여 값 N 및 M을 계산한다. 움직임 마스킹을 시간적인 기간에 상호 연관시키는 것은 움직임 마스킹을 설정된 수의 프레임에 상호 연관시키는 것보다 유리하다. 이는 움직임 마스킹을 시간적인 간격과의 상호 연관시 키는 것이 관찰자의 시간 기반의 시각적인 인식에 직접적으로 순응하기 때문이다. 반면에, 이러한 마스킹을 프레임 수에 상호 연관시키면 서로 다른 디스플레이가 서로 다른 프레임레이트로 비디오를 표시하기 때문에 디스플레이 간격이 변화되는 문제가 있다.Also, in Equation 4, variables N and M represent the number of frames before the current frame and the number of frames after the current frame, respectively. Instead of simply selecting values N and M based on a particular number of frames, some embodiments calculate values N and M based on a specific time interval before and after the current frame time. Correlating motion masking in a temporal period is advantageous over correlating motion masking to a set number of frames. This is because correlating motion masking with temporal intervals directly conforms to the observer's time-based visual perception. On the other hand, if such masking is correlated to the number of frames, there is a problem in that the display interval is changed because different displays display video at different frame rates.

수학식 4에서, "W"는 몇몇의 실시예에서 프레임 j가 현재 프레임으로부터 진행됨에 따라 감소하는 가중치 계수를 나타낸다. 또한, 이 수학식에서, 첫번째 합산은 현재 프레임 이전에 마스킹될 수 있는 움직임의 양을 나타내고, 두번째 합산은 현재 프레임 이후에 마스킹될 수 있는 움직임의 양을 나타내며, 마지막 표현식(

)은 현재 프레임의 프레임 SAD를 나타낸다.In Equation 4, " W " represents, in some embodiments, a weighting factor that decreases as frame j advances from the current frame. Also, in this equation, the first sum represents the amount of motion that can be masked before the current frame, the second sum represents the amount of motion that can be masked after the current frame, and the last expression (

) Represents the frame SAD of the current frame.

몇몇의 실시예에서, 가중치 계수는 장면 전환을 설명하도록 조정된다. 예를 들면, 몇몇의 실시예는 예견 범위 내의(즉, M 프레임 내의) 다가오는 장면 전환을 설명하지만, 장면 전환 이후의 임의의 프레임을 설명하지는 않는다. 예를 들면, 이들 실시예는 장면 전환 이후의 예견 범위 내의 프레임에 대하여 가중치 계수를 0으로 설정할 수 있다. 또한, 몇몇의 실시예는 회고 범위 내의(즉, M 프레임 내의) 장면 전환 이전 또는 장면 전환 중의 프레임을 설명하지 않는다. 예를 들면, 이들 실시예는 이전 장면과 관련되거나 이전 장면 전환 전에 있었던 회고 범위 내의 프레임에 대하여 가중치 계수를 0으로 설정할 수 있다.In some embodiments, the weight coefficients are adjusted to account for the scene transitions. For example, some embodiments describe upcoming scene transitions within the foresight range (ie, within M frames), but do not describe any frames after the scene transition. For example, these embodiments may set the weighting factor to zero for a frame within the forecast range after the scene change. In addition, some embodiments do not describe frames before or during scene transitions within the recall range (ie, within M frames). For example, these embodiments may set the weighting factor to zero for frames within the recall range associated with the previous scene or before the previous scene transition.

3. 제2 접근법에 대한 변형3. Modifications to the Second Approach

a) 과거 또는 미래의 프레임이

에 미치는 영향을 제 한하기a) past or future frames

Limit the impact on

상기 수학식 4는 본질적으로 다음의 식으로

를 표현한다:Equation 4 is essentially

Expresses:

,

여기서

는

와 같고,

는

와 같으며,

는

와 같다.here

Is

Is the same as

Is

Is the same as

Is

Same as

몇몇의 실시예는

의 계산을 수정하여

와

중 어느 것도 부적절하게

의 값을 제어하지 않도록 한다. 예를 들면, 몇몇의 실시예는 초기에 PFA가

와 같고, FFA가

와 같도록 정의한다.Some embodiments

By modifying the calculation of

Wow

None of them are inappropriately

Do not control the value of. For example, some embodiments initially show that PFA

Is equal to the FFA

Defined as

그 다음 이들 실시예는 PFA가 FFA의 스칼라 곱보다 더 큰지를 판정한다. 그렇다면, 이들 실시예는 PFA를 상한 PFA 한계 값(예를 들면, FFA의 스칼라 곱)과 같도록 설정한다. PFA를 상한 PFA 한계 값과 같도록 설정하는 것 이외에도, 몇몇의 실시예는 FFA를 0으로 설정하고 CFA를 0으로 설정하는 것을 조합하여 수행할 수 있다. 다른 실시예는 PFA와 CFA 모두 또는 이 둘 중 하나를 PFA, CFA, 및 FFA의 가중된 조합으로 설정할 수 있다.These embodiments then determine if the PFA is greater than the scalar product of the FFA. If so, these embodiments set the PFA to be equal to the upper limit PFA threshold (eg, a scalar product of the FFA). In addition to setting the PFA equal to the upper limit PFA threshold, some embodiments may perform a combination of setting the FFA to zero and setting the CFA to zero. Other embodiments may set both or one of PFA and CFA to a weighted combination of PFA, CFA, and FFA.

마찬가지로, 초기에는 가중된 합에 기초하여 PFA 및 FFA 값을 정의한 이후에, 몇몇의 실시예는 또한 FFA 값이 PFA의 스칼라 곱보다 큰지를 판정한다. 그렇다면, 이들 실시예는 FFA가 상한 FFA 한계 값(예를 들면, PFA의 스칼라 곱)과 같도록 설정한다. FFA를 상한 FFA 한계 값과 같도록 설정하는 것 이외에도, 몇몇의 실시예는 PFA를 0으로 설정하고 CFA를 0으로 설정하는 것을 조합하여 수행할 수 있다. 다른 실시예는 FFA와 CFA 모두 또는 이 둘 중 하나를 FFA, CFA, 및 PFA의 가중된 조합으로 설정할 수 있다.Similarly, after initially defining PFA and FFA values based on a weighted sum, some embodiments also determine whether the FFA value is greater than the scalar product of the PFA. If so, these embodiments set the FFA equal to the upper limit FFA limit value (eg, the scalar product of the PFA). In addition to setting the FFA equal to the upper limit FFA limit value, some embodiments may perform a combination of setting the PFA to zero and setting the CFA to zero. Other embodiments may set both or both of FFA and CFA to a weighted combination of FFA, CFA, and PFA.

(가중치 합에 기초하는 이들 값의 초기 계산 이후에) PFA 및 FFA 값의 잠재적인 이후의 조정은 이들 값 중 하나가 부적절하게

를 제어하는 것을 방지한다.Potential subsequent adjustments of PFA and FFA values (after initial calculation of these values based on weighted sums) result in one of these values being inappropriately

To control it.

b)

및

가

에 미치는 영향을 제한하기b)

And

end

Limit the impact on

상기 수학식 3은 본질적으로 다음의 식으로

를 표현한다.Equation 3 is essentially

Express

,

여기서

는 scalar^*(scalar^*

)^β와 같고,

는

와 같다.here

Scalar ^* (scalar ^*

) ^β ,

Is

Same as

몇몇의 실시예는

의 계산을 수정하여

및

중 어느 것도

의 값을 부적절하게 제어하지 않도록 한다. 예를 들면, 몇몇의 실시예는 초기에

를 scalar^*(scalar^*

)^β와 같도록 정의하고,

를

와 같도록 정의한다.Some embodiments

By modifying the calculation of

And

None of

Do not improperly control the value of. For example, some embodiments initially

Scalar ^* (scalar ^*

) equal to ^β ,

To

Defined as

그 다음 이들 실시예는 SA가 TA의 스칼라 곱보다 큰지를 판정한다. 그렇다면, 이들 실시예는 SA가 상한 SA 한계 값(예를 들면, TA의 스칼라 곱)과 같도록 설정한다. SA를 상한 SA 한계와 같도록 설정하는 것 이외에도, 이 경우 몇몇의 실시예는 TA 값을 0 또는 TA와 SA의 가중된 조합으로 설정할 수도 있다. These embodiments then determine if SA is greater than the scalar product of the TA. If so, these embodiments set the SA equal to the upper SA limit value (eg, the scalar product of the TA). In addition to setting the SA to be equal to the upper SA limit, some embodiments in this case may set the TA value to zero or a weighted combination of TA and SA.

마찬가지로, 초기에 지수방정식에 기초하여 SA 및 TA 값을 정의한 이후에, 몇몇의 실시예는 또한 TA 값이 SA의 스칼라 곱보다 큰지를 판정한다. 그렇다면, 이들 실시예는 TA가 상한 TA 한계 값(예를 들면, SA의 스칼라 곱)과 같도록 설정한다. TA를 상한 TA 한계와 같도록 설정하는 것 이외에도, 이 경우 몇몇의 실시예는 SA 값을 0으로 설정하거나 SA와 TA의 가중된 조합으로 설정할 수도 있다. Likewise, after initially defining SA and TA values based on exponential equations, some embodiments also determine whether the TA value is greater than the scalar product of SA. If so, these embodiments set the TA equal to the upper limit TA limit value (eg, a scalar product of SA). In addition to setting the TA equal to the upper TA limit, in this case some embodiments may set the SA value to 0 or a weighted combination of SA and TA.

(지수방정식에 기초하는 이들 값의 초기 계산 이후에) SA 및 TA 값의 잠재적 인 이후의 조정은 이들 값 중 하나가 부적절하게

를 제어하는 것을 방지한다.Potential subsequent adjustments of SA and TA values (after initial calculation of these values based on exponential equations) result in one of these values being improperly adjusted.

To control it.

B. 매크로블록-레벨 마스킹 강도의 계산B. Calculation of Macroblock-Level Masking Intensity

1. 제1 접근법1. The first approach

몇몇의 실시예에서, 매크로블록-레벨 마스킹 강도

는 다음과 같이 계산된다:In some embodiments, macroblock-level masking strength

Is calculated as follows:

여기서,here,

ㆍ

는 프레임 k, 매크로블록 m의 평균 픽셀 강도이다.ㆍ

Is the average pixel intensity of frame k, macroblock m.

ㆍ

및 C는 상수이고/거나 지역적 통계에 적응된다.ㆍ

And C are constant and / or adapted to local statistics.

2. 제2 접근법2. The Second Approach

상술한 수학식 5는 본질적으로 다음과 같이 매크로블록 마스킹 강도를 계산한다: Equation 5 above essentially calculates the macroblock masking strength as follows:

수학식 5에서, 매크로블록의

는

와 같고,

는

와 같다. 이

는 코딩되고 있는 매크로블록 내의 픽셀 영억에서의 공간적인 전개의 양을 측정한다.In Equation 5, the macroblock

Is

Is the same as

Is

Same as this

Measures the amount of spatial evolution in pixel permanents within the macroblock being coded.

프레임 마스킹 강도의 경우에서와 같이, 몇몇의 실시예는 복수의 연속되는 프레임에 걸친 픽셀 영역에서의 시간적 전개의 양을 포함하도록 매크로블록 마스킹 강도의 활성 측정치를 확장할 수 있다. 상세히는, 이들 실시예는 다음과 같이 매크로블록 마스킹 강도를 계산할 것이다:As in the case of frame masking intensity, some embodiments may extend the active measure of macroblock masking intensity to include the amount of temporal evolution in the pixel region over a plurality of consecutive frames. Specifically, these examples will calculate the macroblock masking intensity as follows:

여기서, Mb_Activity_Attribute는 다음의 수학식 7에 의해 주어진다:Where Mb_Activity_Attribute is given by:

매크로블록에 대한 Mb_Temporal_Activity_Attribute의 계산은 상술한 프레임에 대한 Mb_Temporal_Activity_Attribute의 계산과 유사할 수 있다. 예를 들면, 몇몇의 이들 실시예에서, Mb_Temporal_Activity_Attribute는 다음의 수학식 8에 의해 제공된다:The calculation of Mb_Temporal_Activity_Attribute for the macroblock may be similar to the calculation of Mb_Temporal_Activity_Attribute for the above-described frame. For example, in some of these embodiments, Mb_Temporal_Activity_Attribute is provided by the following equation:

수학식 8의 변수들은 섹션 Ⅲ.A에 정의되었다. 수학식 5에서, 프레임 i 또는 j의 매크로블록 m은 현재 프레임의 매크로블록 m에서의 위치와 동일한 위치의 매크로블록일 수 있거나, 현재 프레임의 매크로블록 m에 대응하도록 초기에 예측된 프레임 i 또는 j에서의 매크로블록일 수 있다.The variables in Equation 8 are defined in section III.A. In equation (5), macroblock m of frame i or j may be a macroblock at the same position as that in macroblock m of the current frame, or frame i or j initially predicted to correspond to macroblock m of the current frame It may be a macroblock in.

수학식 8이 제공한 Mb_Temporal_Activity_Attribute는 수학식 4가 제공한 프레임 Temporal_Activity_Attribute의 (상기 섹션 Ⅲ.A.3에 기술된) 수정과 유사한 방식으로 수정될 수 있다. 상세히는, 수학식 8이 제공한 Mb_Temporal_Activity_Attribute는 과거 또는 미래의 프레임의 매크로블록의 부적절한 영향을 제한하도록 수정될 수 있다.The Mb_Temporal_Activity_Attribute provided by Equation 8 may be modified in a manner similar to the modification (described in section III.A.3 above) of the frame Temporal_Activity_Attribute provided by Equation 4. In detail, Mb_Temporal_Activity_Attribute provided by Equation 8 may be modified to limit an improper effect of a macroblock of a past or future frame.

마찬가지로, 수학식 7이 제공한 Mb_Activity_Attribute는 수학식 3이 제공한 프레임

의 (상기 섹션 Ⅲ.A.3)에 기술된 수정과 유사한 방식으로 수정될 수 있다. 상세히는, 수학식 7이 제공한 Mb_Activity_Attribute는 Mb_Spatial_Activity_Attribute 및 Mb_Temporal_Activity_Attribute의 부적절한 영향을 제한하도록 수정될 수 있다.Similarly, Mb_Activity_Attribute provided by Equation 7 is a frame provided by Equation 3

It may be modified in a manner similar to the modification described in (section III.A.3 above). In detail, Mb_Activity_Attribute provided by Equation 7 may be modified to limit inappropriate effects of Mb_Spatial_Activity_Attribute and Mb_Temporal_Activity_Attribute.

C. 마스킹된 QP 값의 계산C. Calculation of Masked QP Values

마스킹 강도의 값(

및

) 및 기준 마스킹 강도의 값

에 기초하여, 시각적인 마스킹 프로세스는 2개의 함수

및

를 이용함으로써 프레임 레벨 및 매크로블록 레벨에서 마스킹된 QP 값을 계산할 수 있다. 이들 2 함수의 의사 코드는 다음과 같다:Value of masking intensity (

And

) And the value of the reference masking strength

Based on this, the visual masking process has two functions

And

By using, it is possible to calculate the masked QP value at the frame level and the macroblock level. The pseudo code for these two functions is as follows:

상기 함수에서,

및

는 소정의 상수이거나 지역적 통계에 적응될 수 있다.In the above function,

And

May be a constant or adapt to local statistics.

Ⅳ. 멀티 패스 인코딩Ⅳ. Multi-pass encoding

도 1은 본 발명의 몇몇의 실시예의 멀티-패스 인코딩 방법을 개념적으로 도시한 프로세스(100)를 나타낸다. 이 도면에 도시된 바와 같이, 프로세스(100)는 3 단계를 가지며, 이들 단계는 다음의 3개의 서브-섹션에서 기술된다.1 shows a process 100 conceptually illustrating a multi-pass encoding method of some embodiments of the present invention. As shown in this figure, process 100 has three steps, which are described in the following three sub-sections.

A. 분석 및 초기 QP 선택A. Analysis and initial QP selection

도 1에 도시된 바와 같이, 프로세스(100)는 먼저 멀티-패스 인코딩 프로세스의 초기 분석 단계 중에 (즉 패스 0 중에) 기준 마스킹 강도의 초기값(φ_R(1)) 및 명목상의 양자화 파라미터의 초기값(QP_Nom(1))을 계산한다(105). 초기 기준 마스킹 강도(φ_R(1))는 제1 검색 단계 중에 이용되는 한편, 초기 명목상의 양자화 파라미터(QP_Nom(1))는 제1 검색 단계의 제1 패스 중에(즉, 멀티-패스 인코딩 프로세스의 패스 1 중에) 이용된다.As shown in FIG. 1, the process 100 first begins with an initial value of the reference masking intensity φ _{R (1} ) and an initial nominal quantization parameter during the initial analysis phase of the multi-pass encoding process (ie, during pass 0 ₎ . The value QP _{Nom (1} ) is calculated (105). The initial reference masking intensity φ _{R (1)} is used during the first search phase, while the initial nominal quantization parameter QP _{Nom (1)} is used during the first pass of the first search phase (ie, multi-pass encoding). During pass 1 of the process).

패스 0의 시작 시에, φ_R(0)은 실험적 결과에 기초하여 선택된 몇몇의 임의의 값 또는 하나의 값(예를 들면, φ_R 값의 통상적인 범위의 중간 값)일 수 있다. 시퀀스의 분석 중에, 마스킹 강도

(k)가 각 프레임에 대하여 계산된 다음, 기준 마스킹 강도, φ_R(1), 는 패스 0의 끝에서 avg(

(k))와 같도록 설정된다. 기준 마스킹 강도 φ_R에 대한 다른 결정 또한 가능하다. 예를 들면, 값

(k)의 중앙치 또는 다른 산술 함수, 예를 들면 값

(k)의 가중된 평균으로서 계산될 수 있다.At the beginning of pass 0, φ _{R (0)} may be some arbitrary value or one value selected based on experimental results (eg, the middle value of a typical range of φ _R values). During the analysis of the sequence, masking intensity

(k) is calculated for each frame, then the reference masking intensity, φ _{R (1)} , is avg (

(k)). Other determinations of the reference masking strength φ _R are also possible. For example, a value

median of (k) or other arithmetic function, eg value

It can be calculated as the weighted average of (k).

복잡도가 변화되는 초기 QP 선택에 대한 몇 가지 접근법이 존재한다. 예를 들면, 초기 명목상의 QP가 임의의 값(예를 들면, 26)으로 선택될 수 있다. 대안으 로, 코딩 실험에 기초하여 목표 비트레이트에 대한 허용가능한 품질을 생성한다고 알려진 값이 선택될 수 있다.There are several approaches to initial QP selection that vary in complexity. For example, the initial nominal QP may be chosen at any value (eg, 26). Alternatively, a value known to produce an acceptable quality for the target bitrate may be selected based on the coding experiment.

초기 명목상의 QP 값은 또한 공간적 해상도, 프레임레이트, 공간/시간 복잡도, 및 목표 비트레이트에 기초하여 룩업 테이블로부터 선택될 수도 있다. 몇몇의 실시예에서, 이러한 초기 명목상의 QP 값은 이들 파라미터 각각에 의존하는 거리 측정치를 이용하는 테이블로부터 선택되거나, 이들 파라미터의 가중된 거리 측정치를 이용하여 선택될 수 있다.The initial nominal QP value may also be selected from the lookup table based on spatial resolution, frame rate, space / temporal complexity, and target bitrate. In some embodiments, these initial nominal QP values may be selected from a table using distance measurements that depend on each of these parameters, or may be selected using weighted distance measurements of these parameters.

이러한 초기 명목상의 QP 값은 또한 (마스킹을 하지 않고) 레이트 제어기를 이용하여 고속으로 인코딩하는 중에 선택되기 때문에 프레임 QP 값의 조정된 평균으로 설정될 수 있는데, 여기서 평균은 패스 0에 대한 비트레이트 퍼센트 비율 오차 E₀에 기초하여 조정되었다. 마찬가지로, 초기 명목상의 QP는 또한 프레임 QP 값의 가중되고 조정된 평균으로 설정될 수 있는데, 여기서 각각의 프레임에 대한 가중치는 이 프레임에서, 생략된 매크로블록으로서 코딩되지 않는 매크로블록의 퍼센트에 의해 결정된다. 대안으로, 초기 명목상의 QP는 (마스킹을 하면서) 레이트 제어기를 이용하여 고속으로 인코딩하는 중에 선택되기 때문에, φ_R(0) 에서 φ_R(1) 로의 기준 마스킹 강도의 변화의 영향이 고려되는 한, 프레임 QP 값의 조정된 평균 또는 조정되고 가중된 평균으로 설정될 수 있다.Since this initial nominal QP value is also selected during fast encoding using a rate controller (without masking), it can be set to the adjusted average of the frame QP values, where the average is the bitrate percentage of pass 0. It was adjusted based on the ratio error E ₀ . Similarly, the initial nominal QP can also be set to a weighted and adjusted average of the frame QP values, where the weight for each frame is determined by the percentage of macroblocks in this frame that are not coded as omitted macroblocks. do. Alternatively, since the initial nominal QP is chosen during fast encoding using the rate controller (while masking), so long as the influence of the change in the reference masking intensity from φ _{R (} ₀₎ to φ _{R (1)} is taken into account , An adjusted average of the frame QP values or an adjusted and weighted average.

B. 제1 검색 단계: 명목상의 QP 조정B. First Search Phase: Nominal QP Adjustment

참조번호(105) 이후에, 멀티-패스 인코딩 프로세스(100)는 제1 검색 단계에 들어간다. 제1 검색 단계에서, 프로세스(100)는 시퀀스의 N₁번의 인코딩을 수행하는데, 여기에서 N₁은 제1 검색 단계 전반에 걸치는 패스들의 개수를 나타낸다. 각각의 제1 단계의 패스 중에, 프로세스는 상수 기준 마스킹 강도를 가지는 변경되는 명목상의 양자화 파라미터를 사용한다.After reference numeral 105, the multi-pass encoding process 100 enters a first search step. In a first search step, process 100 performs N ₁ encoding of the sequence, where N ₁ represents the number of passes across the first search step. During each pass of the first step, the process uses varying nominal quantization parameters with constant reference masking intensity.

상세히는, 제1 검색 단계의 각각의 패스 p 중에, 프로세스(100)는 각 프레임 k에 대한 특정 양자화 파라미터

및 프레임 k 내의 각각의 개별적인 매크로블록 m에 대한 특정 양자화 파라미터

를 계산한다(107). 소정의 명목상의 양자화 파라미터

및 기준 마스킹 강도

에 대한 파라미터

및

의 계산은 섹션 Ⅲ에 기술되었다(여기서,

및

는 섹션 Ⅲ에서 기술하였던 함수

및

를 이용함으로써 계산된다.). 제1 패스(즉, 패스 1) 내지 참조번호(107)에서, 명목상의 양자화 파라미터 및 제1 단계 기준 마스킹 강도는 초기 분석 단계(105) 중에 계산되었던 파라미터 QP_Nom(1) 및 기준 마스킹 강도 φ_R(1)이다.In detail, during each pass p of the first search step, process 100 performs specific quantization parameters for each frame k.

And specific quantization parameters for each individual macroblock m in frame k

Calculate (107). Any nominal quantization parameter

And reference masking strength

Parameters for

And

The calculation of is described in section III (where

And

Is the function described in section III.

And

Calculated by using In the first pass (i.e., pass 1) to reference numeral 107, the nominal quantization parameter and the first stage reference masking intensity are determined by the parameter QP _{Nom (1)} and the reference masking intensity φ _R which were calculated during the initial analysis step 105. ₍₁₎

참조번호(107) 다음에, 프로세스는 참조번호(107)에서 계산한 양자화 파라미터 값에 기초하여 시퀀스를 인코딩한다(110). 그 다음, 인코딩 프로세스(100)는 종료되어야 하는지 여부를 판정한다(115). 서로 다른 실시예들이 전체 인코딩 프로세스를 종료하기 위한 서로 다른 기준을 가진다. 멀티-패스 인코딩 프로세스를 완전히 종료하는 탈출 조건의 예는 다음을 포함한다:Following reference numeral 107, the process encodes 110 the sequence based on the quantization parameter value calculated at reference numeral 107. The encoding process 100 then determines 115 whether it should end. Different embodiments have different criteria for terminating the entire encoding process. Examples of escape conditions that completely terminate the multi-pass encoding process include:

ㆍ

<ε, 여기서 ε는 최종 비트레이트에서의 오차 허용치이다.ㆍ

<ε, where ε is the tolerance in the final bitrate.

ㆍ

가 QP 값의 유효 범위의 상한 또는 하한 경계에 있다.ㆍ

Is at the upper or lower boundary of the effective range of QP values.

ㆍ 패스의 개수가 허용가능한 패스의 최대 개수 P_MAX를 초과하였다.The number of passes exceeded the maximum number of allowable passes P _MAX .

몇몇의 실시예는 이들 탈출 조건 모두를 사용할 수 있는 한편, 다른 실시예는 이들 중 몇 가지만을 사용할 수 있다. 그러나 다른 실시예들이 인코딩 프로세스를 종료하기 위한 다른 탈출 조건을 사용할 수 있다.Some embodiments may use all of these escape conditions, while others may use only a few of them. However, other embodiments may use other escape conditions to terminate the encoding process.

멀티-패스 인코딩 프로세스가 종료된다고 판정한다면(115), 프로세스(100)는 제2 검색 단계를 생략하고 참조번호(145)로 이동한다. 참조번호(145)에서, 프로세스는 마지막 패스 p로부터의 비트스트림을 최종 결과로서 저장한 다음 종료한다.If it is determined that the multi-pass encoding process ends (115), the process 100 skips the second search step and moves to reference numeral 145. At 145, the process saves the bitstream from the last pass p as the final result and then terminates.

반면에, 프로세스가 종료되지 않아야 한다고 판정하면(115), 제1 검색 단계를 종료해야 하는지를 판정한다(120). 마찬가지로, 서로 다른 실시예들이 제1 검색 단계를 종료하기 위한 서로 다른 기준을 가진다. 멀티-패스 인코딩 프로세스의 제1 검색 단계를 종료하는 탈출 조건의 예는 다음을 포함한다:On the other hand, if it is determined that the process should not end (115), it is determined whether the first search step should be terminated (120). Similarly, different embodiments have different criteria for terminating the first search step. Examples of escape conditions that terminate the first search phase of the multi-pass encoding process include:

ㆍ

이

와 동일하고

이다(이 경우, 비트레이트에서의 오차는 명목상의 QP를 수정함으로써 임의로 더 낮추어질 수 없다).ㆍ

this

Same as

(In this case, the error in the bitrate cannot be arbitrarily lowered by correcting the nominal QP).

ㆍ

ε, 여기서 ε_C는 제1 검색 단계에서의 비트레이트에서의 오차 허용치이다.ㆍ

ε, where ε _C is the error tolerance in the bit rate in the first search step.

ㆍ패스의 개수가 P₁을 초과한다, 여기서 P₁은 P_MAX보다 작다.And the number of paths exceeds P _1, where P ₁ is less than P _MAX.

ㆍ패스의 개수가 P₂를 초과한다, 여기서 P₂는 P₁보다 작고,

이다.And the number of paths exceeds P _2, where P ₂ is less than P _1,

to be.

몇몇의 실시예는 이들 탈출 조건 모두를 사용할 수 있는 한편, 다른 실시예는 이들 중 몇 가지만을 사용할 수 있다. 그러나 다른 실시예들이 제1 검색 단계를 종료하기 위한 다른 탈출 조건을 사용할 수 있다.Some embodiments may use all of these escape conditions, while others may use only a few of them. However, other embodiments may use other escape conditions to end the first search step.

멀티-패스 인코딩 프로세스가 제1 검색 단계를 종료한다고 판정한다면(120), 프로세스(100)는 다음 서브-섹션에서 기술될 제2 검색 단계로 진행한다. 반면에, 프로세스가 제1 검색 단계를 종료하지 않아야 한다고 판정하면(120), 제1 검색 단계의 다음 패스에 대한 명목상의 QP를 갱신한다(125)(즉,

을 정의한다). 몇몇의 실시예에서, 명목상의

는 다음과 같이 갱신된다. 패스 1의 끝에서, 이들 실시예는If it is determined that the multi-pass encoding process ends the first search step (120), the process 100 proceeds to the second search step to be described in the next sub-section. On the other hand, if the process determines that it should not terminate the first search step (120), it updates 125 the nominal QP for the next pass of the first search step (i.e.,

To define). In some embodiments, nominal

Is updated as follows: At the end of pass 1, these embodiments

를 정의하며, 여기서

는 상수이다. 그 다음 패스 2 내지 패스

에서의 패스의 각각의 끝에서, 이들 실시예는

, Where

Is a constant. Then pass 2 to pass

At each end of the pass in, these embodiments

를 정의하며, 여기서

는 이하 더 기술될 함수이다. 또한, 상기 수학식에서, q1 및 q2는 패스 p까지의 모든 패스들 중에서 가장 낮은 대응하는 비트레이트 오차를 가지는 패스 번호이고, q1, q2, 및 p는 다음과 같은 관계를 가진다:

, Where

Is a function to be described further below. Further, in the above equation, q1 and q2 are pass numbers with the lowest corresponding bitrate errors among all passes up to pass p, and q1, q2, and p have the following relationship:

다음은 InterpExtrap 함수에 대한 의사 코드이다. x가 x1과 x2 사이에 있지 않은 경우, 이 함수는 외삽법 함수임을 유의한다. 그렇지 않다면, 이 함수는 내삽법 함수이다.The following is pseudo code for InterpExtrap function. Note that if x is not between x1 and x2, this function is an extrapolation function. Otherwise, this function is an interpolation function.

명목상의 QP값은 통상적으로 정수 값으로 반올림되며 QP 값의 유효 범위 내에 주어지도록 무수리를 갖다 버린다. 본 기술 분야에서 통상의 기술을 가진 자라면 다른 실시예들은 상술한 접근법들과는 다르게 명목상의

값을 계산할 수 있음을 인식할 것이다.The nominal QP value is typically rounded to an integer value and taken anhydrous to be within the valid range of the QP value. Those skilled in the art will appreciate that other embodiments may be nominally different from the approaches described above.

It will be appreciated that the value can be calculated.

참조번호(125)에서, 프로세스는 참조번호(107)로 다시 이동하여 다음 패스(즉, p :=p+1)를 시작하는데, 이 패스에서, 현재 패스 p에 대하여 각 프레임 k에 대한 특정 양자화 파라미터

및 프레임 k 내의 개별적인 매크로 블럭 각각에 대한 특정 양자화 파라미터

를 계산한다(107). 그 다음, 프로세스는 이들 새롭게 계산된 양자화 파라미터에 기초하여 프레임의 시퀀스를 인코딩한다(110). 그 다음 참조번호(110)로부터, 프로세스는 상술하였던 참조번호(115)로 이동한다.At 125, the process moves back to 107 and starts the next pass (i.e. p: = p + 1), in which a specific quantization for each frame k for the current pass p. parameter

And specific quantization parameters for each individual macroblock in frame k

Calculate (107). The process then encodes 110 a sequence of frames based on these newly calculated quantization parameters. Then from reference numeral 110, the process moves to reference numeral 115 described above.

C. 제2 검색 단계: 기준 마스킹 강도 조정C. Second Search Step: Adjust Baseline Masking Strength

프러세스(100)가 제1 검색 단계를 종료해야 한다고 판정한다면(120), 프로세스는 참조번호(130)로 이동한다. 제2 검색 단계에서, 프로세스(100)는 N₂번의 시퀀스 인코딩을 수행하는데, 여기서 N₂는 제2 검색 단계의 전반에 걸친 패스들의 개수를 나타낸다. 각각의 패스 중에, 프로세스는 동일한 명목상의 양자화 파라미터 및 변경되는 기준 마스킹 강도를 이용한다.If process 100 determines that the first search step should be terminated (120), the process moves to reference numeral 130. In a second search step, process 100 performs N ₂ sequence encodings, where N ₂ represents the number of passes throughout the second search step. During each pass, the process uses the same nominal quantization parameter and the changing reference masking intensity.

참조번호(130)에서, 프로세스(100)는 다음 패스, 즉 패스 N₁+1인 패스 p+1에 대한 기준 마스킹 강도 φ_R(p+1)을 계산한다. 패스 N₁+1에서, 프로세스(100)는 참조번호(315)에서 프레임의 시퀀스를 인코딩한다. 서로 다른 실시예들이 서로 다른 방식으로 패스 p의 끝에서 기준 마스킹 강도 φ_R(p+1)를 계산한다(130). 2개의 대안적인 접근법이 후술될다.In reference numeral 130, the process 100 calculates the reference masking intensity φ _{R (p + 1)} for the next pass, pass p + 1, which is pass N ₁ +1. In pass N ₁ +1, process 100 encodes the sequence of frames at reference numeral 315. Different embodiments calculate the reference masking strength φ _{R (p + 1)} at the end of pass p in different ways (130). Two alternative approaches are described below.

몇몇의 실시예는 이전 패스(들)로부터의

의 비트레이트(들) 및 값(들)의 오차에 기초하여 기준 마스킹 강도

를 계산한다. 예를 들면, 패스 N₁의 끝에서, 몇몇의 실시예는

를 정의한다.Some embodiments may be from previous pass (s).

Reference masking strength based on the bitrate (s) and error of the value (s) of

Calculate For example, at the end of pass N ₁ , some embodiments

Define.

패스 N₁+m(m은 1보다 큰 정수)의 끝에서, 몇몇의 실시예는 At the end of pass N ₁ + m (m is an integer greater than 1), some embodiments

를 정의한다. 대안으로 몇몇의 실시예는 Define. Alternatively, some embodiments

를 정의하는데, 여기에서 q1 및 q2는 최상의 오차를 제공하였던 이전의 패스들이다.Where q1 and q2 are the previous passes that provided the best error.

다른 실시예들은 섹션 Ⅰ에서 정의하였던 AMQP를 이용함으로써 제2 검색 단계의 각 패스들의 끝에서 기준 마스킹 강도를 계산한다. 소정의 명목상의 QP에 대한 AMQP 및

에 대한 몇몇의 값을 계산하기 위한 한가지 방식은 함수

의 의사코드를 참조하여 다음에 기술된다.Other embodiments calculate the reference masking intensity at the end of each pass of the second search step by using the AMQP as defined in section I. AMQP for a given nominal QP and

One way to calculate some values for is a function

It is described next with reference to the pseudo code of.

AMQP를 이용하는 몇몇의 실시예는 이전 패스(들)로부터의 AMQP의 비트레이트(들) 및 값(들)에서의 오차에 기초하여 패스 p+1에 대한 소정의 AMQP를 계산한다. 그 다음 이 AMQP에 대응하는

은 의사코드가 이 서브섹션의 끝에 주어지는 함수

에 의해 주어진 검색 프로세저를 통해 구한다.Some embodiments using AMQP calculate a predetermined AMQP for pass p + 1 based on the error in the bitrate (s) and value (s) of AMQP from the previous pass (s). Then corresponds to this AMQP

Is a function whose pseudocode is given at the end of this subsection.

Obtained through the search procedure given by.

예를 들면, 몇몇의 실시예는 패스 N₁의 끝에서

를 계산하는데, 여기에서

일 때,

이고,

일때,

이다.For example, some embodiments at the end of pass N ₁

Where

when,

ego,

when,

to be.

그 다음 이들 실시예는

를 정의한다. 패스 N₁+m의 끝에서(여기서 m은 1보다 큰 정수), 몇몇의 실시예는

및

을 정의한다.These examples then

Define. At the end of pass N ₁ + m (where m is an integer greater than 1), some embodiments

And

Define.

소정의 AMQP 및

의 몇몇의 디폴트 값이 주어지면, 소정의 AMQP에 대응하는

은 몇몇의 실시예에서 다음의 의사 코드를 가지는 Search 함수를 이용하여 구할 수 있다.Given AMQP and

Given some default values of, corresponding to a given AMQP

In some embodiments, may be obtained by using a Search function having the following pseudo code.

상기 의사 코드에서, 숫자 10, 12 및 0.05는 적절하게 선택된 임계값으로 교체될 수 있다.In the pseudo code, the numbers 10, 12 and 0.05 can be replaced with appropriately selected thresholds.

프레임 시퀀스의 인코딩을 통하여 다음 패스(패스 p+1)에 대한 기준 마스킹 강도를 계산한 후에, 프로세스(100)는 참조번호(132)로 이동하여 다음 패스(즉, p:=p+1)를 시작한다. 각각의 인코딩 패스 p 중에 각각의 프레임 k 및 각각의 매크 로블록 m에 대하여, 프로세스는 각각의 프레임 k에 대한 특정 양자화 파라미터

및 프레임 k 내의 개별적인 매크로블록에 대한 특정 양자화 파라미터

을 계산한다(132). 소정의 명목상의 양자화 파라미터

및 기준 마스킹 강도

에 대한 파라미터

및

의 계산은 섹션 Ⅲ에 기술되었다(여기서

및

은 상기 섹션 Ⅲ에 기술에 기술되었던 함수

및

를 이용함으로써 계산된다). 제1 패스 내지 참조번호(132) 중에, 기준 마스킹 강도는 바로 이전의 참조번호(130)에서 계산되었던 것이다. 또한, 제2 검색 단계 중에, 명목상의 QP는 제2 검색 단계 전반에 걸쳐 상수를 유지하고 있다. 몇몇의 실시예에서, 제2 검색 단계 전반에 걸친 명목상의 QP는 제1 검색 단계 중에 최상의 인코딩 솔루션(즉, 가장 낮은 비트레이트 오차를 가지는 인코딩 솔루션)을 생성하는 명목상의 QP이다.After calculating the reference masking intensity for the next pass (pass p + 1) through the encoding of the frame sequence, process 100 moves to reference 132 to determine the next pass (i.e. p: = p + 1). To start. For each frame k and each macroblock m during each encoding pass p, the process performs specific quantization parameters for each frame k.

And specific quantization parameters for individual macroblocks in frame k

Compute (132). Any nominal quantization parameter

And reference masking strength

Parameters for

And

The calculation of is described in section III (where

And

Is the function that was described in the description in section III above.

And

Is calculated by using). During the first pass through reference 132, the reference masking intensity was that calculated at the previous reference 130. In addition, during the second search phase, the nominal QP maintains a constant throughout the second search phase. In some embodiments, the nominal QP throughout the second search step is the nominal QP that produces the best encoding solution (ie, the encoding solution with the lowest bitrate error) during the first search step.

참조번호(132) 이후에, 프로세스는 참조번호(130)에서 계산된 양자화 파라미터를 이용하여 프레임 시퀀스를 인코딩한다(135). 참조번호(135) 이후에, 프로세스는 제2 검색 단계를 종료해야 하는지를 판정한다(140). 다른 실시예에서는 패스 p의 끝에서 제2 검색 단계를 종료하기 위한 서로 다른 기준을 사용한다.After reference numeral 132, the process encodes the frame sequence using the quantization parameter calculated at reference numeral 130 (135). After reference numeral 135, the process determines 140 whether to terminate the second search step. Another embodiment uses different criteria for ending the second search step at the end of pass p.

이러한 기준의 예들은 다음과 같다.Examples of such criteria are as follows.

ㆍ

<ε, 여기서 ε는 최종 비트레이트의 오차 허용치이다.ㆍ

<ε, where ε is the tolerance of the final bitrate.

ㆍ 패스의 개수가 허용되는 패스의 최대 수를 초과한다.• The number of passes exceeds the maximum number of passes allowed.

몇몇의 실시예는 이들 탈출 조건 모두를 사용할 수 있는 한편, 다른 실시예 는 이들 중 몇 가지만을 사용할 수 있다. 그러나 다른 실시예들이 제1 검색 단계를 종료하기 위한 다른 탈출 조건을 사용할 수 있다.Some embodiments may use all of these escape conditions, while others may use only a few of them. However, other embodiments may use other escape conditions to end the first search step.

프로세스(100)가 제2 검색 단계를 종료하지 않아야 한다고 판정하면(140), 다음의 인코딩 패스를 위하여 기준 마스크 강도를 다시 계산하도록 참조번호(130)로 복귀한다. 참조번호(130)로부터, 프로세스는 참조번호(132)로 이동하여 양자화 파라미터를 계산한 다음 참조번호(135)로 이동하여 새로이 계산된 양자화 파라미터를 이용함으로써 비디오 시퀀스를 인코딩한다.If process 100 determines that it should not end the second search step (140), it returns to reference number 130 to recalculate the reference mask intensity for the next encoding pass. From reference numeral 130, the process moves to reference numeral 132 to calculate the quantization parameter and then to reference numeral 135 to encode the video sequence by using the newly calculated quantization parameter.

반면에, 프로세스가 제2 검색 단계를 종료한다고 판정하면(140), 참조번호(145)로 이동한다. 참조번호(145)에서, 프로세스(100)는 마지막 패스 p로부터의 비트스트림을 최종 결과로서 저장한 다음, 종료한다.On the other hand, if the process determines that it has finished the second search step (140), it moves to reference numeral 145. At reference numeral 145, process 100 stores the bitstream from last pass p as the final result and then terminates.

V. 디코더 입력 버퍼 언더플로우 제어V. Decoder Input Buffer Underflow Control

본 발명의 몇몇의 실시예는 디코더가 이용하는 입력 버퍼의 사용에 대한 최적의 인코딩 솔루션을 식별하기 위해, 목표 비트레이트를 찾기 위하여 비디오 시퀀스의 각종 인코딩을 검사하는 멀티-패스 인코딩 프로세스를 제공한다. 몇몇의 실시예에서, 이러한 멀티-패스 프로세스는 도 1의 멀티-패스 프로세스(100)를 따른다.Some embodiments of the present invention provide a multi-pass encoding process that examines various encodings of a video sequence to find a target bitrate, in order to identify an optimal encoding solution for the use of the input buffer used by the decoder. In some embodiments, this multi-pass process follows the multi-pass process 100 of FIG. 1.

인코딩된 이미지의 사이즈, 디코더가 인코딩된 데이터를 수신하는 레이트, 디코더 버퍼의 사이즈, 디코딩 프로세스의 속도에서의 변동과 같은, 몇 가지 요소 때문에, 디코더 입력 버퍼("디코더 버퍼") 사용은 인코딩된 이미지 시퀀스(예를 들면, 프레임들)의 디코딩 중에 어느 정도 변동할 것이다.Due to several factors, such as the size of the encoded image, the rate at which the decoder receives the encoded data, the size of the decoder buffer, and the variation in the speed of the decoding process, the use of the decoder input buffer ("decoder buffer") may result in an encoded image. It will vary somewhat during the decoding of the sequence (e.g., frames).

디코더 버퍼 언더플로우는 해당 이미지가 디코더측에 완전히 도달하기 전에 디코더가 그 다음 이미지를 디코딩하기를 준비하는 상황을 의미한다. 몇몇의 실시예의 멀티-패스 인코더는 디코더 버퍼 언더플로우를 방지하기 위하여 디코더 버퍼를 시뮬레이션하고 시퀀스에서 선택된 세그먼트를 재-인코딩한다.Decoder buffer underflow means a situation in which the decoder prepares to decode the next image before the image reaches the decoder side completely. The multi-pass encoder of some embodiments simulates the decoder buffer and re-encodes selected segments in the sequence to prevent decoder buffer underflow.

도 2는 본 발명의 몇몇의 실시예의 코덱 시스템(200)을 도시한다. 이 시스템은 디코더(205) 및 인코더(210)를 포함한다. 이 도면에서, 인코더(210)는 이 인코더가 디코더(205)의 동일 구성요소의 동작을 시뮬레이션할 수 있게 하는 몇몇의 구성요소를 가진다.2 illustrates a codec system 200 of some embodiments of the present invention. The system includes a decoder 205 and an encoder 210. In this figure, encoder 210 has several components that allow this encoder to simulate the operation of the same components of decoder 205.

구체적으로, 디코더(205)는 입력 버퍼(215), 디코딩 프로세스(220), 및 출력 버퍼(225)를 포함한다. 인코더(210)는 시뮬레이션된 디코더 입력 버퍼(230), 시뮬레이션된 디코딩 프로세스(235), 및 시뮬레이션된 디코더 출력 버퍼(240)를 보유함으로써 이들 모듈을 시뮬레이션한다. 본 발명의 설명을 방해하지 않기 위해, 도 2는 디코딩 프로세스(220) 및 인코딩 프로세스(245)를 하나의 블록으로 나타내도록 단순화하였다. 또한, 몇몇의 실시예에서, 시뮬레이션된 디코딩 프로세스(235) 및 시뮬레이션된 디코더 출력 버퍼(240)는 버퍼 언더플로우 관리에 이용되지 않으므로 이 도면에서는 오직 예시용으로만 도시되었다.Specifically, the decoder 205 includes an input buffer 215, a decoding process 220, and an output buffer 225. Encoder 210 simulates these modules by holding simulated decoder input buffer 230, simulated decoding process 235, and simulated decoder output buffer 240. In order not to obscure the description of the present invention, FIG. 2 is simplified to show the decoding process 220 and the encoding process 245 as one block. In addition, in some embodiments, the simulated decoding process 235 and the simulated decoder output buffer 240 are not used for buffer underflow management and are shown in this figure for illustrative purposes only.

디코더는 입력 버퍼(215)를 보유하여 들어오는 인코딩된 이미지의 레이트 및 도달 시점에 대한 변동을 제거시킨다. 디코더가 데이터를 다 소모했거나(언더플로우) 입력 버퍼를 가득 채울 경우(오버플로우), 화상 디코딩이 중지되거나 들어오는 데이터가 버려지기 때문에 디코딩이 끊기는 것이 보일 것이다. 이러한 경우들은 바람직하지 못하다.The decoder maintains an input buffer 215 to remove variations in rate and arrival time of the incoming encoded image. If the decoder has run out of data (underflow) or fills the input buffer (overflow), it will appear that the decoding is broken because the picture decoding is stopped or the incoming data is discarded. Such cases are undesirable.

언더플로우 상태를 제거하기 위하여, 몇몇의 실시예에서의 인코더(210)는 먼저 이미지들의 시퀀스를 인코딩하고 이 인코딩된 시퀀스를 저장 장치(255)에 저장한다. 예를 들면, 인코더(210)는 멀티-패스 인코딩 프로세스(100)를 이용하여 이미지들의 시퀀스의 제1 인코딩을 획득한다. 그 다음 디코더 입력 버퍼(215)를 시뮬레이션하고 버퍼 언더플로우를 일으켰을 이미지들을 재-인코딩한다. 모든 버퍼 언더플로우 상태가 제거된 이후에, 재-인코딩된 이미지가 네트워크 접속(인터넷, 케이블, PSTN선, 등), 비-네트워크 직접 접속, 미디어(DVD 등) 등일 수 있는 접속(255)을 통해 디코더(205)에 제공된다.To remove the underflow condition, the encoder 210 in some embodiments first encodes a sequence of images and stores this encoded sequence in storage 255. For example, encoder 210 obtains a first encoding of a sequence of images using multi-pass encoding process 100. It then simulates the decoder input buffer 215 and re-encodes the images that would have caused the buffer underflow. After all buffer underflow conditions are removed, the re-encoded image is via a connection 255, which may be a network connection (Internet, cable, PSTN line, etc.), a non-network direct connection, media (DVD, etc.), or the like. Provided to the decoder 205.

도 3은 몇몇의 실시예의 인코더의 인코딩 프로세스(300)를 도시한다. 이 프로세스는 디코더 버퍼가 언더플로우를 일으키지 않는 최적의 인코딩 솔루션을 찾기를 시도한다. 도 3에 도시된 바와 같이, 프로세스(300)은 바람직한 목표 비트레이트를 만족시키는(예를 들면, 시퀀스에서의 각각의 이미지에 대한 평균 비트레이트는 바람직한 평균 목표 비트레이트를 만족시키는) 이미지들의 시퀀스의 제1 인코딩을 식별한다(302). 예를 들면, 프로세스(300)는 멀티-패스 인코딩 프로세스(100)를 이용하여 이미지들의 시퀀스의 제1 인코딩을 획득할 수 있다(302).3 illustrates an encoding process 300 of an encoder of some embodiments. This process attempts to find an optimal encoding solution that does not cause the decoder buffer to underflow. As shown in FIG. 3, process 300 may determine the sequence of images that satisfy a desired target bitrate (eg, the average bitrate for each image in the sequence satisfies the desired average target bitrate). Identifies the first encoding (302). For example, process 300 may obtain 302 a first encoding of a sequence of images using multi-pass encoding process 100.

참조번호(302) 이후에, 인코딩 프로세스(300)는 접속 속도(즉, 디코더가 인코딩된 데이터를 수신하는 속도), 디코더 입력 버퍼의 사이즈, 인코딩된 이미지의 사이즈, 디코딩 프로세스 속도, 등과 같은 다양한 요소를 고려함으로써 디코더 입력 버퍼(215)를 시뮬레이션한다(305). 참조번호(310)에서, 프로세스(300)는 인코 딩된 이미지의 임의의 세그먼트가 디코더 입력 버퍼를 언더플로우되게 할 것인지를 판정한다. 인코더가 언더플로우 상태를 판정하는(그리고 그 다음 제거하는) 데에 이용하는 기법은 이하 더 기술된다.After reference numeral 302, encoding process 300 may include various factors such as connection speed (i.e., speed at which the decoder receives encoded data), size of decoder input buffer, size of encoded image, speed of decoding process, and the like. By simulating the decoder input buffer 215 (305). At reference numeral 310, process 300 determines whether any segment of the encoded image will cause the decoder input buffer to underflow. The technique that the encoder uses to determine (and then remove) the underflow condition is further described below.

프로세스(300)가 인코딩된 이미지가 언더플로우 상태를 일으키지 않는다고 판정하면(310), 프로세스는 종료된다. 반면, 프로세스(300)가 인코딩된 이미지의 임의의 세그먼트에 버퍼 언더플로우 상태가 존재한다고 판정하면(310), 인코딩 파라미터를 이전의 인코딩 패스로부터의 이들 파라미터의 값에 기초하여 정련한다(315). 그 다음 프로세스는 언더플로우를 가지는 세그먼트를 재-인코딩하여 세그먼트 비트 사이즈를 줄인다(320). 세그먼트를 재-인코딩한 후에, 프로세스(300)는 세그먼트를 검사하여 언더플로우 상태가 제거되었는지 판정한다(325).If process 300 determines (310) that the encoded image does not cause an underflow condition, the process ends. On the other hand, if process 300 determines that there is a buffer underflow condition in any segment of the encoded image (310), the encoding parameters are refined (315) based on the values of these parameters from the previous encoding pass. The process then re-encodes the segment with underflow to reduce the segment bit size (320). After re-encoding the segment, process 300 examines the segment to determine if the underflow condition has been removed (325).

프로세스가 세그먼트는 여전히 언더플로우를 일으킨다고 판정한다면(325), 프로세스(300)는 참조번호(315)로 이동하여 언더플로우를 제거하도록 인코딩 파라미터를 더 정련한다. 이와는 다르게, 프로세스가 세그먼트는 임의의 언더플로우를 일으키지 않을 것이라 판정하면(325), 프로세스는 참조번호(320)에서의 마지막 반복에서 재-인코딩된 세그먼트의 끝 이후의 프레임으로서 비디오 시퀀스를 재검토하고 재-인코딩하기 위한 시작점을 지정한다(330). 그 다음, 참조번호(335)에서, 프로세스는 참조번호(330)에서 지정된 비디오 시퀀스의 일부를 참조번호(315 및 320)에서 지정된 언더플로우 세그먼트에 후속하는 제1 IDR 프레임까지(이 프레임은 포함되지 않음) 재-인코딩한다. 참조번호(335) 이후에, 프로세스는 참조번호(305)로 다시 이동하여 재인코딩 이후에 비디오 시퀀스의 나머지가 여전히 버퍼 언더플로우 를 일으키는지 판정하기 위해 디코더 버퍼를 시뮬레이션한다. 참조번호(305)로부터의 프로세스(300)의 흐름은 위에서 기술하였다.If the process determines that the segment still causes underflow (325), the process 300 further refines the encoding parameters to go to reference 315 to eliminate underflow. Alternatively, if the process determines that the segment will not cause any underflow (325), the process will review and reconsider the video sequence as the frame after the end of the re-encoded segment at the last iteration at 320. Specify a starting point for encoding (330). Then, at reference numeral 335, the process continues the portion of the video sequence specified at reference numeral 330 up to the first IDR frame following the underflow segment specified at reference numerals 315 and 320 (this frame is not included). Re-encode. After reference number 335, the process moves back to reference number 305 to simulate the decoder buffer to determine if the remainder of the video sequence still causes buffer underflow after re-encoding. The flow of process 300 from reference numeral 305 has been described above.

A. 인코딩된 이미지의 시퀀스에서의 언더플로우 세그먼트 결정A. Determining Underflow Segments in a Sequence of Encoded Images

상술한 바와 같이, 인코더는 인코딩되거나 재-인코딩된 이미지의 세그먼트 내의 임의의 세그먼트가 디코더 버퍼에서 언더플로우를 일으키는지 판정하기 위해 디코더 버퍼 상태를 시뮬레이션한다. 몇몇의 실시예에서, 인코더는 인코딩된 이미지의 사이즈, 대역폭과 같은 네트워크 상태, 디코더 요소(예를 들면, 입력 버퍼 사이즈, 이미지를 제거하는 초기 및 명목상의 시간, 디코딩 프로세스 시간, 각각의 이미지의 디스플레이 시간, 등)를 고려하는 시뮬레이션 모델을 이용한다.As mentioned above, the encoder simulates the decoder buffer state to determine if any segment in the segment of the encoded or re-encoded image causes an underflow in the decoder buffer. In some embodiments, the encoder may be configured such as the size of the encoded image, network conditions such as bandwidth, decoder elements (e.g., input buffer size, initial and nominal time to remove the image, decoding process time, display of each image). A simulation model that takes into account time, etc.).

몇몇의 실시예에서, MPEG-4 AVC 코딩된 화상 버퍼(CPB) 모델이 디코더 입력 버퍼 상태를 시뮬레이션하는 데에 이용된다. CPB는 HRD(Hypothetical Reference Decoder)의 시뮬레이션된 입력 버퍼를 칭하기 위해 MPEG-4 H. 264 표준에서 이용되는 용어이다. HRD는 인코딩 프로세스가 산출할 수 있는 스트림에 따르는 변화성에 대한 제약사항을 기술하는 가상의 디코더 모델이다. CPB 모델이 잘 알려져 있으며 이하 섹션 1에 편의를 위해 기술된다. CPB 및 HRD의 보다 상세한 설명은 "Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification(ITU-T Rec. H. 264/ISO/IEC 14496-10 AVC)"에 기재되어 있다.In some embodiments, an MPEG-4 AVC Coded Picture Buffer (CPB) model is used to simulate the decoder input buffer state. CPB is a term used in the MPEG-4 H. 264 standard to refer to the simulated input buffer of the HRD (Hypothetical Reference Decoder). HRD is a hypothetical decoder model that describes constraints on variability in the stream that an encoding process can produce. CPB models are well known and described in section 1 below for convenience. A more detailed description of CPB and HRD is described in "Draft ITU-T Recommendation and Final Draft International Standard of Joint Video Specification (ITU-T Rec. H. 264 / ISO / IEC 14496-10 AVC)".

1. CPB 모델을 이용하여 디코더 버퍼를 시뮬레이션하기1. Using the CPB Model to Simulate a Decoder Buffer

다음의 단락은 몇몇의 실시예에서 디코더 입력 버퍼가 CPB 모델을 이용하여 시뮬레이션 되는 방식을 기술한다. 이미지 n의 제1 비트가 CPB에 들어가기 시작하 는 시간은 다음과 같이 도출되는, 초기 도달 시점 t_ai(n)라고 칭한다.The following paragraphs describe how in some embodiments the decoder input buffer is simulated using a CPB model. The time at which the first bit of image n begins to enter the CPB is called the initial arrival time point t _ai (n), which is derived as follows.

ㆍ 이미지가 제1 이미지라면(즉, 이미지 0), t_ai(0)=0If the image is a first image (ie image 0), t _ai (0) = 0

ㆍ 이미지가 인코딩되거나 재-인코딩되고 있는 시퀀스에서의 제1 이미지가 아니라면(즉, n>0), t_ai(n)=MAX(t_af(n-1), t_ai,earliest(n)).If the image is not the first image in the sequence being encoded or re-encoded (ie n> 0), t _ai (n) = MAX (t _af (n-1), t _ai , earliest (n)) .

상기 식에서,Where

ㆍ t_ai,earliest(n)=t_r,n(n)-initial_cpb_removal_delay,T _ai, earliest (n) = t _r , n (n) -initial_cpb_removal_delay,

여기서 t_r,n(n)은 이하에 기술된 바와 같이 CPB로부터의 이미지 n의 명목상의 제거 시점이며 initial_cpb_removal_delay는 초기 버퍼링 기간이다.Where t _r , n (n) is the nominal removal point of image n from the CPB as described below and initial_cpb_removal_delay is the initial buffering period.

이미지에 대한 최후 도달 시점은The final arrival time for the image is

t_af(n)=t_ai(n)+b(n)/BitRate,t _af (n) = t _ai (n) + b (n) / BitRate,

에 의해 유도된다.Induced by

여기서 b(n)은 이미지 n의 비트 사이즈이다.Where b (n) is the bit size of image n.

몇몇의 실시예에서, 인코더는 명목상의 제거 시간의 계산을, H.264 사양에서와 같이 비트 스트림의 선택적인 부분으로부터 판독하는 대신에 후술될 바와 같이 직접 계산한다. 이미지 0의 경우, CPB로부터의 이미지의 명목상의 제거 시점은In some embodiments, the encoder calculates the nominal removal time calculation directly, as described below, instead of reading from an optional portion of the bit stream as in the H.264 specification. For image 0, the nominal removal point of the image from the CPB is

로 기술된다.Is described.

이미지 n(n>0)의 경우, CPB로부터의 이미지의 명목상의 제거 시점은For image n (n> 0), the nominal removal point of the image from the CPB is

t_r,n(n)=t_r,n(0)+sum_{i=0 to n-1}(t_i)t _r , n (n) = t _r , n (0) + sum _{i = 0 to n-1} (t _i )

로 기술되는데,It is described as

여기서 t_r,n(n)은 이미지 n의 명목상의 제거 시점이며, t_i는 화상 i에 대한 디스플레이 기간이다.Where t _r , n (n) is the nominal time of removal of image n, and t _i is the display period for image i.

이미지 n의 제거 시점은 다음과 같이 기술된다.The removal point of image n is described as follows.

ㆍ t_r,n(n)>=t_af(n)이면, t_r(n)=t_r,n(n), If t _r , n (n)> = t _af (n), then t _r (n) = t _r , n (n),

ㆍ t_r,n(n)<t_af(n)이면, t_r(n)=t_af,n(n)If t _r , n (n) <t _af (n), then t _r (n) = t _af , n (n)

후자의 경우는 이미지 n의 사이즈, b(n)이 너무 커서 명목상의 제거 시점에서 제거를 못하는 것을 나타낸다.The latter case indicates that the size of image n, b (n), is too large to remove at nominal removal point.

2. 언더플로우 세그먼트의 탐지2. Detection of Underflow Segments

이전 섹션에서 기술된 바와 같이, 인코더는 디코더 입력 버퍼 상태를 시뮬레이션하고 소정의 순시에서 버퍼의 비트 개수를 획득할 수 있다. 대안으로, 인코더는 어떻게 각각의 개별적인 이미지가 자신의 명목상의 제거 시점과 최후 도달 시점 간의 차(즉, t_b(n)=t_r,n(n)-t_af(n))를 통해 디코더 입력 버퍼 상태를 변경하는지 추적할 수 있다. t_b(n)가 0보다 작다면, 버퍼는 순시 t_r,n(n)과 t_af(n) 간에, 그리고 가능하다면 t_r,n(n) 이전과 t_af(n) 이후에 언더플로우를 겪고 있다.As described in the previous section, the encoder can simulate the decoder input buffer state and obtain the number of bits in the buffer at a given instant. Alternatively, the encoder determines how each individual image enters the decoder input through the difference between its nominal elimination point and its final arrival point (ie t _b (n) = t _{r, n} (n) -t _af (n)). You can track if the state of the buffer changes. If t _b (n) is less than 0, the buffer underflows instantaneously between t _{r, n} (n) and t _af (n), and possibly before t _{r, n} (n) and after t _af (n) Are going through.

언더플로우에 직접 관련된 이미지는 t_b(n)이 0보다 작은지 검사함으로써 쉽 게 발견될 수 있다. 그러나, 0보다 작은 t_b(n)을 가지는 이미지가 반드시 언더플로우를 일으키는 것은 아니며, 반대로 언더플로우를 일으키는 이미지가 0보다 작은 t_b(n)을 가지지 않을 수 있다. 몇몇의 실시예는 언더플로우가 자신의 최악의 시점에 도달할 때까지 디코더 입력 버퍼를 계속적으로 비움으로써 언더플로우를 일으키는 연속적인 이미지 범위로서 언더플로우 세그먼트를 (디코딩 순서로) 정의한다.Images directly related to underflow can be easily found by checking if t _b (n) is less than zero. However, an image with t _b (n) less than zero does not necessarily cause underflow, and conversely, an image causing underflow may not have t _b (n) less than zero. Some embodiments define underflow segments (in decoding order) as a contiguous image range that causes underflow by continuously emptying the decoder input buffer until the underflow reaches its worst point.

도 4는 몇몇의 실시예에서 이미지의 최후 도달 시점과 명목상의 제거 시점 간의 차 t_b(n) 대 이미지 번호의 도면이다. 이 도면은 1500개의 인코딩된 이미지의 시퀀스에 대하여 도시된다. 도 4a는 그 시작과 끝을 표시하는 화살표로 언더플로우 세그먼트를 도시한다. 도 4a에서는 간결함을 위하여 화살표로 명확하게 표시되지는 않았지만, 제1 언더플로우 세그먼트 이후에 일어나는 다른 언더플로우 세그먼트가 존재함에 유의한다.4 is a diagram of the difference t _b (n) versus image number between the last arrival time of the image and the nominal removal time in some embodiments. This figure is shown for a sequence of 1500 encoded images. 4A shows the underflow segment with arrows indicating its start and end. Note that in FIG. 4A, although not explicitly indicated by arrows for brevity, there are other underflow segments that occur after the first underflow segment.

도 5는 인코더가 참조번호(305)에서 언더플로우 탐지 동작을 수행하는데 이용하는 프로세스(500)를 도시한다. 프로세스(500)는 먼저 상술한 바와 같이 디코더 입력 버퍼 상태를 시뮬레이션함으로써 각각의 이미지의 최후 도달 시점 t_af, 및 명목상의 제거 시점 t_r,n을 결정한다(505). 이 프로세스는 버퍼 언더플로우 관리의 반복적인 프로세스 중에 여러 번 호출될 수 있기 때문에, 이미지 번호를 시작점으로서 수신하고 이러한 소정의 시작 이미지에서부터 이미지의 시퀀스를 검사함에 유의한다. 분명히, 제1 반복에서는, 시작점은 시퀀스의 제1 이미지이다.5 shows a process 500 that an encoder uses to perform an underflow detection operation at 305. Process 500 first determines 505 the last arrival time point t _af , and the nominal removal time point t _r , n of each image by simulating the decoder input buffer state as described above. Note that since this process can be called many times during the iterative process of buffer underflow management, it receives the image number as a starting point and checks the sequence of images from this predetermined starting image. Clearly, in the first iteration, the starting point is the first image of the sequence.

참조번호(510)에서, 프로세스(500)는 디코더 입력 버퍼에서의 각각의 이미지의 최후 도달 시점을 디코더에 의한 그 이미지의 명목상의 제거 시점과 비교한다. 프로세스가 명목상의 제거 시점 이후에 최후 도달 시점을 가지는 이미지가 없다고(즉, 어떤 언더플로우 상태도 존재하지 않는다고) 판정한다면, 프로세스는 종료된다. 반면, 최후 도달 시점이 명목상의 제거 시점 이후에 있는 이미지가 발견된다면, 프로세스는 언더플로우가 존재한다고 판정하고 참조번호(515)로 이동하여 언더플로우 세그먼트를 식별한다.At 510, process 500 compares the last arrival time of each image in the decoder input buffer with the nominal removal time of that image by the decoder. If the process determines that there is no image with the latest arrival time after the nominal removal time point (ie no underflow state exists), the process ends. On the other hand, if an image is found whose last arrival time is after the nominal removal point, the process determines that there is an underflow and moves to reference 515 to identify the underflow segment.

참조번호(515)에서, 프로세스(500)는 언더플로우 상태가 증가하기를 시작하는(즉, t_b(n)이 이미지 범위에 걸쳐서 음의 방향이 되지 않는) 그 다음 전역 최소값까지 디코더 버퍼를 계속적으로 비우기를 시작하는 이미지의 세그먼트로서 언더플로우 세그먼트를 식별한다. 그 다음 프로세스(500)는 종료한다. 몇몇의 실시예에서, 언더플로우 세그먼트의 시작은 관련된 인코딩 간 이미지의 집합의 시작을 나타내는 인코딩 내 이미지인 I-프레임으로 시작하도록 더 조정된다. 일단 언더플로우를 일으키고 있는 하나 이상의 세그먼트가 식별되면, 인코더는 언더플로우를 제거하는 것을 진행한다. 다음의 섹션 B는 단일-세그먼트인 경우(즉, 인코딩된 이미지의 전체 시퀀스가 단일 언더플로우 세그먼트 만을 포함할 때)의 언더플로우 제거를 기술한다. 그 다음 섹션 C는 복수의-세그먼트 언더플로우 경우에 대한 언더플로우 제거를 기술한다.At reference numeral 515, process 500 continues to decode the decoder buffer to the next global minimum value at which the underflow condition begins to increase (i.e., t _b (n) is not negative across the image range). Identifies the underflow segment as the segment of the image to begin emptying with. Process 500 then ends. In some embodiments, the start of the underflow segment is further adjusted to begin with an I-frame, which is an image in the encoding that indicates the start of the set of images between the associated encodings. Once one or more segments that are causing the underflow are identified, the encoder proceeds to eliminate the underflow. The following section B describes underflow elimination in the case of a single-segment (ie, when the entire sequence of encoded images contains only a single underflow segment). Section C then describes underflow elimination for the multiple-segment underflow case.

B. 단일-세그먼트 언더플로우 제거B. Remove single-segment underflow

도 4의 a를 참조해 보면, t_b(n) 대 n 곡선이 감소하는 기울기를 가지고 n-축과 오직 한번만 교차한다면, 전체 시퀀스에서 오직 하나의 언더플로우 세그먼트가 존재한다. 언더플로우 세그먼트는 0과 만나는 점에 선행하는 가장 근접한 지역 극대값(local maximum)에서 시작하고, 0과 만나는 점과 시퀀스의 끝 사이의 다음의 전역 최소값(global minimum)에서 끝난다. 세그먼트의 끝점 다음에는 버퍼가 언더플로우를 극복한 경우에 기울기가 증가하는 곡선을 취하는 다른 0과 만나는 점이 올 수 있다.Referring to a of FIG. 4, if the t _b (n) versus n curve crosses the n-axis only once with decreasing slope, there is only one underflow segment in the entire sequence. The underflow segment starts at the nearest local maximum preceding the point where it meets zero and ends at the next global minimum between the point where it meets zero and the end of the sequence. The end point of the segment can be followed by another zero that takes a curve that increases in slope if the buffer overcomes underflow.

도 6은 몇몇의 실시예에서 인코더가 단일한 이미지 세그먼트에서 언더플로우 상태를 제거하는 데에 이용하는(315, 320, 및 325에서) 프로세스(600)를 도시한다. 참조번호(605)에서, 프로세스(600)는 세그먼트의 끝에서 발견된 가장 긴 지연(예를 들면, 최소값 t_b(n))과 버퍼로의 입력 비트레이트의 곱을 계산함으로써 언더플로우 세그먼트에서의 감소시키려는 총 비트수(ㅿB)를 추정한다.6 illustrates a process 600 that the encoder uses in some embodiments (at 315, 320, and 325) to remove an underflow condition in a single image segment. At reference numeral 605, process 600 reduces the underflow segment by calculating the product of the longest delay found at the end of the segment (e.g., the minimum value t _b (n)) and the input bitrate into the buffer. Estimate the total number of bits (#B) you want to make.

그 다음, 참조번호(610)에서, 프로세스(600)는 최후 인코딩 패스(또는 패스들)로부터 현재 세그먼트의 총 비트 수 및 평균 마스킹된 프레임 QP(AMQP)를 이용하여 세그먼트에 대한 소정의 비트 수, B_T=B-ㅿB_P(여기에서 p는 세그먼트에 대한 프로세스(600)의 현재 반복 횟수이다)를 이루기 위한 소정의 AMQP를 추정한다. 이 반복이 특정 세그먼트에 대한 프로세스(600)의 첫 번째 반복이라면, AMQP 및 총 비트 수는 참조번호(302)에서 식별된 초기 인코딩 솔루션으로부터 유도된 이 세그먼트에 대한 AMQP 및 총 비트수이다. 반면에, 이 반복이 프로세스(600)의 첫 번째 반복이 아니라면, 이들 파라미터들은 프로세스(600)의 최후 패스 또는 최후 몇몇의 패스에서 획득한 인코딩 솔루션 또는 솔루션들로부터 유도될 수 있다.Then, at reference numeral 610, process 600 uses the total number of bits of the current segment from the last encoding pass (or passes) and the predetermined number of bits for the segment using the average masked frame QP (AMQP), Estimate a given AMQP to achieve B _T = B− ㅿ B _P (where p is the current number of iterations of process 600 for the segment). If this iteration is the first iteration of the process 600 for a particular segment, then the AMQP and total number of bits are the AMQP and total number of bits for this segment derived from the initial encoding solution identified at 302. On the other hand, if this iteration is not the first iteration of process 600, these parameters may be derived from the encoding solution or solutions obtained in the last pass or the last few passes of process 600.

그 다음, 참조번호(615)에서, 프로세스(600)는 소정의 AMQP를 이용하여 마스킹을 더 허용할 수 있는 이미지가 더 많은 비트 감소를 획득할 수 있도록 마스킹 강도 Φ_F(n)에 기초하여, 평균 마스킹된 프레임 QP, MQP(n)을 수정한다. 그 다음 프로세스는 참조번호(315)에서 정의된 파라미터에 기초하여 비디오 세그먼트를 재-인코딩한다(610). 그 다음 프로세스는 세그먼트를 검사하여 언더플로우 상태가 제거되었는지 판정한다(625). 도 4의 b는 프로세스(600)가 언더플로우 세그먼트에 적용되어 이 세그먼트를 재-인코딩한 이후의 도 4의 a의 언더플로우 상태의 제거를 도시한다. 언더플로우 상태가 제거되면, 프로세스는 종료한다. 그렇지 않다면, 참조번호(605)로 이동하여 총 비트 사이즈를 줄이도록 인코딩 파라미터를 더 조정할 것이다.Then, at reference numeral 615, the process 600 is based on the masking intensity Φ _F (n) such that an image that is more permissible to mask using a predetermined AMQP can obtain more bit reductions. Modify the average masked frame QP, MQP (n). The process then re-encodes the video segment 610 based on the parameter defined at 315. The process then examines the segment to determine if the underflow condition has been removed (625). 4B illustrates the removal of the underflow state of FIG. 4A after process 600 has been applied to the underflow segment to re-encode this segment. When the underflow condition is removed, the process terminates. Otherwise, go to reference 605 and further adjust the encoding parameter to reduce the total bit size.

C. 복수의 언더플로우 세그먼트에서의 언더플로우 제거C. Eliminate Underflow in Multiple Underflow Segments

시퀀스에 복수의 언더플로우 세그먼트가 존재하면, 세그먼트의 재-인코딩은 모든 후속 프레임에 대하여, 버퍼가 차있는 시간 t_b(n)을 변경한다. 수정된 버퍼 상태를 설명하기 위하여, 인코더는 감소하는 기울기로 처음으로 0과 만나는 점(즉, 가장 낮은 n)으로부터 시작하여, 한번에 하나의 언더플로우 세그먼트를 검색한다.If there are a plurality of underflow segments in the sequence, the re-encoding of the segments changes the time t _b (n) the buffer is full for every subsequent frame. To account for the modified buffer state, the encoder searches for one underflow segment at a time, starting from the point where it first meets zero (ie, the lowest n) with decreasing slope.

언더플로우 세그먼트는 이러한 0과 만나는 점 선행하는 가장 근접한 지역 극대값에서 시작하고, 0과 만나는 점과 다음의 0과 만나는 점(또는 더 이상 0과 만나 는 점이 없는 경우는 시퀀스의 끝) 간의 다음 전역 최소값에서 종료한다. 하나의 세그먼트를 찾은 이후에, 인코더는 그 세그먼트의 끝에서 t_b(n)를 0으로 설정하고 모든 후속 프레임에 대한 버퍼 시뮬레이션을 재기함으로써 가상적으로 이 세그먼트에서의 언더플로우를 제거하고 업데이트된 버퍼 포화도(fullness)를 추정한다.The underflow segment starts at the nearest local maximal point preceding this zero, and then the next global minimum between the point where zero meets and the next zero (or the end of the sequence if no more zero). Terminate at. After finding one segment, the encoder virtually eliminates underflow in this segment and updates the buffer saturation by setting t _b (n) to 0 at the end of that segment and recovering the buffer simulation for all subsequent frames. estimate the fullness.

그 다음 인코더는 수정된 버퍼 포화도를 이용하여 다음 세그먼트를 검색하는 것을 계속한다. 상술한 바와 같이, 일단 모든 언더플로우 세그먼트가 식별되었다면, 인코더는 AMQP를 유도하고 단일-세그먼트 경우와 꼭 마찬가지로 다른 세그먼트와 독립적으로 각 세그먼트에 대한 마스킹된 프레임 QP를 수정한다.The encoder then continues to search for the next segment using the modified buffer saturation. As mentioned above, once all underflow segments have been identified, the encoder derives the AMQP and modifies the masked frame QP for each segment independently of the other segments, just as with the single-segment case.

본 기술 분야에서 통상의 기술을 가진 자라면 다른 실시예들이 서로 다르게 구현될 수 있음을 인식할 것이다. 예를 들면, 몇몇의 실시예는 디코더의 입력 버퍼의 언더플로우를 일으키는 복수의 세그먼트를 식별하지 않을 것이다. 그 대신, 몇몇의 실시예는 상술한 바와 같이 버퍼 시뮬레이션을 수행하여 언더플로우를 일으키는 첫번째 세그먼트를 식별할 것이다. 이렇게 세그먼트를 식별한 이후에, 이들 실시예는 그 세그먼트에서의 언더플로우 상태를 교정하도록 세그먼트를 수정한 다음 이어지는 수정된 영역을 인코딩하는 것을 재기한다. 시퀀스의 나머지를 인코딩한 이후에, 이들 실시예는 그 다음 언더플로우 세그먼트에 대하여 이러한 프로세스를 반복할 것이다.Those skilled in the art will recognize that other embodiments may be implemented differently. For example, some embodiments will not identify multiple segments that cause an underflow of the decoder's input buffer. Instead, some embodiments will perform a buffer simulation as described above to identify the first segment causing the underflow. After identifying the segment in this way, these embodiments recover from modifying the segment to correct the underflow condition in that segment and then encoding the subsequent modified region. After encoding the rest of the sequence, these embodiments will repeat this process for the next underflow segment.

D. 버퍼 언더플로우 관리의 적용D. Application of Buffer Underflow Management

상술한 디코더 버퍼 언더플로우 기법은 다양한 인코딩 및 디코딩 시스템에 적용된다. 이러한 시스템의 몇몇의 예는 다음에 기술된다.The decoder buffer underflow technique described above is applied to various encoding and decoding systems. Some examples of such systems are described below.

도 7은 비디오 스트리밍 서버(710)와 몇몇의 클라이언트 디코더(715-725)를 접속시키는 네트워크(705)를 도시한다. 클라이언트는 300Kb/sec 및 3Mb/sec 등의 서로 다른 대역폭을 가지는 링크를 통하여 네트워크(705)에 접속된다. 비디오 스트리밍 서버(710)는 인코더(730)로부터 인코딩된 비디오 이미지를 클라이언트 디코더(715-725)로 스트리밍하는 것을 제어하고 있다.7 shows a network 705 that connects the video streaming server 710 with some client decoders 715-725. Clients are connected to the network 705 via links having different bandwidths, such as 300 Kb / sec and 3 Mb / sec. The video streaming server 710 controls the streaming of the encoded video image from the encoder 730 to the client decoders 715-725.

스트리밍 비디오 서버는 네트워크에서의 가장 느린 대역폭(즉, 300Kb/sec) 및 가장 작은 클라이언트 버퍼 사이즈를 이용하여 인코딩된 비디오 이미지를 스트리밍하기로 결정할 수 있다. 이러한 경우, 스트리밍 서버(710)는 300 Kb/sec의 목표 비트레이트에 대하여 최적화된 오직 하나의 인코딩된 이미지 집합이 필요하다. 반면, 서버는 서로 다른 대역폭 및 서로 다른 클라이언트 버퍼 상태에 대하여 최적화된 서로 다른 인코딩을 생성하고 저장할 수 있다.The streaming video server may decide to stream the encoded video image using the slowest bandwidth in the network (ie, 300 Kb / sec) and the smallest client buffer size. In this case, the streaming server 710 needs only one encoded image set optimized for the target bitrate of 300 Kb / sec. On the other hand, the server can generate and store different encodings optimized for different bandwidths and different client buffer states.

도 8은 디코더 언더플로우 관리에 대한 적용의 다른 예를 도시한다. 이 예에서, HD-DVD 플레이어(805)는 비디오 인코더(810)로부터의 인코딩된 비디오 데이터를 저장한 HD-DVD(840)로부터 인코딩된 비디오 이미지를 수신하고 있다. HD-DVD 플레이어(805)는 입력 버퍼(815), 간결함을 위해 한 블록(820)으로 도시된 디코딩 모듈의 집합(820), 및 출력 버퍼(825)를 가진다.8 shows another example of an application for decoder underflow management. In this example, HD-DVD player 805 is receiving an encoded video image from HD-DVD 840 that stores encoded video data from video encoder 810. The HD-DVD player 805 has an input buffer 815, a set of decoding modules 820 shown as one block 820 for brevity, and an output buffer 825.

플레이어(805)의 출력은 TV(830) 또는 컴퓨터 디스플레이 단말기(835)와 같은 디스플레이 장치로 송신된다. HD-DVD 플레이어는 매우 높은 대역폭, 예를 들면, 29.4 Mb/sec를 가질 수 있다. 디스플레이 장치에 고품질 이미지를 유지하기 위하여, 인코더는 이미지 시퀀스에서의 어떠한 세그먼트도 제시간에 디코더 입력 버퍼에 전달될 수 없을 정도로 커지지 않는 방식으로 인코딩되는 것을 보장한다.The output of player 805 is sent to a display device, such as TV 830 or computer display terminal 835. HD-DVD players can have very high bandwidth, for example 29.4 Mb / sec. In order to maintain a high quality image on the display device, the encoder ensures that no segment in the image sequence is encoded in such a way that it is not large enough to be delivered to the decoder input buffer in time.

Ⅵ. 컴퓨터 시스템Ⅵ. Computer systems

도 9는 본 발명의 실시예가 구현될 수 있는 컴퓨터 시스템을 나타낸다. 컴퓨터 시스템(900)은 버스(905), 프로세서(910), 시스템 메모리(915), 읽기 전용 메모리(920), 영구 저장 장치(925), 입력 장치(930) 및 출력 장치(935)를 포함한다. 버스(905)는 컴퓨터 시스템(900)의 다양한 내부 장치를 통신적으로 접속시키는 모든 시스템 버스, 주변 버스, 및 칩셋 버스들을 종합적으로 나타낸다. 예를 들면, 버스(905)는 프로세서(910)를 읽기 전용 메모리(920), 시스템 메모리(915), 및 영구 저장 장치(925)에 통신적으로 접속시킨다.9 illustrates a computer system on which embodiments of the present invention may be implemented. Computer system 900 includes bus 905, processor 910, system memory 915, read-only memory 920, permanent storage 925, input device 930, and output device 935. . Bus 905 collectively represents all system buses, peripheral buses, and chipset buses that communicatively connect various internal devices of computer system 900. For example, bus 905 communicatively couples processor 910 to read-only memory 920, system memory 915, and persistent storage 925.

이들 다양한 메모리 장치(unit)로부터, 프로세서(910)는 본 발명의 프로세스를 실행하기 위하여, 실행할 명령어 및 처리될 데이터를 검색한다. 읽기 전용 메모리(ROM)(920)는 프로세서(910)와 컴퓨터 시스템의 다른 모듈이 필요로 하는 명령어와 정적인 데이터를 저장한다.From these various memory units, the processor 910 retrieves instructions to execute and data to be processed in order to execute the process of the present invention. Read-only memory (ROM) 920 stores instructions and static data required by the processor 910 and other modules of the computer system.

한편, 영구 저장 장치(925)는 판독-및-기록 메모리 장치이다. 이 장치는 컴퓨터 시스템(900)의 전원이 꺼졌을 때에도 명령어 및 데이터를 저장하는 비-휘발성 메모리 장치이다. 본 발명의 몇몇의 실시예는 영구 저장 장치(925)로서 (자기 디스크 또는 광 디스크 및 그 대응하는 디스크 드라이브와 같은) 대용량-저장 장치를 이용한다.On the other hand, persistent storage 925 is a read-and-write memory device. This device is a non-volatile memory device that stores instructions and data even when the computer system 900 is powered off. Some embodiments of the present invention utilize mass-storage devices (such as magnetic disks or optical disks and their corresponding disk drives) as persistent storage 925.

다른 실시예는 영구 저장 장치로서 (플로피 디스크 또는 zip® 디스크, 및 그 대응하는 디스크 드라이브와 같은) 이동식 저장 장치를 이용한다. 영구 저장 장치(925)와 마찬가지로, 시스템 메모리(915)는 판독-및-기록 메모리 장치이다. 그러나, 저장 장치(925)와는 다르게, 시스템 메모리는 RAM과 같은 휘발성 판독-및-기록 메모리이다. 시스템 메모리는 프로세스가 런타임시에 필요로 하는 명령어 및 데이터의 일부를 저장한다. 몇몇의 실시예에서, 본 발명의 프로세스들은 시스템 메모리(915), 영구 저장 장치(925), 및/또는 ROM(920)에 저장된다.Another embodiment uses removable storage devices (such as floppy disks or zip® disks, and their corresponding disk drives) as permanent storage. As with persistent storage 925, system memory 915 is a read-and-write memory device. However, unlike storage 925, system memory is volatile read-and-write memory, such as RAM. System memory stores some of the instructions and data that a process needs at run time. In some embodiments, the processes of the present invention are stored in system memory 915, persistent storage 925, and / or ROM 920.

버스(905)는 또한 입력 장치(930) 및 출력 장치(935)에 접속된다. 입력 장치는 사용자가 컴퓨터 시스템에 정보를 전달하고 명령을 선택할 수 있게 한다. 입력 장치(930)는 문자 숫자식의 키보드 및 커서-제어기를 포함한다. 출력 장치(935)는 컴퓨터 시스템에 의해 생성된 이미지를 디스플레이한다. 출력 장치는 CRT 또는 LCD와 같은 프린터 및 디스플레이 장치를 포함한다.Bus 905 is also connected to input device 930 and output device 935. The input device allows the user to communicate information and select commands to the computer system. The input device 930 includes an alphanumeric keyboard and a cursor-controller. Output device 935 displays an image generated by the computer system. Output devices include printers and display devices such as CRTs or LCDs.

마지막으로, 도 9에 도시된 바와 같이, 버스(905)는 또한 (도시되지 않은) 네트워크 어댑터를 통하여 컴퓨터(900)를 네트워크(965)에 연결한다. 이러한 방식으로, 컴퓨터는 (LAN, WAN, 또는 인터넷과 같은) 컴퓨터의 네트워크 또는 (인터넷과 같은) 네트워크의 네트워크의 일부일 수 있다. 컴퓨터 시스템(900)의 임의의 또는 모든 구성요소들이 본 발명에 관련하여 이용될 수 있다. 그러나, 본 기술 분야에서 통상의 기술을 가진 자라면 임의의 다른 시스템 구성 또한 본 발명에 관련하여 이용될 수 있음을 인식할 것이다.Finally, as shown in FIG. 9, bus 905 also connects computer 900 to network 965 via a network adapter (not shown). In this way, the computer may be part of a network of computers (such as a LAN, WAN, or the Internet) or a network of networks (such as the Internet). Any or all components of computer system 900 may be used in connection with the present invention. However, one of ordinary skill in the art will recognize that any other system configuration may also be used in connection with the present invention.

본 발명은 다양한 특정 상세한 사항을 참조하여 기술되었지만, 본 기술 분야에서 통상의 기술을 가진 자라면 본 발명은 본 발명의 사상을 벗어나지 않고 다른 특정 형태로 실행될 수 있음을 인식할 것이다. 예를 들면, 디코더 입력 버퍼를 시뮬레이션하는 데에 H264 방법을 이용하는 대신 버퍼 사이즈, 버퍼에서의 이미지의 도달 및 제거 시점, 및 이미지의 디코딩 및 디스플레이 시간을 고려하는 다른 시뮬레이션 방법이 이용될 수 있다.While the present invention has been described with reference to various specific details, those skilled in the art will recognize that the present invention may be practiced in other specific forms without departing from the spirit of the invention. For example, instead of using the H264 method to simulate the decoder input buffer, other simulation methods may be used that take into account the buffer size, the time of arrival and removal of the image in the buffer, and the decoding and display time of the image.

상술한 몇몇의 실시예는 평균 삭제된 SAD를 계산하여 매크로블록에서의 이미지 변화의 표시를 획득한다. 그러나 다른 실시예는 다르게 이미지 변화를 식별할 수 있다. 예를 들면, 몇몇의 실시예는 매크로블록의 픽셀에 대해 예상되는 이미지 값을 예측할 수 있다. 그 다음 이들 실시예는 매크로블록의 픽셀의 휘도값에서 이렇게 예측된 값을 빼고 이 뺄셈 값의 절대값을 합함으로써 매크로블록 SAD를 생성한다. 몇몇의 실시예에서, 예측된 값은 매크로블록의 픽셀의 값 뿐만 아니라 하나 이상의 이웃한 매크로블록의 픽셀 값에도 기초한다.Some embodiments described above calculate the average deleted SAD to obtain an indication of image change in the macroblock. However, other embodiments may identify image changes differently. For example, some embodiments may predict the image values expected for the pixels of the macroblock. These embodiments then generate a macroblock SAD by subtracting this predicted value from the luminance value of the pixel of the macroblock and summing the absolute value of this subtraction value. In some embodiments, the predicted value is based on the pixel value of one or more neighboring macroblocks as well as the value of the pixel of the macroblock.

또한, 상술한 실시예는 유도된 공간 및 시간전 마스킹 값을 직접 이용한다. 다른 실시예는 연속되는 공간 마스킹 값에 대하여 및/또는 연속되는 시간 마스킹 값에, 비디오 이미지 전반에 걸친 이들 값의 일반적인 성향을 찾아내기 위하여 이들을 이용하기 전에 평활화 필터링을 적용할 것이다. 그러므로, 본 기술 분야에서 통상의 기술을 가지는 자라면 본 발명은 상술한 예시적인 상세에 의해 제한되지 않는다고 이해할 것이다.In addition, the embodiment described above directly uses the derived spatial and pre-time masking values. Another embodiment will apply smoothing filtering to successive spatial masking values and / or to successive temporal masking values before using them to find general tendencies of these values throughout the video image. Therefore, those of ordinary skill in the art will understand that the present invention is not limited by the above-described exemplary details.

도 1은 본 발명의 몇몇의 실시예의 인코딩 방법을 개념적으로 예시하는 프로세스를 도시하는 도면.1 illustrates a process conceptually illustrating the encoding method of some embodiments of the present invention.

도 2는 몇몇의 실시예의 코덱 시스템을 개념적으로 도시하는 도면.2 conceptually illustrates the codec system of some embodiments.

도 3은 몇몇의 실시예의 인코딩 프로세스를 도시하는 흐름도.3 is a flow chart illustrating an encoding process of some embodiments.

도 4의 a는 몇몇의 실시예에서 이미지의 명목상의 제거 시점과 최후 도달 시점 간의 차이 대 언더플로우 상태를 나타내는 이미지 번호를 도시하는 도면.FIG. 4A illustrates an image number that indicates the difference between the nominal removal time and the last arrival time of the image versus the underflow state in some embodiments.

도 4의 b는 이미지의 명목상의 제거 시점과 최후 도달 시점 간의 차이 대 언더플로우 상태가 제거된 이후의 도 4의 a에 도시된 것과 동일한 이미지에 대한 이미지 번호를 도시하는 도면.4b shows the image number for the same image as shown in a of FIG. 4 after the difference between the nominal removal time and the final arrival time of the image versus the underflow condition has been removed.

도 5는 몇몇의 실시예에서 인코더가 언더플로우 탐지를 수행하는데에 이용하는 프로세스를 도시하는 도면.FIG. 5 illustrates a process that an encoder uses to perform underflow detection in some embodiments. FIG.

도 6은 몇몇의 실시예에서 인코더가 이미지의 단일 세그먼트에서 언더플로우 상태를 제거하는 데에 이용하는 프로세스를 도시하는 도면.FIG. 6 illustrates a process the encoder uses in some embodiments to remove underflow conditions in a single segment of an image. FIG.

도 7은 비디오 스트리밍 응용에서의 버퍼 언더플로우 관리의 적용을 도시하는 도면.7 illustrates application of buffer underflow management in a video streaming application.

도 8은 HD-DVD 시스템에서의 버퍼 언더플로우 관리의 적용을 도시하는 도면.8 illustrates the application of buffer underflow management in an HD-DVD system.

도 9는 본 발명의 일 실시예가 구현되는 컴퓨터 시스템을 도시하는 도면.9 illustrates a computer system in which one embodiment of the present invention is implemented.

Claims

A method of encoding a plurality of images,

a) defining a nominal quantization parameter for encoding the image;

b) deriving at least one image-specific quantization parameter for at least one image based on the nominal quantization parameter;

c) encoding the image based on the image-specific quantization parameter; And

d) iteratively repeating the definition, derivation, and encoding operations to optimize the encoding

Image encoding method comprising a.

The method of claim 1,

a) deriving a plurality of image-specific quantization parameters for a plurality of images based on the nominal quantization parameters;

b) encoding the image based on the image-specific quantization parameter; And

c) repeating the definition, derivation and encoding operations to optimize the encoding

Image encoding method further comprising.

The method of claim 1,

Stopping the iteration when an encoding operation meets a set of termination criteria

Image encoding method further comprising.

The method of claim 3,

And the set of termination criteria comprises an identification of the image encoding that is acceptable.

The method of claim 4, wherein

And the allowable image encoding is encoding of the image within a specific range of a target bitrate.

A method of encoding a plurality of images,

a) identifying a plurality of image attributes, each particular image attribute quantifying the complexity of at least a particular portion of a particular image;

b) identifying reference attributes that quantify the complexity of the plurality of images;

c) identifying a quantization parameter for encoding the plurality of images based on the identified image attribute, the reference attribute and the nominal quantization parameter;

d) encoding the plurality of images based on the identified quantization parameters; And

e) optimizing the encoding by repeatedly performing the identification and encoding operations, where a plurality of different iterations use a plurality of different reference attributes

Image encoding method comprising a.

The method of claim 6,

The plurality of attributes are visual masking intensities of at least a portion of each image, the visual masking intensities being encoding artifacts that are not recognizable to the viewer of the video sequence after the video sequence has been encoded according to the method and then decoded. ) To estimate the amount.

The method of claim 6,

A plurality of said attributes are visual masking intensities of at least a portion of each image, and visual masking intensities for a portion of an image in quantifying the complexity of a portion of said image and in quantifying the complexity of a portion of an image, wherein said visual masking intensity Provides an indicator of the amount of compressed artifacts that may result from the encoding without visual distortion of the encoded image after the image is decoded.

A computer readable medium storing a computer program for encoding a plurality of images, comprising:

The computer program

a) instructions for defining a nominal quantization parameter for encoding the image;

b) instructions for deriving at least one image-specific quantization parameter for at least one image based on the nominal quantization parameter;

c) instructions for encoding the image based on the image-specific quantization parameter; And

d) instructions for optimizing the encoding by iteratively repeating the definition, derivation, and encoding operations.

A computer readable medium comprising an assembly.

The method of claim 9,

The computer program

a) instructions for deriving a plurality of image-specific quantization parameters for a plurality of images based on the nominal quantization parameters;

b) instructions for encoding the image based on the image-specific quantization parameter; And

c) instructions for optimizing the encoding by repeating the definition, derivation, and encoding operations

And further comprising an assembly.

The method of claim 9,

And a set of instructions for stopping the repetition when an encoding operation satisfies a set of termination criteria.

The method of claim 11,

And the set of termination criteria includes an identification of the image encoding that is acceptable.

The method of claim 12,

And the acceptable image encoding is the encoding of the image within a specific range of a target bitrate.