CN102833539B

CN102833539B - Multi-pass video encoding

Info

Publication number: CN102833539B
Application number: CN201210271659.1A
Authority: CN
Inventors: 童歆; 吴锡荣; 托马斯·彭; 安德里亚那·杜米特拉; 巴林·哈斯凯尔; 吉姆·诺米勒
Original assignee: Apple Inc
Current assignee: Apple Inc
Priority date: 2004-06-27
Filing date: 2005-06-24
Publication date: 2015-03-25
Anticipated expiration: 2025-06-24
Also published as: JP5318134B2; CN102833539A; JP4988567B2; JP2008504750A; HK1101052A1; KR20090034992A; CN1926863A; KR20070011294A; KR100909541B1; KR100988402B1; WO2006004605A3; CN1926863B; WO2006004605A2; WO2006004605B1; EP1762093A2; KR20090037475A; EP1762093A4; JP2011151838A; CN102833538B; CN102833538A

Abstract

Some embodiments of the invention provide a multi-pass encoding method that encodes several images (e.g., several frames of a video sequence). The method iteratively performs an encoding operation that encodes these images. The encoding operation is based on a nominal quantization parameter, which the method uses to compute quantization parameters for the images. During several different iterations of the encoding operation, the method uses several different nominal quantization parameters. The method stops iterations when a terminating criterion is reached (e.g., an acceptable encoding of the images is identified).

Description

Multi-pass video encoding

The divisional application that the application is the applying date is on June 24th, 2005, application number is 200580006363.5, denomination of invention is the application for a patent for invention of " multi-pass video encoding ".

Background technology

Video encoder is by utilizing Multi-encoding scheme code sequence of video images (such as, frame of video).Video Coding Scheme typically within frame or the mode encoded video frame of interframe or each several part (set of pixels such as, in frame of video) of frame of video.Frame or the set of pixels of interior frame coding are encoded independent of the set of pixels in other frames or other frames.The frame of interframe encode or set of pixels are encoded by reference to the set of pixels in other frames one or more or other frames.

When compressed video frame, some encoders achieve " rate controller ", and its set for the frame of video that will encode or frame of video provides " bit budget ".Bit budget specifies the amount of bits having distributed to this frame of video of coding or sets of video frames.By effective allocation bit budget, the video flowing that the first water compression of certain restriction (such as, target bit rate etc.) is considered in generation attempted by rate controller.

Up to now, multiple unipath and multipath rate controller has been proposed.Unipath rate controller provides bit budget for the encoding scheme of a series of video image of encoding in single path, and multipath rate controller provides bit budget for the encoding scheme of a series of video image of encoding in multiple path.

Unipath rate controller is effective under real-time coding condition.On the other hand, multipath rate controller is restricted to specific bit rate Optimized Coding Based based on one group.Up to now, not a lot of rate controllers considers the space of set of pixels or the complexity of time in frame or frame in the bit rate controlling them.Equally, most of multipath rate controller does not use the Coding Project of optimum quantization parameter fully to search for solution space for taking into account desired bit rate to set of pixels in frame and/or frame.

Therefore, there is the demand to the rate controller using innovative techniques in prior art, so that while controlling the bit rate for one group of video image of encoding, consider space or the time complexity of video image and/or video image each several part.Also there is the demand to multipath rate controller in prior art, it fully checks that various encoding scheme is to identify the encoding scheme using optimum quantization parameter set for video image and/or video image each several part.

Summary of the invention

Some embodiments of the present invention provide the multipath coding method of the multiple image of a kind of coding (such as, multiple frames of video sequence).The method repeats the encoding operation of these images of coding.This encoding operation is based on nominal quantization parameter, and the method uses this nominal quantization parameter to calculate the quantization parameter of these images.In the iterative process different several times of this encoding operation, the method use several different nominal quantization parameter.The method stops its iterative process when reaching termination criterion (such as, it recognizes an acceptable Image Coding).

Some embodiments of the present invention provide a kind of method for encoded video sequence.First attribute of the complexity of the first image in the method identification quantitation video.It is also coding first image recognition quantization parameter based on the first attribute of described identification.The method then to be encoded the first image based on the quantization parameter of described identification.In certain embodiments, this method is that multiple images in video perform these three operations.

Some embodiments of the present invention are based on " visual masking " attribute coding sequence of video images of each several part of video image and/or video image.The visual masking of image or image each several part is the instruction to standing how many coding artifacts in image or image each several part.In order to express the visual masking attribute of image or image each several part, some embodiments calculate the visual masking strength of the luminance energy of quantized image or image each several part.In certain embodiments, this luminance energy is measured as image or the average luma of image each several part or the function of pixel energy.

Substitute this luminance energy or combine with it, the visual masking strength of image or image each several part also can the activity energy of quantized image or image each several part.The complexity of activity energy presentation video or image each several part.In certain embodiments, activity energy comprises the space components of quantized image or image each several part space complexity, and/or quantizes the moving parts of the distortion quantity can stood due to the movement between image/shelter.

Some embodiments of the present invention provide a kind of method for encoded video sequence.The visual masking attribute of the first image in the method identification video.It also identifies the quantization parameter for the first image of encoding based on the visual masking attribute of described identification.The method then to be encoded the first image based on the quantization parameter of described identification.

Accompanying drawing explanation

Novel feature of the present invention is set forth in the dependent claims.But, for illustrative purposes, set forth multiple embodiment of the present invention in the following figures.

Fig. 1 gives the conceptual process illustrating the coding method of some embodiments of the invention;

Fig. 2 conceptually illustrates the coding/decoding system of some embodiments;

Fig. 3 is the flow chart of the cataloged procedure illustrating some embodiments;

Fig. 4 a is difference between the nominal removal time of image in some embodiments and the final time of advent and the curve chart of relation between the amount of images illustrating underflow condition;

Fig. 4 b illustrates after elimination underflow condition, to same image nominal removal time as is shown in fig. 4 a and the graph of relation between the difference of the final time of advent and amount of images;

Fig. 5 illustrates the process that in some embodiments, encoder detects for performing underflow;

Fig. 6 illustrates that in some embodiments, encoder is for the process of the underflow condition in the individual chip of removal of images;

Fig. 7 illustrates the application of buffer underflow management in video stream application;

Fig. 8 illustrates the application of buffer underflow management in HD-DVD system.

Fig. 9 gives the computer system utilizing it to achieve one embodiment of the present of invention.

Embodiment

In following detailed description of the present invention, propose and describe numerous details of the present invention, example and embodiment.But clear and definite and it is evident that to those skilled in the art, the present invention is not limited to described embodiment, and the present invention can without the need to some specific details and example be discussed and implement.

I. define

This part is that the multiple symbols used in this document provide definition.

R _trepresent target bit rate, it is for the bit rate desired by encoded frame sequence.Usually, this bit rate by bps in units of statement, and be that the quantity of frame from desired final document size, sequence and frame rate calculate.

The bit rate of bit stream coded by the end that Rp represents path p.

The percentage of errors of Ep representative in end's bit rate of path p.In some cases, this percentage calculation is

ε represents the error permissible range in final bit rate.

ε _crepresentative is for the error permissible range in the bit rate of a QP search phase.

QP represents quantization parameter.

QP _nom(p)be represented as the nominal quantization parameter used in the path p of encode frame sequences.QP _nom(p)value adjusted to reach target bit rate in a QP adjusting stage by multipath encoder of the present invention.

MQP _pk () representative shielding frame QP, it is the quantization parameter (QP) of frame k in path p.Some embodiments calculate this value by utilizing nominal QP and frame level visual masking.

MQP _mB(p)(k, m) representative shielding macro block QP, it is the quantization parameter (QP) of the single macro block (having macro index m) of frame k and path p.Some embodiments are by utilizing MQP _pk () and macroblock-level visual are sheltered and are calculated MQP _mB(p)(k, m).

φ _fk () representative becomes the value that frame k shelters intensity.Shelter intensity φ _fk () is the complexity metric to this frame, in certain embodiments, this value is used to determine how visible coding artifacts/noise will present and be used for calculating the MQP of frame k _p(k).

φ _{r (p)}represent the reference shielding intensity in path p.This reference shielding intensity is for calculating the MQP of frame k _p(k), and it is adjusted to reach target bit rate by multipath encoder of the present invention in second stage.

φ _mBthere is in (k, m) representative frame k the shielding intensity that call number is the macro block of m.Shielding intensity φ _mB(k, m) is the tolerance of this macro block complexity, and in certain embodiments, it is used to determine how visible coding artifacts/noise will present and be used for calculating MQP _mB(p)(k, m).AMQPp represents the average shielding QP on the frame in path p.In certain embodiments, this value is as the average MQP on all frames in path p _pk () calculates.

II. summarize

Some embodiments of the present invention provide and realize with the coding method of the optimum visual quality of given bit rate coding frame sequence.In certain embodiments, the method uses the visual masking process for each macroblock allocation quantization parameter QP.This distribution is significantly familiar with based on coding artifacts/noise that is brighter in image or frame of video or that be spatially not so good as in comparatively dark or plane domain compared with the coding artifacts/noise in complex region.

In certain embodiments, this visual masking process performs as the part of the multipath cataloged procedure of invention.In order to make final coded bit stream reach target bit rate, this cataloged procedure adjustment nominal quantization parameter by reference to shielding intensity parameters φ _rcontrol visual masking process.As following further describes, adjustment nominal quantization parameter and control the QP value that shielding algorithm adjusts each macro block in every width picture (that is, each frame normally in Video Coding Scheme) and every width picture.

In certain embodiments, nominal QP and φ of the whole sequence of multipath cataloged procedure global adaptation _r.In other embodiments, video sequence is divided into fragment by this process, utilizes nominal QP and φ _radjust each fragment.Description below relates to the frame sequence it applying multipath coded treatment.Those of ordinary skill will recognize that this sequence comprises whole sequence in certain embodiments, and it only comprises a fragment of sequence in other embodiments.

In certain embodiments, this method has three coding stages.This three phases is: the initial analysis stage that (1) performs in path 0, and (2) arrive path N at path 1 ₁first search phase of middle execution, and (3) are at path N ₁+ 1 to N ₁+ N ₂second search phase of middle execution.

In initial analysis stage (that is, during path 0), this method identification is used for nominal QP(QP _nom(1), by coding path 1 in use) initial value.During initial analysis stage, the method also identifies with reference to shielding intensity φ _rvalue, use in its all path in the first search phase.

In the first search phase, this method performs the N of cataloged procedure ₁iteration (that is, N ₁path).To each frame k in path p, this process is by using particular quantization parameter MQP _pthe particular quantization parameter MQP of each macro block m in (k) and frame k _mB(p)(k, m) encodes this frame, at this MQP _mB(p)(k, m) utilizes MQP _pk () calculates.

In the first search phase, quantization parameter MQP _pk () changes between path, because it is by the nominal quantization parameter QP changed between path _nom(p)obtain.In other words, during the first search phase at the end of each path p, this process computation is used for the nominal QP of path p+1 _nom(p+1).In certain embodiments, nominal QP _nom(p+1)based on from the nominal QP value of path before and bit rate error.In other examples, nominal QP _nom _(p+1)differently calculate at the end of each path of value in the second search phase.

In the second search phase, this method performs the N of cataloged procedure ₂iteration (that is, N ₂path).As in the first search phase, this process is by using particular quantization parameter MQP _pthe particular quantization parameter MQP of each macro block m in (k) and frame k _mB(p)(k, m) encodes each frame k during each path p, at this by MQP _pk () obtains MQP _mB _(p)(k, m).

Equally, as in the first search phase, quantization parameter MQP _pk () changes between path.But during the second search phase, this parameter change is because it utilizes the reference shielding intensity φ changed between path _{r (p)}calculate.In certain embodiments, with reference to shielding intensity φ _{r (p)}based on from the mistake in the bit rate of path before and φ _rvalue calculates.In other examples, different values is calculated as at the end of each path of this reference shielding intensity in the second search phase.

Although be describe multipath cataloged procedure in conjunction with visual masking process, those of ordinary skill in the art is it is to be appreciated that encoder is without the need to using these two kinds of processing procedures together simultaneously.Such as, in certain embodiments, by ignoring φ _rand omitting above-described second search phase, multipath cataloged procedure is used to encode bit stream near given target bit rate and without the need to visual masking.

Visual masking and multipath cataloged procedure is further described in III and the IV part of the application.

III. visual masking

A given nominal quantization parameter, first visual masking process utilizes with reference to shielding intensity (φ _r) and this frame shielding intensity (φ _f) calculate the shielding frame quantization parameter (MQP) of each frame.This process is then based on this frame and macroblock-level masking strength (φ _fand φ _mB) calculate the shielding macroblock quantization parameter (MQP of each macro block _mB).When applying visual masking process in multipath cataloged procedure, the reference shielding intensity (φ in some embodiments _r) be further described in the first encoding pass as mentioned above and below and be identified.

A. frame cascade screen intensity is calculated

1. first method

In order to calculate frame cascade screen intensity φ _f(k), some embodiments use following formula (A):

φ _F(k)＝C*power(E*avgFrameLuma(k)，β)*power(D*avgFrameSAD(k)，α _F)，(A)

Wherein:

● avgFrameLuma(k) for utilize bxb region to calculate frame k in Average pixel intensity, wherein b be more than or equal to 1 integer (such as, b=1 or b=4);

● avgFrameSAD(k) be MbSAD(k, the m of macro blocks all in frame k) mean value;

● MbSAD(k, m) for by function Calc4x4MeanRemovedSAD(4x4_block_pixel_value) provide there is the summation that index is the value of all 4x4 blocks in the macro block of m;

● α _f, C, D, and E is constant and/or adjusts according to this geo-statistic; And

● power(a, b) mean a ^b.

Pseudo-code for function Calc4x4MeanRemovedSAD is as follows:

2. second method

Other embodiment calculates frame cascade screen intensity in a different manner.Such as, above-mentioned public affairs

Formula (A) calculating frame shielding substantially as follows intensity:

φ _F(k)＝C*power(E*Brightness_Attribute，exponent0)*

power(scalar*Spatial_Activity_Attribute，exponentl).

In formula (A), the Brightness_Attribute of frame equals avgFrameLuma(k), and Spatial_Activity_Attribute equals avgFrameSAD(k), it is the average macroblock SAD(MbSAD(k of all macro blocks in frame, m)) value, equals the absolute value sum on average removing 4x4 pixel change (as provided by Calc4x4MeanRemovedSAD) of all 4x4 blocks in macro block at this average macroblock SAD.This Spatial_Activity_Attribute measures just by quantity that the space in the pixel region within the frame of encoding is revised.

Activity metric is expanded to the quantity comprised through the time complexity curve in the pixel region of many successive frames by other embodiment.Especially, these embodiments calculating frame shielding as follows intensity:

φ _F(k)＝C*power(E*Brightness_Attribute，expouent0)*

power(scalar*Activity_Attribute，exponentl)(B)

In this formula, Activity_Attribute is provided by following formula (C):

E*power(F*Temporal_Activity_Attribuc，exponent_delta)(C)

In certain embodiments, Temporal_Activity_Attribute has quantized to stand (that is, shielding) due to moving and cause the quantity of distortion between frame.These embodiments some in, the Temporal_Activity_Attribute of frame equals the constant times of the absolute value sum of the motion compensation rub-out signal of the pixel region defined in this frame.In other embodiment, Temporal_Activity_Attribute is provided by following formula (D):

Temporal_Activity_Attribute =

Σ_{j = - 1}^{- N} (w_{j} \cdot avgFrameSAD (j)) + Σ_{j = 1}^{M} (W_{j} \cdot avgFrameSAD (j)) + W_{0} \cdot avgFrameSAD (0) - - - (D)

In formula (D), average macroblock SAD(MbSAD(k in " avgFrameSAD " representative (as mentioned above) frame, m)) value, avgFrameSAD(0) be the avgFrameSAD of present frame, and negative j points to the time instance before present frame, and positive j points to the time instance after present frame.Thus, avgFrameSAD(j=-2) represent present frame before average frame SAD, the avgFrameSAD(j=3 of two frames) represent present frame after the average frame SAD of three frames.

Equally, in formula (D), variable N and M refers to the quantity of the frame before and after present frame respectively.Replace simply based on frame selective value N and M of specific quantity, some embodiments based on current time frame time before or after special time computation of Period value N and M.Mobile shielding is associated with temporal durations than by mobile shield to be associated with the frame of one group of quantity have more advantage.This is because movement shielding was associated with the time cycle directly meet the time-based visual sense feeling of observer.On the other hand, such shielding is associated because different display unit presents video with different frame rates from the quantity of frame and the variable display duration will be stood.

In formula (D), " W " acute pyogenic infection of finger tip weight factor, in certain embodiments, when frame j leaves when the current frame further, it can reduce.Equally, in this formula, the first summation represents the mobile quantity that can shield before present frame.Second summation represent can current just after shielding mobile quantity, and last expression formula (avgFrameSAD(0)) expression present frame frame SAD.

In certain embodiments, weight factor is adjusted scene changes is described.Such as, some embodiments solve (that is, in M frame) upcoming scene changes in look ahead range, but without any frame after scene changes.Such as, these embodiments can the weight factor of frame in the look ahead range after scene set change be zero.Equally, some embodiments do not solve see backward (that is, within N frame) in scope prior to or be positioned at the frame of scene changes.Such as, these embodiments can arrange the weight factor seeing the frame in scope backward before relating to scene above or falling previous scene changes is zero.

3. the variation of the second method

A) limit past frame and in the future frame on the impact of Temporal_Activity_Attribute

Above formula (D) is substantially from following conditional statements Temporal_Activity_Attribute:

Temporal_Activity_Attribute＝Past_Frame_Activity+Future_Frame_Activity+

Current_Frame_Activity，

At this Past_Frame_Activity(PFA) equal future_Frame_Activity(FFA) equal and Current_Frame_Activity(CFA) equal avgFrameSAD(current).

The calculating of some embodiments amendment Temporal_Activity_Attribute is so that Past_Frame_Activity and Future_Frame_Activity all can not the value of excessive control Temporal_Activity_Attribute.Such as, some embodiments original definition PFA equals

Σ_{i = 1}^{N} (W_{i} \cdot avgFrameSAD (i)),

And FFA equals

Σ_{j = 1}^{M} (W_{j} \cdot avgFrameSAD (j)) .

These embodiments then judge whether PFA is greater than scalar time FFA.If PFA is just set to equal PFA higher limit (such as, scalar time FFA) by these embodiments.Equal PFA higher limit except arranging PFA, some embodiments can perform FFA is set to zero and the combination that CFA is set to zero arrange.PFA and CFA one or both of can be set to the weighted array of PFA, CFA and FFA by other embodiment.

Similar with it, after PFA and the FFA value based on weighted sum original definition, some embodiments also judge whether FFA value is greater than scalar time PFA.If FFA is just set to equal FFA higher limit (such as, scalar time PFA) by these embodiments.Equal FFA higher limit except arranging FFA, some embodiments can perform PFA is set to zero and the combination that CFA is set to zero arrange.FFA and CFA one or both of can be set to the weighted array of FFA, CFA and PFA by other embodiment.

The potential follow-up adjustment of PFA and FFA value (after based on weighted sum these values initially being estimated) prevent these values any one to the excessive control of Temporal_Activity_Attribute.

B) limit Spatial_Activity_Attribute and

Temporal_Activity_Attribute is on the impact of Activity_Attribute

Above formula (C) is basic from following conditional statements Activity_Attribute:

Activity_Attribute＝Spatial_Activity+Temporal_Activity

Wherein, Spatial_Activity equals scalar*(scalar*Spatial_Activity_Attribute) ^β, and Temporal_Activity equals scalar*(scalar*Temporal_Activity_Attribute) ^Δ.

The calculating of some embodiments amendment Activity_Attribute so as Spatial_Activity and Temporal_Activity any one all can not the value of excessive control Activity_Attribute.Such as, some embodiments original definition Spatial_Activity(SA) equal scalar*(scalar*Spatial_Activity_Attribute) ^β, and definition of T emporal_Activity(TA) equal scalar*(scalar*Temporal_Activity_Attribute) ^Δ.

These embodiments then judge whether SA is greater than scalar time TA.If SA is just set to equal SA higher limit (such as, scalar time TA) by these embodiments.Except this situation equaling the SA upper limit except arranging SA, TA value can also be set to zero or be set to the weighted array of TA and SA by some embodiments.

Similar with it, after based on exponential equation original definition SA and TA value, some embodiments also judge whether TA value is greater than scalar time SA.If TA is just set to equal TA higher limit (such as, scalar time SA) by these embodiments.Except this situation equaling the TA upper limit except arranging TA, SA value can also be set to zero or be set to the weighted array of SA and TA by some embodiments.

The potential follow-up adjustment of SA and TA value (after carrying out initial calculation based on exponential equation to these values) prevents the excessive control of a pair Activity_Attribute of these values.

B. computing macro block cascade screen intensity

1. first method

In certain embodiments, macroblock-level masking strength φ _mB(k, m) calculates as follows:

φ _MB(k，m)＝A*power(C*avgMbLuma(k，m)，β)*power(B*MbSAD(k，

m)，α _MB)，(F)

Wherein:

AvgMbLuma(k, m) be the Average pixel intensity in frame k, macro block m;

α _mB, β, A, B and C be constant and/or be suitable for this geo-statistic.

2. second method

Above-described formula (F) is following computing macro block shielding intensity substantially:

φ _MB(k，m)＝D*power(E*Mb_Brightness_Attribute，exponent0)*

power(scalar*Mb_Spatial_Activity_Attribute，cxponentl)

In formula (F), the Mb_Brightness_Attribute of macro block equals avgMbLuma(k, m), and Mb_Spatial_Activity_Attribute equals avgMbSAD(k).This Mb_Spatial_Activity_Attribute measures just by quantity that the space in the pixel region in the macro block of encoding is revised.

As the same as frame shielding intensity, some embodiments can shield activity metric in intensity to comprise the quantity through the time complexity curve in the pixel region of many successive frames by extended macroblock.Especially, these embodiments are by computing macro block shielding intensity as follows:

φ _MB(k，m)＝D*power(E*Mb_Brightness_Attribute，exponent0)*

power(scalar*Mb_Activity_Attribute，exponent1)，(G)

Wherein Mb_Activity_Attribute is provided by following formula (H):

Mb_Activity_Attribute＝F*power(D*Mb_Spatial_Activity_Attribute，exponent_beta)+

G*power(F*Mb_Temporal_Activity_Attribue，exponent_dclta)(H)

The calculating of the Mb_Temporal_Activity_Attribute of macro block can be similar with the calculating of the Mb_Temporal_Activity_Attribute of the above frame.Such as, these embodiments some in, Mb_Temporal_Activity_Attribute is provided by following formula (I):

Mb_Temporal_Activity_Attrbute =

Σ_{i = 1}^{N} (W_{i} \cdot MbSAD (i, m)) + Σ_{j = 1}^{M} (W_{j} \cdot MbSAD (j, m)) + MbSAD (m) - - - (I)

Variable in formula (I) defines in III part.In formula (F), the macro block m in frame I or j can be as with the macro block in the same position of macro block m in present frame, the macro block in frame i or j of the macro block m that can be maybe initial predicted be in corresponding present frame.

The Mb_Temporal_Activity_Attribute provided by formula (I) can modify in the mode that the amendment (discussing in above III.A.3 part) of the frame Temporal_Activity_Attribute provided with formula (D) is similar.Especially, the Mb_Temporal_Activity_Attribute that provided by formula (I) can be revised with the excessive influence of the macro block limited in the past and in the future in frame.

Similar, the Mb_Activity_Attribute provided by formula (H) can modify in the mode that the amendment (discussing in above III.A.3 part) of the frame Activity_Attribute provided with formula (C) is similar.Especially, the Mb_Activity_Attribute that provided by formula (H) can be revised to limit the excessive influence of Mb_Spatial_Activity_Attribute and Mb_Temporal_Activity_Attribute.

C. the QP value of shielding is calculated

Based on shielding intensity (φ _fand φ _mB) value and with reference to shielding intensity (φ _r) value, the shielding QP value of visual masking process by using two function CalcMQP and CalcMQPforMB to calculate frame level and macro-block level.The pseudo-code of these two functions is as follows:

With in superior function, β _fand β _mBcan be the constant that presets or be suitable for this geo-statistic.

IV. multipath coding

Fig. 1 illustrates process 100, and it conceptually illustrates the multipath coding method of some embodiments of the invention.Just as shown in the drawing, process 100 has three phases, describes in following three parts.

A. analyze and initial Q P selection

As shown in Figure 1, process 100 is at first at initial analysis stage (that is, during path 0) the computing reference shielding intensity (φ of multipath cataloged procedure _{r (1)}) initial value and nominal quantization parameter (QP _nom(1)) initial value (step 105).Initial reference masking strength (φ _{r (1)}) use during the first search phase, and initial nominal quantization parameter (QP _nom(1)) during first path of the first search phase, use the path 1 of multipath cataloged procedure (that is, during).

At the beginning of path 0, φ _{r (0)}value (such as, the φ that can be some arbitrary value or select based on experimental result _rthe median of the typical range of value).During the analysis of sequence, calculate shielding intensity φ for every frame _fk (), is then arranged with reference to shielding intensity φ in the end of path 0 _{r (1)}equal avg(φ _f(k)).To reference shielding intensity φ _rother judgements be also possible.Such as, it can calculate as value φ _fk the median of () or other arithmetic functions, such as, be worth φ _fthe weighted average of (k).

There is the several method using the complexity of change to carry out initial Q P selection.Such as, initial nominal QP can be chosen as arbitrary value (such as 26).Optionally, known value can be selected to generate acceptable quality for target bit rate based on coding experiments.

Initial nominal QP value also can be selected from question blank based on spatial solution, frame rate, space/time complexity and target bit rate.In certain embodiments, this initial nominal QP value uses the distance metric of each depended in these parameters to select from table, or it can utilize the Weighted distance metric sebection of these parameters.

This initial nominal QP value can also as they during using rate controller fast coding selected by (unshielded) be set to the adjustment mean value of frame QP value, wherein this mean value is based on bit rate percentage rate's error E of path 0 ₀adjustment.Similar, initial nominal QP also can be set to the weighting adjustment mean value of frame QP value, and wherein the weight of each frame is determined by the percentage of macro block in this frame not being encoded to skipped macroblocks.Optionally, initial nominal QP can be set to adjustment mean value or the adjustment weighted average of frame QP value as selected by they (are with shielding) during using rate controller fast coding, considers with reference to shielding intensity from φ simultaneously _{r (0)}change to φ _{r (1)}effect.

B. the fast search stage: nominal QP adjusts

After step 105, multipath cataloged procedure 100 entered for the first search phase.In the first search phase, process 100 performs the N of sequence ₁coding, wherein N ₁represent the number of vias by the first search phase.During each path of first stage, this process uses the variation nominal quantization parameter with constant reference shielding intensity.

Especially, during each path p of first order search phase, process 100 calculates the particular quantization parameter MQP of (step 107) each frame k _p(k), and the particular quantization parameter MQP calculating each independent macro block m in frame k _mB(p)(k, m).Given nominal quantization parameter QP _nom(p)with reference shielding intensity φ _{r (p)}parameter MQP _p(k) and MQP _mB(p)the calculating of (k, m) describes (wherein MQP in III part _p(k) and MQP _mB(p)(k, m) calculates by utilizing function CalcMQP and CalcMQPforMB, and this describes in above part III).In the first path (that is, path 1) passing through step 107, nominal quantization parameter and first stage are parameter QP with reference to shielding intensity _nom(1)with reference shielding intensity φ _{r (1)}, they calculate during initial analysis stage 105.

After step 107, this process to be encoded this sequence (step 110) based on the quantization parameter value calculated in step 107.Next, cataloged procedure 100 judges whether it should terminate (step 115).Different embodiments has the different condition terminating whole cataloged procedure.The example terminating the exit criteria of multipath cataloged procedure completely comprises:

● | Ep|< ε, wherein ε is the error permissible range in final bit rate.

● QP _nom(p)for coboundary and the lower boundary of QP value effective range.

● the quantity of path has exceeded the maximum access number P allowed _mAX.

Some embodiments may use these all exit criterias, and other embodiments only may use some in them.But other embodiment may use other the exit criteria for terminating cataloged procedure.

When multipath cataloged procedure determines to terminate (step 115), process 100 was omitted for the second search phase and is transferred to step 145.In step 145, this process to be preserved from the bit stream of last path p as final result, then terminates.

On the other hand, when this process determines that (step 115) can not terminate, it then determines whether (step 120) should terminate for the first search phase.Equally, different embodiments has the different condition of end first search phase.The example terminating the exit criteria of the first search phase of multipath cataloged procedure comprises:

● QP _nom(p+1)with QP _nom(q)identical, and q≤p, (error in the case, in bit rate can not be reduced by amendment nominal QP more further).

● | Ep|< ε _c, ε _c> ε, wherein ε _cit is the error allowed band in the bit rate of the first search phase.

● the quantity of path has exceeded P ₁, wherein P ₁be less than P _mAX.

● the quantity of path has exceeded P ₂, it is less than P ₁, and | Ep|< ε ₂, ε ₂> ε _c.

Some embodiments may use all these exit criterias, and embodiment only may use some in them.But other embodiment may use other the exit criteria for terminating for the first search phase.

When multipath cataloged procedure determines that (step 120) terminated for the first search phase, process 100 proceeded to for the second search phase, and it describes in lower part.On the other hand, when process determine (step 120) its should not terminate for the first search phase time, it just in the first search phase upgrade (step 125) next path nominal QP(namely, definition QP _nom(p+1)).In certain embodiments, nominal QP _nom(p+1)following renewal.In the end of path 1, these embodiments define:

QP _Nom（p+1）=QP _Nom（p）+χE _p，

Wherein χ is constant.From path 2 to path N ₁the end of each path, so the definition of these embodiments:

QP _Nom（p+1)＝InterpExtrap（0，E _q1，E _q2，QP _Nom（q1），QP _Nom（q2）），

Wherein InterpExtrap is the function further described as follows.Equally, in above formula, q1 and q2 be correspondence have until path p all paths in the minimum number of vias of bit error, and q1, q2 and p have following relation:

1≤q ₁<q ₂≤p

It is below the pseudo-code of InterpExtrap function.Note, if x is not between x1 and x2, this function is just extrapolation function.Otherwise it is interpolating function.

Nominal QP value is usually rounded to integer value and is limited within the effective range of QP value.One of skill in the art will recognize that other embodiments can be different from above-described method to calculate nominal QP _nom(p+1).

After step 125, the transfer of this process gets back to step 107 to start next path (that is, p:=p+1), and for this path, calculates the particular quantization parameter MQP of each frame k for current pass p _p(k), and the particular quantization parameter MQP of each independent macro block m in frame k _mB(p)(k, m) (step 107).Next, the quantization parameter encoded frame sequence (step 110) that recently calculates based on these of this process.This process is then by step 110 transfer step 115, and it describes above.

C. the second search phase: with reference to the adjustment of shielding intensity

When process 100 determines that it should terminate for the first search phase (step 120), it transfers to step 130.In the second search phase, process 100 performs the N of sequence ₂coding, at this N ₂represent the number of vias by the second search phase.During each path, this process uses the reference shielding intensity of identical nominal quantization parameter and change.

In step 130, process 100 calculates next path, i.e. path p+1, and it is path N ₁+ 1, reference shielding intensity φ _{r (p+1)}.At path N ₁in+1, process 100 encoded frame sequence in step 135.Different embodiments is in a different manner at the end computing reference shielding intensity φ of path p _{r (p+1)}(step 130).The following describe two kinds of optional implementation methods.

Some embodiments are based on the error in the bit rate from previous path and φ _rvalue computing reference shielding intensity φ _{r (p)}.Such as, at path N ₁end, some embodiments define:

φ _R(N1+1)＝φ _R(N1)+φ _R(N1)×Konst×E _N1.

In the end of path N1+m, herein m be greater than 1 integer, some embodiments define

φ _R(N1+m)＝InterpExtrap(0，E _N1+m-2，E _N1+m-1，φ _R(N1+m-2)，φ _R(N1+m-1))

Or some embodiments define:

φ _R(N1+m)＝InterpExtrap(0，E _N1+m-q2，E _N1+m-q1，φ _R(N1+m-q2)，φ _R(N1+m-q1))

Wherein q1 and q2 provides the path of Optimal error before being.

Other embodiments are by utilizing the end computing reference shielding intensity of AMQP in the second search phase at each path, and it defines in part i.Pseudo-code below with reference to function G etAvgMaskedQP describes given nominal QP and φ _rsome values for calculating a kind of mode of AMQP:

Some use the embodiment of AMQP to calculate the AMQP desired by path p+1 based on the value from the error in the bit rate of path before and AMQP.Corresponding to the φ of this AMQP _{r (p+1)}so pass through by function Search(AMQP _(p+1), φ _{r (p)}) search procedure that provides and finding, pseudo-code finally the providing in this part of this function.

Such as, some embodiments are at path N ₁end calculate AMQP _n1+1, wherein:

AMQP _N1+1＝InterpExtrap(0，E _N1-1，E _N1，AMQP _N1-1，AMQP _N1)，when N ₁＞1，

And

AMQP _N1+1＝AMQP _N1，when N ₁＝1，

So these embodiments define:

φ _R(N1+1)＝Search(AMQP _N1+1，φ _R(N1))

At path N ₁+ m(wherein m be greater than 1 integer) end, some embodiments define:

AMQP _N1+m＝InterpExtrap(0，E _N1+m-2，E _N1+m-1，AMQP _N1+m-2，AMQP _N1+m-1)，

And

φ _R(N1+m)＝Search(AMQP _N1+m，φ _R(N1+m-1))

Given desired AMQP and φ _rsome default values, corresponding to the φ of desired AMQP _rsearch function can be utilized to find, and this function has following pseudo-code in certain embodiments:

In above pseudo-code, numeral 10,12 and 0.05 can use the threshold value suitably selected to replace.

After the reference shielding intensity being calculated next path (path p+1) by encoded frame sequence, process 100 is just transferred to step 132 and is started next path (that is, p:=p+1).During each encoding pass p, for each frame k and each macro block m, the particular quantization parameter MQP of each frame k of this process computation _pthe particular quantization parameter MQP of the independent macro block m in (k) and frame k _mB(p)(k, m) (step 132).Given nominal quantization parameter QP _nom _(p)with reference shielding intensity φ _{r (p)}parameter MQP _p(k) and MQP _mB(p)the calculating of (k, m) describes (wherein MQP in III part _p(k) and MQP _mB(p)(k, m) calculates by utilizing function CalcMQP and CalcMQPforMB, and this describes in above III part).During the first path by step 132, with reference to the numerical value that shielding intensity calculates in step 130 place just.Equally, during the second search phase, nominal QP remains constant in whole second search phase.In certain embodiments, the nominal QP obtained by optimum code solution (that is, in the Coding Project with lowest bitrate error) during the nominal QP within the second search phase was the first search phase.

After step 132, this process utilizes the quantization parameter encoded frame sequence (step 135) calculated in step 130 place.After step 135, this process determines whether (step 140) should terminate for the second search phase.Different embodiments uses different conditions for terminating for the first search phase in the end of path p.The example of this condition is:

● | Ep|< ε, wherein ε is the error permissible range in final bit rate.

● the quantity of path has exceeded the maximum access number P allowed _mAX.

Some embodiments may use these all exit criterias, and other embodiments only may use some in them.But other embodiment may use other the exit criteria for terminating for the first search phase.

When process 100 determines that (step 140) should not terminate for the second search phase, it turns back to step 130 to recalculate the reference shielding intensity of next encoding pass.This process transfers to step 132 to calculate quantization parameter from step 130, then transfers to step 135 with the quantization parameter encoded video sequence recently calculated by utilization.

On the other hand, when this process determines that (step 140) terminated for the second search phase, then it transfers to step 145.In step 145, process 100 preserves bit stream from last path p as final result, then just terminates.

V. decoder input buffer underflow controls

Some embodiments of the present invention provide the multipath cataloged procedure of various codings target bit rate being checked to video sequence, in order to identify the optimum code scheme of the use about the input block used by decoder.In certain embodiments, this multipath process follows the multipath cataloged procedure 100 of Fig. 1.

Due to the change of various factors, the variation of the size of such as encoded image, the aspect such as speed, the size of decoder buffer, the speed of decode procedure that Decoder accepts coded data uses, the use of decoder input buffer (" decoder buffer ") changes to a certain extent in the process of decoding coded picture sequence (such as, frame).

Decoder buffer underflow is rather important before image arrives decoder end completely, decoder prepares to decode next image.The multipath coder models decoder buffer of some embodiments the fragment selected in recompile sequence are to prevent decoder buffer underflow.

Fig. 2 conceptually illustrates the coded system 200 of some embodiments of the invention.This system comprises decoder 205 and encoder 210.In the figure, encoder 210 have multiple can the assembly of operation of similar assembly of analog decoder 205.

Especially, decoder 205 has input block 215, decode procedure 220 and output buffer 225.Decoder 210 is by safeguarding that analog decoder input block 230, analog codec process 235 and analog decoder output buffer 240 simulate these modules.In order to not hinder description of the invention, reduced graph 2 is to be shown as single block by decode procedure 220 and cataloged procedure 245.Equally, in certain embodiments, do not utilize analog codec process 235 and analog decoder output buffer 240 for buffer underflow management, thus only illustrate for citing in detail in this figure.

Decoder safeguards that input block 215 is to eliminate the speed of coded image and the change of the time of advent of input.If decoder has been used up data (underflow) or has been filled with input block (overflow), the visual decoding discontinuities that the data with regard to having such as picture decoding halts or input are dropped.Both of these case is all less desirable.

In order to eliminate underflow condition, encoder 210 first coded video sequences they are stored into memory 255 in certain embodiments.Such as, encoder 210 uses multipath cataloged procedure 100 to obtain the first coding of image sequence.Then its analog decoder input block 215 and recompile may cause the image of buffer underflow.After all buffer underflow condition are all eliminated, by connecting 255, the image of recompile is supplied to decoder 205, connecting 255 can be that network connects (internet, cable, PSTN circuit etc.), and non-network directly connects, media (DVD etc.) etc.

Fig. 3 illustrates the cataloged procedure 300 of the encoder of some embodiments.This process tries finds the optimum code scheme that can not cause decoder buffer underflow.As shown in Figure 3, process 300 identifies that (step 302) meets the first coding of the image sequence of desired target bit rate (such as, meeting the mean bit rate of each image of desired average target bit rate in sequence).Such as, process 300 can use (step 302) multipath cataloged procedure 100 to obtain the first coding of image sequence.

After step 302, cataloged procedure 300 is by considering various factors, as connection speed (namely, decoder is used for the speed of received code data), the size, decoding processing speed etc. of the size of decoder input buffer, coded image, change modeling decoder input buffer 215(step 305).In step 310, process 300 determines whether any fragment of coded image can cause decoder input buffer underflow.Encoder is used for determining that the technology of (and eliminating subsequently) underflow condition is further described below.

If process 300 determines (step 310), coded image does not cause underflow condition, and this process terminates.On the other hand, if process 300 determines that (step 310) exists buffer underflow condition in any fragment of coded image, it just improves coding parameter (step 315) based on the value of these parameters from previous encoding passes.Then this process re-encodes (step 320) has the fragment of underflow to reduce the bit size of this fragment.After this fragment of recompile, process 300 checks that (step 325) this fragment is to determine whether to eliminate underflow condition.

When this process determines that (step 325) this fragment still can cause underflow, process 300 just transfers to step 315 to improve coding parameter further to eliminate underflow.Optionally, when this process determines that (step 325) this fragment can not cause any underflow, this process just specify (step 330) for reexamine and the starting point of this video sequence of recompile as the fragment of recompile in the last iteration of step 320 end after frame.Next, in step 335, the video sequence part of this process re-encodes specified by step 330, until Underflow Segments the first every IDR frame subsequently that (and eliminating) specifies in step 315 and 320.After step 335, the transfer of this process gets back to step 305 with analog decoder buffering area to determine whether remaining video sequence still will cause buffer underflow after recompile.The foregoing describe the flow process of process 300 from step 305.

A. the Underflow Segments in encoded image sequence is determined

As mentioned above, whether coder models decoder buffer conditions can cause the underflow in decoder buffer with any fragment determining in the sequence of the image of encoded or recompile.In certain embodiments, encoder uses and considers the network condition of size, the such as bandwidth of coded image, (such as, input block size, removes the initial of image and nominal time to decoder factors, decoding process time, the displaying time etc. of each image) simulation model.

In certain embodiments, MPEG-4AVC Coded Picture Buffer (CPB) modeling decoder input buffer condition is used.CPB is the term used in MPEG-4H.264 standard, refers to the analog input buffering area of Hypothetical Reference Decoder (HRD).HRD is the hypothetical decoder model of the restriction of the changeability aspect of the issuable qualified data flow of prescribed coding process.CPB model is well-known, and describes in lower part 1 for convenience.The description specifically of CPB and HRD can be recommended to find in draft and International Standard of Joint Video Specification final draft (ITU-TRec.H.264/ISO/IEC14496-10AVC) at ITU-T.

1. use CPB modeling decoder buffer

Following paragraphs describe is how to use CPB modeling decoder input buffer in certain embodiments.The time that first bit of image n starts to enter CPB is called as t initial time of advent _ai(n), its derivation is as follows:

● t _ai(0)=0, when image is the first image (that is, image 0);

● t _ai(n)=Max (t _af(n-1), t _ai, earliest (n)), when image be not just encoding or recompile sequence in the first image time (that is, n>0).

In above formula:

●t _ai，earliest(n)=t _r，n(n)-initial cpb removal delay，

Wherein t _{r, n}n nominal removal time that () removes from CPB for the image n as specified below, and initial_cpb_removal_delay is the initial buffer cycle.

The final time of advent of image n is derived by following formula:

t _af(n)=t _ai(n)+b(n)/BitRate，

Wherein b (n) is image n size bitwise.

In certain embodiments, the encoder calculating carrying out self nominal removal time as described below, but not read from the selective sections of bit stream as in H.264 specification.For image 0, the nominal removal time that image removes from CPB is appointed as:

t _r，n(0)=initial_cpb_removal_delay

For image n (n>0), the nominal removal time that image removes from CPB is appointed as:

t _r，n(n)=t _r，n(0)+sum _{i=0 to n-1}(ti)

Wherein t _{r, n}n () is the nominal removal time of image n, and t _ifor the display duration of picture i.

The time that removes of image n specifies as follows:

● t _r(n)=t _{r, n}n (), works as t _{r, n}(n) >=t _aftime (n),

● t _r(n)=t _afn (), works as t _{r, n}(n) <t _aftime (n)

The size b (n) of latter event indicating image n is very large to such an extent as to it is prevented that and to remove when nominal removal time.

2. the detection of Underflow Segments

As the description in part above, encoder can analog decoder input block state the amount of bits obtained at time instant given immediately in buffering area.Optionally, encoder can follow the tracks of each independent image is how by difference (that is, the t between its nominal removal time and the final time of advent _b(n)=t _{r, n}(n)-t _af(n)) change decoder input buffer state.Work as t _bwhen () is less than 0 n, buffering area will at time instant t _{r, n}(n) and t _afbetween (n), and may at t _{r, n}before (n) and t _afunderflow is met with after (n).

By test t _bn whether () be less than the image that 0 easily can find directly to be absorbed in underflow.But, t _bn () is less than the image of 0 and not necessarily causes underflow, otherwise cause the t of the image of underflow _bn () is not necessarily less than 0.Some embodiments are by ceaselessly exhausting decoder input buffer until underflow reaches the stretching, extension that Underflow Segments is defined as the consecutive image (with decoding order) causing underflow by its minimum point continuously.

Fig. 4 is image t in some embodiments _bthe curve chart of the nominal removal time of (n) and amount of images and the difference between the final time of advent.This curved needle is drawn 1500 coded video sequences.Fig. 4 a illustrates the Underflow Segments that it starts and terminates with arrow mark.Note after the first Underflow Segments, also there occurs another one Underflow Segments in Fig. 4 a, do not use arrow obviously to mark to it for simplification.

Fig. 5 illustrates that encoder is for performing the process 500 of the underflow detection operation at step 305 place.First process 500 determines the t final time of advent of (step 505) each image by interpretive simulation decoder input buffer described above condition _afwith nominal removal time t _{r, n}.Note, may the some time be called as due in the iterative process that this process manage at buffer underflow, its reception image number check image sequence as starting point and from this given starting point.It is evident that, for first time iteration, this starting point is first image in sequence.

In step 510, process 500 by decoder by the final time of advent of each image at decoder input buffer place compared with the nominal removal time of this image.If this process determines the image (that is, there is not underflow condition) after nominal removal time without the final time of advent, this process just exits.On the other hand, when have found the image of its final time of advent after nominal removal time, this process is just determined to there is underflow and is transferred to step 515 to identify Underflow Segments.

In step 515, Underflow Segments is identified as decoder buffer and starts to exhaust continuously until the fragment of image of next global minimum by process 500, starts to improve (that is, t in this underflow condition _bn () can not more negative value between image extensin period).So process 500 exits.In certain embodiments, Underflow Segments start to be further adjusted for starting with I frame, it is the intra-coded picture of the beginning of mark one group of intra-coded picture of being correlated with.Once identify one or more fragment causing underflow, encoder just continues to eliminate underflow.The elimination of (that is, when whole image sequence of encoding only comprises single Underflow Segments) underflow under describing individual chip situation with lower part B.Then the underflow that part C describes when being used for multiple Segment Underflow is eliminated.

B. individual chip underflow is eliminated

With reference to figure 4(a), if t _bn () has descending slope with the curve of n and extends only through n axle once, so in whole sequence, just only have a Underflow Segments.This Underflow Segments starts from the nearest local maximum place of previous zero cross point, ends at the next global minimizer between zero cross point and EOS.If buffering area is recovered from underflow, the end point of fragment can follow another zero cross point of the curve with the rate of rise.

Fig. 6 illustrates and eliminates the process 600 of underflow condition at the individual chip inner demoder of image for (step 315,320 and 325) in certain embodiments.In step 605, process 600 is by most long delay (such as, the minimum value t calculating the output that enters into the input bit rate of buffering area and find in the end of fragment _b(n)) estimate the total number of bits (Δ B) that will reduce in Underflow Segments.

Then, in step 610, process 600 uses average shielding frame QP(AMQP) and from the total number of bits estimation in the current clip of a upper encoding pass (or multiple path) for realizing the AMQP of the expectation of the bit number desired by this fragment, B _t=B-Δ B _p, wherein p is the current iteration number of times of the process 600 of this fragment.If this iteration is the iteration first of the process 600 of this specific fragment, the sum of AMQP and bit is exactly being derived the AMQP of this fragment obtained and total number of bits by initial encoding solution of identifying in step 302 place.On the other hand, when this iteration is not the iteration first of process 600, these parameters just can be derived by Coding Project or the solution obtained in last path or last multiple path of process 600 and be obtained.

Next, in step 615, process 600 is based on shielding intensity φ _{f (n)}desired AMQP is used to revise average shielding frame QP, MQP(n) so that the image can standing more multi-shielding obtains more bit reductions.This process is then based on parameter recompile (step 620) video segment defined in step 315.This process then checks that (step 625) this fragment is to judge whether underflow condition is eliminated.Fig. 4 (b) illustrates process 600 being applied to Underflow Segments with the elimination situation of the underflow condition of Fig. 4 (a) after to its recompile.When eliminating underflow condition, this process just exits.Otherwise process transfer gets back to step 605 to adjust coding parameter further to reduce total bit size.

C. the underflow of many Underflow Segments is eliminated

When there being multiple Underflow Segments in sequence, the recompile of fragment changes all buffer fullness time t guaranteeing frame _b(n).In order to solve the buffer condition of amendment, encoder starts from first zero cross point (that is, at minimum point n place) with descending slope, once searches for a Underflow Segments.

Underflow Segments starts from the nearest local maximum place prior to this zero cross point, and ends at next the global minimum place between zero cross point and next zero cross point (if or not having more zero cross points in the end point of sequence).After finding a fragment, encoder removes underflow in this fragment and ideally by arranging t in fragment end _bn () is 0 and re-starts to all sequences frame the buffer fullness that buffer simulation estimates renewal.

Encoder then utilizes amended buffer fullness to continue next fragment of search.Once identify all Underflow Segments as above, encoder is just derived AMQP and as when individual chip, is revised the shielding frame QP of each fragment independent of other fragments.

Those of ordinary skill will appreciate that the embodiment that can realize other by different way.Such as, some embodiments can not identify multiple fragment causing the input block underflow of decoder.Some embodiments but can executive buffer described above simulation to identify the first fragment causing underflow.After identifying such fragment, these embodiments just revise this fragment to correct the underflow condition in that fragment, then continue the coding of the correction portion performed subsequently.After the remainder encoding sequence, these embodiments will repeat this process to next Underflow Segments.

D. the application of buffer underflow management

Above-described decoder buffer underflow technology is applied to numerous Code And Decode system.The following describe multiple examples of this type systematic.

Fig. 7 illustrates the network 705 be connected with several client decoders 715-725 by video streaming server 710.Client is connected to network 705 by the link of the different bandwidth with such as 300Kb/ second and 3Mb/ second.Video streaming server 710 controls the encode video image stream from encoder 730 to client decoders 715-725.

Stream video server can determine to use the lowest-bandwidth in network (that is, 300Kb/ second) and minimal client buffer size to flow encode video image.In the case, the image of streaming server 710 group coding that only needs the target bit rate for 300Kb/ second to optimize.On the other hand, server can generate and store the different coding optimized for different bandwidth and different clients buffer condition.

Fig. 8 illustrates the Another application example that decoder underflows manages.In this example, HD-DVD player 805 is from the HD-DVD 840 received code video image of the encoded video data stored from video encoder 810.HD-DVD player 805 has input block 815, for simplifying the one group of decoder module and the output buffer 825 that are shown as parts 820.

The output of player 805 is sent to the display unit of such as TV 830 or computer display terminal 835.HD-DVD player can have very high bandwidth, such as 29.4Mb/ second.In order to maintain high-quality image on the display apparatus, encoder guarantees that video image is encoded in some way, does not wherein have the too large fragment that consequently on time can not be delivered to decoder input buffer in image sequence.

VI. computer system

Fig. 9 illustrates the computer system of realized one embodiment of the present of invention.Computer system 900 comprises bus 905, processor 910, system storage 915, read-only memory 920, permanent storage 925, input unit 930 and output device 935.Bus 905 concentrates the chipset bus of numerous internal units of the system, ancillary equipment and the unimpeded connection computer system 900 that represent all.Such as, bus 905 is by processor 910 and read-only memory 920, system storage 915 with permanent memory device 925 is unimpeded is connected.

In order to perform each process of the present invention, processor 910 retrieves the instruction and data to be processed that will perform from these various memory cell.Read-only memory (ROM) 920 stores the static data needed for other modules and the instruction of processor 910 and computer system.

On the other hand, permanent memory device 925 is read/write memory device.Even this device is the non-volatile memory cells also storing instruction and data when computer system 900 is closed.Some embodiments of the present invention use mass storage device (disk drive as disk or CD and correspondence thereof) as permanent storage 925.

Other embodiment uses mobile storage means (as floppy disk or compact disk, and the disk drive of correspondence) as permanent storage.Similar with permanent storage 925, system storage 915 is read/write memory device.But, with storage device 925 unlike, system storage is impermanency read/write memory, as random access memory.System storage store processor operationally between required some instruction and datas.In certain embodiments, various processing procedure of the present invention is kept in system storage 915, permanent storage 925 and/or read-only memory 920.

Bus 905 is also connected to input and output device 930 and 935.Input unit enables user with computer system communicate information and chooses the order of computer system.Input unit 930 comprises alphanumeric keyboard and cursor control.Output device 935 shows the image generated by computer system.Output device comprises printer and display device, as cathode ray tube (CRT) or liquid crystal display (LCD).

Finally, as shown in Figure 9, computer 900 is also connected with network 965 by network adapter (not shown) by bus 905.In this manner, computer can be a part for a part for computer network (as local area network (LAN) (" LAN "), wide area network (" WAN "), or in-house network) or the network of network (such as internet).Any or all assembly of computer system 900 can use in conjunction with the present invention.But, those skilled in the art will appreciate that, also can use any other system configuration in conjunction with the present invention.

Although describe the present invention with reference to various specific detail, it will be recognized by those of ordinary skill in the art that and can not depart from spirit of the present invention and implement the present invention in the mode that other are specified.Such as, not the H264 method using analog decoder input block, can use the arrival considering image in buffer size, buffering area yet and remove the decoding of time and image and other analogy methods of display number of times.

Above-described multiple embodiment calculates and on average removes SAD to obtain the instruction of image change in macro block.But other embodiments can recognition image change in a different manner.Such as, some embodiments can the desired image value of pixel of predicted macroblock.These embodiments then by deducting this predicted value from the brightness value of the pixel of macro block, and add that the absolute value of this deduction part generates macro block SAD.In certain embodiments, this predicted value not only based on the pixel value in macro block, and based on the pixel value in one or more adjacent macroblocks.

Equally, above-described embodiment directly uses the room and time masking value of deriving and drawing.Other embodiment in order to choose continuous space masking value among video image and/or continuous time masking value general trend and filter the application of these values is level and smooth before them in use.Thus, those skilled in the art will be appreciated that the present invention be not limited to before the details of illustrating.

Claims

1. a method for encode video image sequence, described method comprises:

Encode described sequence of video images to produce current coding scheme by using multiple coding parameter;

From the described sequence of video images of being encoded by described current coding scheme, by exhausting the input block of reference decoder continuously until the underflow condition in described input block is improved, identify multiple video images of the described underflow condition causing described input block, the input block of wherein said reference decoder is used with analog decoder input block during encoding;

Adjustment is used for a group coding parameter of identified multiple video images;

Coding parameter group after Use Adjustment encodes identified multiple video images to produce the new encoding scheme being designated as current coding scheme; And

The coding of described sequence of video images repeats described identification, described adjustment and described coding to produce described new encoding scheme, until can not cause the described input block underflow of described reference decoder.

2. method according to claim 1, also comprises: when described new encoding scheme can not cause input block underflow, the residual image from the first image after the described multiple image causing described underflow of described sequence of video images of encoding.

3. method according to claim 1, the described sequence of video images of wherein said coding comprises:

Use the described input block of described reference decoder to simulate the decoder input buffer of described decoder; And

Utilize described simulation to select multiple bit with described sequence of video images of encoding, prevent the underflow of the described input block of described reference decoder simultaneously.

4. method according to claim 3, wherein simulates described decoder input buffer and also comprises the condition of the described decoder of consideration through the network of its received code data.

5. method according to claim 3, wherein simulates described decoder input buffer and also comprises the size considering described decoder input buffer.

6. method according to claim 3, wherein simulates described decoder input buffer and also comprises the initial removal delay considered from described decoder input buffer.

7. method according to claim 1, also comprise: the coding each time for described multiple video image all meets target bit rate, wherein meeting described target bit rate is one of multiple quality criterion, and the encoding scheme of wherein each generation meets the whole of described quality criterion.

8. a method for coded video sequences, described method comprises:

Use image sequence described in one group of parameter coding;

The input block of analog decoder;

When multiple images in described image sequence cause the described input block underflow condition of described decoder, by exhausting described input block continuously until the described underflow condition in described input block is improved, adjust described parameter group; And

Parameter group coding after Use Adjustment causes multiple images of described underflow, wherein adjusts described parameter group and the described multiple image of coding is carried out iteratively, until the underflow of described input block that described multiple image causes is eliminated.

9. method according to claim 8, wherein simulates described input block and comprises: consider one of connection speed and buffer size.

10. method according to claim 8, wherein simulates described input block and comprises: the speed considering the decode procedure of described decoder.

11. methods according to claim 8, also comprise: after described underflow is eliminated, next the group image after the described multiple image causing described underflow of encoding in described image sequence.

12. 1 kinds of encoding devices, comprise the device of the step in the method for performing as described in any one in claim 1-11.