The application be that July 14, application number in 2003 are 03817031.0 the applying date, denomination of invention divides an application for the application for a patent for invention of " reference pictures is adaptive weighted in the video decode ".
The application requires in U.S. Provisional Patent Application sequence number 60/395 submission on July 15th, 2002, " reference pictures is adaptive weighted in the Video Codec " by name, the priority of 843 (attorney PU020340), it incorporates this literary composition as a reference in full.In addition, the application requires equally in U.S. Provisional Patent Application sequence number 60/395 submission on July 15th, 2002, " weight estimation estimation " by name, the priority of 874 (attorney PU020339), it incorporates this literary composition as a reference in full equally.
Embodiment
The invention provides a kind of apparatus and method, be used for motion vector estimation and self adaptation benchmark picture weighted factor and distribute.In some video sequence, especially have in the video sequence of fade (fading) at those, current picture to be encoded or image block are relevant with reference pictures self stronger with the reference pictures ratio that passes through the weighted factor convergent-divergent.The Video Codec that applies weighted factor to reference pictures is not to the unusual poor efficiency of the coding of the sequence that fades.When in coding, using weighted factor, video encoder need be determined weighted factor and motion vector, but depend on another for one optimal selection among both, wherein estimation is generally amount of calculation the best part in the compression of digital video encoder.
In joint video team (" JVT (Joint Video Team) ") video compression standard that people propose, each P picture can use a plurality of reference pictures to form the prediction of picture, but the 8x8 zone of each other moving mass or macro block only uses single reference pictures to predict.Except that coding and translatory movement vector, also transmit reference picture indices for each moving mass or 8x8 zone, use which reference pictures with indication.Store at encoder and decoder place may reference pictures finite aggregate, and transmit the number of allowable reference pictures.
In the JVT standard, for bi-directional predicted picture (being also referred to as " B " picture), form two predictive operators (predictor) for each moving mass or 8x8 zone, wherein each can form from the reference pictures that separates, and these two predictive operators are together on average to form single consensus forecast operator.For bi-directional predicted coded motion blocks, reference pictures can be all from forward direction, all from the back to direction or one from one of forward direction from the back to direction.For the available reference pictures that can be used to predict, safeguard two tabulations.These two reference pictures are called tabulation 0 and tabulation 1 predictive operator.Encode respectively and transmit each reference pictures for the tabulation 0 with the tabulation 1 reference pictures index, that is, and red_idx_I0 and red_idx_I1.Bi-directional predicted or " B " picture of joint video team (" JVT ") allows adaptive weighted between two predictions, that is,
Pred=[(P0)(Pred0)]+[(P1)(Pred1)]+D,
Wherein P0 and P1 are weighted factor, and Pred0 and Pred1 are respectively the reference picture prediction of tabulation 0 and tabulation 1, and D is skew.
For the indication weighted factor, two kinds of methods have been proposed.In first method,, determine weighted factor by being used for the direction of reference pictures.In this method, if the red_idx_I0 index is less than or equal to red_idx_I1, then use weighted factor (1/2,1/2), otherwise usage factor (2 ,-1).
In the second method that people propose,, transmit the weighted factor of arbitrary number for each fragment (slice).Then, transmit the weighted factor index for each 8x8 zone of using bi-directional predicted macro block or moving mass.Decoder uses the weighted factor index receive to select suitable weighted factor from the set that is transmitted, to use when decoding moving piece or the 8x8 zone.For example, if send three weighted factors in this slice layer, then they are respectively corresponding to weighted factor index 0,1,2.
Below describe principle of the present invention only has been described.Though therefore being appreciated that those skilled in the art can imagine herein clearly description or demonstration, still comprised the principle of the invention and be included in the present invention's spirit and the interior various structures of scope.In addition, the all examples herein enumerated and conditional language mainly are to be used for only being used for teaching purpose, with principle of the present invention and the design that helps the reader understanding inventor that prior art is made contributions, and should be understood that these examples and the condition that are not limited to enumerate particularly.In addition, point out that herein all statements of the principle of the invention, aspect and execution mode and specific examples thereof all are to be used for covering its structure and function equivalent.In addition, these equivalents are intended to comprise current known equivalent and the equivalent of developing in the future, that is, tubular construction is not how, carries out any element of being developed of said function.
Thus, for example, those skilled in the art should understand that: block diagram herein represents to realize the concept map of the illustrative circuit of the principle of the invention.Similarly, be appreciated that various processing or process that any flow table, flow chart, state transition graph, false code or the like express possibility and in fact represent and carried out by computer or processor in computer-readable medium, and no matter whether clearly shown such computer or processor.
The function of various elements shown in the figure can by use specialized hardware and can with suitable software in combination the hardware of executive software provide.When being provided by processor, these functions can be provided by single application specific processor, are perhaps provided by a plurality of independent processors, and wherein some processor can be shared.In addition, for direct use term " processor " or " controller ", only should not be understood that to refer to can executive software hardware, and may comprise implicitly but be not limited to: the read-only memory (" ROM ") of digital signal processor (" DSP ") hardware, storing software, random access storage device (" RAM ") and nonvolatile storage.Similarly, any switch that shows among the figure is all just conceptual.Even the operation that its function can be by coming programmed logic, by dedicated logic circuit, by the mutual execution manually of program control and dedicated logic circuit, wherein the implementor can select concrete technology as the case may be.
In the claims, any element that is represented as the parts that are used to carry out appointed function is used for comprising all modes of carrying out this function, including (for example): a) carry out the combination of this functional circuit elements, perhaps b) any type of software, therefore comprise firmware, microcode or the like, it combines to carry out this function with the proper circuit of carrying out this software.Such invention that claim limited is included among the following fact: in the desired mode of claim, make up and the function that is provided by pointed various parts is provided.Therefore, the applicant can provide any parts of those functions to think the equivalent of parts shown here.
As shown in Figure 1, label 100 overall expression standard video decoders.Video Decoder 100 comprises with inverse quantizer 120 and carries out the length variable decoder (" VLD ") 110 that signal communication is connected.Inverse quantizer 120 is connected with inverse converter 130 signal communications.Inverse converter 130 is connected with the first input end signal communication of adder or summing junction 140, and wherein the output of summing junction 140 provides the output of Video Decoder 100.The output of summing junction 140 is connected with reference picture store 150 signal communications.Reference picture store 150 is connected with motion compensator 160 signal communications, and motion compensator 160 communicates to connect with second input end signal of summing junction 140.
Forward Fig. 2 to, label 200 overall expressions have the bi-directional predicted Video Decoder of self adaptation.Video Decoder 200 comprises the VLD 210 that is connected with inverse quantizer 220 signal communications.Inverse quantizer 220 is connected with inverse converter 230 signal communications.Inverse converter 230 is connected with the first input end signal communication of summing junction 240, and wherein the output of summing junction 240 provides the output of Video Decoder 200.The output of summing junction 240 is connected with reference picture store 250 signal communications.Reference picture store 250 devices are connected with motion compensator 260 signal communications, and motion compensator 260 is connected with the first input end signal communication of multiplier 270.
VLD 210 also is connected with reference pictures weighted factor look-up table 280 signal communications, so that self adaptation two-way (" ABP ") coefficient index to be provided to look-up table 280.First output of look-up table 280 is used to provide weighted factor, and communicates to connect with second input end signal of multiplier 270.The output of multiplier 270 is connected with the first input end signal communication of summing junction 290.Second output of look-up table 280 is used to provide skew, and communicates to connect with second input end signal of summing junction 290.Second input end signal of the output of summing junction 290 and summing junction 240 communicates to connect.
Forward Fig. 3 now to, label 300 overall expressions have the Video Decoder of reference pictures weighting.Video Decoder 300 comprises the VLD 310 that is connected with inverse quantizer 320 signal communications.Inverse quantizer 330 is connected with inverse converter 330 signal communications.Inverse converter 330 is connected with the first input end signal communication of summing junction 340, and wherein the output of summing junction 340 provides the output of Video Decoder 300.The output of summing junction 340 is connected with reference picture store 350 signal communications.Reference picture store 350 is connected with motion compensator 360 signal communications, and motion compensator 360 is connected with the first input end signal communication of multiplier 370.
In addition, VLD 310 also is connected with reference pictures weighted factor look-up table 380 signal communications, to provide reference picture indices to look-up table 380.First output of look-up table 380 is used to provide weighted factor, and communicates to connect with second input end signal of multiplier 370.The output of multiplier 370 is connected with the first input end signal communication of summing junction 390.Second output of look-up table 380 is used to provide skew, and communicates to connect with second input end signal of summing junction 390.Second input end signal of the output of summing junction 390 and summing junction 340 communicates to connect.
As shown in Figure 4, label 400 overall expression standard video encoder.The input of encoder 400 is connected with the normal phase input end signal communication of summing junction 410.The output of summing junction 410 is connected with piece converter 420 signal communications.Converter 420 is connected with quantizer 430 signal communications.The output of quantizer 430 is connected with variable length encoder (" VLC ") 440 signal communications, and wherein VLC 440 outside that is output as encoder 400 can obtain output.
The output of quantizer 430 also is connected with inverse quantizer 450 signal communications.Inverse quantizer 450 is connected with contrary piece converter 460 signal communications, is connected against piece converter 460 and then with reference picture store 470 signal communications.First output of reference picture store 470 is connected with the first input end signal communication of exercise estimator 480.The input of encoder 400 also communicates to connect with second input end signal of exercise estimator 480.The output of exercise estimator 480 is connected with the first input end signal communication of motion compensator 490.Second output of reference picture store 470 and second input end signal of motion compensator 490 communicate to connect.The output of motion compensator 490 is connected with the inverting input signal communication of summing junction 410.
Forward Fig. 5 to, label 500 overall expressions have the video encoder of reference pictures weighting.The input of encoder 500 is connected with the normal phase input end signal communication of summing junction 510.The output of summing junction 510 is connected with piece converter 520 signal communications.Converter 520 is connected with quantizer 530 signal communications.The output of quantizer 530 is connected with VLC 540 signal communications, and wherein VLC 540 outside that is output as encoder 500 can obtain output.
The output of quantizer 530 also is connected with inverse quantizer 550 signal communications.Inverse quantizer 550 is connected with contrary piece converter 560 signal communications, is connected against piece converter 560 and then with reference picture store 570 signal communications.First output of reference picture store 570 is connected with the first input end signal communication of reference pictures weighted factor distributor 572.The input of encoder 500 also communicates to connect with second input end signal of reference pictures weighted factor distributor 572.The output of the reference pictures weighted factor distributor 572 of indication weighted factor is connected with the first input end signal communication of motion compensator 580.Second output of reference picture store 570 and second input end signal of motion compensator 580 communicate to connect.
The input of encoder 500 also communicates to connect with the 3rd input end signal of exercise estimator 580.The output of the exercise estimator 580 of indication motion vector is connected with the first input end signal communication of motion compensator 590.The 3rd output of reference picture store 570 and second input end signal of motion compensator 590 communicate to connect.Indication is connected with the first input end signal communication of multiplier 592 through the output of the motion compensator 590 of the reference pictures of motion compensation.The output of the reference pictures weighted factor distributor 572 of indication weighted factor and second input end signal of multiplier 592 communicate to connect.The output of multiplier 592 is connected with the inverting input signal communication of summing junction 510.
Forward Fig. 6 now to, the example procedure of the video signal data of label 600 overall expression decoded image blocks.This process comprises begin block 610, and it passes control to input block 612.Input block 612 receives image block compressed data, and passes control to input block 614.Input block 614 receives at least one reference picture indices of image block data, and wherein each reference picture indices is corresponding to particular reference picture.Input block 614 passes control to functional block 616, the weighted factor that functional block 616 is determined corresponding to the reference picture indices that each received, and pass control to optional function piece 617.The skew that optional function piece 617 is determined corresponding to the reference picture indices that each received, and pass control to functional block 618.The reference pictures that functional block 618 is retrieved corresponding to the reference picture indices that each received, and pass control to functional block 620.The functional block 620 and then reference pictures of being retrieved carried out motion compensation, and pass control to functional block 622.Functional block 622 will multiply by corresponding weighting factor through the reference pictures of motion compensation, and pass control to optional function piece 623.Optional function piece 623 will add corresponding skew through the reference pictures of motion compensation, and pass control to functional block 624.The reference pictures of functional block 624 and then formation process weighting and motion compensation, and pass control to end block 626.
Forward Fig. 7 now to, the example procedure of the video signal data of label 700 overall presentation code image blocks.This process comprises begin block 710, and it passes control to input block 712.Input block 712 receives unpressed substantially image block data, and passes control to functional block 714.Functional block 714 is distributed the weighted factor of image block corresponding to the particular reference picture with respective index.Functional block 714 passes control to optional function piece 715.Optional function piece 715 distributes the skew of image block corresponding to the particular reference picture with respective index.Optional function piece 715 passes control to functional block 716, and functional block 716 is corresponding to the difference calculation of motion vectors between image block and the particular reference picture, and passes control to functional block 718.Functional block 718 is carried out motion compensation corresponding to motion vector to particular reference picture, and passes control to functional block 720.Functional block 720 and then will multiply by the weighted factor that is distributed through the reference pictures of motion compensation forming the reference pictures through weighting and motion compensation, and passes control to optional function piece 721.Optional function piece 721 and then will add the skew that is distributed through the reference pictures of motion compensation forming the reference pictures through weighting and motion compensation, and passes control to functional block 722.Functional block 722 deducts the reference pictures through weighting and motion compensation from unpressed substantially image block, and passes control to functional block 724.The difference between the reference pictures of functional block 724 and then the unpressed substantially image block of utilization and process weighting and motion compensation and the respective index code signal of particular reference picture, and pass control to end block 726.
In this exemplary embodiment, for the picture or the fragment of each coding, weighted factor with can be associated by reference pictures relative its current picture block of encoding, each permission.In the coding or the current picture of decoding during each piece, will be corresponding to (a plurality of) weighted factor of its reference picture indices with (a plurality of) offset applications to reference prediction with the formation weight predictor.All pieces in the fragment of same relatively reference pictures coding all apply identical weighted factor to reference picture prediction.
When coded picture, whether use adaptive weighted can showing in frame parameter set or sequence parameter set or described fragment or picture head middle finger.For using adaptive weighted each fragment or picture, can transmit weighted factor for each admissible reference pictures of may be used for encoding this fragment or picture.The number of admissible reference pictures transmits at the head of described fragment.For example, if can use three reference pictures current fragment of encoding, then transmit nearly three weighted factors, and these weighted factors are associated with the reference pictures with same index.
If do not transmit weighted factor, then use default weights.In one embodiment of the invention, when not transmitting weighted factor, use default weights (1/2,1/2).Can use fixing or elongated code transmits weighted factor.
Different with canonical system, each weighted factor that transmits with each fragment, piece or picture is corresponding to particular reference picture index.Before, the weighted factor of any set that transmits with each fragment or picture was not associated with any particular reference picture.On the contrary, for each moving mass or the bi-directional predicted weighted indexing of 8x8 zone transmission self adaptation, to select and to apply from which weighted factor in the set that is transmitted this special exercise piece or 8x8 zone.
In the present embodiment, explicitly does not transmit the weighted factor index in each moving mass or 8x8 zone.On the contrary, use and the reference picture indices weighting factor associated that is transmitted.This has greatly reduced in the bit stream that is transmitted to allowing the adaptive weighted amount of overhead that has of reference pictures.
This system and technology can put on prediction " P " picture that uses single predictive operator coding, perhaps use bi-directional predicted " B " picture of two predictive operator codings.Below be described in the decoding processing that all exists in encoder and the decoder at the situation of P and B picture.Replacedly, this technology also can be applied to use be similar to I, B, with the coded system of the notion of P picture.
For B picture single direction prediction and bi-directional predicted in the B picture, can use identical weighted factor.When macro block uses single predictive operator for the P picture or in for the prediction of B picture single direction, be the single reference picture indices of this block movement.After the decoding processing step of motion compensation produces predictive operator, apply weighted factor to predictive operator.Then the predictive operator after the weighting is added on the coded residual (coded residual), to shear to form decoded pictures.For the piece that is used for the P picture or only be used for using the piece of the B picture of tabulation 0 prediction, weight predictor forms:
Pred=W0*Pred0+D0 (1)
Wherein W0 is and tabulation 0 reference pictures weighting factor associated, the skew of D0 for being associated with tabulation 0 reference pictures, and Pred0 is the prediction piece through motion compensation from tabulation 0 reference pictures.
For the piece that is used for only using the 0 B picture of predicting of tabulating, weight predictor forms:
Pred=W1*Pred1+D1 (2)
Wherein W1 is and tabulation 1 reference pictures weighting factor associated, the skew of D1 for being associated with tabulation 1 reference pictures, and Pred1 is the prediction piece through motion compensation from tabulation 1 reference pictures.
Can shear predictive operator after the weighting to guarantee that end value within the pixel value tolerance band, is generally 0 to 255.The precision of multiplication can be limited to the resolution of any predetermined number of bits in the weighting formula.
Under bi-directional predicted situation, for each transmission reference picture indices of two predictive operators.Carry out motion compensation to form two predictive operators.Each predictive operator uses and its reference picture indices weighting factor associated, to form two predictive operators after the weighting.Then, the predictive operator after average together these two weightings is added to coded residual with this consensus forecast operator then to form the consensus forecast operator.
For the piece of the B picture that is used for using tabulation 0 and tabulating 1 prediction, weight predictor forms:
Pred=(P0*Pred0+D0+P1*Pred1+D1)/2 (3)
When calculating weight predictor, can shear the predictive operator after the weighting or any median, to guarantee that end value within the pixel value tolerance band, is generally 0 to 255.
Thus, apply weighted factor to the video compression encoder that uses a plurality of reference pictures and the reference picture prediction of decoder.According to the reference picture indices that is used for moving mass, this weighted factor changes for each moving mass in this picture.Because transmitted reference picture indices in the video bit stream after compression, so significantly reduced the additional overhead that changes weighted factor according to moving mass.All moving mass with respect to the same datum picture coding all apply identical weighted factor to reference picture prediction.
According to explanation herein, those skilled in the art can easily understand these and other feature and advantage of the present invention.Be appreciated that explanation of the present invention can be applied to various forms of hardware, software, firmware, application specific processor or its combination.
More preferably, the present invention can be implemented as the combination of hardware and software.In addition, described software preferably is embodied as with tangible form and is included in application program on the program storage unit (PSU).This application program can upload to the machine that comprises any suitable architecture and by its execution.Preferably, this machine is realized on the computer platform that has such as hardware such as one or more CPU (" CPU "), random access storage device (" RAM ") and I/O (" I/O ") interfaces.This computer platform can also comprise operating system and micro-instruction code.Various processing described herein and function can be the parts of micro-instruction code, or the part of application program, perhaps its combination, and it can be carried out by origin CPU.In addition, various other peripheral cells can be connected to this computer platform, for example additional-data storage unit and print unit.
Should also be appreciated that: because some construction system assembly and the method in the accompanying drawings be preferably with the software realization, so the actual connection between system component or the function blocks may be according to programming mode of the present invention and difference.Explanation has herein been arranged, those of ordinary skills can imagine of the present invention these and similarly realize or configuration.
Though described exemplary embodiment, be appreciated that to the invention is not restricted to those accurate execution modes, and under the prerequisite that does not depart from the scope of the present invention with spirit, those of ordinary skills can carry out various changes and modification at accompanying drawing.All these change with revising and are included within the scope of the present invention that claims provide.