Embodiment
Hereafter with reference to some illustrative embodiments, principle of the present invention and spirit will be described by reference to the accompanying drawings.Should be appreciated that providing these execution modes is only used to enable those skilled in the art understand better and then realize the present invention, and not limit the scope of the invention by any way.
In the disclosure, term " encoding and decoding " refers to the coding occurring in encoder place and the decoding occurring in decoder place.Similarly, term codec refer to encoder, decoder or combination codec.Term codec, encoder, decoder all refer to and are designed for the particular machine of image or video data being carried out to encoding and decoding (coding or decoding) consistent with the disclosure.
Although should be appreciated that hereinafter mainly with the coding performed at encoder to describe various execution mode of the present invention, the inverse process of described various execution modes also can be applied to the decoding performed at decoder.
Modern video coding techniques generally can be divided into current encoded frame the coding units do not overlapped mutually, this coding units can be the macro block (MB in AVC/H.264 or AVS1, macroblock), also can be the coding unit (CU, codingunit) in HEVC/H.265 or AVS2.In order to be adapted to concrete video content, coding units can also be made son further and divide, and is defined herein as coding subunit.
Modern video coding techniques also adopts infra-prediction techniques, wherein utilizing the room and time redundancy of vision signal to reach the object of Information Compression: when compressing current coding units or coding subunit, utilizing Pixel Information encoded around it to predict the pixel value in current coding units or coding subunit.Forecasting process is generally use certain several predefined prediction algorithm on the basis of neighborhood pixels, generates a predict pixel block of present encoding unit/subunit.According to the assessment to distortion performance, the predictive mode that encoder efficiency of selection is the highest.
AVC/H.264 is a kind of built vertical video compression standard using transform coding and decoding in block process.In AVC/H.264, image is divided into the macro block (MB) of 16 × 16 pixels.Each MB is often by the block of Further Division Cheng Geng little.Use in image or inter picture prediction predicts that size is equal to or less than the block of a MB, and be applied to prediction residual by together with the spatial alternation quantized.Usual use entropy encoding/decoding method (such as, variable-length code-decode or arithmetic coding/decoding) is encoded to the residual transform coefficient through quantizing.
Develop, with the International video encoding and decoding standard HEVC/H.265 taking over AVC/H.264, the size of transform block is extended to 64 × 64 pixels to make HD video encoding and decoding benefited, wherein also image and frame of video are divided into coding unit and predicting unit.
In above AVC/H.264 and HEVC/H.265 standard, as previously mentioned, all employ infra-prediction techniques.The concrete operations of coding side relevant therewith comprise: the index of the predictive mode used of 1) encoding; 2) present encoding unit/subunit and predict pixel block poor, obtain residual block; 3) residual block converted, quantize and entropy code.The respective operations of decoding end comprises: the index of predictive mode of 1) decoding, obtains predictive mode according to index, and calculate corresponding intra-frame prediction block; 2) entropy decoded transform coefficients, re-quantization, inverse transformation obtain residual block; 3) addition prediction block and residual block obtain the block of pixels of reconstruction.
Be used for predicting coding units/subunit decoding and rebuilding that the neighbor of present encoding unit/subunit is front thus and obtain.Due to the damage of information in compression process, predict pixel block produced also just contains the damage of information to have the neighbor of damage to predict with these.Damage to reduce these impact brought, some encryption algorithms (comprising HEVC/H.265) are by improving compression performance in the method being used for doing filtering in the pixel of the present encoding unit/subunit through prediction.
The Predicting and filtering process of the present invention's proposition is introduced below in conjunction with HEVC/H.265 international standard.Hereinafter, if no special instructions, the term such as such as code tree unit CTU, maximum coding unit LCU, coding unit CU, predicting unit PU, converter unit TU that the present invention uses is inherited in HEVC/H.265 standard the definition of these terms and description.But, be to be understood that, decoding method described by embodiment of the present invention can be understood as the improvement of a details to HEVC/H.265 international standard, but also can be applied in other HD video coding (HEVC) embodiment independent of HEVC/H.265 standard, such as, may be used for improving pixel prediction in the frame in AVC/H.264 coding and decoding video international standard.
Can specifically with reference to the H.265 document that International Telecommunication Union's telecommunication standards organizes ITU-T to announce to the description of HEVC/H.265 international standard, title is " Highefficiencyvideocoding ", can obtain from network address http://www.itu.int/rec/T-REC-H.265-201304-S.In order to the integrality of the application's disclosure, the full content of above-mentioned document is incorporated to by way of reference herein.Be not intended to herein be described with regard to any details of HEVC/H.265 international standard, those skilled in the art will know that and how in the published document of this standard, to find more details.
HEVC/H.265 is block-based blending space and time prediction decoding method.In HEVC/H.265, first input picture is divided into square maximum coding unit LCU or is called code tree unit CTU, as shown in Figure 1.Be different from the H.264 video encoding standard that wherein basic coding unit is the macro block of 16 × 16 pixels, in HEVC, CTU can greatly to 64 × 64 pixels.LCU can be divided into four square coding unit CU, and the size of CU can be 1/4th of LCU size; A LCU also can not be divided, is directly regarded as a CU, specifically depending on the situation of the input picture of this LCU region.Each CU can be divided into four less CU further, and its size is 1/4th of former CU size.Cutting procedure can be repeated until meet certain standard.Fig. 2 illustrates the example of the LCU being divided into CU.In the ordinary course of things, for HEVC/H.265, the minimum CU (leaf node such as, as being hereafter described in further detail) of use is considered to a basic coding CU.
How LCU is divided into CU can be represented by quaternary tree.At each Nodes of quaternary tree, if node is divided into child node further, then segmentation mark SF is set to 1.Otherwise, mark SF is set to 0.Such as, the LCU of Fig. 2 is divided and can be represented by the quaternary tree of Fig. 3.
Not divided node (such as, corresponding to the node of terminal or leaf node in given quaternary tree) can comprise one or more predicting unit PU.Typically, PU represents all or part of of corresponding CU, and comprises for obtaining the data of the reference sample for PU for the object performing prediction for CU.Therefore, at each leaf node place of quaternary tree, the CU (such as, the CU in Fig. 2 shown in the upper left corner) of 2N × 2N can have one of four kinds of possible patterns (N × N, N × 2N, 2N × N, 2N × 2N), as shown in Figure 4.Although illustrate for the CU of 2N × 2N, other PU be of different sizes with corresponding pattern (such as, square or rectangle) can be used, as shown in Figure 5.
With reference now to Fig. 5, it is for the different coding structure of PU.For intraframe coding, the PU being of a size of 2N × 2N and N × N can be used.For interframe encode, the PU being of a size of 2N × 2N, 2N × N, N × 2N, N × N can be used.As mentioned above, if encoded to PU with frame mode, then each PU can have its spatial prediction direction.If encoded to PU in the inter mode, then each PU can have its motion vector and the reference picture be associated.
CU spatially can be carried out intraframe predictive coding.If CU is carried out encoding and decoding by with frame mode, then each PU of CU can have its spatial prediction direction.Typically, in intraframe predictive coding, between the adjacent block in frame, there is the spatial coherence of high-level.Therefore, current pixel block can be predicted from neighbouring encoded or through reconstruct block of pixels, thus create infra-frame prediction.In certain embodiments, can by being positioned at above current block or the weighted average of sample of previous coding on its left side forms prediction.Encoder can select to make the pattern of difference between original and prediction and cost minimization, and sends this selection with signal in control data.
Prediction or inter prediction encoding are with after producing prediction data and residual error data in conducting frame, and carrying out any conversion (4 × 4 or 8 × 8 integer transforms such as used in H.264/AVC or discrete cosine transform (DCT)) with after producing conversion coefficient, the quantification of transformation system can be performed.Quantize generally to refer to and conversion coefficient quantized thus likely reduces to be used for the process of data volume of expression system, such as, by converting high accuracy conversion coefficient to a limited number of probable value.
Each CU can also be divided into converter unit TU.In certain embodiments, to the map function of one or more TU execution block to carry out decorrelation to the pixel in this block, and block energy is compressed into the lower-degree coefficient of transform block.Modern video codec generally can define the conversion of sizes.The conversion that Video Codec is supported number and size by the impact of practical application, also have the consideration realizing cost etc. simultaneously.TU can support the transcoding, coding transform of 4 × 4 to 32 × 32, and the full-size (that is, maximum transform size) of TU can preset in coding/decoding system.
In certain embodiments, the conversion of 8 × 8 or 4 × 4 can be applied.In other embodiments, the set that the block of different size converts can be applied to CU, as shown in Figure 6, wherein the block on the left side is the CU being divided into PU, and the block on the right is the TU set of association.The size of in CU each piece conversion and position are described by an independent quaternary tree.Fig. 7 illustrates in the example of fig. 6 for the quadtree representation of the TU of CU.As is understood, CU, PU and TU size can be N × N or M × N, and wherein N ≠ M, N and M are the power side of 2, such as 4,8,16,32,64.
TU and PU of any given CU can be used for different objects.TU is normally used for conversion, quantizes and encoding operation, and PU is normally used for room and time prediction.For given CU, the direct relation between the number of PU and the number of TU may not be there is.
In image or video, block of pixels can comprise the block of the conversion coefficient in the block of the pixel data in pixel domain or transform domain, such as, after the conversion of such as DCT, integer transform, wavelet transformation or conceptive similarity transformation is applied to the residual error data for given pixel data blocks, wherein residual error data represents for the pixel difference between the pixel data of this block and the prediction data generated for this block.In some cases, block of pixels can comprise the transformation coefficient block of the quantification in transform domain, and wherein after conversion is applied to the residual error data for given video data, the conversion coefficient obtained also is quantized.In Video coding, quantize the step being the introduction of loss, thus the balance between bit rate and reconstruction quality can be set up.
Carrying out piecemeal to block of pixels is free-revving engine service in block-based video coding and decoding technology.The data using less block to carry out encoding and decoding video data can to cause the position of the frame of video predicted better for the details comprising high-level, and therefore can reduce the final error (such as, the deviation of prediction data and source video data) represented by residual error data.Typically, the correlation between predicting by the sample block to various sizes is carried out modeling and is utilized space in video sequence or time redundancy, thus only actual and between the signal of prediction little difference needs to be encoded.According to being created the prediction for current block by the sample of encoding.Although likely reduce residual error data, but, such technology may need extra syntactic information to indicate less block how relative to frame of video by piecemeal, and the video bitrate through encoding and decoding that increases may be caused.Therefore, in some technology, block piecemeal may depend on the final increase of the bit rate in the video data of encoding and decoding relatively caused due to extra syntactic information and the balance to the minimizing desired by residual error data.
Fig. 8 shows the diagram for predicting unit of encoding and/or decode according to HEVC/H.265 standard.As shown in Figure 8, given current PU, represents with x, then can pass through infra-frame prediction (or interframe) prediction and first obtain prediction PU, represent with x '.Prediction PU, x ' then can be deducted by from current PU, x, produce PU residual error, represent with e.Then can be converted by conversion by the CU residual error of dividing into groups to the PU residual error be associated with CU, e and generating, often next TU, produce the PU residual error in transform domain, represent with E.Described conversion can such as use the block of square or non-square to convert.
Then PU residual error, E can device module 118 quantize by quantifying, thus convert high accuracy conversion coefficient to a limited number of probable value.As will be appreciated, quantification damages operation, and quantification loss cannot recover usually.
Then coefficient through quantizing can carry out entropy encoding/decoding by entropy encoding/decoding module 120, produces last compression bit.It should be noted that and depend on performed encoding and decoding standard, prediction described above, conversion and quantification can perform for any block of video data, such as, for PU or TU of CU, or for macro block.
In order to promote that Time and place is predicted, also can take the conversion coefficient E through quantizing, and utilizing inverse quantization module 122 to carry out re-quantization to it, thus produce the conversion coefficient E ' through re-quantization.Then conversion coefficient through re-quantization carries out inverse transformation by inverse transform module 124, produces the PU residual error through reconstruct, represents with e '.PU residual error through reconstructing, e ' are then added in time or spatially and predict PU, x accordingly ', to form the PU through reconstruct, with x " represent.
Can through reconstruct predicting unit PU, x " on perform block elimination filtering (" DFB ") operation, first to reduce blocking effect.Can perform sample adaptive-biased (" SAO ") process conditionally after completing the block elimination filtering operation for the image through decoding, this compensates and is biased through the pixel value between the pixel and original pixels of reconstruct.In certain embodiments, DBF operation and SAO process are all realized by auto-adaptive loop filter function, and this auto-adaptive loop filter function can be performed on the PU through reconstruct conditionally by loop filter module 126.In certain embodiments, the encoding and decoding distortion of auto-adaptive loop filter function minimization between input and output image.In certain embodiments, loop filter module 126 works during inter picture prediction loop.If be reference picture through the image of reconstruct, then they can be stored in reference buffer 128 for time prediction in the future.
HEVC specifies two loop filters applied in order with the DBF first applied and the SAO filter applied afterwards.DBF is similar to that MPEG-4AVC/H.264 uses, but has more simply design and a better support for parallel processing.In HEVC/H.265, DBF is only applicable to the sample lattice of 8 × 8, and utilizes MPEG-4AVC/H.264, and DBF is applicable to 4 × 4 sample lattice.DBF uses the sample lattice of 8 × 8, because it can not cause significantly degenerating and improving parallel processing significantly, this is because DBF no longer causes interacting with other cascades operated.Another change is three DBF intensity that HEVC only allows 0 to 2.HEVC also requires DBF first to the horizontal filtering of image applications for vertical edge, and only after that, it is to the vertical filtering of image applications for horizontal edge.This allows multiple parallel thread to be used to DBF.
SAO filtering is applied after DBF, and is carried out allowing better to reconstruct original signal amplitude with the look-up table such as being comprised some parameters by use, and described parameter is based on the histogram analysis undertaken by encoder.SAO filter has two kinds of fundamental types, is biased (" the BO ") type of Edgebiased (" EO ") type and band respectively.One of SAO type can be applied by by CTB.EO type have correspond to along four possible directions (such as, level, vertical, 135 degree and 45 degree) four subtypes of process.For given EO subtype, EO process operates by using one of four different gradient pattern to compare two of the value of pixel and its neighbours.By the pixel in the biased each pattern be applied in four gradient pattern.For not being pixel value in one of gradient pattern, do not apply biased.BO process is directly based on the sample magnitude being divided into 32 bands.By the pixel in biased 16 bands be applied in 32 bands, wherein the group of 16 bands corresponds to BO subtype.SAO filter process is designed to by biased sample value of adding to is carried out minimizing distortion compared with primary signal.It can increase the definition at edge and reduce ring and pulse illusion.
It will be understood by those skilled in the art that at decoder place, the inverse process of process as described above can be performed.
Inventor recognizes, above-mentioned in the processing procedure of predicting unit PU, predicting unit PU, x through reconstruct " block of original coding (not) obtain by carrying out interpolation calculation etc. to the reconstructed value of neighbor.But inventor expects to reduce residual error data further.
According to the embodiment of the present invention, pixel prediction in frame can be made according to (through what rebuild) neighbor without any process of the encoding block of image or frame of video, and in delta frame after predict pixel block, the pixel first in prediction block carries out filtering.Due to direct, filtering is carried out to the block of pixels after infra-frame prediction, the residual error of filtered prediction block and present encoding block thus can be made less.
For example, for shown in Fig. 8, when producing PU residual error, e, can to increase first to prediction PU, x ' apply the step of filtering.
It should be noted that, embodiments of the present invention propose for infra-frame prediction, the neighbor wherein used is the decoding and rebuilding value without any process, it is not after filtering, therefore the neighbor used in frame filter is not intended to by embodiments of the present invention application about frame filter, because have passed through filtering at least one times.
Therefore, according to the embodiment of the present invention, propose a kind of image and video coding method 90, as shown in Figure 9, it illustrates the flow chart of this image and video data method.The method 90 comprises step:
S910: original image or frame of video are divided into coding units;
S920: carry out pixel prediction in frame to the coding units through dividing processing, generates the coding units through prediction;
S930: filtering is carried out to the described coding units through prediction; And
S940: the described coding units through dividing processing and the filtered described coding units through predicting are subtracted, generates coding units residual error.
Afterwards, alternatively, through operations such as conversion, quantification and entropy code, the filtered coding units through prediction and described residual coding can be entered in encoding code stream.
In one embodiment, according to the statistical property of the predict pixel in prediction block, multiple filtering mode can be defined.Such as, if the statistical property of the predict pixel in present encoding unit/subunit shows as unanimously, smoothly, then can use strict filter performance criteria, all radio-frequency components being regarded as noise in predicting the outcome are removed; Otherwise, if the statistical property of the predict pixel in present encoding unit/subunit shows as more unsmooth, then can use the filter performance criteria of relative loose, retain radio-frequency component to a certain degree.Again such as, according to noise model estimating noise variance, pointed threshold value can be designed and carries out filtering.
In one embodiment, according to the directional characteristic of the predict pixel in prediction block, corresponding filtering mode can be defined.The spatial prediction direction performed for present encoding unit/subunit can comprise level, vertical, 45 degree of diagonal angles, 135 degree of diagonal angles, planes etc.Such as, spatial prediction can be implemented differently for brightness PU and colourity PU.
In one embodiment, the filtering carried out for the neighbor without any process can comprise the operation of sensu lato linear/non-linear, such as, and self adaptation sample bias (SAO) etc.The complexity that filter can design, that also can design is simple, and the present invention does not limit this.
According to the embodiment of the present invention, also propose a kind of image and video coding equipment 100, as shown in Figure 10, it illustrates the block diagram of this image and video coding equipment, and it comprises:
Cutting unit 1010, for being divided into coding units by original image or frame of video;
Intraprediction unit 1020, for carrying out pixel prediction in frame to the coding units through dividing processing, generates the coding units through prediction;
Filter unit 1030, for carrying out filtering to the described coding units through prediction; And
Residual Generation unit 1040, for the described coding units through dividing processing and the filtered described coding units through predicting being subtracted, generates coding units residual error.
According to the embodiment of the present invention, a kind of image and video encoding/decoding method 110 are also proposed, as the inverse process corresponding with image and video coding method 90.As shown in figure 11, it illustrates this image and the video solution flow chart according to method.The method 110 comprises step:
S1110: identify the coding units through dividing processing and coding units residual error from the encoding code stream of image and video data;
S1120: carry out pixel prediction in frame to the coding units through dividing processing, generates the coding units through prediction;
S1130: filtering is carried out to the described coding units through prediction; And
S1140: the filtered described coding units through prediction and described coding units residual error are made addition, obtains the coding units rebuild.
According to the embodiment of the present invention, also propose a kind of image and video decoding apparatus 120, as shown in figure 12, it illustrates the block diagram of this image and video decoding apparatus, and it comprises:
Decoding unit 1210, for identifying coding units through dividing processing and coding units residual error in the encoding code stream from image and video data;
Intraprediction unit 1220, for carrying out pixel prediction in frame to the coding units through dividing processing, generates the coding units through prediction;
Filter unit 1230, for carrying out filtering to the described coding units through prediction; And
Reconstruction unit 1240, for the filtered described coding units through prediction and described coding units residual error are made addition, obtains the Suo Ma unit rebuild.
Should be appreciated that the coding units in embodiment of the present invention can be whole image or frame of video, can be the substandard macro block MB of AVC/H.264, or can be HEVC/H.265 substandard coding unit CU, sub-CU, predicting unit PU or sub-PU.Each coding units can independent encoding and decoding unit as image or frame of video.
Should be appreciated that in accompanying drawing that the element being depicted as functional block may be implemented as hardware, software or their combination.The combination of the square frame in each square frame in block diagram and/or flow chart and block diagram and/or flow chart, can realize by the special hardware based system of the function put rules into practice or operation, or can realize with the combination of specialized hardware and computer instruction.
In addition, embodiments of the present invention can be employed in the systems such as such as personal computer, smart phone or flat computer.
Give specification of the present invention for the object illustrated and describe, but it is not intended to be exhaustive or be limited to the invention of disclosed form.It may occur to persons skilled in the art that a lot of amendment and variant.Under the prerequisite not departing from spirit of the present invention, all modifications made and replacement all will fall in the scope of claims definition.