CN110971905B

CN110971905B - Method, apparatus and storage medium for encoding and decoding video content

Info

Publication number: CN110971905B
Application number: CN201911127826.3A
Authority: CN
Inventors: 修晓宇; 贺玉文; C-M·蔡; 叶琰
Original assignee: Vid Scale Inc
Current assignee: InterDigital VC Holdings Inc
Priority date: 2014-03-14
Filing date: 2015-03-14
Publication date: 2023-11-17
Anticipated expiration: 2035-03-14
Also published as: WO2015139010A8; CN106233726A; TWI650006B; CN106233726B; US20210274203A1; JP6368795B2; KR20160132990A; KR20210054053A; TW201540053A; JP2024029087A; WO2015139010A1; US20150264374A1; KR20190015635A; CN110971905A; AU2015228999A1; JP2022046475A; KR102391123B1; JP6684867B2; MX356497B; AU2015228999B2

Abstract

A system, method and apparatus for performing adaptive residual color space conversion are disclosed. A video bitstream may be received and a first flag may be determined based on the video bitstream. A residual may also be generated based on the video bitstream. The residual may be converted from the first color space to the second color space in response to the first flag.

Description

Method, apparatus and storage medium for encoding and decoding video content

The application is a divisional application of Chinese patent application with application date of 2015, 3-month and 14, application number of 201580014202.4 and application name of 'system and method for enhancing RGB video coding'.

Background

Screen content sharing applications have become increasingly popular as device and network capabilities have increased. Examples of popular screen content sharing applications include remote desktop applications, video conferencing applications, and mobile media presentation applications. The screen content may include a number of video and/or image elements having one or more dominant colors and/or sharp edges. Such image and video elements may include relatively sharp curves and/or text within such elements. While a variety of video compression devices and methods may be used to encode screen content and/or transmit such content to a receiver, such methods and devices may not fully represent one or more features of the screen content's characteristics. Such lack of features can lead to degraded compression performance in reconstructed images or video content. In such implementations, the reconstructed image or video content may be negatively affected by image or video quality problems. For example, such curves and/or text may be ambiguous, distorted, or otherwise difficult to discern in the screen content.

Disclosure of Invention

A system, method and apparatus for encoding and decoding video content are disclosed. In an embodiment, the system and method may be implemented to perform adaptive residual color space conversion. A video bitstream may be received and a first flag may be determined based on the video bitstream. A residual may also be generated based on the video bitstream. The residual may be converted from a first color space to a second color space in response to a first flag.

In an embodiment, determining the first flag may include receiving the first flag at a coding unit level. The first flag may be received only if the second flag at the coding unit level indicates that there is at least one residual in the coding unit having a non-zero value. The conversion of the residual from the first color space to the second color space may be performed by applying a color space conversion matrix. The color space conversion matrix may correspond to an irreversible YCgCo to RGB conversion matrix that can be applied to lossy encoding. In another embodiment, the color space conversion matrix may correspond to a reversible YCgCo-to-RGB conversion matrix that can be applied to lossless coding. Residual conversion from the first color space to the second color space may include applying a scaling factor matrix, and where the color space conversion matrix is not normalized, each row of the scaling factor matrix may include a scaling factor corresponding to a norm of a corresponding row of the non-normalized color space conversion matrix. The color space conversion matrix may include at least one fixed point precision coefficient. A second flag based on the video bitstream may be signaled at a sequence level, a picture level, or a slice (slice) level, and the second flag may indicate whether a conversion process of the residual from the first color space to the second color space is enabled for a sequence level, a picture level, or a slice level, respectively.

In an embodiment, the residual of the coding unit may be encoded in the first color space. The best mode of encoding such residual residuals may be determined based on the cost of encoding such residual residuals in the available color space. The flag may be determined based on the determined best mode and may be included in the output bitstream. These and other aspects of the disclosed subject matter are described below.

Drawings

FIG. 1 is a block diagram schematically illustrating an exemplary screen content sharing system according to one embodiment;

FIG. 2 is a block diagram schematically illustrating an exemplary video coding system according to one embodiment;

FIG. 3 is a block diagram schematically illustrating an exemplary video decoding system according to one embodiment;

FIG. 4 is a schematic illustration of an exemplary prediction unit mode according to an embodiment;

FIG. 5 is a schematic illustration of an exemplary color image according to an embodiment;

FIG. 6 is a schematic illustration of an exemplary method for implementing an embodiment of the disclosed subject matter;

FIG. 7 is a schematic illustration of another exemplary method for implementing an embodiment of the disclosed subject matter;

FIG. 8 is a block diagram schematically illustrating an exemplary video coding system according to one embodiment;

FIG. 9 is a block diagram schematically illustrating an exemplary video decoding system according to one embodiment;

FIG. 10 is a block diagram schematically illustrating an exemplary subdivision of a prediction unit into transform units, according to an embodiment;

FIG. 11A is a system diagram of an example communication system in which the disclosed subject matter may be implemented;

fig. 11B is a system diagram of an example wireless transmit/receive unit (WTRU) that may be used in the communication system illustrated in fig. 11A;

FIG. 11C is a system diagram of an example radio access network and an example core network that may be used in the communication system illustrated in FIG. 11A;

FIG. 11D is a system diagram of another example radio access network and an example core network that may be used in the communication system illustrated in FIG. 11A;

fig. 11E is a system diagram of another example radio access network and an example core network that may be used in the communication system illustrated in fig. 11A.

Detailed Description

A detailed description of exemplary examples will now be described with reference to the accompanying drawings. While this description provides detailed examples of possible implementations, it should be noted that these details are merely intended to be exemplary and are not intended to limit the scope of the application in any way.

As more people share device content when using, for example, media presentations and remote desktop applications, screen content compression methods become important. In some embodiments, the display capabilities of the mobile device are enhanced to high definition or ultra high definition resolution. Video coding tools and transforms, such as block coding modes, may not be optimized for higher definition screen content coding. Such tools may increase the bandwidth used to transmit screen content in a content sharing application.

Fig. 1 shows a block diagram of an exemplary screen content sharing system 191. The system 191 may include a receiver 192, a decoder 194, and a display 198 (which may also be referred to as a "renderer"). Receiver 192 may provide an input bitstream 193 to decoder 194, and decoder 194 may decode the bitstream to generate decoded pictures 195 that may be provided to one or more display picture buffers 196. Display picture buffer 196 may provide decoded pictures 197 to display 198 for display on one or more displays of the device.

Fig. 2 schematically illustrates a block diagram of a block-based single-layer video encoder 200 that may be implemented, for example, to provide a bitstream to the receiver 192 of the system 191 of fig. 1. As shown in fig. 2, the encoder 200 may use techniques such as spatial prediction (which may also be referred to as "intra-prediction") or temporal prediction (which may also be referred to as "inter-prediction") to predict the input video signal 201, thereby attempting to improve compression efficiency. The encoder 200 may include mode decision and/or other encoder control logic 240 capable of determining a form of prediction. Such determination may be based at least in part on criteria such as rate-based criteria, distortion-based criteria, and/or combinations thereof. The encoder 200 may provide one or more prediction blocks 206 to an element 204, which may generate and provide a prediction residual 205 (which may be a difference signal between the input signal and the prediction signal) to a transform element 210. The encoder 200 may transform the prediction residual 205 at a transform element 210 and quantize the prediction residual 205 at a quantization element 215. The quantized residue along with mode information (e.g., intra or inter prediction) and prediction information (motion vectors, reference picture indices, intra prediction modes, etc.) may be provided as a residual coefficient block 222 to the entropy encoding element 230. The entropy encoding element 230 may compress the quantized residual and provide it as an output video bitstream 235. The entropy encoding element 230 may also or instead use the encoding mode, prediction mode, and/or motion information 208 in generating the output video bitstream 235.

In an embodiment, encoder 200 may also generate or alternatively generate a reconstructed video signal by applying inverse quantization to residual coefficient block 222 at inverse quantization element 225 and inverse transformation at inverse transformation element 220 to generate a reconstructed residual that can be added back to prediction signal 206 at element 209. In some embodiments, the resulting reconstructed video signal may be processed using a loop filtering process implemented at loop filtering element 250 (e.g., by using one or more of deblocking filtering, sampling adaptive offset, and/or adaptive loop filtering). In some embodiments, the resulting reconstructed video signal in the form of reconstructed block 255 may be stored at reference picture store 270, and at reference picture store 270, the reconstructed video signal may be used to predict future video signals, for example, by motion prediction (estimation and compensation) element 280 and/or spatial prediction element 260. Note that in some embodiments, the resulting reconstructed video signal generated by element 209 is provided to spatial prediction element 260 without processing by an element such as loop filter element 250.

Fig. 3 shows a block diagram of a block-based single layer decoder 300 that may receive a video bitstream 335, where the video bitstream 335 may be a bitstream such as the bitstream 235 generated by the encoder 200 of fig. 2. The decoder 300 may reconstruct the bitstream 335 for display on the device. Decoder 300 may parse bit stream 335 at entropy decoder element 330 to generate residual coefficients 326. Residual coefficients 326 may be inverse quantized at dequantization element 325 and/or inverse transformed at inverse transformation element 320 to obtain a reconstructed residual that may be provided to element 309. The prediction signal may be obtained using the coding mode, prediction mode, and/or motion mode 327, in some embodiments using one or both of the spatial prediction information provided by spatial prediction element 360 and/or the temporal prediction information provided by temporal prediction element 390. Such a prediction signal may be provided as a prediction block 329. The prediction signal and the reconstructed residual may be superimposed at element 309 to generate a reconstructed video signal, which may be provided to loop filter element 350 for loop filtering and may be stored in reference picture store 370 for displaying pictures and/or decoding the video signal. Note that the prediction mode 328 may be provided to element 309 by entropy decoding element 330 for use in generating a reconstructed video signal that may be provided to loop filtering element 350 for loop filtering.

Video coding standards, such as High Efficiency Video Coding (HEVC), may reduce transmission bandwidth and/or storage. In some embodiments, the HEVC implementation may operate as block-based hybrid video coding, with the implemented encoder and decoder operating generally as described herein with reference to fig. 2 and 3. HEVC may allow the use of larger video blocks and may use quaternary tree partitioning for signal block encoded information. In such an embodiment, a picture or a slice of a picture may be partitioned into Coding Tree Blocks (CTBs), each having the same size (e.g., 64 x 64). Each CTB may be partitioned into Code Units (CUs) having a quad-tree partition, and each CU may be further partitioned into Prediction Units (PUs) and Transform Units (TUs), each of which may also be partitioned using the quad-tree partition method.

In an embodiment, for each inter-coded (inter-coded) CU 400, the associated PU may be partitioned using one of eight exemplary partitioning modes (examples of which are schematically shown in fig. 4 as modes 410, 420, 430, 440, 460, 470, 480, and 490). In some embodiments, temporal prediction may be applied to a PU that reconstructs inter-coded (inter-coded). A linear filter may be applied to obtain pixel values at fractional positions (fractional position). The interpolation filter used in some such embodiments may have seven (tap) or eight orders for luminance and/or four orders for chrominance. A deblocking filter, which may be content-based, may be used such that a different deblocking filtering operation may be applied at each boundary of a TU and PU depending on a number of factors, which may include one or more of coding mode differences, motion differences, reference picture differences, pixel value differences, and the like. In an entropy encoding embodiment, adaptive binary arithmetic coding (CABAC) may be used for one or more block-level syntax elements. In some embodiments, CABAC may not be used for high level parameters. The bins (bins) that may be used in CABAC encoding may include conventional bins based on context encoding and bypass (bypass) encoding bins that do not use context.

Screen content video may be captured in a Red Green Blue (RGB) format. The RGB signal may include redundancy between the three color components. Although such redundancy is inefficient in embodiments implementing video compression, for applications where high fidelity is required for the decoded screen content video, the use of RGB color space may be optional because color space conversion (e.g., from RGB encoding to YCbCr encoding) introduces losses to the original video signal due to rounding and clipping operations that may be used to convert color components between different spaces. In some embodiments, video compression efficiency may be improved by using correlation between color components of three color spaces. For example, an encoding tool for cross-component prediction may use the residual of the G component to predict the residual of the B and/or R components. The residue of the Y component in YCbCr embodiments may be used to predict the residue of the Cb and/or Cr components.

In one embodiment, motion compensated prediction techniques may be used to exploit redundancy between temporally neighboring pictures. In such embodiments, the motion vector may be supported as accurately as one-quarter pixels of the Y component and one-eighth pixels of the Cb and/or Cr components. In an embodiment, fractional sample interpolation (fractional sample interpolation) can be used, which can include separable 8-order filters for half-pixel positions and 7-order filters for quarter-pixel positions. Table 1 below shows exemplary filter coefficients for Y component fractional interpolation. Fractional interpolation of Cb and/or Cr components may be implemented using similar filter coefficients, except that in some embodiments, separable 4-order filters may be used, and the motion vectors may be as accurate as one-eighth pixel in a 4:2:0 video format implementation. In a 4:2:0 video format implementation, the Cb and Cr components may contain less information than the Y component, and the 4-order interpolation filter may reduce the complexity of the fractional interpolation filtering, and may not sacrifice the efficiency obtained in motion compensated prediction of the Cb and Cr components compared to an 8-order interpolation filter implementation. Table 2 below shows exemplary filter coefficients that may be used for fractional interpolation of Cb and Cr components.

Fractional location	Filter coefficients
		0	{0,0,0,64,0,0,0,0}
2/4	{-1,4,-10,58,17,-5,1,0}
		2/4	{-1,4,-11,40,40,-11,4,-1}
3/4	{0,1,-5,17,58,-10,4,-1}

Table 1 exemplary filter coefficients for Y component fractional interpolation

Fractional location	Filter coefficients
		0	{0,64,0,0}
1/8	{-2,58,10,-2}
		2/8	{-4,54,16,-2}
3/8	{-6,46,28,-4}
		4/8	{-4,36,36,-4}
5/8	{-4,28,46,-6}
		6/8	{-2,16,54,-4}
7/8	{-2,10,58,-2}

Table 2 exemplary filter coefficients for Cb and Cr component fractional interpolation

In one embodiment, the video signal initially captured in the RGB color format may be encoded in the RGB domain, for example, if the decoded video signal is desired to be high fidelity. The cross-component prediction tool may improve the efficiency of encoding RGB signals. In some embodiments, redundancy that may exist between the three color components may not be fully exploited because in some embodiments the G component may be used to predict the B and/or R components, while the correlation between the B and R components is not used. Such decorrelation of color components may improve coding performance of RGB video coding.

A fractional interpolation filter (fractional interpolation filter) can be used to encode the RGB video signal. Interpolation filter designs directed to encoding YCbCr video signals in a 4:2:0 color format may not be preferred for encoding RGB video signals. For example, the B and R components of RGB video may represent richer color information and may possess higher frequency characteristics than the chrominance components of the converted color space, such as Cb and Cr components in the YCbCr color space. The 4-order fractional filters available for Cb and/or Cr components may not be sufficiently accurate for motion compensated prediction of B and R components when encoding RGB video. In lossless coding embodiments, reference pictures may be used for motion compensated prediction, which are mathematically identical to the original picture associated with such reference pictures. In such embodiments, such reference pictures may contain more edges (i.e., high frequency signals) than lossy coding embodiments using the same original picture, where the high frequency information in such reference pictures is reduced and/or distorted due to the quantization process. In such an embodiment, a smaller order interpolation filter capable of retaining higher frequency information in the original picture may be used for the B and R components.

In one embodiment, a residual color conversion method may be used to adaptively select an RGB or YCgCro color space for encoding residual information associated with an RGB video. Such residual color space conversion methods may be applied to lossless or lossy coding, or both, without incurring excessive computational complexity overhead during the encoding and/or decoding process. In another embodiment, interpolation filters may be adaptively selected for motion compensated prediction of different color components. Such methods may allow for the free use of different fractional interpolation filters at the sequence, picture, and/or CU level and may improve the efficiency of motion compensation based predictive coding.

In an embodiment, residual coding may be performed in a different color space from the original color space in order to remove redundancy of the original color space. Video encoding of natural content (e.g., camera captured video content) may be performed in the YCbCr color space instead of the RGB color space, because encoding in the YCbCr color space may provide a more compact representation of the original video signal than encoding in the RGB color space (e.g., cross-component correlation in the YCbCr color space may be lower than cross-components in the RGB color space), and encoding efficiency of YCbCr may be higher than encoding efficiency of RGB. In most cases source video in RGB format can be captured and high fidelity reconstructed video may be desired.

Color space conversion is not always lossless and the output color space may have the same dynamic range as the input color space. For example, if RGB video is converted to ITU-R bt.709ycbcr color space with the same bit depth, there may be some loss due to rounding and truncation operations that may be performed during such color space conversion. YCgCo may be a color space that may have similar characteristics to YCbCr color space, but the conversion process between RGB and YCgCo (i.e., from RGB to YCgCo and from YCgCo to RGB) is simpler in operation than the conversion process between RGB and YCbCr, since only shift (shift) and add operations may be used during such conversion. By increasing the bit depth of the intermediate operation by one bit, YCgCo can also be made fully supporting the inverse conversion (i.e. where the derived color values after the inverse conversion can be identical in value to the original color values). This aspect may be desirable because it is applicable to both lossy and lossless embodiments.

In an embodiment, the residue may be converted from RGB to YCgCo between residue encodings due to the encoding efficiency and ability to perform the reversible conversion provided by the YCgCo color space. The determination as to whether to apply RGB to the YCgCo conversion process may be adaptively performed at a sequence and/or slice and/or block level (e.g., CU level). For example, the determination may be made based on whether to apply a transform that provides improvement in a Rate Distortion (RD) metric (e.g., a weighted combination of rate and distortion). Fig. 5 shows an exemplary image 510 that may be an RGB picture. The image 510 may be decomposed into three color components of YCgCo. In such embodiments, both the reversible and irreversible versions of the transformation matrix may be specific to lossless encoding and lossy encoding, respectively. When the residue is encoded in the RGB domain, the encoder may treat the G component as a Y component and the B and R components as Cb and Cr components, respectively. In the disclosed case, G, B, R order is used to represent RGB video instead of R, G, B order. Note that although the embodiments described herein may be described using examples in which conversion from RGB to YCgCo is performed, it will be apparent to those skilled in the art that the disclosed embodiments may also be used to effect conversion between RGB and other color spaces (e.g., YCbCr). All such embodiments are contemplated as within the scope of the present example disclosure.

Reversible conversion from the GBR color space to the YCgCo color space may be performed using equations (1) and (2) shown below. These equations can be used for both lossy and lossless coding. Equation (1) shows the manner in which the reversible conversion from GBR color space to YCgCo is achieved according to an embodiment:

the displacement may be performed without multiplication or division, because:

Co＝R-B

t＝B+(Co>>1)

Cg＝G-t

Y＝t+(Cg>>1)

in such an embodiment, the reverse conversion from YCgCo to GBR may be performed using equation (2):

it can be performed using displacement because:

t＝Y-(Cg>>1)

G＝Cg+t

B＝t-(Co>>1)

R＝Co+B

in an embodiment, the irreversible conversion may be performed using equations (3) and (4) shown below. In some embodiments, such irreversible conversion may be used for lossy encoding, but not for lossless encoding. Equation (3) shows the way in which the irreversible conversion from GBR color space to YCgCo is achieved according to an embodiment:

according to an embodiment, the inverse conversion from YCgCo to GBR may be performed using equation (4):

as shown in equation (3), the forward color space transformation matrix that can be used for lossy encoding may not be normalized. The magnitude and/or energy of the residual signal in the YCgCo domain is reduced compared to the magnitude (magnitude) and/or energy of the original residual in the RGB domain. This reduction of the residual signal in the YCgCo domain can compromise the lossy coding performance of the YCgCo domain because the YCgCo residual coefficients may be over quantized using the same Quantization Parameter (QP) that has been used in the RGB domain. In an embodiment, the QP adjustment method may be used in a situation where the delta QP may be added to the original QP value in case a color space transform is applied in order to compensate for the magnitude change of the YCgCo residual signal. The same delta QP may be applied to the Y component, cg and/or Co components. In some embodiments implementing equation (3), different rows of the forward transform matrix may not have the same norm (norm). The same QP adjustment may not ensure that both the Y component and the Cg and/or Co components have similar amplitude levels as the G component and the B and/or R components.

In an embodiment, to ensure that the YCgCo residual signal converted from the RGB residual signal has a similar amplitude as the RGB residual signal, a pair of scaled forward and inverse transform matrices may be used to convert the residual signal between the RGB domain and the YCgCo domain. More specifically, the forward transform matrix from the RGB domain to the YCgCo domain can be defined by equation (5):

wherein the method comprises the steps ofAn element-wise matrix multiplication of two terms that may be in the same position of two matrices may be indicated. a. b and c may be scaling factors used to compensate for norms of different rows in the original forward color space transformation matrix, such as used in equation (3), which may be derived using equations (6) and (7):

in such an embodiment, the inverse transformation from the YCgCo domain to the RGB domain may be implemented using equation (8):

in equations (5) and (8), the scaling factor may be a real number, which may require floating point multiplication in transforming the color space between RGB and YCgCo. To reduce implementation complexity, in an embodiment, multiplication of the scaling factor may be approximated by a computationally more efficient multiplication with an integer M following an N-bit right shift.

The disclosed color space conversion methods and systems may be enabled and/or disabled at the sequence, picture, or block (e.g., CU, TU) level. For example, in an embodiment, the color space conversion of the prediction residual may be adaptively enabled and/or disabled at the coding unit level. The encoder may select an optimal color space between GBR and YCgCo for each CU.

FIG. 6 illustrates an exemplary method 600 for RD optimization process at an encoder described herein using adaptive residual color conversion. At block 605, residuals of the CU may be encoded using "best mode" (e.g., intra-prediction mode for intra-coding, motion vector for inter-coding, and reference picture index) encoding for the implementation, which may be a preconfigured encoding mode that was previously determined to be the best available encoding mode, or another predetermined encoding mode that has been determined to have the lowest or relatively low RD overhead at least at the point of performing the functions of block 605. At block 610, the flag may be set to "False" (or to any other indicator indicating False, zero, etc.)) Labeled "cu_ycgco_residual_flag" in this example, but it may also be labeled using any term or combination of terms, indicating that the YCgCo color space will not be used to perform encoding of the residual of the coding unit. In response to the flag being evaluated as false or an equivalent replacement at block 610, the encoder may perform residual encoding in the GBR color space at block 615 and calculate RD costs for such encoding (labeled "RDCost" in fig. 6 _GBR "but again any label or term may be used herein to refer to such expense).

At block 620, a determination is made as to whether the RD cost of the GBR color space encoding is lower than the RD cost of the best mode encoding (RDCost) _BestMode ) Is determined by the above-described method. If the GBR color-space encoded RD cost is lower than the best-mode encoded RD cost, then at block 625, the best-mode cu_ycgco_residual_flag may be set to false or its equivalent (or may be left set to false or its equivalent), and the best-mode RD cost may be set to the residual encoded RD cost in GBR color space. The method 600 may proceed to block 630, where the cu_ycgco_residual_flag may be set to a true or equivalent indicator.

At block 620, if the RD cost of the GBR color space is determined to be greater than or equal to the RD cost of the best mode encoding, then the RD cost of the best mode encoding may be retained as the value it was set to prior to the evaluation of block 620 and bypass block 625. The method 600 may proceed to block 630, where the cu_ycgco_residual_flag may be set to true or its equivalent indicator. The setting of CU_YCgCo_residual_flag to true at block 630 may facilitate encoding the residual of the encoding unit using the YCgCo color space, and therefore, the estimation of RD overhead for encoding using the YCgCo color space compared to RD overhead for best mode encoding will be described below.

At block 635, the residuals of the coding unit may be encoded using the YCgCo color space and the RD cost of such encoding determined (such cost is labeled "RDCost" in FIG. 6 _YCgCo Any label or term may be used again herein to refer to such expense.

At block 640, a determination is made as to whether the RD cost of the YCgCo color space coding is lower than the RD cost of the best mode coding. If the RD cost of the YCgCo color space coding is lower than the RD cost of the best mode coding, then at block 645, the CU_YCgCo_residual_flag of the best mode may be set to true or its equivalent (or reserved to be set to true or its equivalent), and the RD cost of the best mode may be set to the RD cost of the residual coding in the YCgCo color space coding. The method 600 may end at block 650.

At block 640, if the RD cost of the YCgCo color space is determined to be greater than or equal to the RD cost of the best mode encoding, then the RD cost of the best mode encoding may be retained as the value it was set to prior to the evaluation of block 640, and may bypass block 645. The method 600 may end at block 650.

Those skilled in the art will appreciate that the disclosed embodiments, including method 600 and any subset thereof, may allow for a comparison of GBR and YCgCo color space codes and their respective RD costs such that color space codes with lower RD costs may be allowed to be selected.

FIG. 7 illustrates another exemplary method 700 for RD optimization process at an encoder described herein using adaptive residual color conversion. In an embodiment, the encoder may attempt residual encoding using the YCgCo color space when at least one reconstructed GBR residual in the current coding unit is non-zero. If the total reconstructed residual is zero, it may indicate that the prediction in GBR color space may be sufficient and that the conversion to YCgCo color space may not further improve the efficiency of residual coding. In such an embodiment, the number of cases examined in the RD optimization may be reduced and the encoding process may be performed more efficiently. Such an embodiment may be implemented in a system using a large number of quantization parameters, such as a large number of quantization step sizes.

At block 705, residuals of the CU may be encoded using "best mode" (e.g., intra-prediction mode for intra-coding, inter-coded motion vector and reference picture index) encoding of the implementation, which may be a preconfigured encoding mode that was previously determined to be the best available encoding modeThe equation, or at least another predetermined encoding mode that has been determined to have the lowest or relatively low RD cost at the point of performing the function of block 705. At block 710, the flag may be set to "False" (or to any other indicator indicating False, zero, etc.), labeled "cu_ycgco_residual_flag" in this example, indicating that the coding of the residual of the coding unit will not be performed using the YCgCo color space. It is again noted here that any term or combination of terms may be used to label the indicia. In response to the flag being evaluated as false or an equivalent replacement at block 710, at block 715 the encoder may perform residual encoding in the GBR color space and calculate RD costs for such encoding (labeled "RDCost" in fig. 7 _GBR Any label or term may be used again herein to refer to such expense.

At block 720, a determination is made as to whether the RD cost of the GBR color space encoding is lower than the RD cost of the best mode encoding. If the GBR color-space encoded RD cost is lower than the best-mode encoded RD cost, then at block 725, the best-mode cu_ycgco_residual_flag may be set to false or its equivalent (or reserved to be set to false or its equivalent), and the best-mode RD cost is set to the residual encoded RD cost in GBR color space.

At block 720, if the RD cost of the GBR color space is determined to be greater than or equal to the RD cost of the best mode encoding, then the RD cost of the best mode encoding may be retained as the value it was set to prior to the evaluation of block 720 and bypass block 725.

At block 730, a determination is made as to whether at least one of the reconstructed GBR coefficients is not zero (i.e., whether all reconstructed GBR coefficients are equal to zero). If there is at least one reconstructed GBR coefficient that is not zero, then at block 735, cu_ycgco_residual_flag may be set to true or its equivalent indicator. The setting of cu_ycgco_residual_flag to true (or its equivalent indicator) at block 735 may facilitate encoding the residual of the encoding unit using the YCgCo color space, and thus, the estimation of RD costs for encoding using the YCgCo color space compared to the RD costs for best mode encoding will be described below.

In the event that at least one reconstructed GBR coefficient is not zero, at block 740, the residuals of the coding unit may be encoded using the YCgCo color space and the RD cost of such encoding may be determined (such cost is labeled "RDCost" in fig. 7 _YCgCo Any label or term may be used again herein to refer to such expense.

At block 745, a determination is made as to whether the RD cost of the YCgCo color space coding is lower than the value of the RD cost of the best mode coding. If the RD cost of the YCgCo color space encoding is lower than the RD cost of the best mode encoding, then at block 750, the CU_YCgCo_residual_flag of the best mode may be set to true or its equivalent (or reserved to be set to true or its equivalent), and the RD cost of the best mode may be set to the RD cost of the residual encoding in the YCgCo color space encoding. The method 700 may end at block 755.

If the RD cost of the YCgCo color space is determined to be higher than or equal to the RD cost of the best mode encoding at block 745, then the RD cost of the best mode encoding may be retained as the value it was set to before the evaluation of block 745, and may bypass block 750. The method 700 may end at block 755.

Those skilled in the art will appreciate that the disclosed embodiments, including method 700 and any subset thereof, may allow for a comparison of GBR and YCgCo color space codes and their respective RD costs such that color space codes with lower RD costs may be allowed to be selected. The method 700 of FIG. 7 may provide a more efficient manner to determine the appropriate settings for the flags, such as the exemplary CU_YCgCo_residual_coding_flag described herein, while the method 600 of FIG. 6 may provide a more in-depth manner to determine the appropriate settings for the flags, such as the exemplary CU_YCgCo_residual_coding_flag described herein. In either embodiment, or in any variation, subset, or implementation using any one or more aspects thereof, all of which are contemplated as within the scope of the disclosed examples, the values of such flags may be transmitted in encoded bitstreams, such as those described with respect to fig. 2, and any other encoder described herein.

Fig. 8 illustrates a block diagram of a block-based single-layer video encoder 800, which may be implemented, for example, to provide a bitstream to a receiver 192 of the system 191 shown in fig. 1 in accordance with an embodiment. As shown in fig. 8, an encoder such as encoder 800 may use techniques such as spatial prediction (which may also be referred to as "intra prediction") and temporal prediction (which may also be referred to as "inter prediction" or "motion compensated prediction") to predict an input video signal 801 in an attempt to improve compression efficiency. Encoder 800 may include mode decision and/or other encoder control logic 840 that may determine a form of prediction. Such determination may be based at least in part on criteria such as rate-based criteria, distortion-based criteria, and/or combinations thereof. The encoder 800 may provide one or more prediction blocks 806 to an adder element 804, and the adder element 804 may generate and provide a prediction residual 805 (which may be a difference signal between an input signal and a prediction signal) to a transform element 810. The encoder 800 may transform the prediction residual 805 at a transform element 810 and quantize the prediction residual 805 at a quantization element 815. The quantized residual is provided as an expense coefficient block 822 to the entropy encoding element 830 along with mode information (e.g., intra or inter prediction) and prediction information (motion vector, reference picture index, intra prediction mode, etc.). Entropy encoding element 830 may compress the quantized residual and provide it as output video bitstream 835. The entropy encoding element 830 may also or instead use the encoding mode, prediction mode, and/or motion information 808 in generating the output video bitstream 835.

In an embodiment, the encoder 800 may also generate or alternatively generate a reconstructed video signal by applying inverse quantization to the residual coefficient block 822 at an inverse quantization element 825 and applying an inverse transform at an inverse transformation element 820, in order to generate a reconstructed residual that can be added back to the prediction signal 806 at an adder element 809. In an embodiment, a residual inverse transform of such reconstructed residual may be generated by residual inverse transform element 827 and provided to adder element 809. In such embodiments, residual encoding element 826 may provide an indication of the value of cu_ycgco_residual_coding_flag 891 (or cu_ycgco_residual_flag, or any other flag or indicator used to perform the functions mentioned herein with respect to the cu_ycgco_residual_coding_flag and/or the cu_ycgco_residual_flag described herein or to provide an indication described herein) to control switch 817 via control signal 823. Control switch 817 may, in response to receiving control signal 823 indicating receipt of such a flag, direct the reconstructed residual to residual inverse transform element 827 for generating a residual inverse transform of the reconstructed residual. The value of flag 891 and/or control signal 823 may indicate a decision by the encoder as to whether to apply a residual conversion process, which may include forward residual conversion 824 and inverse residual conversion 827. In some embodiments, the control signal 823 may take different values as the encoder evaluates the expense and benefit of applying or not applying the residual conversion process. For example, the encoder may evaluate the rate distortion costs of applying the cost conversion process to a portion of the video signal.

In some embodiments, the resulting reconstructed video signal generated by adder 809 may be processed using a loop filtering process implemented at loop filtering element 850 (e.g., by using one or more of deblocking filtering, sampling adaptive offset, and/or adaptive loop filtering). In some embodiments, the resulting reconstructed video signal in the form of reconstructed blocks 855 may be stored at a reference picture store 870, and at the reference picture store 870, the reconstructed video signal is used to predict future video signals, e.g., by a motion prediction (estimation and compensation) element 880 and/or a spatial prediction element 860. Note that in some embodiments, the resulting reconstructed video signal generated by adder element 809 is provided to spatial prediction element 860 without processing such as loop filter element 850.

As shown in fig. 8, in an embodiment, an encoder such as encoder 800 may determine the value of cu_ycgco_residual_coding_flag 891 (or cu_ycgco_residual_flag, or any other flag or flags or indicators used to perform the functions mentioned herein with respect to the cu_ycgco_residual_coding_flag and/or the cu_ycgco_residual_flag described herein or to provide an indication described herein) at color space decision element 826 for residual coding. The color space decision element 826 for residual coding may provide an indication of such a flag to the control switch 807 via control signal 823. In response, upon receipt of a control signal 823 indicating receipt of such a flag, the control switch 807 may direct the prediction residual 805 to the residual conversion element 824 such that the RGB to YCgCo conversion process is adaptively applied to the prediction residual 805 at the residual conversion element 824. In some embodiments, this conversion process may be performed before the transformation and quantization are performed at the coding units processed by the transformation element 810 and the quantization element 815. In some embodiments, this conversion process may also or alternatively be performed before the inverse transform and inverse quantization are performed at the coding units processed by inverse transform element 820 and inverse quantization element 825. In some embodiments, cu_ycgco_residual_coding_flag 891 may also, or instead, be provided to entropy encoding element 830 for inclusion in bitstream 835.

Fig. 9 shows a block diagram of a block-based single layer decoder 900 that may receive a video bitstream 935, the video bitstream 935 being a bitstream such as bitstream 835 that may be generated by encoder 800 of fig. 8. The decoder 900 may reconstruct the bitstream 935 for display on a device. The decoder 900 may parse the bit stream 935 at the entropy decoder element 930 to generate residual coefficients 926. Residual coefficients 926 may be inverse quantized at dequantization element 925 and/or inverse transformed at inverse transformation element 920 to obtain a reconstructed residual that may be provided to adder element 909. The prediction signal may be obtained using the coding mode, the prediction mode, and/or the motion mode 927, in some embodiments using one or both of the spatial prediction information provided by the spatial prediction element 960 and/or the temporal prediction information provided by the temporal prediction element 990. Such a prediction signal may be provided as a prediction block 929. The prediction signal and the reconstructed residual may be superimposed at adder element 909 to generate a reconstructed video signal, which may be provided to loop filter element 950 for loop filtering and may be stored in reference picture store 970 for displaying pictures and/or decoding the video signal. Note that the prediction mode 928 may be provided to the adder element 909 by the entropy decoding element 930 for use in generating a reconstructed video signal that may be provided to the loop filtering element 350 for loop filtering.

In an embodiment, the decoder 900 may decode the bitstream 935 at the entropy decoding element 930 to determine the cu_ycgco_residual_coding_flag 991 (or cu_ycgco_residual_flag, or any other flag or indicator used to perform the functions mentioned herein with respect to the cu_ycgco_residual_coding_flag and/or the cu_ycgco_residual_flag described herein or to provide an indication described herein), which may have been encoded into the bitstream 935 by an encoder such as the encoder 800 of fig. 8. The value of cu_ycgco_residual_coding_flag 991 may be used to determine whether to perform a YCgCo to RGB inverse conversion process at residual inverse conversion element 999 on the reconstructed residual generated by inverse conversion element 920 and provided to adder element 909. In an embodiment, the flag 991 or a control signal indicative thereof received may be provided to the control switch 917, in response to which the control switch 917 directs the reconstructed residual to the residual inverse transform element 999 to generate a residual inverse transform of the reconstructed residual.

In an embodiment, by performing adaptive color space conversion on the prediction residual, rather than as part of motion compensated prediction or intra prediction, the complexity of the video coding system may be reduced, as such an embodiment may not require the encoder and/or decoder to store the prediction signals in two different color spaces.

In order to improve residual coding efficiency, transform coding of prediction residues may be performed by dividing a residual block into a plurality of square transform units, wherein possible TU sizes may be 4×4,8×8, 16×16, and/or 32×32. Fig. 10 illustrates an exemplary PU-to-TU partition 1000 in which a left bottom PU 1010 may represent an embodiment in which the TU size is equal to the PU size, and PUs 1020, 1030, and 1040 may represent embodiments in which each respective exemplary PU may be split into multiple TUs.

In an embodiment, color space conversion of the prediction residual may be adaptively enabled and/or disabled at the TU level. Such embodiments may provide for switching between different color spaces of finer granularity than enabling and/or disabling adaptive color transforms at the CU level. Such embodiments may again improve the coding gain that can be achieved by adaptive color space conversion.

Referring again to the exemplary encoder 800 of fig. 8, to select a color space for residual encoding of a CU, an encoder such as the exemplary encoder 800 may test each encoding mode (e.g., intra-encoding mode, inter-encoding mode, intra-block copy mode) twice, once with color space conversion, and once without color space conversion. In some embodiments, to improve the efficiency of such coding complexity, various "fast" or more efficient coding logic may be used as described herein.

In an embodiment, since YCgCo may provide a more compact representation of the original color signal than RGB, the RD cost of enabling the color space conversion may be determined and compared to the RD cost of disabling the color space conversion. In some embodiments, if there is at least one non-zero coefficient when the color space transformation is enabled, a calculation may be made to disable RD overhead of the color space transformation.

To reduce the number of coding modes tested, in some embodiments, the same coding mode may be used for both RGB and YCgCo color spaces. For intra mode, the selected luma and chroma intra predictions may be shared between RGB and YCgCo spaces. For inter mode, selected motion vectors, reference pictures, and motion vector predictors may be shared between RGB and YCgCo color spaces. For intra-block copy mode, the selected block vector and block vector predictor may be shared between the RGB and YCgCo color spaces. To further reduce coding complexity, in some embodiments, TU partitioning may be shared between RGB and YCgCo color spaces.

Since there may be correlation between the three color components (Y, cg and Co in the YCgCo domain, and G, B and R in the RGB domain), in some embodiments the same intra prediction direction may be selected for the three color components. The same intra prediction mode may be used for all three color components of each of the two color spaces.

Since there may be a correlation between CUs in the same region, one CU may select the same color space (e.g., either RGB or YCgCo) as its parent CU for encoding its residual signal. Alternatively, a sub-CU may derive a color space from information associated with its parent, such as a selected color space and/or RD costs for each color space. In an embodiment, where the residual of the parent CU of one CU is encoded into the YCgCo domain, the encoding complexity may be reduced by not checking the RD overhead of residual encoding in the RGB domain. The RD overhead of checking residual coding in YCgCo domain may also or instead be skipped if the residual of the parent CU of the sub-CU is coded into RGB domain. In some embodiments, the RD cost of the parent CU of the child CU in two color spaces is used for the child CU if the two color spaces are tested in the encoding of the parent CU. The sub-CU may skip the RGB color space if the parent CU of the sub-CU selects the YCgCo color space and the RD cost of YCgCo is lower than RGB and vice versa.

Some embodiments support a number of prediction modes, including a number of intra prediction modes that may include a number of intra angle prediction modes, one or more DC modes, and/or one or more planar prediction modes. Residual coding using color space transforms for all such intra prediction mode tests increases the complexity of the encoder. In an embodiment, instead of calculating the total RD costs for all supported intra prediction modes, N intra prediction candidate subsets are selected from the supported modes without considering the bits of the residual coding. The N selected intra prediction candidates may be tested in the transformed color space by calculating RD costs after applying residual coding. The best mode with the lowest RD cost among the supported modes is selected as the intra-prediction mode in the conversion color space.

It is noted herein that the disclosed color space conversion systems and methods may be enabled and/or disabled at the sequence level and/or at the picture and/or slice level. In the exemplary embodiment shown in table 3 below, syntax elements (examples of which are bold as highlighted in table 3, but which may take any form, labels, idioms, or combinations thereof, all of which are contemplated within the scope of the disclosed examples) may be used in the Sequence Parameter Set (SPS) to indicate whether the residual color space conversion coding tool is enabled. In some embodiments, the disclosed adaptive color space conversion systems and methods may be enabled as a "444" chroma format as color space conversion is applied to video content having luma and chroma components of the same resolution. In such embodiments, color space conversion to 444 chromaticity format may be constrained to a relatively high level. In such embodiments, a bitstream conformance constraint may be applied in order to enhance disabling color space conversion if a non-444 color format may be used.

Table 3 exemplary sequence parameter set syntax

In an embodiment, an exemplary syntax element "sps residual csc flag (sps residual csc flag)" equal to 1 indicates that the residual color space conversion coding tool may be enabled. An exemplary syntax element sps_residual_csc_flag equal to 0 indicates that residual color space conversion may be disabled, and a flag cu_ycgco_residual_flag at the CU level is inferred to be 0. In such an embodiment, when the chromaarraypype syntax element is not equal to 3, the value of the exemplary sps_residual_csc_flag syntax element (or equivalent substitution thereof) may be equal to 0 in order to maintain bitstream consistency.

In another embodiment, as shown in table 4 below, depending on the value of the chromaarraypype syntax element, the sps_residual_csc_flag example syntax element may be signaled (examples of which are bold as highlighted in table 4, but which may take any form, label, idiomatic expression, or combination thereof, all of which are contemplated within the scope of the disclosed examples). In such an embodiment, if the input video is in 444 color format (i.e., chromaarraypype is equal to 3, e.g., "chromaarraypype= 3" in table 4), then the sps_residual_csc_flag exemplary syntax element may be signaled to indicate whether color space conversion is enabled. If such input video is not in 444 color format (i.e., chromaarraypype is not equal to 3), the sps_residual_csc_flag exemplary syntax element may not be signaled and may be set equal to 0.

Table 4 exemplary sequence parameter set syntax

In an embodiment, if the residual color space conversion coding tool is enabled, another flag may be added at the CU level and/or the TU level as described herein to enable color space conversion between GBR and YCgCo color spaces.

In an embodiment, an example of an exemplary coding unit syntax element "cu_ycgco_residual_flag" equal to 1 is schematically shown in table 5 below (examples of which are bold as highlighted in table 5, but which may take any form, label, idiomatic, or combination thereof, all of which are foreseeable within the scope of the disclosed examples) indicating that the residual of the coding unit may be encoded and/or decoded in the YCgCo color space. In such an embodiment, a cu_ycgco_residual_flag syntax element or equivalent to 0 may indicate that the residual of the coding unit may be encoded in GBR color space.

Table 5 exemplary coding unit syntax

In another embodiment, an example of an exemplary transform unit syntax element "tu_ycgco_residual_flag" equal to 1 (an example of which is bold as highlighted in table 6, but which may take any form, label, idiomatic term, or combination thereof, all of which are foreseeable within the scope of the disclosed example) is schematically shown in table 6 below, indicating that the residual of the transform unit may be encoded and/or decoded in the YCgCo color space. In such an embodiment, a tu_ycgco_residual_flag syntax element or equivalent to 0 may indicate that the residual of the transform unit may be encoded in the GBR color space.

/>

Table 6 exemplary transform unit syntax

In some embodiments, some interpolation filters may be inefficient at interpolating fractional pixels (interpolating fractional pixel) for motion compensated prediction used in screen content encoding. For example, in encoding RGB video, a 4-order filter may not be as accurate when interpolating the B and R components at fractional locations. In lossy coding embodiments, an 8-order luma filter may not be the most efficient way to preserve the useful high frequency texture information contained in the original luma component. In an embodiment, the indication of the separation of interpolation filters may be for different color components.

In one such embodiment, one or more default interpolation filters (e.g., a set of 8 th order filters, a set of 4 th order filters) may be used as candidates for the fractional pixel interpolation process. In another embodiment, a set of interpolation filters other than the default interpolation filter may be explicitly signaled in the bitstream. To enable adaptive filter selection for different color components, a signaling syntax element may be used that specifies the interpolation filter selected for each color component. The disclosed filter selection system and method may be used at various coding levels, such as sequence level, picture and/or slice level, and CU level. The selection of the operational encoding level may be made based on the encoding efficiency and/or computational and/or operational complexity of the available implementations.

In embodiments in which a default interpolation filter is used, a flag may be used to indicate that a set of 8-order filters or a set of 4-order filters may be used for fractional pixel interpolation of color components. One such flag may indicate filter selection for the Y component (or the G component in an RGB color space embodiment), while another such flag may be used for the Cb and Cr components (or the B and R components in an RGB color space embodiment). The following table provides examples of such flags that may be signaled at the sequence level, picture and/or slice level, and CU level.

Table 7 below schematically illustrates an embodiment in which such flags are signaled to allow selection of a default interpolation filter at the sequence level. The disclosed syntax may be applied to any parameter set including Video Parameter Set (VPS), sequence Parameter Set (SPS), and Picture Parameter Set (PPS). In the embodiment schematically illustrated in table 7, the exemplary syntax elements may be signaled at the SPS.

Table 7 exemplary signaling for selecting interpolation filters at the sequence level

In such an embodiment, an exemplary syntax element "sps luma default filter flag" equal to 1 (sps luma default filter flag), an example of which is bold as highlighted in table 7, but which may take any form, label, idiom, or combination thereof, all of which are contemplated within the disclosed example scope, may indicate that for interpolation of fractional pixels, the luma components of all pictures associated with the current sequence parameter set may use a set of the same luma interpolation filters (e.g., a set of default luma filters). In such an embodiment, an exemplary syntax element sps_luma_use_default_filter_flag equal to 0 may indicate that for interpolation of fractional pixels, the luma components of all pictures associated with the current sequence parameter set may use the same set of chroma interpolation filters (e.g., a default set of chroma filters).

In such an embodiment, an exemplary syntax element "sps_chroma_use_default_filter_flag" equal to 1 (an example of which is bold as highlighted in table 7, but which may take any form, label, idiom, or combination thereof, all of which are contemplated within the scope of the disclosed example) may indicate that for interpolation of fractional pixels, the chroma components of all pictures associated with the current sequence parameter set may use the same set of chroma interpolation filters (e.g., a set of default chroma filters). In such an embodiment, an exemplary syntax element sps_chroma_use_default_filter_flag equal to 0 may indicate that for interpolation of fractional pixels, the chroma components of all pictures associated with the current sequence parameter set may use the same set of luma interpolation filters (e.g., a default set of luma filters).

In an embodiment, the flag is signaled at the picture and/or slice level to facilitate selection of the fractional interpolation filter at the picture and/or slice level (i.e., the same interpolation filter may be used by all CUs in the picture and/or slice for a given color component). Table 8 below schematically illustrates an example of signaling using syntax elements in a slice segment (slice) header according to an embodiment.

Table 8 exemplary signaling for selecting interpolation filters at the picture and/or slice level

In such an embodiment, an exemplary syntax element "slice_luma_use_default_filter_flag" equal to 1 (an example of which is bold as highlighted in table 8, but which may take any form, label, idiom, or combination thereof, all of which are contemplated within the scope of the disclosed example) may indicate that for interpolation of fractional pixels, the luma component of the current slice may use the same set of luma interpolation filters (e.g., a set of default luma filters). In such an embodiment, a slice_luma_use_default_filter_flag example syntax element equal to 0 may indicate that for interpolation of fractional pixels, the luma component of the current slice may use the same set of chroma interpolation filters (e.g., a default set of chroma filters).

In such an embodiment, an exemplary syntax element "slice_chroma_use_default_filter_flag" equal to 1 (an example of which is bold as highlighted in table 8, but which may take any form, label, idiom, or combination thereof, all of which are contemplated within the scope of the disclosed example) may indicate that for interpolation of fractional pixels, the chroma component of the current slice may use the same set of chroma interpolation filters (e.g., a set of default chroma filters). In such an embodiment, an exemplary syntax element slice_chroma_use_default_filter_flag equal to 0 may indicate that for interpolation of fractional pixels, the chroma component of the current slice may use the same set of luma interpolation filters (e.g., a default set of luma filters).

In an embodiment in which flags are signaled at the CU level in order to facilitate the selection of interpolation filters at the CU level, such flags may be signaled using the coding unit syntax as shown in table 9. In such embodiments, the color components of a CU may adaptively select one or more interpolation filters that may provide prediction signals for the CU. Such a selection may represent an achievable coding improvement for the adaptive interpolation filter selection.

Table 9 exemplary signaling for selecting interpolation filters at CU level

In such an embodiment, an exemplary syntax element "cu_use_default_filter_flag" equal to 1 (cu_use_default_filter_flag) (examples of which are bold as highlighted in table 9, but which may take any form, labels, idioms, or combinations thereof, all of which are contemplated within the scope of the disclosed examples) may indicate that for interpolation of fractional pixels, both luminance and chrominance may use default interpolation filters. In such an embodiment, a cu_use_default_filter_flag example syntax element equal to 0 or an equivalent substitution thereof may indicate that for interpolation of fractional pixels, the luma or chroma components of the current CU may use a different set of interpolation filters.

In such an embodiment, an exemplary syntax element "cu_luma_use_default_filter_flag" equal to 1 (a cu_luma_use_default_filter_flag), an example of which is bold as highlighted in table 9, but which may take any form, label, idiom, or combination thereof, all of which are contemplated within the scope of the disclosed examples, may indicate that for interpolation of fractional pixels, the luma component of the current CU may use a set of identical luma interpolation filters (e.g., a set of default luma filters). In such an embodiment, an exemplary syntax element cu_luma_use_default_filter_flag equal to 0 may indicate that for interpolation of fractional pixels, the luma component of the current CU may use a set of identical chroma interpolation filters (e.g., a set of default chroma filters).

In such an embodiment, an exemplary syntax element "cu_chroma_use_default_filter_flag" equal to 1 (a cu_chroma_use_default_filter_flag), examples of which are bold as highlighted in table 9, but which may take any form, labels, idioms, or combinations thereof, all of which are contemplated within the scope of the disclosed examples, may indicate that for interpolation of fractional pixels, the chroma components of the current CU may use a set of the same chroma interpolation filters (e.g., a set of default chroma filters). In such an embodiment, an exemplary syntax element cu_chroma_use_default_filter_flag equal to 0 may indicate that for interpolation of fractional pixels, the chroma component of the current CU may use a set of identical luma interpolation filters (e.g., a set of default luma filters).

In an embodiment, the coefficients of the interpolation filter candidates may be explicitly signaled in the bitstream. Any interpolation filter other than the default interpolation filter may be used for fractional pixel interpolation processing of the video sequence. In such an embodiment, to facilitate delivery of filter coefficients from encoding to decoder, an exemplary syntax element "inter_filter_coeff_set ()" (interpolation_filter_coefficient_set ()) "(an example of which is bold as highlighted in table 10, but which may take any form, label, idiom, or combination thereof, all of which are contemplated within the scope of the disclosed examples) may be used to carry the filter coefficients in the bitstream. Table 10 schematically shows the syntax structure for signaling such coefficients of interpolation filter candidates.

Table 10 exemplary signaling of interpolation filters

In such an embodiment, an exemplary syntax element "optional_interpolation_filter_used_flag" (an example of which is bold as highlighted in table 10, but which may take any form, label, idiom, or combination thereof, all of which are contemplated within the scope of the disclosed examples) may specify whether any interpolation filter is present. When the exemplary syntax element of the syntax_inter_filter_used_flag is set to 1, an arbitrary interpolation filter can be used for the interpolation process.

Again, in such an embodiment, the exemplary syntax element "num_inter_filter_set" (an example of which is bold as highlighted in table 10, but which may take any form, label, idiom, or combination thereof, all of which are contemplated within the scope of the disclosed example) or an equivalent thereof, may specify the number of interpolation filter sets present in the bitstream.

And again, in such embodiments, the exemplary syntax element "interpolation filter coeff shift" (an example of which is bold as highlighted in table 10, but which may take any form, label, idiom, or combination thereof, all of which are contemplated within the scope of the disclosed examples) or an equivalent thereof, may designate a right shift operand for pixel interpolation.

And again, in such embodiments, the exemplary syntax element "num_interp_filter [ i ] (number_interpolation_filter [ i ])" (examples of which are bold as highlighted in table 10, but which may take any form, label, idiom, or combination thereof, all of which are contemplated within the scope of the disclosed examples) or an equivalent thereof, may specify the number of interpolation filters in the ith interpolation filter set.

Here again, in such an embodiment, the exemplary syntax element "num_inter_filter_coeff [ i ] (quantity_interpolation_filter_coefficient [ i ])" (examples of which are bold as highlighted in table 10, but which may take any form, label, idiom, or combination thereof, all of which are contemplated within the scope of the disclosed examples) or an equivalent thereof, may specify the order used by the interpolation filters in the ith interpolation filter set.

Here again, in such an embodiment, the example syntax element "interp_filter_coeff_abs [ i ] [ j ] [ l ] (interpolation_filter_coefficient_abs [ i ] [ j ] [ l ])" (examples of which are bold as highlighted in table 10, but which may take any form, labels, idioms, or combinations thereof, all of which are contemplated within the scope of the disclosed examples) or an equivalent thereof, may specify the absolute value of the 1 st coefficient of the j-th interpolation filter in the i-th interpolation filter set.

And again here, in such embodiments, the exemplary syntax element "interp_filter_coeff_sign [ i ] [ j ] [ l ] (interpolation_filter_coefficient_symbol [ i ] [ j ] [ l ])" (examples of which are bold as highlighted in table 10, but which may take any form, labels, idioms, or combinations thereof, all of which are contemplated within the scope of the disclosed examples) or an equivalent thereof, may designate the symbol of the 1 st coefficient of the j-th interpolation filter in the i-th interpolation filter set.

The disclosed syntax elements may be indicated in any high-level parameter set, such as VPS, SPS, PPS and slice segment header. It is also noted that additional syntax elements may be used at the sequence level, picture level, and/or CU level to assist in the selection of interpolation filters for the arithmetic coding level. It is also noted that the disclosed flag may be replaced with a variable that is capable of indicating the selected filter set. Note that in contemplated embodiments, any number (e.g., two, three, or more) of interpolation filter sets may be signaled in the bitstream.

Using the disclosed embodiments, any combination of interpolation filters may be used to interpolate pixels at fractional locations during the motion compensated prediction process. For example, in an embodiment where lossy encoding of a 4:4:4 video signal (in RGB or YCbCr format) may be performed, a default 8-order filter may be used to generate fractional pixels for the three color components (i.e., R, G and B components). In another embodiment, where lossless encoding of the video signal is performed, a default 4-way filter may be used to generate fractional pixels for the three color components (i.e., the Y, cb and Cr components in the YCbCr color space, and the R, G and B components in the RGB color space).

Fig. 11A is a diagram of an example communication system 100 in which one or more disclosed embodiments may be implemented. Communication system 100 may be a multiple access system that provides content, such as voice, data, video, messages, broadcasts, etc., to a plurality of wireless users. The communication system 100 enables multiple wireless users to access such content through a system resource share that includes wireless bandwidth. For example, communication system 100 may employ one or more channel access methods, such as Code Division Multiple Access (CDMA), time Division Multiple Access (TDMA), frequency Division Multiple Access (FDMA), orthogonal FDMA (OFDMA), single carrier FDMA (SC-FDMA), and the like.

As shown in fig. 11A, the communication system 100 may include wireless transmit/receive units (WTRUs) 102a, 102b, 102c, and/or 102d (which may be collectively or commonly referred to as WTRUs 102), radio Access Networks (RANs) 103/104/105, core networks 106/107/109, public Switched Telephone Networks (PSTN) 108, the internet 110, and other networks 112, although it should be understood that the disclosed systems and methods contemplate any number of WTRUs, base stations, networks, and/or network elements. Each of the WTRUs 102a, 102b, 102c, 102d may be any type of device configured to operate and/or communicate in a wireless environment. For example, the WTRUs 102a, 102b, 102c, 102d may be configured to transmit and/or receive wireless signals and may include User Equipment (UE), mobile stations, fixed or mobile subscriber units, pagers, cellular telephones, personal Digital Assistants (PDAs), smart phones, laptops, netbooks, personal computers, wireless sensors, consumer electronics, and the like.

Communication system 100 may also include a base station 114a and a base station 114b. Each of the base stations 114a, 114b may be any type of device configured to wirelessly connect with at least one of the WTRUs 102a, 102b, 102c, 102d to facilitate access to one or more communication networks, such as the core networks 106/107/109, the internet 110, and/or the network 112. By way of example, the base stations 114a, 114B may be Base Transceiver Stations (BTSs), nodes B, e node bs, home enode bs, site controllers, access Points (APs), wireless routers, and the like. Although the base stations 114a, 114b are each depicted as a single element, it is to be understood that the base stations 114a, 114b may include any number of interconnected base stations and/or network elements.

Base station 114a may be part of RAN 103/104/105 and RAN 103/104/105 may also include other base stations and/or network elements (not shown), such as a Base Station Controller (BSC), a Radio Network Controller (RNC), a relay node, etc. Base station 114a and/or base station 114b may be configured to transmit and/or receive wireless signals within a particular geographic area, referred to as a cell (not shown). The cell is further divided into cell sectors. For example, a cell associated with base station 114a may be divided into three sectors. Thus, in one embodiment, the base station 114a includes three transceivers, e.g., one for each sector of a cell. In another embodiment, the base station 114a may use multiple-input multiple-output (MIMO) technology and thus may use multiple transceivers for each sector of a cell.

The base stations 114a, 114b may communicate with one or more of the WTRUs 102a, 102b, 102c, 102d over an air interface 115/116/117, which air interface 115/116/117 may be any suitable wireless communication link (e.g., radio Frequency (RF), microwave, infrared (IR), ultraviolet (UV), visible, etc.). Any suitable Radio Access Technology (RAT) may be used to establish the air interfaces 115/116/117.

More specifically, as described above, communication system 100 may be a multiple access system and may employ one or more channel access schemes, such as CDMA, TDMA, FDMA, OFDMA, SC-FDMA, and the like. For example, the base station 114a and the WTRUs 102a, 102b, 102c in the RAN 103/104/105 may implement a radio technology such as Universal Mobile Telecommunications System (UMTS) terrestrial radio access (UTRA), which may use Wideband CDMA (WCDMA) to establish the air interfaces 115/116/117.WCDMA may include communication protocols such as High Speed Packet Access (HSPA) and/or evolved HSPA (hspa+). HSPA may include High Speed Downlink Packet Access (HSDPA) and/or High Speed Uplink Packet Access (HSUPA).

In another embodiment, the base station 114a and the WTRUs 102a, 102b, 102c may implement a radio technology such as evolved UMTS terrestrial radio access (E-UTRA), which may use Long Term Evolution (LTE) and/or LTE-advanced (LTE-a) to establish the air interface 115/116/117.

In other embodiments, the base station 114a and the WTRUs 102a, 102b, 102c may implement radio technologies such as IEEE 802.16 (e.g., worldwide Interoperability for Microwave Access (WiMAX)), CDMA2000 1X, CDMA EV-DO, temporary standard 2000 (IS-2000), temporary standard 95 (IS-95), temporary standard 856 (IS-856), global system for mobile communications (GSM), enhanced data rates for GSM evolution (EDGE), GSM EDGE (GERAN), and the like. The base station 114B in fig. 11A may be, for example, a wireless router, home node B, home enode B, or access point, and may utilize any suitable RAT to facilitate wireless connections in local areas such as business, home, vehicle, campus, etc. In one embodiment, the base station 114b and the WTRUs 102c, 102d may implement a radio technology such as IEEE 802.11 to establish a Wireless Local Area Network (WLAN). In another embodiment, the base station 114b and the WTRUs 102c, 102d may implement a radio technology such as IEEE 802.15 to establish a Wireless Personal Area Network (WPAN). In another embodiment, the base station 114b and the WTRUs 102c, 102d may utilize a cellular-based RAT (e.g., WCDMA, CDMA2000, GSM, LTE, LTE-a, etc.) to establish a pico cell or femto cell. As shown in fig. 11A, the base station 114b may have a direct connection to the internet 110. Thus, the base station 114b may not need to access the Internet 110 via the core network 106/107/109.

The RANs 103/104/105 may communicate with a core network 106/107/109, which core network 106/107/109 may be any type of network configured to provide voice, data, applications, and/or voice over internet protocol (VoIP) services to one or more of the WTRUs 102a, 102b, 102c, 102 d. For example, the core networks 106/107/109 may provide call control, billing services, mobile location based services, prepaid calls, internet connectivity, video distribution, etc., and/or perform advanced security functions such as user authentication. Although not shown in fig. 11A, it is to be appreciated that RANs 103/104/105 and/or core networks 106/107/109 can communicate directly or indirectly with other RANs that employ the same RAT as RAN 103/104/105 or a different RAT. For example, in addition to being connected to a RAN 103/104/105 that may utilize E-UTRA radio technology, the core network 106/107/109 may also communicate with another RAN (not shown) that employs GSM radio technology.

The core network 106/107/109 may also act as a gateway for the WTRUs 102a, 102b, 102c, 102d to access the PSTN 108, the internet 110, and/or other networks 112. PSTN 108 may include circuit-switched telephone networks that provide Plain Old Telephone Service (POTS). The internet 110 may comprise a global system of interconnected computer networks and devices that use common communication protocols, such as Transmission Control Protocol (TCP), user Datagram Protocol (UDP), and IP in the Transmission Control Protocol (TCP)/Internet Protocol (IP) internet protocol family. Network 112 may include a wired or wireless communication network owned and/or operated by other service providers. For example, network 112 may include another core network connected to one or more RANs that may employ the same RAT as RAN 103/104/105 or a different RAT.

Some or all of the WTRUs 102a, 102b, 102c, 102d in the communication system 100 may include multi-mode capabilities, e.g., the WTRUs 102a, 102b, 102c, 102d may include multiple transceivers for communicating with different wireless networks over different wireless links. For example, the WTRU102 c shown in fig. 11A may be configured to communicate with the base station 114a, which may employ a cellular-based radio technology, and with the base station 114b, which may employ an IEEE 802 radio technology.

Fig. 11B is a system diagram of an example WTRU 102. As shown in fig. 11B, the WTRU102 may include a processor 118, a transceiver 120, a transmit/receive element 122, a speaker/microphone 124, a keypad 126, a display/touch screen 128, a non-removable memory 130, a removable memory 132, a power supply 134, a Global Positioning System (GPS) chipset 136, and other peripherals 138. It should be appreciated that the WTRU102 may include any sub-combination of the foregoing elements while remaining consistent with an embodiment. Moreover, embodiments contemplate that base stations 114a and 114B and/or nodes that base stations 114a and 114B may represent (e.g., without limitation, transceiver stations (BTSs), node bs, site controllers, access Points (APs), home node bs, evolved home node bs (enobs), home evolved node bs (henbs), home evolved node B gateways, proxy nodes, etc.) may include some or all of the elements depicted in fig. 11B and described herein.

The processor 118 may be a general purpose processor, a special purpose processor, a conventional processor, a Digital Signal Processor (DSP), a plurality of microprocessors, one or more microprocessors in association with a DSP core, a controller, a microcontroller, application Specific Integrated Circuits (ASICs), field Programmable Gate Arrays (FPGAs) circuits, any other type of Integrated Circuit (IC), a state machine, or the like. The processor 118 may perform signal coding, data processing, power control, input/output processing, and/or any other functions that enable the WTRU 102 to operate in a wireless environment. The processor 118 may be coupled to a transceiver 120, and the transceiver 120 may be coupled to a transmit/receive element 122. Although fig. 11B depicts the processor 118 and the transceiver 120 as separate elements, it should be appreciated that the processor 118 and the transceiver 120 may be integrated together in an electronic package or chip.

The transmit/receive element 122 may be configured to transmit signals to and receive signals from a base station (e.g., base station 114 a) over the air interfaces 115/116/117. For example, in one embodiment, the transmit/receive element 122 may be an antenna configured to transmit and/or receive RF signals. In another embodiment, the transmit/receive element 122 may be an emitter/detector configured to transmit and/or receive, for example, IR, UV, or visible light signals. In another embodiment, the transmit/receive element 122 may be configured to transmit and receive both RF and optical signals. It should be appreciated that the transmit/receive element 122 may be configured to transmit and/or receive any combination of wireless signals.

In addition, although the transmit/receive element 122 is depicted as a single element in fig. 11B, the WTRU 102 may include any number of transmit/receive elements 122. More specifically, the WTRU 102 may employ MIMO technology. Thus, in one embodiment, the WTRU 102 may include two or more transmit/receive elements 122 (e.g., multiple antennas) for transmitting and receiving wireless signals over the air interfaces 115/116/117.

Transceiver 120 may be configured to modulate signals to be transmitted by transmit/receive element 122 and demodulate signals received by transmit/receive element 122. As described above, the WTRU 102 may have multi-mode capabilities. Thus, for example, the transceiver 120 may include multiple transceivers for enabling the WTRU 102 to communicate via multiple RATs, such as UTRA and IEEE 802.11.

The processor 118 of the WTRU 102 may be coupled to and may receive user input data from a speaker/microphone 124, a keypad 126, and/or a display/touch screen 128, such as a Liquid Crystal Display (LCD) display unit or an Organic Light Emitting Diode (OLED) display unit. The processor 118 may also output user data to the speaker/microphone 124, the keyboard 126, and/or the display/touch screen 128. In addition, the processor 118 may access information from any type of suitable memory (e.g., the non-removable memory 130 and/or the removable memory 132) and store data therein. The non-removable memory 130 may include Random Access Memory (RAM), read Only Memory (ROM), a hard disk, or any other type of memory storage device. Removable memory 132 may include a Subscriber Identity Module (SIM) card, a memory stick, a Secure Digital (SD) memory card, and the like. In other embodiments, the processor 118 may access information from a memory that is not physically located on the WTRU 102, such as on a server or home computer (not shown), and store data therein.

The processor 118 may receive power from the power source 134 and may be configured to distribute and/or control the power to other elements in the WTRU 102. The power source 134 may be any suitable device for powering the WTRU 102. For example, the power source 134 may include one or more dry cells (e.g., nickel cadmium (NiCd), nickel zinc ferrite (NiZn), nickel metal hydride (NiMH), lithium ion (Li), etc.), solar cells, fuel cells, and the like.

The processor 118 may also be coupled to a GPS chipset 136, and the GPS chipset 136 may be configured to provide location information (e.g., longitude and latitude) regarding the current location of the WTRU 102. In addition to or in lieu of information from the GPS chipset 136, the WTRU102 may receive location information from base stations (e.g., base stations 114a, 114 b) over the air interface 115/116/117 and/or determine its location based on the timing of signals received from two or more nearby base stations. It should be appreciated that the WTRU102 may acquire location information by any suitable location determination method while remaining consistent with an embodiment.

The processor 118 may be further coupled to other peripheral devices 138, and the peripheral devices 138 may include one or more software and/or hardware modules that provide additional features, functionality, and/or wired or wireless connectivity. For example, the number of the cells to be processed, peripheral devices 138 may include accelerometers, electronic compasses, satellite transceivers, digital cameras (for taking pictures or video), universal Serial Bus (USB) ports, vibrating devices, television transceivers, hands-free headphones, and the like, Modules, frequency Modulation (FM) radio units, digital music players, media players, video game modules, internet browsers, and the like.

Fig. 11C is a system diagram of RAN 103 and core network 106 according to an embodiment. As described above, the RAN 103 may communicate with the WTRUs 102a, 102b, 102c over the air interface 115 using UTRA radio technology. The RAN 103 may also communicate with a core network 106. As shown in fig. 11C, the RAN 103 may include node bs 140a, 140B, 140C, each of which may include one or more transceivers for communicating with the WTRUs 102a, 102B, 102C over the air interface 115. Each of the node bs 140a, 140B, 140c may be associated with a particular cell (not shown) within the RAN 103. RAN 103 may also include RNCs 142a, 142b. It should be appreciated that RAN 103 may include any number of node bs and RNCs, consistent with an embodiment.

As shown in fig. 11C, the node bs 140a, 140B may communicate with the RNC 142 a. In addition, node B140 c may communicate with RNC 142B. Node bs 140a, 140B, 140c may communicate with RNCs 142a, 142B, respectively, via Iub interfaces. The RNCs 142a, 142b may communicate with each other over an Iur interface. Each of the RNCs 142a, 142B may be configured to control the node bs 140a, 140B, 140c connected thereto, respectively. In addition, each of the RNCs 142a, 142b may be configured to perform or support other functions, such as outer loop power control, load control, admission control, packet scheduling, handover control, macro diversity, security functions, data encryption, and the like.

The core network 106 shown in fig. 11C may include a Media Gateway (MGW) 144, a Mobile Switching Center (MSC) 146, a Serving GPRS Support Node (SGSN) 148, and/or a Gateway GPRS Support Node (GGSN) 150. Although each of the foregoing components are depicted as part of the core network 106, it should be understood that any of these components may be owned and/or operated by an entity other than the core network operator.

The RNC 142a in the RAN 103 may be connected to the MSC 146 in the core network 106 via an IuCS interface. The MSC 146 may be connected to the MGW 144. The MSC 146 and MGW 144 may provide the WTRUs 102a, 102b, 102c with access to a circuit switched network, such as the PSTN 108, to facilitate communications between the WTRUs 102a, 102b, 102c and legacy landline communication devices.

The RNC 142a in the RAN 103 may also be connected to the SGSN 148 in the core network 106 via an IuPS interface. SGSN 148 may be coupled to GGSN 150. The SGSN 148 and GGSN 150 may provide the WTRU 102a, 102b, 102c with access to a packet-switched network, such as the Internet 110, to facilitate communications between the WTRU 102a, 102b, 102c and the IP enabled devices.

As described above, the core network 106 may also be connected to the network 112, and the network 112 may include wired or wireless networks owned and/or operated by other service providers.

Fig. 11D is a system diagram of a RAN 104 and a core network 107 according to another embodiment. As described above, the RAN 104 may communicate with the WTRUs 102a, 102b, and 102c over the air interface 116 using an E-UTRA radio technology. RAN 104 may also communicate with core network 107.

RAN 104 may include enode bs 160a, 160B, 160c, but it should be understood that RAN 104 may include any number of enode bs while maintaining consistency with the embodiments. each of the enode bs 160a, 160B, 160c may include one or more transceivers for communicating with the WTRUs 102a, 102B, 102c over the air interface 116. In one embodiment, the enode bs 160a, 160B, 160c may implement MIMO technology. Thus, the enode B160 a may use multiple antennas to transmit wireless signals to and receive wireless signals from the WTRU 102, for example.

each of the enode bs 160a, 160B, 160c may be associated with a particular cell (not shown) and may be configured to process radio resource management decisions, handover decisions, scheduling of users in the uplink and/or downlink, and so on. As shown in fig. 11D, the enode bs 160a, 160B, 160c may communicate with each other over an X2 interface.

The core network 107 shown in fig. 11D may include a Mobility Management Entity (MME) 162, a serving gateway 164, and a Packet Data Network (PDN) gateway 166. Although each of the foregoing elements are depicted as part of the core network 107, it should be appreciated that any of these elements may be owned and/or operated by other entities outside the core network operator.

The MME 162 may be connected to each of the enode bs 160a, 160B, 160c in the RAN 104 via an S1 interface and act as control nodes. For example, the MME 162 may be responsible for user authentication of the WTRUs 102a, 102b, 102c, bearer activation/deactivation, selection of a particular serving gateway during initial connection of the WTRUs 102a, 102b, 102c, and so on. MME 162 may also provide control plane functionality for switching between RAN 104 and other RANs (not shown) using other radio technologies such as GSM or WCDMA.

The serving gateway 164 may be connected to each of the enode bs 160a, 160B, 160c in the RAN 104 via an S1 interface. The serving gateway 164 may generally route and forward user data packets to/from the WTRUs 102a, 102b, 102 c. The serving gateway 164 may also perform other functions such as anchoring the user plane during inter-node B handover, triggering paging, managing and storing the contexts of the WTRUs 102a, 102B, 102c when downlink data is available to the WTRUs 102a, 102B, 102c, and the like.

The serving gateway 164 may also be connected to a PDN gateway 166, and the PDN gateway 166 may provide the WTRUs 102a, 102b, 102c with access to a packet-switched network (e.g., the internet 110) to facilitate communication between the WTRUs 102a, 102b, 102c and IP-enabled devices.

The core network 107 may facilitate communication with other networks. For example, the core network 107 may provide the WTRUs 102a, 102b, 102c with access to a circuit-switched network (e.g., the PSTN 108) to facilitate communications between the WTRUs 102a, 102b, 102c and conventional landline communication devices. For example, the core network 107 may include, or be in communication with, an IP gateway (e.g., an IP Multimedia Subsystem (IMS) server) that acts as an interface between the core network 107 and the PSTN 108. In addition, the core network 107 may provide the WTRUs 102a, 102b, 102c with access to the network 112, which network 112 may include other wired or wireless networks owned and/or operated by other service providers.

Fig. 11E is a system diagram of the RAN 105 and the core network 109, according to one embodiment. The RAN 105 may be an Access Service Network (ASN) that employs IEEE 802.16 radio technology to communicate with WTRUs 102a, 102b, 102c over an air interface 117. As will be discussed further below, the communication links between the different functional entities of the WTRUs 102a, 102b, 102c, the RAN 105 and the core network 109 may be defined as reference points.

As shown in fig. 11E, the RAN 105 may include base stations 180a, 180b, 180c and an ASN gateway 182, but it should be understood that the RAN 105 may include any number of base stations and ASN gateways while remaining consistent with an embodiment. The base stations 180a, 180b, 180c may each be associated with a particular cell (not shown) in the RAN 105 and may each include one or more transceivers to communicate with the WTRUs 102a, 102b, 102c over the air interface 117. In one embodiment, the base stations 180a, 180b, 180c may implement MIMO technology. Thus, for example, the base station 180a may use multiple antennas to send and receive wireless signals to and from the WTRU 102 a. The base stations 180a, 180b, 180c may also provide mobility management functions such as handoff triggering, tunnel establishment, radio resource management, traffic classification, quality of service (QoS) policy enforcement, and so on. The ASN gateway 182 may act as a traffic aggregation point and may be responsible for paging, caching user profiles, routing to the core network 109, and so forth.

The air interface 117 between the WTRUs 102a, 102b, 102c and the RAN 105 may be defined as an R1 reference point implementing the IEEE 802.16 specification. In addition, each of the WTRUs 102a, 102b, 102c may establish a logical interface (not shown) with the core network 109. The logical interface between the WTRUs 102a, 102b, 102c and the core network 109 may be defined as an R2 reference point, which may be used for authentication, authorization, IP host configuration management, and/or mobility management.

The communication link between each of the base stations 180a, 180b, 180c may be defined as an R8 reference point, which may include protocols for facilitating WTRU handover and data transfer between the base stations. The communication links between the base stations 180a, 180b, 180c and the ASN gateway 182 may be defined as R6 reference points. The R6 reference point may include a protocol for facilitating mobility management based on mobility events associated with each of the WTRUs 102a, 102b, 102 c.

As shown in fig. 11E, the RAN 105 may be connected to a core network 109. The communication link between the RAN 105 and the core network 109 may be defined as an R3 reference point, which R3 reference point includes protocols for facilitating, for example, data transfer and mobility management capabilities. The core network 109 may include a mobile IP home agent (MIP-HA) 184, an authentication, authorization, accounting (AAA) server 186, and a gateway 188. Although each of the foregoing elements are depicted as part of the core network 109, it will be appreciated that any of these elements may be owned and/or operated by an entity other than the core network operator.

The MIP-HA may be responsible for IP address management and enable the WTRUs 102a, 102b, 102c to roam between different ASNs and/or different core networks. The MIP-HA 184 may provide the WTRUs 102a, 102b, 102c with access to a packet-switched network (e.g., the internet 110) to facilitate communications between the WTRUs 102a, 102b, 102c and IP-enabled devices. AAA server 186 may be responsible for user authentication and support for user services. Network management 188 may facilitate interconnection with other networks. For example, the gateway 188 may provide the WTRUs 102a, 102b, 102c with access to a circuit-switched network (e.g., the PSTN 108) to facilitate communications between the WTRUs 102a, 102b, 102c and conventional landline communication devices. Further, the gateway 188 may provide the WTRUs 102a, 102b, 102c with access to the network 112 (which may include other wired or wireless networks owned and/or operated by other service providers).

Although not shown in fig. 11E, it is understood that the RAN 105 may be connected to other ASNs and the core network 109 may be connected to other core networks. The communication link between the RAN 105 and other ASNs may be defined as an R4 reference point, which may include a protocol for coordinating mobility of the WTRUs 102a, 102b, 102c between the RAN 105 and other RANs. The communication links between the core network 109 and other core networks may be defined as R5 references, which R5 references may include protocols for facilitating interconnection between the home core network and the visited core network.

Although the features and elements are described above in particular combinations, those skilled in the art will appreciate that each feature or element can be used alone or in any combination with other features and elements. Furthermore, the methods described herein may be implemented in a computer program, software, or firmware incorporated into a computer readable medium for execution by a computer or processor. Examples of computer readable media include electronic signals (transmitted over a wired or wireless connection) and computer readable storage media. Examples of computer readable storage media include, but are not limited to, read-only memory (ROM), random-access memory (RAM), registers, caches, semiconductor memory devices, magnetic media (such as internal hard disks and removable disks), magneto-optical media, and optical media (such as CD-ROM disks and Digital Versatile Disks (DVDs)). A processor associated with the software may be used to implement a radio frequency transceiver for use in a WTRU, UE, terminal, base station, RNC, or any host.

Claims

1. A method for decoding video content, the method comprising:

obtaining an adaptive color space conversion enabling indication configured to indicate whether adaptive color space conversion is allowed for a sequence of pictures;

Determining, based on the adaptive color space conversion enabling indication, that the adaptive color space conversion is allowed for the sequence of pictures;

determining that there is at least one non-zero coefficient in residual coefficients associated with the encoded block of a plurality of encoded blocks;

based on determining that the adaptive color space conversion is allowed to be used and that there is at least one non-zero coefficient in the residual coefficients associated with the coding block of the plurality of coding blocks, obtaining coding unit adaptive color space conversion indications for coding blocks of a plurality of coding blocks of the sequence of pictures, the plurality of coding blocks having different sizes, wherein the coding unit adaptive color space conversion indications are configured to indicate whether color space conversion is applied to the coding block of the plurality of coding blocks; and

the encoded blocks of the plurality of encoded blocks are decoded based on the coding unit adaptive color space conversion indication.

2. The method of claim 1, wherein the adaptive color space conversion enabling indication is obtained in a sequence parameter set.

3. The method of claim 1, the method further comprising:

Obtaining a non-zero residual coefficient flag associated with the encoded block of the plurality of encoded blocks, the non-zero residual coefficient flag indicating whether at least one non-zero coefficient is present among the residual coefficients associated with the encoded block of the plurality of encoded blocks, wherein the presence of at least one non-zero coefficient among residual coefficients associated with the encoded block of the plurality of encoded blocks is determined based on the non-zero residual coefficient flag associated with the encoded block.

4. The method of claim 3, wherein the non-zero residual coefficient flag comprises an indication that at least one non-zero coefficient is present among luma residual coefficients.

5. The method of claim 3, wherein the non-zero residual coefficient flag comprises an indication that at least one non-zero coefficient is present among chroma residual coefficients.

6. An apparatus for video encoding, comprising:

a processor, when executing the computer program stored in the memory, is configured to perform at least:

obtaining coding unit adaptive color space conversion indications for coding blocks of a plurality of coding blocks of the sequence of pictures based on determining that the adaptive color space conversion is allowed to be used and that there is at least one non-zero coefficient in residual coefficients associated with the coding blocks of the plurality of coding blocks, the plurality of coding blocks having different sizes, wherein the coding unit adaptive color space conversion indications are configured to indicate whether color space conversion is applied to the coding blocks of the plurality of coding blocks; and

7. The apparatus of claim 6, wherein the adaptive color space conversion enabling indication is obtained in a sequence parameter set.

8. The apparatus of claim 6, wherein the processor is further configured to:

Obtaining a non-zero residual coefficient flag associated with the encoded block of the plurality of encoded blocks, the non-zero residual coefficient flag indicating whether at least one non-zero coefficient is present among residual coefficients associated with the encoded block of the plurality of encoded blocks, wherein the presence of at least one non-zero coefficient among residual coefficients associated with the encoded block of the plurality of encoded blocks is determined based on the non-zero residual coefficient flag associated with the encoded block.

9. The apparatus of claim 8, wherein the non-zero residual coefficient flag comprises an indication that at least one non-zero coefficient is present among luma residual coefficients.

10. The device of claim 8, wherein the non-zero residual coefficient flag comprises an indication that at least one non-zero coefficient is present among chroma residual coefficients.

11. A method for encoding video content, the method comprising:

obtaining residuals of a coding block of a plurality of coding blocks in a sequence of pictures, the plurality of coding blocks having different sizes;

determining that there is at least one non-zero coefficient in residual coefficients associated with the coding block of the plurality of coding blocks;

Determining whether to apply color space conversion to the residual of the encoded block based on a rate-distortion cost comparison; and

upon determining that the color space conversion is applied to the encoded blocks and that at least one non-zero coefficient is present in the residual coefficients associated with the encoded blocks of the plurality of encoded blocks, including in a bitstream an encoding unit adaptive color space conversion indication for the encoded blocks of the plurality of encoded blocks, the encoding unit adaptive color space conversion indication configured to indicate whether color space conversion is applied to the encoded blocks.

12. The method of claim 11, the method further comprising:

calculating rate-distortion costs associated with performing residual coding in the GBR color space; and

calculating rate distortion costs associated with performing residual coding in a YCgCo color space, wherein the determining that the color space conversion is applied to the coding block of the plurality of coding blocks is based on: the rate-distortion cost associated with performing residual coding in the YCgCo color space is lower than the rate-distortion cost associated with performing residual coding in the GBR color space.

13. An apparatus for video encoding, comprising:

upon determining that the color space conversion is applied to the encoded blocks and that there is at least one non-zero coefficient in the residual coefficients associated with the encoded blocks of the plurality of encoded blocks, including in a bitstream an encoding unit adaptive color space conversion indication for the encoded blocks of the plurality of encoded blocks, the encoding unit adaptive color space conversion indication configured to indicate whether color space conversion is applied to the encoded blocks of the plurality of encoded blocks.

14. The apparatus of claim 13, wherein the processor is further configured to:

15. A computer-readable storage medium comprising instructions stored therein, wherein one or more processors when executing the instructions perform at least the following:

16. The computer-readable storage medium of claim 15, wherein the adaptive color space conversion enabling indication is obtained in a sequence parameter set.

17. The computer-readable storage medium of claim 15, further comprising instructions for causing the one or more processors to:

18. The computer-readable storage medium of claim 17, wherein the non-zero residual coefficient flag comprises an indication that at least one non-zero coefficient is present among luma residual coefficients.

19. The computer-readable storage medium of claim 17, wherein the non-zero residual coefficient flag comprises an indication that at least one non-zero coefficient is present among chroma residual coefficients.

20. A computer-readable storage medium comprising instructions stored therein, wherein one or more processors when executing the instructions perform at least the following:

21. The computer-readable storage medium of claim 20, further comprising instructions for causing the one or more processors to: