WO2014066182A1 - Band separation filtering / inverse filtering for frame packing / unpacking higher-resolution chroma sampling formats - Google Patents

Band separation filtering / inverse filtering for frame packing / unpacking higher-resolution chroma sampling formats Download PDF

Info

Publication number
WO2014066182A1
WO2014066182A1 PCT/US2013/065754 US2013065754W WO2014066182A1 WO 2014066182 A1 WO2014066182 A1 WO 2014066182A1 US 2013065754 W US2013065754 W US 2013065754W WO 2014066182 A1 WO2014066182 A1 WO 2014066182A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
frames
format
yuv
filtering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/US2013/065754
Other languages
English (en)
French (fr)
Inventor
Gary J. Sullivan
Henrique Sarmento Malvar
Yongjun Wu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Corp
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Priority to EP13789089.3A priority Critical patent/EP2910023A1/en
Priority to KR1020157010297A priority patent/KR20150079624A/ko
Priority to JP2015538079A priority patent/JP2016500965A/ja
Priority to CN201380055253.2A priority patent/CN104982036B/zh
Publication of WO2014066182A1 publication Critical patent/WO2014066182A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/46Embedding additional information in the video signal during the compression process
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • H04N19/88Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression involving rearrangement of data among different coding units, e.g. shuffling, interleaving, scrambling or permutation of pixel data or permutation of transform coefficient data among different blocks

Definitions

  • Engineers use compression (also called source coding or source encoding) to reduce the bit rate of digital video. Compression decreases the cost of storing and transmitting video information by converting the information into a lower bit rate form. Decompression (also called decoding) reconstructs a version of the original information from the compressed form.
  • a "codec” is an encoder/decoder system.
  • a video codec standard typically defines options for the syntax of an encoded video bitstream, detailing parameters in the bitstream when particular features are used in encoding and decoding. In many cases, a video codec standard also provides details about the decoding operations a decoder should perform to achieve conformant results in decoding. Aside from codec standards, various proprietary codec formats define other options for the syntax of an encoded video bitstream and corresponding decoding operations.
  • a video source such as a camera, animation output, screen capture module, etc. typically provides video that is converted to a format such as a YUV 4:4:4 chroma sampling format.
  • a YUV format includes a luma (or Y) component with sample values representing approximate brightness values as well as multiple chroma (or U and V) components with sample values representing color difference values.
  • chroma information is represented at the same spatial resolution as luma information.
  • a YUV 4:2:0 format is a format that sub-samples chroma information compared to a YUV 4:4:4 format, so that chroma resolution is half that of luma resolution both horizontally and vertically.
  • the decision to use a YUV 4:2:0 format for encoding/decoding is premised on the understanding that, for most use cases such as encoding/decoding of natural camera-captured video content, viewers do not ordinarily notice many visual differences between video encoded/decoded in a YUV 4:2:0 format and video encoded/decoded in a YUV 4:4:4 format.
  • the compression advantages for the YUV 4:2:0 format which has fewer samples per frame, are therefore compelling. There are some use cases, however, for which video has richer color information and higher color fidelity may be justified.
  • YUV 4:4:4 and YUV 4:2:0 chroma sampling formats are more easily perceived by viewers.
  • a 4:4:4 format may be preferable to a 4:2:0 format.
  • screen capture codecs supporting encoding and decoding in a 4:4:4 format are available, the lack of widespread support for codecs supporting 4:4:4 formats (especially with respect to hardware codec implementations) is a hindrance for these use cases.
  • the detailed description presents innovations in frame packing of video frames of a higher-resolution chroma sampling format into video frames of a lower- resolution chroma sampling format for purposes of encoding.
  • the higher- resolution chroma sampling format is a YUV 4:4:4 format
  • the lower-resolution chroma sampling format is a YUV 4:2:0 format.
  • the video frames of the lower-resolution chroma sampling format can be unpacked to reconstruct the video frames of the higher-resolution chroma sampling format.
  • available encoders and decoders operating at the lower-resolution chroma sampling format can be used, while still retaining higher resolution chroma information.
  • a computing device packs one or more frames of a higher- resolution chroma sampling format into one or more frames of a lower-resolution chroma sampling format. As part of the packing, the computing device performs wavelet decomposition (or other band separation filtering) on sample values of chroma
  • the computing device assigns the sample values of the multiple bands to parts of the frame(s) of the lower-resolution chroma sampling format.
  • a computing device unpacks one or more frames of a lower- resolution chroma sampling format into one or more frames of a higher-resolution chroma sampling format. As part of the unpacking, the computing device assigns parts of the frame(s) of the lower-resolution chroma sampling format to sample values of multiple bands. The computing device then performs wavelet reconstruction (or other inverse band separation filtering) on the sample values of the multiple bands to produce sample values of chroma components of the frame(s) of the higher-resolution chroma sampling format.
  • wavelet reconstruction or other inverse band separation filtering
  • the packing or unpacking can be implemented as part of a method, as part of a computing device adapted to perform the method or as part of a tangible computer- readable media storing computer-executable instructions for causing a computing device to perform the method.
  • Figure 1 is a diagram of an example computing system in which some described embodiments can be implemented.
  • FIGs 2a and 2b are diagrams of example network environments in which some described embodiments can be implemented.
  • Figure 3 is a diagram of a generalized frame packing/unpacking system in which some described embodiments can be implemented
  • Figure 4 is a diagram of an example encoder system in conjunction with which some described embodiments can be implemented.
  • Figure 5 is a diagram of an example decoder system in conjunction with which some described embodiments can be implemented.
  • Figure 6 is a diagram illustrating an example video encoder in conjunction with which some described embodiments can be implemented.
  • Figure 7 is a diagram illustrating an example video decoder in conjunction with which some described embodiments can be implemented.
  • Figure 8 is a diagram illustrating an example approach to frame packing that uses spatial partitioning of frames.
  • Figure 9 is a diagram illustrating an example approach to frame packing in which every second row of chroma component planes of frames of a higher-resolution chroma sampling format is copied.
  • Figure 10 is a diagram illustrating example frames packed according to the approach of Figure 9.
  • Figure 11 is a diagram illustrating an example approach to frame packing in which every second column of chroma component planes of frames of a higher-resolution chroma sampling format is copied.
  • Figure 12 is a flow chart illustrating a generalized technique for frame packing for frames of a higher-resolution chroma sampling format.
  • Figure 13 is a flow chart illustrating a generalized technique for frame unpacking for frames of a higher-resolution chroma sampling format.
  • Figure 14 is a diagram illustrating an example approach to frame packing with vertical then horizontal filtering in three-band wavelet decomposition of chroma component planes of frames of a higher-resolution chroma sampling format.
  • Figure 15 is a diagram illustrating an example approach to frame packing with horizontal then vertical filtering in three-band wavelet decomposition of chroma component planes of frames of a higher-resolution chroma sampling format.
  • Figures 16a and 16b are diagrams illustrating example approaches to frame packing with vertical then horizontal then vertical filtering in four-band wavelet decomposition of chroma component planes of frames of a higher-resolution chroma sampling format.
  • Figure 17 is a flow chart illustrating a generalized technique for frame packing for frames of a higher-resolution chroma sampling format, where the frame packing includes wavelet decomposition or other band separation filtering.
  • Figure 18 is a flow chart illustrating a generalized technique for frame unpacking for frames of a higher-resolution chroma sampling format, where the frame packing includes wavelet reconstruction or other inverse band separation filtering.
  • a video source such as a camera, animation output, screen capture module, etc. typically provides video that is converted to a format such as a YUV 4:4:4 chroma sampling format (an example of a 4:4:4 format, more generally).
  • a YUV format includes a luma (or Y) component with sample values representing approximate brightness values as well as multiple chroma (or U and V) components with sample values representing color difference values.
  • chroma or U and V
  • YUV indicates any color space with a luma (or luminance) component and one or more chroma (or chrominance) components, including Y'UV, YIQ, Y'lQ and YDbDr as well as variations such as YCbCr and YCoCg.
  • the component signal measures that are used may be adjusted through the application of a non-linear transfer characteristics function (generally known as "gamma pre-compensation" and often denoted by the use of a prime symbol, although the prime symbol is often omitted for typographical convenience).
  • the component signal measures may be in a domain that has a linear relationship with light amplitude.
  • the luma and chroma component signals may be well aligned with the perception of brightness and color for the human visual system, or the luma and chroma component signals may somewhat deviate from such measures (e.g., as in the YCoCg variation, in which formulas are applied that simplify the computation of the color component values).
  • YUV formats as described herein include those described in the international standards known as ITU-R BT.601 , ITU-R BT.709, and ITU-R BT.2020.
  • Examples of chroma sample types are shown in Figure E-l of the H.264/AVC standard.
  • a 4:4:4 format can be a YUV 4:4:4 format or format for another color space, such as RGB or GBR.
  • YUV 4:2:0 is a format that sub-samples chroma information compared to a YUV 4:4:4 format, which preserves full-resolution chroma information (that is, chroma information is represented at the same resolution as luma information).
  • the decision to use a YUV 4:2:0 format for encoding/ decoding is premised on the understanding that, for most use cases such as encoding/decoding of natural camera-captured video content, viewers do not ordinarily notice many visual differences between video encoded/decoded in a YUV 4:2:0 format and video encoded/decoded in a YUV 4:4:4 format.
  • the compression advantages for the YUV 4:2:0 format which has fewer samples per frame, are therefore compelling.
  • the detailed description presents various approaches to wavelet decomposition (or other band separation filtering) for packing frames of a higher-resolution chroma sampling format into frames of a lower-resolution chroma sampling format.
  • the frames of the lower-resolution chroma sampling format can then be encoded using an encoder designed for the lower-resolution chroma sampling format.
  • the frames at the lower-resolution chroma sampling format can be output for further processing and display.
  • the frames of the higher-resolution chroma sampling format can be recovered through frame unpacking that includes wavelet reconstruction (or other inverse band separation filtering), for output and display.
  • these approaches alleviate the shortcomings of the prior approaches by preserving chroma information from the frames in the higher-resolution chroma sampling format, while leveraging
  • codecs adapted for the lower-resolution chroma sampling format.
  • widely available codecs with specialized, dedicated hardware can provide faster encoding/decoding with lower power consumption for YUV 4:4:4 video frames packed into YUV 4:2:0 video frames.
  • the described approaches can be used to preserve chroma information for frames of one chroma sampling format when encoding/decoding uses another chroma sampling format.
  • Some examples described herein involve frame packing/unpacking of frames of a YUV 4:4:4 format for encoding/decoding using a codec adapted for a YUV 4:2:0 format.
  • Other examples described herein involve frame packing/unpacking of frames of a YUV 4:2:2 format for encoding/decoding using a codec adapted for a YUV 4:2:0 format.
  • the described approaches can be used for other chroma sampling formats.
  • the described approaches can be used for color spaces such as RGB, GBR, etc. in sampling ratios such as 4:4:4, 4:2:2, 4:2:0, 4: 1 : 1, 4:0:0, etc. as the chroma sampling formats.
  • specific aspects of the innovations described herein include, but are not limited to, the following: Packing a 4:4:4 frame into two 4:2:0 frames, where the packing includes wavelet decomposition (or other band separation filtering), and encoding the two 4:2:0 frames using a video encoder designed for 4:2:0 format.
  • Pre-processing and post-processing operations (such as wavelet decomposition/ reconstruction or other band separation filtering/inverse filtering) that can improve the quality of the final displayed frame for a YUV format when only one 4:2:0 frame (out of the two 4:2:0 frames) is used for final display.
  • the 4:2:0 frames can have a higher bit depth for encoding/decoding, so as to avoid loss of chroma information in pre-processing and post-processing operations.
  • frame packing arrangement SEI message is extended to support representing 4:4:4 content in a nominally 4:2:0 bitstream.
  • one constituent frame e.g. , in a top-bottom packing or alternating-frame coding scheme
  • YUV 4:2:0 is the most widely supported format in products (especially with respect to hardware codec implementations), having an effective way of conveying YUV 4:4:4 content through such decoders can provide the substantial benefit of enabling widespread near-term deployment of YUV 4:4:4 capabilities (especially for screen content coding).
  • the samples of a 4:4:4 frame are packed into two 4:2:0 frames, and the two 4:2:0 frames are encoded as the constituent frames of a frame packing arrangement.
  • the semantics of the content interpretation type syntax element are extended to signal this usage.
  • the content interpretation type syntax element signals how to interpret the data that are packed using a packing arrangement, and the frame configuration for the packing arrangement is signaled with a different syntax element.
  • innovations described herein are illustrated with reference to syntax elements and operations specific to the HEVC standard. For example, reference is made to the draft version JCTVC-I1003 of the HEVC standard - "High efficiency video coding (HEVC) text specification draft 7," JCTVC-I1003_d5, 9 th meeting, Geneva, April 2012.
  • the innovations described herein can also be implemented for other standards or formats. For example, innovations described herein can be implemented for the H.264/AVC standard using frame packing arrangement SEI messages.
  • any of the methods described herein can be altered by changing the ordering of the method acts described, by splitting, repeating, or omitting certain method acts, etc.
  • the various aspects of the disclosed technology can be used in combination or separately.
  • Different embodiments use one or more of the described innovations.
  • Figure 1 illustrates a generalized example of a suitable computing system (100) in which several of the described innovations may be implemented.
  • the computing system (100) is not intended to suggest any limitation as to scope of use or functionality, as the innovations may be implemented in diverse general-purpose or special-purpose computing systems.
  • the computing system (100) includes one or more processing units (1 10, 1 15) and memory (120, 125). In Figure 1 , this most basic configuration (130) is included within a dashed line.
  • the processing units (1 10, 1 15) execute computer-executable instructions.
  • a processing unit can be a general-purpose central processing unit ("CPU"), processor in an application-specific integrated circuit or any other type of processor. In a multi-processing system, multiple processing units execute computer-executable instructions to increase processing power. For example,
  • Figure 1 shows a central processing unit (1 10) as well as a graphics processing unit or coprocessing unit (1 15).
  • the tangible memory (120, 125) may be volatile memory (e.g., registers, cache, RAM), non-volatile memory (e.g., ROM, EEPROM, flash memory, etc.), or some combination of the two, accessible by the processing unit(s).
  • the memory (120, 125) stores software (180) implementing one or more innovations for frame packing with wavelet decomposition (or other band separation filtering) and/or unpacking with wavelet reconstruction (or other inverse band separation filtering) for higher-resolution chroma sampling formats, in the form of computer-executable instructions suitable for execution by the processing unit(s).
  • a computing system may have additional features.
  • the computing system (100) includes storage (140), one or more input devices (150), one or more output devices (160), and one or more communication connections (170).
  • An interconnection mechanism (not shown) such as a bus, controller, or network interconnects the
  • operating system software provides an operating environment for other software executing in the computing system (100), and coordinates activities of the components of the computing system (100).
  • the tangible storage (140) may be removable or non-removable, and includes magnetic disks, magnetic tapes or cassettes, CD-ROMs, DVDs, or any other medium which can be used to store information in a non-transitory way and which can be accessed within the computing system (100).
  • the storage (140) stores instructions for the software (180) implementing one or more innovations for frame packing with wavelet
  • the input device(s) (150) may be a touch input device such as a keyboard, mouse, pen, or trackball, a voice input device, a scanning device, or another device that provides input to the computing system (100).
  • the input device(s) (150) may be a camera, video card, TV tuner card, or similar device that accepts video input in analog or digital form, or a CD-ROM or CD-RW that reads video samples into the computing system (100).
  • the output device(s) (160) may be a display, printer, speaker, CD-writer, or another device that provides output from the computing system (100).
  • the communication connection(s) (170) enable communication over a communication medium to another computing entity.
  • the communication medium conveys information such as computer-executable instructions, audio or video input or output, or other data in a modulated data signal.
  • a modulated data signal is a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • communication media can use an electrical, optical, RF, or other carrier.
  • Computer-readable media are any available tangible media that can be accessed within a computing environment.
  • Computer-readable media include memory (120, 125), storage (140), and combinations of any of the above.
  • the innovations can be described in the general context of computer-executable instructions, such as those included in program modules, being executed in a computing system on a target real or virtual processor.
  • program modules include routines, programs, libraries, objects, classes, components, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • the functionality of the program modules may be combined or split between program modules as desired in various embodiments.
  • Computer-executable instructions for program modules may be executed within a local or distributed computing system.
  • system and “device” are used interchangeably herein. Unless the context clearly indicates otherwise, neither term implies any limitation on a type of computing system or computing device. In general, a computing system or computing device can be local or distributed, and can include any combination of special-purpose hardware and/or general-purpose hardware with software implementing the functionality described herein.
  • the disclosed methods can also be implemented using specialized computing hardware configured to perform any of the disclosed methods.
  • the disclosed methods can be implemented by an integrated circuit (e.g. , an application specific integrated circuit ("ASIC”) (such as an ASIC digital signal process unit (“DSP”), a graphics processing unit (“GPU”), or a programmable logic device (“PLD”), such as a field programmable gate array (“FPGA”)) specially designed or configured to implement any of the disclosed methods.
  • ASIC application specific integrated circuit
  • DSP digital signal process unit
  • GPU graphics processing unit
  • PLD programmable logic device
  • FPGA field programmable gate array
  • Figures 2a and 2b show example network environments (201 , 202) that include video encoders (220) and video decoders (270).
  • the encoders (220) and decoders (270) are connected over a network (250) using an appropriate communication protocol.
  • the network (250) can include the Internet or another computer network.
  • RTC radio frequency communication
  • encoder 220
  • decoder 270
  • SMPTE 421M standard
  • ISO/IEC 14496-10 standard
  • the bidirectional communication can be part of a video conference, video telephone call, or other two-party communication scenario.
  • the network environment (201) in Figure 2a includes two real-time communication tools (210), the network environment (201) can instead include three or more real-time communication tools (210) that participate in multi-party communication.
  • a real-time communication tool (210) manages encoding by an encoder (220).
  • Figure 4 shows an example encoder system (400) that can be included in the real-time communication tool (210).
  • the real-time communication tool (210) uses another encoder system.
  • a real-time communication tool (210) also manages decoding by a decoder (270).
  • Figure 5 shows an example decoder system (500), which can be included in the real-time communication tool (210).
  • the real-time communication tool (210) uses another decoder system.
  • an encoding tool (212) includes an encoder (220) that encodes video for delivery to multiple playback tools (214), which include decoders (270).
  • the unidirectional communication can be provided for a video surveillance system, web camera monitoring system, remote desktop conferencing presentation or other scenario in which video is encoded and sent from one location to one or more other locations.
  • the network environment (202) in Figure 2b includes two playback tools (214)
  • the network environment (202) can include more or fewer playback tools (214).
  • a playback tool (214) communicates with the encoding tool (212) to determine a stream of video for the playback tool (214) to receive.
  • the playback tool (214) receives the stream, buffers the received encoded data for an appropriate period, and begins decoding and playback.
  • Figure 4 shows an example encoder system (400) that can be included in the encoding tool (212).
  • the encoding tool (212) uses another encoder system.
  • the encoding tool (212) can also include server-side controller logic for managing connections with one or more playback tools (214).
  • Figure 5 shows an example decoder system (500), which can be included in the playback tool (214).
  • the playback tool (214) uses another decoder system.
  • a playback tool (214) can also include client-side controller logic for managing connections with the encoding tool (212).
  • Figure 3 is a block diagram of a generalized frame packing/unpacking system (300) in conjunction with which some described embodiments may be implemented.
  • the system (300) includes a video source (310), which produces source frames (311) of a higher-resolution chroma sampling format such as a 4:4:4 format.
  • the video source (310) can be a camera, tuner card, storage media, or other digital video source.
  • the frame packer (315) rearranges the frames (311) of the higher-resolution chroma sampling format to produce source frames (316) of a lower-resolution chroma sampling format such as a 4:2:0 format.
  • Example approaches to frame packing are described below.
  • the frame packer (315) can signal metadata (317) that indicates whether and how frame packing was performed, for use by the frame unpacker (385) after decoding.
  • Example approaches to signaling frame packing arrangement metadata are described below.
  • the encoder (340) encodes the frames (316) of the lower-resolution chroma sampling format. Example encoders are described below with reference to Figures 4 and 6.
  • the encoder (340) outputs coded data (341) over a channel (350), which represents storage, a communications connection, or another channel for the output.
  • the decoder (360) receives the encoded data (341) and decodes the frames (316) of the lower-resolution chroma sampling format. Example decoders are described below with reference to Figures 5 and 7. The decoder outputs reconstructed frames (381) of the lower-resolution chroma sampling format.
  • the frame unpacker (385) optionally rearranges the reconstructed frames (381) of the lower-resolution chroma sampling format to reconstruct the frames (386) of the higher-resolution chroma sampling format.
  • Example approaches to frame unpacking are described below.
  • the frame unpacker (385) can receive the metadata (317) that indicates whether and how frame packing was performed, and use such metadata (317) to guide unpacking operations.
  • the frame unpacker (385) outputs the reconstructed frames of the higher- resolution chroma sampling format to an output destination (390).
  • FIG. 4 is a block diagram of an example encoder system (400) in conjunction with which some described embodiments may be implemented.
  • the encoder system (400) can be a general-purpose encoding tool capable of operating in any of multiple encoding modes such as a low-latency encoding mode for real-time communication, transcoding mode, and regular encoding mode for media playback from a file or stream, or it can be a special-purpose encoding tool adapted for one such encoding mode.
  • the encoder system (400) can be implemented as an operating system module, as part of an application library or as a standalone application. Overall, the encoder system (400) receives a sequence of source video frames (41 1) (of a higher-resolution chroma sampling format such as a 4:4:4 format) from a video source (410), performs frame packing (including wavelet
  • a decomposition or other band separation filter to a lower-resolution chroma sampling format such as a 4:2:0 format, encodes frames of the lower-resolution chroma sampling format, and produces encoded data as output to a channel (490).
  • the video source (410) can be a camera, tuner card, storage media, or other digital video source.
  • the video source (410) produces a sequence of video frames at a frame rate of, for example, 30 frames per second.
  • frame generally refers to source, coded or reconstructed image data.
  • a frame is a progressive-scan video frame.
  • an interlaced video frame is de-interlaced prior to encoding.
  • two interlaced video frame is de-interlaced prior to encoding.
  • complementary interlaced video fields are encoded as an interlaced video frame or as separate fields.
  • the term "frame” can indicate a single non-paired video field, a complementary pair of video fields, a video object plane that represents a video object at a given time, or a region of interest in a larger image.
  • the video object plane or region can be part of a larger image that includes multiple objects or regions of a scene.
  • the source frames (41 1) are in a higher-resolution chroma sampling format such as a 4:4:4 format.
  • the frame packer (415) rearranges the frames (41 1) of the higher-resolution chroma sampling format to produce source frames (416) of a lower-resolution chroma sampling format such as a 4:2:0 format.
  • Example approaches to frame packing are described below.
  • the frame packer (415) can signal metadata (not shown) that indicates whether and how frame packing was performed, for use by a frame unpacker after decoding.
  • Example approaches to signaling frame packing arrangement metadata are described below.
  • the frame packer (415) can perform pre-processing operations, for example, wavelet decomposition or other band separation filter, as described below.
  • An arriving source frame (416) is stored in a source frame temporary memory storage area (420) that includes multiple frame buffer storage areas (421 , 422, . .. , 42/?).
  • a frame buffer (421, 422, etc.) holds one source frame in the source frame storage area (420).
  • a frame selector (430) periodically selects an individual source frame from the source frame storage area (420).
  • the order in which frames are selected by the frame selector (430) for input to the encoder (440) may differ from the order in which the frames are produced by the video source (410), e.g., a selected frame may be ahead in order, to facilitate temporally backward prediction.
  • the encoder system (400) can include another pre- processor (not shown) that performs pre-processing ⁇ e.g., filtering) of the selected frame (431) before encoding.
  • the encoder (440) encodes the selected frame (431) (of the lower-resolution chroma sampling format) to produce a coded frame (441) and also produces memory management control operation ("MMCO") signals (442) or reference picture set (“RPS") information. If the current frame is not the first frame that has been encoded, when performing its encoding process, the encoder (440) may use one or more previously encoded/decoded frames (469) that have been stored in a decoded frame temporary memory storage area (460). Such stored decoded frames (469) are used as reference frames for inter- frame prediction of the content of the current source frame (431).
  • MMCO memory management control operation
  • RPS reference picture set
  • the encoder (440) includes multiple encoding modules that perform encoding tasks such as motion estimation and compensation, frequency transforms, quantization and entropy coding.
  • the exact operations performed by the encoder (440) can vary depending on compression format.
  • the format of the output encoded data can be a Windows Media Video format, VC-1 format, MPEG-x format (e.g., MPEG-1, MPEG-2, or MPEG-4), H.26x format (e.g., H.261, H.262, H.263, H.264), HEVC format or other format.
  • the encoder (440) is adapted for encoding frames of the lower-resolution chroma sampling format.
  • an inter-coded, predicted frame is represented in terms of prediction from reference frames.
  • a motion estimator estimates motion of sets of samples of a source frame (441) with respect to one or more reference frames (469).
  • a set of samples can be a macroblock, sub-macroblock or sub-macroblock partition (as in the H.264 standard), or it can be a coding tree unit or prediction unit (as in the HEVC standard).
  • the term "block” indicates a set of samples, which may be a single two-dimensional (“2D") array or multiple 2D arrays (e.g., one array for a luma component and two arrays for chroma components).
  • the multiple reference frames can be from different temporal directions or the same temporal direction.
  • the motion estimator outputs motion information such as motion vector information, which is entropy coded.
  • a motion compensator applies motion vectors to reference frames to determine motion-compensated prediction values.
  • the encoder determines the differences (if any) between a block's motion-compensated prediction values and corresponding original values. These prediction residual values (i.e., residuals, residue values) are further encoded using a frequency transform, quantization and entropy encoding.
  • the encoder 440 can determine intra-prediction values for a block, determine prediction residual values, and encode the prediction residual values.
  • the entropy coder of the encoder (440) compresses quantized transform coefficient values as well as certain side information (e.g., motion vector information, QP values, mode decisions, parameter choices).
  • Typical entropy coding techniques include Exp-Golomb coding, arithmetic coding, differential coding, Huffman coding, run length coding, variable-length-to- variable-length (“V2V”) coding, variable-length-to-fixed-length (“V2F”) coding, LZ coding, dictionary coding, probability interval partitioning entropy coding ("PIPE”), and combinations of the above.
  • the entropy coder can use different coding techniques for different kinds of information, and can choose from among multiple code tables within a particular coding technique.
  • the coded frames (441) and MMCO/RPS information (442) are processed by a decoding process emulator (450).
  • the decoding process emulator (450) implements some of the functionality of a decoder, for example, decoding tasks to reconstruct reference frames that are used by the encoder (440) in motion estimation and compensation.
  • the decoding process emulator (450) uses the MMCO/RPS information (442) to determine whether a given coded frame (441) needs to be stored for use as a reference frame in inter- frame prediction of subsequent frames to be encoded.
  • the decoding process emulator (450) models the decoding process that would be conducted by a decoder that receives the coded frame (441) and produces a corresponding decoded frame (451). In doing so, when the encoder (440) has used decoded frame(s) (469) that have been stored in the decoded frame storage area (460), the decoding process emulator (450) also uses the decoded frame(s) (469) from the storage area (460) as part of the decoding process.
  • the decoded frame temporary memory storage area (460) includes multiple frame buffer storage areas (461 , 462, . .. , 46/?).
  • the decoding process emulator (450) uses the MMCO/RPS information (442) to manage the contents of the storage area (460) in order to identify any frame buffers (461 , 462, etc.) with frames that are no longer needed by the encoder (440) for use as reference frames. After modeling the decoding process, the decoding process emulator (450) stores a newly decoded frame (451) in a frame buffer (461 , 462, etc.) that has been identified in this manner.
  • the coded frames (441) and MMCO/RPS information (442) are also buffered in a temporary coded data area (470).
  • the coded data that is aggregated in the coded data area (470) can also include media metadata relating to the coded video data (e.g., as one or more parameters in one or more SEI messages (such as frame packing arrangement SEI messages) or video usability information (“VUI”) messages).
  • SEI messages such as frame packing arrangement SEI messages
  • VUI video usability information
  • the aggregated data (471) from the temporary coded data area (470) are processed by a channel encoder (480).
  • the channel encoder (480) can packetize the aggregated data for transmission as a media stream (e.g., according to a media container format such as ISO/IEC 14496-12), in which case the channel encoder (480) can add syntax elements as part of the syntax of the media transmission stream.
  • the channel encoder (480) can organize the aggregated data for storage as a file (e.g., according to a media container format such as ISO/IEC 14496-12), in which case the channel encoder (480) can add syntax elements as part of the syntax of the media storage file.
  • the channel encoder (480) can implement one or more media system
  • channel encoder (480) can add syntax elements as part of the syntax of the protocol(s). Such syntax elements for a media transmission stream, media storage stream, multiplexing protocols or transport protocols can include frame packing arrangement metadata.
  • the channel encoder (480) provides output to a channel (490), which represents storage, a communications connection, or another channel for the output.
  • FIG. 5 is a block diagram of an example decoder system (500) in conjunction with which some described embodiments may be implemented.
  • the decoder system (500) can be a general-purpose decoding tool capable of operating in any of multiple decoding modes such as a low-latency decoding mode for real-time communication and regular decoding mode for media playback from a file or stream, or it can be a special-purpose decoding tool adapted for one such decoding mode.
  • the decoder system (500) can be implemented as an operating system module, as part of an application library or as a standalone application.
  • the decoder system (500) receives coded data from a channel (510), decodes frames of a lower-resolution chroma sampling format such as a 4:2:0 format, optionally performs frame unpacking (including wavelet reconstruction or other inverse band separation filter) from the lower-resolution chroma sampling format to a higher-resolution chroma sampling format such as a 4:4:4 format, and produces reconstructed frames (of the higher-resolution chroma sampling format) as output for an output destination (590).
  • a lower-resolution chroma sampling format such as a 4:2:0 format
  • frame unpacking including wavelet reconstruction or other inverse band separation filter
  • the decoder system (500) includes a channel (510), which can represent storage, a communications connection, or another channel for coded data as input.
  • the channel (510) produces coded data that has been channel coded.
  • a channel decoder (520) can process the coded data. For example, the channel decoder (520) de-packetizes data that has been aggregated for transmission as a media stream (e.g., according to a media container format such as ISO/IEC 14496-12), in which case the channel decoder (520) can parse syntax elements added as part of the syntax of the media transmission stream.
  • the channel decoder (520) separates coded video data that has been aggregated for storage as a file (e.g., according to a media container format such as ISO/IEC 14496-12), in which case the channel decoder (520) can parse syntax elements added as part of the syntax of the media storage file.
  • the channel decoder (520) can implement one or more media system demultiplexing protocols or transport protocols, in which case the channel decoder (520) can parse syntax elements added as part of the syntax of the protocol(s).
  • Such syntax elements for a media transmission stream, media storage stream, multiplexing protocols or transport protocols can include frame packing arrangement metadata.
  • the coded data (521) that is output from the channel decoder (520) is stored in a temporary coded data area (530) until a sufficient quantity of such data has been received.
  • the coded data (521) includes coded frames (531) (at the lower-resolution chroma sampling format) and MMCO/RPS information (532).
  • the coded data (521) in the coded data area (530) can also include media metadata relating to the encoded video data (e.g., as one or more parameters in one or more SEI messages such as frame packing arrangement SEI messages or VUI messages).
  • the coded data area (530) temporarily stores coded data (521) until such coded data (521) is used by the decoder (550).
  • coded data for a coded frame (531) and MMCO/RPS information (532) are transferred from the coded data area (530) to the decoder (550).
  • new coded data is added to the coded data area (530) and the oldest coded data remaining in the coded data area (530) is transferred to the decoder (550).
  • the decoder (550) periodically decodes a coded frame (531) to produce a corresponding decoded frame (551) of the lower-resolution chroma sampling format.
  • the decoder (550) may use one or more previously decoded frames (569) as reference frames for inter- frame prediction.
  • the decoder (550) reads such previously decoded frames (569) from a decoded frame temporary memory storage area (560).
  • the decoder (550) includes multiple decoding modules that perform decoding tasks such as entropy decoding, inverse quantization, inverse frequency transforms and motion compensation.
  • the exact operations performed by the decoder (550) can vary depending on compression format. In general, the decoder (550) is adapted for decoding frames of the lower-resolution chroma sampling format.
  • the decoder (550) receives encoded data for a compressed frame or sequence of frames and produces output including decoded frame (551) of the lower- resolution chroma sampling format.
  • a buffer receives encoded data for a compressed frame and makes the received encoded data available to an entropy decoder.
  • the entropy decoder entropy decodes entropy-coded quantized data as well as entropy-coded side information, typically applying the inverse of entropy encoding performed in the encoder.
  • a motion compensator applies motion information to one or more reference frames to form motion-compensated predictions of blocks (e.g., macroblocks, sub-macroblocks, sub-macroblock partitions, coding tree units, prediction units, or parts thereof) of the frame being reconstructed.
  • An intra prediction module can spatially predict sample values of a current block from neighboring, previously
  • the decoder (550) also reconstructs prediction residuals.
  • An inverse quantizer inverse quantizes entropy-decoded data.
  • An inverse frequency transformer converts the quantized, frequency domain data into spatial domain
  • the decoder (550) For a predicted frame, the decoder (550) combines reconstructed prediction residuals with motion-compensated predictions to form a reconstructed frame. The decoder (550) can similarly combine prediction residuals with spatial predictions from intra prediction.
  • a motion compensation loop in the video decoder (550) includes an adaptive de-blocking filter to smooth discontinuities across block boundary rows and/or columns in the decoded frame (551).
  • the decoded frame temporary memory storage area (560) includes multiple frame buffer storage areas (561, 562, ..., 56/?).
  • the decoded frame storage area (560) is an example of a DPB.
  • the decoder (550) uses the MMCO/RPS information (532) to identify a frame buffer (561, 562, etc.) in which it can store a decoded frame (551) of the lower-resolution chroma sampling format.
  • the decoder (550) stores the decoded frame (551) in that frame buffer.
  • An output sequencer (580) uses the MMCO/RPS information (532) to identify when the next frame to be produced in output order is available in the decoded frame storage area (560).
  • the output sequencer (580) is read by the output sequencer (580) and output to either (a) the output destination (590) (e.g. , display) for display of the frame of the lower-resolution chroma sampling format, or (b) the frame unpacker (585).
  • the order in which frames are output from the decoded frame storage area (560) by the output sequencer (580) may differ from the order in which the frames are decoded by the decoder (550).
  • the frame unpacker (585) rearranges the frames (581) of the lower-resolution chroma sampling format to produce output frames (586) of a higher-resolution chroma sampling format such as a 4:4:4 format.
  • Example approaches to frame unpacking are described below.
  • the frame packer (585) can use metadata (not shown) that indicates whether and how frame packing was performed, to guide frame unpacking operations.
  • the frame unpacker (585) can perform post-processing operations, for example, wavelet reconstruction or other inverse band separation filter, as described below.
  • Figure 6 is a block diagram of a generalized video encoder (600) in conjunction with which some described embodiments may be implemented.
  • the encoder (600) receives a sequence of video frames of a lower-resolution chroma sampling format such as a 4:2:0 format, including a current frame (605), and produces encoded data (695) as output.
  • a lower-resolution chroma sampling format such as a 4:2:0 format
  • the encoder (600) is block-based and uses a macroblock format that depends on implementation. Blocks may be further sub-divided at different stages, e.g., at the frequency transform and entropy encoding stages. For example, a frame can be divided into 16x16 macrob locks, which can in turn be divided into 8x8 blocks and smaller sub- blocks of pixel values for coding and decoding.
  • the encoder system (600) compresses predicted frames and intra-coded frames. For the sake of presentation, Figure 6 shows an "intra path" through the encoder (600) for intra-frame coding and an "inter path" for inter-frame coding. Many of the components of the encoder (600) are used for both intra-frame coding and inter-frame coding. The exact operations performed by those components can vary depending on the type of information being compressed.
  • a motion estimator (610) estimates motion of blocks (e.g., macrob locks, sub-macrob locks, sub-macroblock partitions, coding tree units, prediction units, or parts thereof) of the current frame (605) with respect to one or more reference frames.
  • the frame store (620) buffers one or more reconstructed previous frames (625) for use as reference frames.
  • the multiple reference frames can be from different temporal directions or the same temporal direction.
  • the motion estimator (610) outputs as side information motion information (615) such as differential motion vector information.
  • the motion compensator (630) applies reconstructed motion vectors to the reconstructed reference frame(s) (625) when forming a motion-compensated current frame (635).
  • the difference (if any) between block of the motion-compensated current frame (635) and corresponding part of the original current frame (605) is the prediction residual (645) for the block.
  • reconstructed prediction residuals are added to the motion-compensated current frame (635) to obtain a reconstructed frame that is closer to the original current frame (605).
  • the intra path can include an intra prediction module (not shown) that spatially predicts pixel values of a current block from neighboring, previously reconstructed pixel values.
  • a frequency transformer (660) converts spatial domain video information into frequency domain (i.e., spectral, transform) data.
  • the frequency transformer (660) applies a discrete cosine transform, an integer approximation thereof, or another type of forward block transform to blocks of pixel value data or prediction residual data, producing blocks of frequency transform coefficients.
  • a quantizer (670) then quantizes the transform coefficients. For example, the quantizer (670) applies non-uniform, scalar quantization to the frequency domain data with a step size that varies on a frame-by-frame basis, macroblock-by-macroblock basis or other basis.
  • an inverse quantizer (676) performs inverse quantization on the quantized frequency coefficient data.
  • the inverse frequency transformer (666) performs an inverse frequency transform, producing blocks of reconstructed prediction residuals or pixel values.
  • the encoder (600) combines reconstructed prediction residuals (645) with motion-compensated predictions (635) to form the reconstructed frame (605). (Although not shown in Figure 6, in the intra path, the encoder (600) can combine prediction residuals with spatial predictions from intra prediction.)
  • the frame store (620) buffers the reconstructed current frame for use in subsequent motion- compensated prediction.
  • a motion compensation loop in the encoder (600) includes an adaptive in-loop deblock filter (610) before or after the frame store (620).
  • the decoder (600) applies in- loop filtering to reconstructed frames to adaptively smooth discontinuities across boundaries in the frames.
  • the adaptive in-loop deblock filter (610) can be disabled for some types of content. For example, in a frame packing approach with main and auxiliary views, the adaptive in-loop deblock filter (610) can be disabled when encoding an auxiliary view (including remaining chroma information that is not part of a main view) so as to not introduce artifacts such as blurring.
  • the entropy coder (680) compresses the output of the quantizer (670) as well as motion information (615) and certain side information (e.g., QP values).
  • the entropy coder (680) provides encoded data (695) to the buffer (690), which multiplexes the encoded data into an output bitstream.
  • a controller receives inputs from various modules of the encoder.
  • the controller evaluates intermediate results during encoding, for example, setting QP values and performing rate-distortion analysis.
  • the controller works with other modules to set and change coding parameters during encoding.
  • the controller can vary QP values and other control parameters to control quantization of luma components and chroma components during encoding.
  • the controller can vary QP values to dedicate more bits to luma content of a given frame (which could be a main view or auxiliary view in a frame packing approach) compared to chroma content of that frame.
  • the controller can vary QP values to dedicate more bits to the main view (including luma and sub-sampled chroma components) compared to the auxiliary view (including remaining chroma information).
  • the encoder can exploit geometric correspondence among sample values of the chroma components in several ways.
  • geometric correspondence indicates a relationship between (1) chroma information at positions of a (nominally) luma component of a frame constructed from the lower- resolution chroma sampling format and (2) chroma information at corresponding scaled positions of chroma components of the frame of the lower-resolution chroma sampling format.
  • a scaling factor applies between positions of the luma and chroma components. For example, for 4:2:0, the scaling factor is two both horizontally and vertically, and for 4:2:2, the scaling factor is two horizontally and one vertically.
  • the encoder can use the geometric correspondence to guide motion estimation, QP selection, prediction mode selection or other decision-making processes from block-to- block, by first evaluating recent results of neighboring blocks when encoding a current block of the to-be-encoded frame. Or, the encoder can use the geometric correspondence to guide such decision-making processes for high-resolution chroma information packed into chroma components of the to-be-encoded frame, using results from encoding of high- resolution chroma information packed into a "luma" component of the to-be-encoded frame.
  • the encoder can use the geometric correspondence to improve compression performance, where motion vectors, prediction modes, or other decisions for high-resolution chroma information packed into a "luma" component of the to-be-encoded frame are also used for high-resolution chroma information packed into chroma components of the to-be-encoded frame.
  • motion vectors, prediction modes, or other decisions for high-resolution chroma information packed into a "luma" component of the to-be-encoded frame are also used for high-resolution chroma information packed into chroma components of the to-be-encoded frame.
  • approach 2 when chroma information is packed into an auxiliary frame of the lower-resolution chroma sampling format, spatial correspondence and motion vector displacement relationships between the nominally luma component of the auxiliary frame and nominally chroma components of the auxiliary frame are preserved.
  • Sample values at corresponding spatial positions in Y, U and V components of the auxiliary frame tend to be consistent, which is useful for such purposes as spatial block size segmentation and joint coding of coded block pattern information or other information that indicates presence/absence of non-zero coefficient values.
  • Motion vectors for corresponding parts of Y, U and V components of the auxiliary frame tend to be consistent (e.g., a vertical or horizontal displacement of two samples in Y corresponds to a displacement of 1 sample in U and V), which also helps coding efficiency.
  • modules of the encoder can be added, omitted, split into multiple modules, combined with other modules, and/or replaced with like modules.
  • encoders with different modules and/or other configurations of modules perform one or more of the described techniques.
  • Specific embodiments of encoders typically use a variation or supplemented version of the encoder (600).
  • the relationships shown between modules within the encoder (600) indicate general flows of information in the encoder; other relationships are not shown for the sake of simplicity. VII. Example Video Decoders.
  • Figure 7 is a block diagram of a generalized decoder (700) in conjunction with which several described embodiments may be implemented.
  • the decoder (700) receives encoded data (795) for a compressed frame or sequence of frames and produces output including a reconstructed frame (705) of a lower-resolution chroma sampling format such as a 4:2:0 format.
  • Figure 7 shows an "intra path" through the decoder (700) for intra-frame decoding and an "inter path" for inter-frame decoding.
  • Many of the components of the decoder (700) are used for both intra-frame decoding and inter-frame decoding. The exact operations performed by those components can vary depending on the type of information being decompressed.
  • a buffer (790) receives encoded data (795) for a compressed frame and makes the received encoded data available to the parser/entropy decoder (780).
  • parser/entropy decoder (780) entropy decodes entropy-coded quantized data as well as entropy-coded side information, typically applying the inverse of entropy encoding performed in the encoder.
  • a motion compensator (730) applies motion information (715) to one or more reference frames (725) to form motion-compensated predictions (735) of blocks (e.g., macroblocks, sub-macroblocks, sub-macroblock partitions, coding tree units, prediction units, or parts thereof) of the frame (705) being reconstructed.
  • the frame store (720) stores one or more previously reconstructed frames for use as reference frames.
  • the intra path can include an intra prediction module (not shown) that spatially predicts pixel values of a current block from neighboring, previously reconstructed pixel values.
  • the decoder (700) reconstructs prediction residuals.
  • An inverse quantizer (770) inverse quantizes entropy-decoded data.
  • An inverse frequency transformer (760) converts the quantized, frequency domain data into spatial domain information. For example, the inverse frequency transformer (760) applies an inverse block transform to frequency transform coefficients, producing pixel value data or prediction residual data.
  • the inverse frequency transform can be an inverse discrete cosine transform, an integer approximation thereof, or another type of inverse frequency transform.
  • the decoder (700) For a predicted frame, the decoder (700) combines reconstructed prediction residuals (745) with motion-compensated predictions (735) to form the reconstructed frame (705). (Although not shown in Figure 7, in the intra path, the decoder (700) can combine prediction residuals with spatial predictions from intra prediction.)
  • a motion compensation loop in the decoder (700) includes an adaptive in-loop deblock filter (710) before or after the frame store (720).
  • the decoder (700) applies in-loop filtering to reconstructed frames to adaptively smooth discontinuities across boundaries in the frames.
  • the adaptive in-loop deblock filter (710) can be disabled for some types of content, when it was disabled during encoding. For example, in a frame packing approach with main and auxiliary views, the adaptive in-loop deblock filter (710) can be disabled when decoding an auxiliary view (including remaining chroma information that is not part of a main view).
  • the decoder (700) also includes a post-processing deblock filter (708).
  • the post-processing deblock filter (708) optionally smoothes discontinuities in reconstructed frames.
  • Other filtering such as de-ring filtering
  • reconstructed frames that are subjected to later frame unpacking bypass the post-processing deblock filter (708).
  • modules of the decoder can be added, omitted, split into multiple modules, combined with other modules, and/or replaced with like modules.
  • decoders with different modules and/or other configurations of modules perform one or more of the described techniques.
  • Specific embodiments of decoders typically use a variation or supplemented version of the decoder (700).
  • the relationships shown between modules within the decoder (700) indicate general flows of information in the decoder; other relationships are not shown for the sake of simplicity. VIII. Frame Packing/Unpacking for Higher-Resolution Chroma Sampling Formats.
  • This section describes various approaches to packing frames of a higher- resolution chroma sampling format into frames of a lower-resolution chroma sampling format.
  • the frames of the lower-resolution chroma sampling format can then be encoded using an encoder designed for the lower-resolution chroma sampling format.
  • the frames at the lower-resolution chroma sampling format can be output for further processing and display.
  • the frames of the higher-resolution chroma sampling format can be recovered through frame unpacking for output and display.
  • Various approaches described herein can be used to preserve chroma information for frames of a 4:4:4 format when encoding/decoding uses a 4:2:0 format, as one specific example.
  • a YUV 4:4:4 frame is packed into two YUV 4:2:0 frames.
  • a typical 4:4:4 frame contains 12 sample values for every 4 pixel positions, while a 4:2:0 frame contains only 6 sample values for every 4 pixel positions. So, all the sample values contained in a 4:4:4 frame can be packed into two 4:2:0 frames.
  • a Y444 plane, U444 plane, and V444 plane are the three component planes for the YUV 4:4:4 frame (801). Each plane has the resolution of width W and height H. For convenience in describing the examples used herein, both W and H are divisible by 4, without implying that this is a limitation of the approach.
  • the approach (800) to packing the YUV 4:4:4 frame into two YUV 4:2:0 frames splits the YUV 4:4:4 frame as shown in Figure 8.
  • the U444 plane of the YUV 4:4:4 frame (801) is partitioned into a bottom half H2-U444 and two upper quarters QI-U444 and Q2-Ut44 using spatial partitioning.
  • the V444 plane of the YUV 4:4:4 frame (801) is partitioned into a bottom half H2-V444 and two upper quarters QI-V444 and Q2-V444 using spatial partitioning. [0103]
  • the partitioned planes of the YUV 4:4:4 frame (801) are then reorganized as one or more YUV 4:2:0 frames.
  • the Y444 plane for the YUV 4:4:4 frames becomes the luma component plane of a first frame (802) of the YUV 4:2:0 format.
  • the bottom halves of the U444 plane and the V444 plane become the luma component plane of a second frame (803) of the YUV 4:2:0 format.
  • the top quarters of the U444 plane and the V444 plane become the chroma component planes of the first frame (802) and second frame (803) of the YUV 4:2:0 format as shown in Figure 8.
  • the first frame (802) and second frame (803) of the YUV 4:2:0 format can be organized as separate frames (separated by the dark line in Figure 8). Or, the first frame (802) and second frame (803) of the YUV 4:2:0 format can be organized as a single frame having a height of 2 x H (ignoring the dark line in Figure 8). Or, the first frame (802) and second frame (803) of the YUV 4:2:0 format can be organized as a single frame having a width of 2 x W.
  • first frame (802) and second frame (803) of the YUV 4:2:0 format can be organized as a single frame or multiple frames using any of the methods defined for frame_packing_arrangement_type in the H.264/AVC standard or the HEVC standard.
  • Approach 1 can be used for color spaces such as RGB, GBR, etc. in sampling ratios such as 4:4:4, 4:2:2, 4:2:0, etc., as the chroma sampling formats. 2.
  • a YUV 4:4:4 frame is packed into two YUV 4:2:0 frames while maintaining geometric correspondence for chroma information of the YUV 4:4:4 frame.
  • YUV 4:2:0 frames with good geometric correspondence among their Y, U and V components can be compressed better because they fit the model expected by a typical encoder adapted to encode YUV 4:2:0 frames.
  • the packing can also be done such that one of the two YUV 4:2:0 frames represents the complete scene being represented by the YUV 4:4:4 frame, albeit with color components at a lower resolution.
  • This provides options in decoding.
  • a decoder that cannot perform frame unpacking, or chooses not to perform frame unpacking, can just take a reconstructed version of the YUV 4:2:0 frame that represents the scene and directly feed it to the display.
  • Figure 9 illustrates one example approach (900) to frame packing that is consistent with these design constraints.
  • a YUV 4:4:4 frame (801) is packed into two YUV 4:2:0 frames (902, 903).
  • the first frame (902) provides a "main view" in YUV 4:2:0 format - a lower chroma resolution version of the complete scene represented by the YUV 4:4:4 frame (801).
  • the second frame (903) provides an
  • auxiliary view in YUV 4:2:0 format and contains remaining chroma information.
  • the areas Bl ... B9 are different areas within the respective frames (902, 903) of YUV 4:2:0 format.
  • the sample values of odd rows of the U444 plane and V444 plane of the YUV 4:4:4 frame (801) are assigned to the areas B4 and B5, and the sample values of even rows of the U444 plane and V444 plane of the YUV 4:4:4 frame (801) are distributed between the areas B2, B3 and B6...B9.
  • sample values of the Y444 plane, U444 plane, and V444 plane of the YUV 4:4:4 frame (801) map to the areas Bl ... B9 as follows.
  • V 4 ⁇ 3 ⁇ 4 in (x, y) V 444 (2x, 2y) , where the range of (x, y) is [0, - 1 ] x
  • V420 (x, ⁇ + y) V4 (2x + l,4y + 2) , where the range of (x, y) is [0, ⁇ - 1] x [0, 2 - 1].
  • the sample values of the Y444 plane, U444 plane, and V444 plane of the YUV 4:4:4 frame (801) can be assigned to the areas Bl ... B9 in a different way.
  • the sample values of even rows of the U444 plane and V444 plane of the YUV 4:4:4 frame (801) are assigned to the areas B4 and B5
  • the sample values of odd rows of the U444 plane and V444 plane of the YUV 4:4:4 frame (801) are distributed between the areas B2, B3 and B6...B9.
  • data from the original U plane of the YUV 4:4:4 frame can be arranged in the U plane of the auxiliary YUV 4:2:0 frame, and data from the original V plane of the YUV 4:4:4 frame can be arranged in the V plane of the auxiliary YUV 4:2:0 frame.
  • the sample values from V444(2x + 1 , 4y) that are assigned to area B7 in the equations above can instead be assigned to area B8, and the sample values from U444(2x + 1, 4y + 2) that are assigned to area B8 in the equations above can instead be assigned to area B7.
  • the same sample values from U444 can be copied into a single area for B6 and B7 without separating every second row
  • the same sample values from V444 can be copied into a single area for B8 and B9 without separating every second row.
  • the U plane (or V plane) of the auxiliary YUV 4:2:0 frame is constructed from the U plane (or V plane) of the YUV 4:4:4 frame, without mixing content from different original U and V planes.
  • the U plane (or V plane) of the auxiliary YUV 4:2:0 frame has a mixture of data from the U and V components of the YUV 4:4:4 frame.
  • the upper half of the U plane (or V plane) of the auxiliary YUV 4:2:0 frame contains data from the original U plane, and the lower half contains data from the original V plane.
  • the first frame (902) and second frame (903) of the YUV 4:2:0 format can be organized as separate frames (separated by the dark line in Figure 9). Or, the first frame (902) and second frame (903) of the YUV 4:2:0 format can be organized as a single frame having a height of 2 x H (ignoring the dark line in Figure 9). Or, the first frame (902) and second frame (903) of the YUV 4:2:0 format can be organized as a single frame having a width of 2 x W.
  • first frame (902) and second frame (903) of the YUV 4:2:0 format can be organized as a single frame using any of the methods defined for frame_packing_arrangement_type in the H.264/AVC standard or the HEVC standard.
  • Figure 10 illustrates example frames packed according to the approach (900) of Figure 9.
  • Figure 10 shows a YUV 4:4:4 frame (1001) that includes a Y444 plane, U444 plane, and V444 plane.
  • the main view (1002) (first YUV 4:2:0 frame) is the YUV 4:2:0 equivalent of the original YUV 4:4:4 frame (1001).
  • a decoding system can simply display a reconstructed version of the main view (1002) if YUV 4:4:4 is either not supported or considered not necessary.
  • the auxiliary view (1003) contains chroma information for the YUV 4:4:4 frame (1001). Even so, the auxiliary view (1003) fits the content model of a YUV 4:2:0 frame and is well suited for compression using a typical YUV 4:2:0 video encoder. Within the frame, the auxiliary view (1003) exhibits geometric correspondence across its Y, U and V components. Between frames, the auxiliary views are expected to show motion that is highly correlated across Y, U and V components.
  • FIG 11 illustrates another example approach (1100) to frame packing that is consistent with these design constraints.
  • a YUV 4:4:4 frame (801) is packed into two YUV 4:2:0 frames (1102, 1103).
  • the first frame (1102) provides a "main view” in YUV 4:2:0 format - a lower chroma resolution version of the complete scene represented by the YUV 4:4:4 frame (801) - while the second frame (1103) provides an "auxiliary view" in YUV 4:2:0 format and contains remaining chroma information.
  • the areas Al ... A9 are different areas within the respective frames (1102, 1103) of YUV 4:2:0 format.
  • the sample values of odd columns of the U444 plane and V444 plane of the YUV 4:4:4 frame (801) are assigned to the areas A4 and A5, and the sample values of even columns of the U444 plane and V444 plane of the YUV 4:4:4 frame (801) are distributed between the areas A2, A3 and A6... A9.
  • sample values of the Y444 plane, U444 plane, and V444 plane of the YUV 4:4:4 frame (801) map to the areas Al ... A9 as follows.
  • V 2 0 x (x, y) U 444 (4x + 2,2y + 1), where the range of (x, y) is [0, ⁇
  • V ⁇ o ⁇ + x > y) V 4 44(4 + 2,2y + 1), where the range of (x, y) is
  • the sample values of the Y444 plane, U444 plane, and V444 plane of the YUV 4:4:4 frame (801) can be assigned to the areas Al . .. A9 in a different way.
  • the sample values of even columns of the U444 plane and V444 plane of the YUV 4:4:4 frame (801) are assigned to the areas A4 and A5
  • the sample values of odd columns of the U444 plane and V444 plane of the YUV 4:4:4 frame (801) are distributed between the areas A2, A3 and A6. ..A9.
  • data from the original U plane of the YUV 4:4:4 frame can be arranged in the U plane of the auxiliary YUV 4:2:0 frame
  • data from the original V plane of the YUV 4:4:4 frame can be arranged in the V plane of the auxiliary YUV 4:2:0 frame.
  • the sample values from V444(4x, 2y + 1) that are assigned to area A7 in the equations above are instead assigned to area A8, and the sample values from U444(4x + 2, 2y + 1) that are assigned to area A8 in the equations above are instead assigned to area A7.
  • the same sample values from U444 can be copied into a single area for A6 and A7 without separating every second column
  • the same sample values from V444 can be copied into a single area for A8 and A9 without separating every second column.
  • the U plane (or V plane) of the auxiliary YUV 4:2:0 frame is constructed from the U plane (or V plane) of the YUV 4:4:4 frame, without mixing content from different original U and V planes.
  • the first frame (1102) and second frame (1103) of the YUV 4:2:0 format can be organized as separate frames (separated by the dark line in Figure 11). Or, the first frame (1102) and second frame (1103) of the YUV 4:2:0 format can be organized as a single frame having a height of 2 x H (ignoring the dark line in Figure 11). Or, the first frame (1102) and second frame (1103) of the YUV 4:2:0 format can be organized as a single frame having a width of 2 x W.
  • first frame (1102) and second frame (1103) of the YUV 4:2:0 format can be organized as a single frame using any of the methods defined for frame_packing_arrangement_type in the H.264/AVC standard or the HEVC standard.
  • Frame unpacking can simply mirror frame packing. Samples assigned to areas of with frames of YUV 4:2:0 format are assigned back to original locations in chroma components of frames of YUV 4:4:4 format. In one implementation, for example, during frame unpacking, samples in areas B2 ... B9 of frames of YUV 4:2:0 format are assigned to reconstructed chroma components U' 444 and V' 444 of frame of YUV 4:4:4 format as shown in the following pseudocode..
  • V' 444 (2x, 2y + 1) ⁇ "
  • V' 444 (2x + l,2y + 1) Y"3 ⁇ 43 ⁇ 4(2x + 1, (H » 1) + y)
  • V' 444 (2x + l,2y) U"3 ⁇ 43 ⁇ 4(x, (H » 2) + (y » 1))
  • V' 444 (2x + l,2y) V"3 ⁇ 43 ⁇ 4(x, (H » 2) + (y » 1))
  • V' 444 (2x, 2y) v"3 ⁇ 4ain (3 ⁇ 4 y)
  • a frame packing arrangement SEI message is used to signal that two 4:2:0 frames include a packed 4:4:4 frame.
  • Frame packing arrangement SEI messages are defined in the H.264/AVC standard as well as the HEVC standard, although such frame packing arrangement SEI messages have previously been used for a different purpose.
  • a frame packing arrangement SEI message was designed to send stereoscopic 3D video frames using a 2D video codec.
  • two 4:2:0 frames represent the left and right views of a stereoscopic 3D video scene.
  • the scope of the frame packing arrangement SEI message can be extended to instead support encoding/decoding of two 4:2:0 frames obtained from a single 4:4:4 frame, followed by frame unpacking to recover the 4:4:4 frame.
  • the two 4:2:0 frames represent a main view and an auxiliary view. Both the main and auxiliary views (frames) are in a format that is an equivalent of a 4:2:0 format.
  • the main view (frame) may be
  • these approaches can use the frame packing arrangement SEI message to effectively support encoding/decoding 4:4:4 frames using video codecs capable of coding/decoding 4:2:0 frames.
  • the SEI message is extended.
  • the semantics of the syntax element content interpretation type are extended as follows.
  • the relevant frame packing approaches for a YUV 4:4:4 frame, there are two constituent YUV 4:2:0 frames - a first frame for a main view, and a second frame for an auxiliary view.
  • content interpretation type indicates the intended interpretation of the constituent frames as specified in the following table.
  • the values 0, 1 and 2 are interpreted as in the
  • one or more of the following constraints may also be imposed for other syntax elements of a frame packing arrangement SEI message.
  • content interpretation type has a value between 3 and 6 (that is, for cases involving frame packing of YUV 4:4:4 frames into YUV 4:2:0 frames)
  • the values of the syntax elements quincunx_sampling_flag, spatial_flipping_flag,
  • frameO_grid_position_x, frameO_grid_position_y, frame l_grid_position_x, and frame l_grid_position_y shall be 0. Furthermore, when content interpretation type is equal to 3 or 5 (indicating absence of filtering in pre-processing),
  • chroma_loc_info_present_flag shall be 1 , and the values of
  • chroma sample loc type top field and chroma sample loc type bottom field shall be 2.
  • the syntax element frame_packing_arrangement_type can be used similarly in conjunction with values of content interpretation type that indicate packing of frames of a higher-resolution chroma sampling format. For example,
  • frame packing arrangement metadata is signaled in some other way.
  • the semantics of the content interpretation type syntax element instead of extending the semantics of the content interpretation type syntax element to indicate packing of frames of a higher-resolution chroma sampling format, the semantics of
  • frame_packing_arrangement_type can be extended to indicate packing of frames of a higher-resolution chroma sampling format.
  • frame packing arrangement metadata (such as values of frame_packing_arrangement_type higher than 5) can indicate whether frame packing/unpacking is used or not used, whether filtering or other preprocessing operations were used or not used (and hence whether corresponding post- processing filtering or other post-processing operations should be used or not used), the type of post-processing operations to perform, or other information about frame packing/unpacking, in addition to indicating how the main and auxiliary views are arranged.
  • the frame packing arrangement SEI message informs a decoder that the decoded pictures contain main and auxiliary views of a 4:4:4 frame as the constituent frames of the frame packing arrangement.
  • This information can be used to process the main and auxiliary views appropriately for display or other purposes.
  • the system at the decoding end desires the video in 4:4:4 format and is capable of reconstructing the 4:4:4 frames from the main and auxiliary views, the system may do so, and the output format will be 4:4:4. Otherwise, only the main view is given as output, and the output format will then be 4:2:0.
  • Simple sub-sampling of the chroma sample values of frames of a higher- resolution chroma sampling format can introduce aliasing artifacts in the downsampled chroma sample values.
  • frame packing can include pre-processing operations to filter chroma sample values. Such filtering can be termed anti-alias filtering.
  • Corresponding frame unpacking can then include post-processing operations to
  • preprocessing operations can be used to filter the chroma sample values during frame packing, and frame unpacking can include corresponding post-processing operations.
  • This section describes a first set of example pre-processing and post-processing operations. Another set of example pre-processing and post-processing operations, involving wavelet decomposition/reconstruction or other band separation filtering/inverse filtering, are described below.
  • pre-processing can help improve quality when only the YUV 4:2:0 frame representing the main view is used for display. This can permit a decoder to ignore the YUV 4:2:0 frame representing the auxiliary view without running the risk of aliasing artifacts caused by simple sub-sampling of chroma information.
  • pre-processing when the chroma signal for the YUV 4:2:0 frame representing the main view is obtained by direct sub-sampling of the chroma signal from the YUV 4:4:4 frame), aliasing artifacts can be seen on some content, for example, ClearType text content, when only the main view is used to generate output.
  • pre-processing and post-processing can help
  • the chroma signal is split into multiple areas, and each area may get compressed differently (e.g. , with a different level of quantization) depending on its location. Because of this, when the chroma signal is assembled again by interleaving the data from multiple areas, artificial discontinuities and high-frequency noise may be introduced. A post-processing operation can help smooth the differences caused in these areas due to compression.
  • pre-processing can help enhance the compression of the YUV 4:2:0 frame representing the auxiliary view, which contains the remaining chroma information.
  • the pre-processing operations and postprocessing operations are limited such that they affect only the chroma signal that is part of the YUV 4:2:0 frame representing the main view. That is, the filtered sample values are part of the chroma components of the main view.
  • pre-processing operations and postprocessing operations can be based on the chroma sample location type (indicating chroma sample grid alignment with luma sample grid).
  • the chroma sample location type is determined from chroma sample loc type top field and
  • chroma sample loc type bottom field syntax elements signaled as part of the compressed bitstream. (These two elements would ordinarily have equal values for progressive-scan source content.)
  • an odd-tap symmetric filter such as [1 2 l]/4, or [0.25 0.5 0.25], along with a rounding operation is used to filter chroma in that direction.
  • post-processing [0.125 0.375 0.375 0.125], along with a rounding operation.
  • the choice of postprocessing operation is usually made such that the post-processing operation compensates for the pre-processing operation. In some cases post-processing directly inverts pre- processing, while in other cases post-processing only approximately inverts preprocessing, as explained below.
  • the chroma sample location type is 1 for chroma sample loc type top field and chroma sample loc type bottom field syntax elements
  • the chroma sample does not align with luma samples in either horizontal or vertical direction, and hence the filter [0.5 0.5] is applied in both the horizontal and vertical directions for the pre-processing operation.
  • the equations for deriving the sample values areas B2 and B3 are as follows.
  • VTM- filt (x, y) [V 444 (2x, 2y) + V 444 (2x + l,2y) +
  • V 444 (2x, 2y) from the YUV 4:4:4 frame are not represented directly in the main view (902); instead, filtered sample values (UTM- mt (x, y) and VTM- filt (x, y)) are at positions in the main view (902).
  • the sample values at U 444 (2x + l,2y), U 444 (2x, 2y + 1), U 444 (2x + l,2y + 1), V 444 (2x + l,2y), V 444 (2x, 2y + 1) and V 444 (2x + l,2y + 1) from the YUV 4:4:4 frame are still represented directly in the auxiliary view (903) among the areas B4 ... B9.
  • the sample values for position U 444 (2x, 2y) and V 444 (2x, 2y) of the YUV 4:4:4 frame can be calculated as U' 444 (2x, 2y) and
  • a 0.5
  • a 0.5
  • sample values for position U 444 (2x, 2y) and V 444 (2x, 2y) of the YUV 4:4:4 frame can be calculated as U' 444 (2x, 2y) and V' 444 (2x, 2y), from values in the packed frame, as follows:
  • V' 444 (2x, 2y) (1 + a + ⁇ + ⁇ ) * V''TM in - mt (x, y) - a * V" 444 (2x + l,2y) - ⁇ * V" 444 (2x, 2y + 1) - ⁇ * V" 444 (2x + l,2y + 1),
  • ⁇ , ⁇ and ⁇ may be advisable in order to reduce perceptible artifacts.
  • ⁇ , ⁇ and ⁇ should be in the range from 0.0 to 1.0, and ⁇ , ⁇ and ⁇ should be smaller when the quantization step size is larger.
  • ⁇ , ⁇ and ⁇ may exacerbate artifacts introduced due to lossy compression.
  • the values of ⁇ , ⁇ and ⁇ can be designed for conditional optimality using cross-correlation analysis.
  • V 444 (2x, 2y) of the YUV 4:4:4 frame can simply be calculated as U' 444 (2x, 2y) and V' 444 (2x, 2y), from values in the packed frame, as follows:
  • V' 444 (2x, 2y) 4*V" 4 n 2 a 0 in - mt (x, y) - V" 444 (2x + l,2y) - V " 444 (2x, 2y + 1) -
  • the sample values 29, 15, 7, and 18 for locations (2x, 2y), (2x+l, 2y), (2x, 2y+l) and (2x+l, 2y+l) are filtered to produce a sample value 17.25, which is rounded to 17.
  • the value filtered sample value of 17 is used in place of the original sample value of 29.
  • the difference between the original sample value (29) and reconstructed sample value (28) shows loss of precision due to the filtering for the pre-processing operation.
  • a device can selectively skip filtering operations during postprocessing, even when filtering was performed during pre-processing. For example, a device can skip filtering during post-processing to reduce the computational load of decoding and playback.
  • the pre-processing operations and post-processing operations are not limited to the chroma signal of the 4:4:4 frame that is part of the 4:2:0 frame representing the main view (for example, areas B2 and B3 for the frame 902 represented in Figure 9).
  • the pre-processing operations and post-processing operations are also performed for the chroma signal of the 4:4:4 frame that is part of the 4:2:0 frame representing the auxiliary view (for example, areas B4 to B9 of the frame 903 represented in Figure 9).
  • Such pre-processing and post-processing operations can use different filtering operations than the pre-processing and post-processing of the chroma signal of the 4:4:4 frame that is made part of the 4:2:0 frame representing the main view.
  • an averaging filtering is used during pre-processing and corresponding filtering is used during post-processing.
  • the pre-processing operations and postprocessing operations can implement a transform/inverse transform pair.
  • the transform/inverse transform pair can be one of the class of wavelet transformations, lifting transformations and other transformations.
  • Specific transforms can also be designed depending on use case scenarios, so as to satisfy the different design reasons mentioned above for the use of pre-processing operations and post-processing operations in the context of packing 4:4:4 frames.
  • the pre-processing and post-processing can use other filter structures, with other filter regions of support, or use filtering that is adaptive with respect to content and/or fidelity (e.g., adaptive with respect to the quantization step sizes used for the encoding).
  • the representation and/or compression of the frame -packed 4:2:0 content can use a higher sample bit depth than the original sample bit depth of the 4:4:4 content.
  • the sample bit depth of the 4:4:4 frames is 8 bits per sample
  • the sample bit depth of the frame -packed 4:2:0 frames is 10 bits per sample. This can help reduce precision loss during the application of pre-processing operations and post-processing operations. Or, this can help achieve higher level of fidelity when 4:2:0 frames are encoded using lossy compression.
  • the bit depth of 10 bits per sample can be maintained in all or most internal modules of the encoder and decoder.
  • the sample bit depth can be reduced to 8 bits per sample, if necessary, after unpacking the content to 4:4:4 format at the receiving end.
  • the sample values of frames of the higher-resolution chroma sampling format can have a first bit depth (such as 8, 10, 12 or 16 bits per sample), while the sample values of frames of the lower-resolution chroma sampling format (following frame packing) have a second bit depth higher than the first bit depth.
  • YUV 4:4:4 frames are packed into YUV 4:2:0 frames for encoding and decoding.
  • YUV 4:2:2 frames are packed into YUV 4:2:0 frames for encoding and decoding.
  • a typical 4:2:2 frame contains 8 sample values for every 4 pixel positions, while a 4:2:0 frame contains only 6 sample values for every 4 pixel positions. So, the sample values contained in a 4:2:2 frame can be packed into 4/3 4:2:0 frames. That is, when packed efficiently, three 4:2:2 frames can be packed into four 4:2:0 frames.
  • the frame packing for 4:2:2 frames is done in a simple manner similar to the simple approach (800) illustrated in Figure 8 for 4:4:4 to 4:2:0 frame packing.
  • a YUV 4:2:2 frame is packed into YUV 4:2:0 frames while maintaining geometric correspondence for chroma information of the YUV 4:2:2 frame.
  • the resulting YUV 4:2:0 frames with good geometric correspondence among their Y, U and V components can be compressed better because they fit the model expected by a typical encoder adapted to encoded YUV 4:2:0 frames.
  • the packing can be done such that a YUV 4:2:0 frame represents the complete scene being represented by YUV 4:2:2 frame, albeit with color components at a lower resolution.
  • auxiliary view will have "empty" areas, but these areas can be filled using a fixed value or by replicating chroma values. Or, the empty areas can be used to indicate other information such as depth of a scene.
  • the approach (900) described with reference to Figure 9 the approach (900) can be used as is, except that the areas B4 and B5 will not have data.
  • the approach (1100) described with reference to Figure 11 the approach (1100) can be used as is, except that the areas A4 and A5 will not have data.
  • new values for content interpretation type are defined to signal the packing of YUV 4:2:2 frames into the constituent YUV 4:2:0 frames, as shown in the following table.
  • a device can pack frames of a higher-resolution non-YUV chroma sampling format (such as RGB 4:4:4 or GBR 4:4:4) into frames of a lower resolution format (such as a 4:2:0 format), which may then be encoded.
  • a higher-resolution non-YUV chroma sampling format such as RGB 4:4:4 or GBR 4:4:4
  • a lower resolution format such as a 4:2:0 format
  • a device unpacks frames of the lower resolution format (such as a 4:2:0 format) into frames of the higher-resolution non-YUV chroma sampling format (such as RGB 4:4:4 or GBR 4:4:4).
  • the lower resolution format such as a 4:2:0 format
  • the higher-resolution non-YUV chroma sampling format such as RGB 4:4:4 or GBR 4:4:4
  • the described approaches can be used for frame packing of video content of a 4:4:4 format, 4:2:2 format or 4:2:0 format into a 4:0:0 format, which is typically used for gray scale or monochrome video content.
  • the chroma information from a frame of the 4:4:4 format, 4:2:2 format or 4:2:0 format can be packed into the primary component of one or more additional or auxiliary frames of 4:0:0 format.
  • Figure 12 shows a generalized technique (1200) for frame packing.
  • the device packs (1210) one or more frames of a higher-resolution chroma sampling format into one or more frames of a lower-resolution chroma sampling format.
  • the device packs frame(s) of 4:4:4 format (e.g., YUV 4:4:4 format) into frame(s) of 4:2:0 format (e.g., YUV 4:2:0 format).
  • the device packs frame(s) of 4:2:2 format (e.g., YUV 4:2:2 format) into frame(s) of 4:2:0 format (e.g., YUV 4:2:0 format).
  • the device packs frame(s) of 4:4:4 format (e.g., YUV 4:4:4 format) into frame(s) of 4:2:2 format (e.g., YUV 4:2:2 format).
  • the device can perform the frame packing (1210) so as to maintain geometric correspondence between adjacent sample values of chroma
  • sample values are maintained as adjacent samples and/or collocated portions of luma and chroma components of the frame(s) of the lower- resolution chroma sampling format. Later encoding can exploit such geometric correspondence.
  • the device can embed a lower chroma resolution version of the frame(s) of the higher-resolution chroma sampling format as part of the frame(s) of the lower-resolution chroma sampling format.
  • part of the frame(s) of the lower-resolution chroma sampling format represents a lower chroma resolution version of the frame(s) of the higher-resolution chroma sampling format.
  • the rest of the frame(s) of the lower-resolution chroma sampling format represents remaining chroma information from the frame(s) of the higher-resolution chroma sampling format.
  • the device assigns sample values of chroma components of the frame(s) of the higher-resolution chroma sampling format to luma and chroma components of the frame(s) of the lower-resolution chroma sampling format.
  • the sample values of chroma components of the frame(s) of the higher-resolution chroma sampling format can be filtered, and filtered sample values are assigned to parts of chroma components of the frame(s) of the lower- resolution chroma sampling format.
  • the sample values of the chroma components of the frame(s) of the higher-resolution chroma sampling format have a lower bit depth (e.g., 8 bits per sample), and the filtered sample values have a higher bit depth (e.g., 10 bits per sample) for encoding at the higher bit depth.
  • the device can then encode (1220) the frame(s) of the lower-resolution chroma sampling format.
  • a different device performs the encoding (1220).
  • the device(s) can repeat the technique (1200) on a frame-by- frame basis or other basis.
  • the device can signal metadata about frame packing/unpacking. For example, the device signals metadata that indicates whether frame packing/unpacking is used or not used. Or, the device signals an indication that the sample values of the chroma components of the frame(s) of the higher-resolution chroma sampling format have been filtered during the frame packing, and should be filtered as part of post-processing.
  • the metadata about frame packing/unpacking can be signaled as part of a supplemental enhancement information message or as some other type of metadata.
  • Figure 13 shows a generalized technique (1300) for frame unpacking.
  • the device Before the frame unpacking itself, the device can decode (1310) the frame(s) of a lower-resolution chroma sampling format. Alternatively, a different device performs the decoding (1310).
  • the device unpacks (1320) one or more frames of the lower-resolution chroma sampling format into one or more frames of a higher-resolution chroma sampling format.
  • the device unpacks frame(s) of 4:2:0 format (e.g., YUV 4:2:0 format) into frame(s) of 4:4:4 format (e.g., YUV 4:4:4 format).
  • the device unpacks frame(s) of 4:2:0 format (e.g., YUV 4:2:0 format) into frame(s) of 4:2:2 format (e.g., YUV 4:2:2 format).
  • the device unpacks frame(s) of 4:2:2 format (e.g., YUV 4:2:2 format) into frame(s) of 4:4:4 format (e.g., YUV 4:4:4 format).
  • the device When a lower chroma resolution version of the frame(s) of the higher-resolution chroma sampling format is embedded as part of the frame(s) of the lower-resolution chroma sampling format, the device has options for display.
  • the part of the frame(s) of the lower-resolution chroma sampling format that represents a lower chroma resolution version of the frame(s) of the higher-resolution chroma sampling format can be reconstructed for output and display.
  • the rest of the frame(s) of the lower-resolution chroma sampling format represents remaining chroma information from the frame(s) of the higher-resolution chroma sampling format, and can be used as part of frame unpacking.
  • the device assigns sample values of luma and chroma components of the frame(s) of the lower-resolution chroma sampling format to chroma components of the frame(s) of the higher-resolution chroma sampling format.
  • the sample values of chroma components of the frame(s) of the higher-resolution chroma sampling format can be filtered as part of postprocessing.
  • at least some sample values of the chroma components of the frame(s) of the higher-resolution chroma sampling format have a higher bit depth (e.g., 10 bits per sample) before the post-processing filtering, and such sample values have a lower bit depth (e.g. , 8 bits per sample) after the post-processing filtering.
  • the device can also receive metadata about frame packing/unpacking. For example, the device receives metadata that indicates whether frame packing/unpacking is used or not used. Or, the device receives an indication that the sample values of the chroma components of the frame(s) of the higher-resolution chroma sampling format have been filtered during the frame packing, and should be filtered as part of post-processing.
  • the metadata about frame packing/unpacking can be signaled as part of a supplemental enhancement information message or as some other type of metadata.
  • the device(s) can repeat the technique (1300) on a frame -by- frame basis or other basis.
  • band separation filtering and inverse filtering are used for pre-processing and post-processing, respectively.
  • the band separation filtering and inverse filtering are wavelet decomposition and reconstruction.
  • Frame packing can include wavelet decomposition (analysis) as part of pre-processing operations to filter chroma sample values.
  • Corresponding frame unpacking then includes wavelet reconstruction (synthesis) as part of post-processing to inverse filter the chroma sample values.
  • a YUV 4:4:4 frame can be packed into two YUV 4:2:0 frames while maintaining geometric correspondence for chroma information of the YUV 4:4:4 frame.
  • YUV 4:2:0 frames with good geometric correspondence among their Y, U and V components can be compressed better because they fit the model expected by a typical encoder adapted to encode YUV 4:2:0 frames.
  • the packing can also be done such that one of the two YUV 4:2:0 frames represents the complete scene being represented by YUV 4:4:4 frame, albeit with color components at a lower resolution. This provides options in decoding.
  • a decoder that cannot perform frame unpacking, or chooses not to perform frame unpacking, can just take a reconstructed version of the YUV 4:2:0 frame that represents the scene and directly feed it to the display.
  • the auxiliary frame (with the remaining chroma information from the frames of the higher-resolution chroma sampling format) can be considered an enhancement layer signal to be combined with the main frame.
  • Primary signal energy can be concentrated into the main frame (by dedicating more bits to the main frame during encoding), with the auxiliary frame consuming an arbitrarily low number of bits for the enhancement layer signal during encoding.
  • the auxiliary frame is encoded at a lower bit rate (lower quality relative to the main frame)
  • the chroma information from the main frame sets the minimum level of quality for the chroma components after
  • pre-processing operations include three-band wavelet decomposition for frame packing of video content of a 4:4:4 format into a 4:2:0 format. For example, suppose the height H of three original arrays of sample values in YUV 4:4:4 format is a multiple of 4, and suppose the width W of the three original arrays of sample values in YUV 4:4:4 format is also a multiple of 4.
  • Figure 14 illustrates an approach (1400) to frame packing with three-band wavelet decomposition as pre-processing.
  • a YUV 4:4:4 frame 801 is packed into two YUV 4:2:0 frames (1402, 1403).
  • the first frame (1402) provides a "main view" in YUV 4:2:0 format - a lower chroma resolution version of the complete scene represented by the YUV 4:4:4 frame (801).
  • the second frame (1403) provides an
  • auxiliary view in YUV 4:2:0 format and contains remaining chroma information.
  • the wavelet decomposition for a first stage is defined as:
  • the plane U is decomposed into bands Cu and Du.
  • the plane V is decomposed into bands Cv and Dv.
  • C and D are a vertically lowpass filtered version and vertically highpass filtered version, respectively, for the plane U or the plane V.
  • the respective vertical lowpass bands (Cu and Cv) are further decomposed by horizontal wavelet decomposition.
  • the lowpass band Cu is decomposed into bands Eu and Fu.
  • the lowpass band Cv is decomposed into bands Ev and Fv.
  • E and F are a horizontally lowpass filtered version and horizontally highpass filtered version, respectively, of a vertically lowpass filtered band.
  • E and F can be referred to as LL and LH wavelet bands.
  • the respective bands Du , Dv, Eu , Ev, Fu and Fv
  • bit depth indicates bit depth.
  • the bit depth can vary between input sample values and the results of pre-processing, however, in some implementations.
  • the sample values of frames of the higher-resolution chroma sampling format can have a first bit depth (such as 8, 10, 12 or 16 bits per sample), while the sample values of frames of the lower-resolution chroma sampling format (following frame packing) have a second bit depth higher than the first bit depth.
  • the pre- processing operations can include clipping of some signals to the range from 0 to 2 B - 1 in some cases.
  • the sample values can be assigned to the areas in a different way.
  • data from the original U plane of the YUV 4:4:4 frame can be arranged in the U plane of the auxiliary YUV 4:2:0 frame
  • data from the original V plane of the YUV 4:4:4 frame can be arranged in the V plane of the auxiliary YUV 4:2:0 frame.
  • the sample values from the odd rows of Fu are assigned to the lower half of U ⁇
  • the sample values from even rows of Fv are assigned to the top half of V 4 a ⁇ .
  • the sample values Fu are simply assigned to U ⁇
  • the sample values from Fv are simply assigned to V 4 a ⁇ .
  • the U plane (or V plane) of the auxiliary YUV 4:2:0 frame is constructed from the U plane (or V plane) of the YUV 4:4:4 frame, without mixing content from different original U and V planes.
  • the bands D, E, and F form a three-band wavelet decomposition of the input array A, which may represent chroma sample values for plane U or V.
  • the HL and HH bands are not created in the three-band decomposition. Instead, the vertically highpass signal D is kept at full horizontal resolution.
  • the bands D, E, and F can be called the H, LL and LH bands.
  • the first letter indicates a vertical highpass (H) or lowpass (L) signal decimation
  • the second letter (when present) indicates a horizontal highpass or lowpass signal decimation.
  • the first frame (1402) and second frame (1403) of the YUV 4:2:0 format can be organized as separate frames (separated by the dark line in Figure 14). Or, the first frame (1402) and second frame (1403) of the YUV 4:2:0 format can be organized as a single frame having a height of 2 x H (ignoring the dark line in Figure 14). Or, the first frame (1402) and second frame (1403) of the YUV 4:2:0 format can be organized as a single frame having a width of 2 x W.
  • first frame (1402) and second frame (1403) of the YUV 4:2:0 format can be organized as a single frame using any of the methods defined for frame_packing_arrangement_type in the H.264/AVC standard or the HEVC standard.
  • FIG. 15 shows another example approach (1500) to frame packing with three- band wavelet decomposition as pre-processing.
  • a YUV 4:4:4 frame (801) is packed into two YUV 4:2:0 frames (1502, 1503).
  • the first frame (1502) provides a "main view” in YUV 4:2:0 format
  • the second frame (1503) provides an "auxiliary view” in YUV 4:2:0 format and contains remaining chroma information.
  • Figure 14 which is horizontally-oriented decomposition, such that Y 4 ⁇ contains full-width, half-height U and
  • Figure 15 illustrates a corresponding vertically-oriented case in which Y 4 ⁇ contains half-width, full-height U and V regions.
  • Y 4 ⁇ contains half-width, full-height U and V regions.
  • the sample values can be assigned to the areas of Figure 15 in a different way.
  • data from the original U plane of the YUV 4:4:4 frame can be arranged in the U plane of the auxiliary YUV 4:2:0 frame
  • data from the original V plane of the YUV 4:4:4 frame can be arranged in the V plane of the auxiliary YUV 4:2:0 frame.
  • the sample values from the odd columns of Fu are assigned to the right half of U ⁇
  • LPF lowpass filter
  • HPF highpass filter
  • the numbers in brackets indicate filter taps, and the denominator indicates a normalization factor.
  • the filter pair is applied both horizontally and vertically, with a centered sampling phase horizontally and vertically (mid-point values derived by filtering), and with the division by two from each stage postponed to follow both the horizontal and vertical filtering.
  • the wavelet decomposition uses a filter pair that differs in terms of filter taps, filter phase,
  • pre-processing operations include four-band wavelet decomposition for frame packing of video content of a 4:4:4 format into a 4:2:0 format. Again, suppose the height H of three original arrays of sample values in YUV 4:4:4 format is a multiple of 4, and suppose the width W of the three original arrays of sample values in YUV 4:4:4 format is also a multiple of 4.
  • the vertically lowpass filtered/horizontally highpass filtered band FA(X, y) is vertically decimated when assigning sample values to andV 4 a ⁇ . Alternate rows are assigned to and V 4 a ⁇ from
  • band FA(X, y) is further decomposed by vertical wavelet decomposition into a lowpass band GA(X, y) and highpass band HA(X, y).
  • Figure 16b continues the first part of Figure 14 but includes a different first frame (1604) and second frame (1605) of YUV 4:2:0 format.
  • some of the respective bands (Du , Dv, Eu and Ev) are arranged as sections of the YUV 4:2:0 frame(s) as described with reference to Figure 14.
  • the remaining bands (Gu, Gv, Hu and Hv) are arranged as sections of the YUV 4:2:0 frame(s) as follows:
  • the U plane (or V plane) of the auxiliary YUV 4:2:0 frame has a mixture of data from the U and V components of the YUV 4:4:4 frame.
  • the upper half of the U plane (or V plane) of the auxiliary YUV 4:2:0 frame contains data from the original U plane, and the lower half contains data from the original V plane.
  • the bands D, E, G, and H form a four-band wavelet decomposition of the input array A, which may represent chroma sample values for plane U or V.
  • the bands D, E, G, and H correspond to H, LL, LHL and LHH bands.
  • the first letter indicates a first vertical highpass (H) or lowpass (L) signal decimation
  • the second letter when present indicates a horizontal highpass or lowpass signal decimation
  • the third letter (when present) indicates an additional vertical highpass or lowpass signal decimation of the LH band.
  • the horizontally lowpass filtered/vertically highpass filtered band FA(X, y) is horizontally decimated when assigning sample values to and V 4 a ⁇ . Alternate columns are assigned to and V 4 a ⁇ from FA(X, y). This assignment happens without first applying anti-alias filtering. Alternatively, additional filtering can be applied to sample values of the band FA(X, y). In particular, the band FA(X, y) is further decomposed by horizontal wavelet decomposition into a lowpass band GA(X, y) and highpass band HA(X, y).
  • the wavelet decomposition uses a Haar wavelet filter pair.
  • the filter pair is applied horizontally and vertically, with a centered sampling phase horizontally and vertically (mid-point values derived by filtering), and with division by two from each stage postponed to follow the final filtering stage.
  • the wavelet decomposition uses a filter pair that differs in terms of filter taps, filter phase, normalization factor, timing of normalization, use of rounding, use of clipping, etc.
  • pre-processing operations include wavelet decomposition for frame packing of video content of a 4:4:4 format into a 4:2:2 format.
  • wavelet decomposition for a first stage is defined as:
  • the plane U is decomposed into bands Cu and Du
  • the plane V is decomposed into bands Cv and Dv.
  • C and D are a horizontally lowpass filtered version and horizontally highpass filtered version, respectively, for the plane U or the plane V.
  • the respective bands (Cu , Cv, Du and Dv) are arranged as sections of the YUV 4:2:2 frame(s) as follows.
  • the "fabricated" signal Y 42 ⁇ (x,y) can simply be discarded after decoding.
  • the sample values of the highpass filtered version can be assigned to the luma component Y 42 ⁇ (x,y) of the auxiliary frame, while the chroma components U 4 TM (x, y) and V 4 TM (x, y) of the auxiliary frame are assigned fabricated values that can be discarded after decoding.
  • the wavelet decomposition uses a Haar wavelet filter pair.
  • the filter pair is applied horizontally, with a centered sampling phase (mid-point values derived by filtering).
  • the wavelet decomposition uses a filter pair that differs in terms of filter taps, filter phase, normalization factor, timing of normalization, use of rounding, use of clipping, etc.
  • pre-processing operations include wavelet decomposition for frame packing of video content of a 4:2:2 format into a 4:2:0 format.
  • the width of the original U and V arrays is W / 2
  • the height of all three original arrays is H.
  • the wavelet decomposition for a first stage is defined as:
  • the plane U is decomposed into bands Cu and Du
  • the plane V is decomposed into bands Cv and Dv.
  • C and D are a vertically lowpass filtered version and vertically highpass filtered version, respectively, for the plane U or the plane V.
  • the respective bands (Cu , Cv, Du and Dv) are arranged as sections of the YUV 4:2:0 frame(s), as follows.
  • the "fabricated" signal Y 4 ⁇ (x,y) can simply be discarded after decoding.
  • the sample values of the highpass filtered version can be assigned to the upper half of the luma component Y 4 ⁇ (x,y) of the auxiliary frame, while the chroma components U 4 TM(x,y) and V 4 TM(x,y) of the auxiliary frame, and lower half of Y 4 ⁇ (x,y) , are assigned fabricated values that can be discarded after decoding.
  • the wavelet decomposition uses a Haar wavelet filter pair.
  • the filter pair is applied vertically, with a centered sampling phase (mid-point values derived by filtering).
  • the wavelet decomposition uses a filter pair that differs in terms of filter taps, filter phase, normalization factor, timing of normalization, use of rounding, use of clipping, etc. 5.
  • the numbers in brackets indicate filter taps, and the denominator indicates a normalization factor.
  • the filter pair is applied horizontally and/or vertically, with a centered sampling phase where filtering is performed (mid-point values derived by filtering), and with the division by two from each stage postponed to follow the final filtering stage.
  • the filter used for wavelet decomposition or other band separation filtering depends on implementation.
  • the filters used can vary in terms of (a) filter taps, (b) normalization factor, (c) how normalization occurs (e.g., after each filtering stage, or deferred partially or completely for one or more filtering stages), (d) whether bit depth expansion is permitted (e.g.
  • a filter pair has low complexity but also provides good performances in terms of decorrelation and hence compression.
  • the filter taps of a filter pair can be based on the Haar wavelet.
  • a different filter is used for wavelet decomposition or other band separation filtering during pre-processing.
  • wavelet decomposition or other band separation filtering during pre-processing.
  • the decomposition can use a symmetric biorthogonal Daubechies wavelet filter pair.
  • This filter has the same sampling phase (mid-point values derived by filtering).
  • a filter pair from another family of filters can be used.
  • the filter pair can be based on quadrature mirror filters, although this may involve filters that do not achieve the property of perfect signal reconstruction in the absence of quantization and rounding error.
  • the filter pair can be based on polyphase quadrature filters, which are used in AAC audio coding/decoding.
  • the filter pair can be based on conjugate quadrature filters, which are the basis of many orthogonal filter banks including those using Daubechies orthogonal wavelets. Any of the filter pairs can be scaled or unsealed, depending on implementation.
  • the normalization factor generally depends on the filter taps and is
  • the normalization factor can be implemented as a division after each filtering stage. Or, some of the normalization can be partially or completely deferred for one or more filtering stages, such that division at a later stage (e.g., final stage)
  • the wavelet decomposition or other band separation filtering can permit some amount of bit depth expansion in the sample values of the bands, so as to reduce rounding/truncation error and potentially permit exact invertibility.
  • division by 2 can be skipped when forming arrays from the sample values of the DA arrays, or division by 2 can be used instead of division by 4 when forming arrays from the sample values of the FA arrays. (In such cases, clipping can be applied to avoid violation of the 0 to 2 B - 1 range, if appropriate.)
  • the division operation may include nearest- integer rounding.
  • n ordinarily represented as a two's complement binary number
  • denominator that is equal to 2 k for some k > 1
  • a rounding factor can be included when dividing by the k th power of two (i.e., division by 2 k ), taking the form (n + 2 (k " J) - p) » k, where the value p is a rounding factor.
  • Different patterns of values can be used for p. For example, the value of p alternates between 0 and 1 following a 2D periodic pattern such as a repeating MxM block pattern. Or, the value of p alternates between 0 and 1 following a pseudo-random basis. Or, the value of p alternates between 0 and 1 following a "blue noise dither" signal or other dither signal.
  • the pre-processing operations include clipping to prevent dynamic range expansion.
  • clipping can simplify implementation, but it may introduce distortion when processing some pathological input signal values.
  • the LPF kernel [-1 1 8 8 1 -1] / 16 could produce negative output values when processing unsigned input data. Since image and video signals are typically highly statistically correlated, however, it is usually very rare for such a negative output value to occur in practice. In fact, it may be so rare that simply clipping the output of the LPF to not allow negative output values is a sensible design choice.
  • sample values of Y, U, and V signals have a range from 0 to 2 B - 1.
  • sample values can have some other range.
  • sample values have a range from -2 ⁇ B " ⁇ to 2 ⁇ B " ⁇ - 1.
  • the offsets of 2 ⁇ B " ⁇ are not used in the above equations.
  • the pre-processing operations can include edge-handling stages. For example, edge values can be padded outward from an edge. Or, edge values can be mirrored at an edge. Or, values can be repeated using modulo/circular wrapping.
  • lifting may be used to reduce rounding/truncation error and potentially permit exact invertibility. Lifting can be used in combination with clipping.
  • the filters of a Haar wavelet filter pair are adapted to use lifting and optionally clipping.
  • the bands C LLFTED A(x, y) and DA(X, y) serve the same purpose (lowpass and highpass representations) as the bands CA(X, y) and DA(X, y) in the example described with reference to Figure 14.
  • the dynamic range of C LLFTED A(x, y) is only B bits rather than B + 1 bits, as dynamic range of C LLFTED A(x, y) is approximately equal to only half the value of the CA(X, y) band.
  • D clipped A (x, y) Min( Max( -2 (B “ 1 ⁇ , DA(X, y) ), 2 (B “ 1 ⁇ - 1 ), instead of simply encoding
  • DA(X, y) as the highpass signal.
  • This operation clips the sample values of DA(X, y) to be in the range of -2 ⁇ B " ⁇ to 2 ⁇ B " ⁇ -1.
  • a corresponding decoder inverts the band separation filtering using D CLIPPED A(x, y) (or a lossy-decoded approximation thereof), instead of using DA(X, y) (or a lossy-decoded approximation thereof). For typical video content, cases where this clipping introduces distortion are rare, and the result will still be an
  • a sample value of the band D CLIPPED A(x, y) is offset by adding the constant offset 2 ⁇ B " ⁇ to its value, to produce a signal with a range from 0 to 2 B - 1 , if this was the original input signal range.
  • the operation is fully reversible except when the clipping affects the signal, and the lowpass signal (C LLFTED A(x, y)) has the same dynamic range as the input signal. Expansion of the dynamic range of the highpass signal (D CLL PP ED _ OFFSET A ( X? y)) is prevented by the clipping.
  • the clipping can introduce distortion, but cases where clipping actually limits the highpass signal are expected to be rare, and in any case a clipped highpass signal still provides an enhancement signal.
  • the constant offset can be subtracted when inverting the transformation after decoding.
  • Using clipping to limit the dynamic range of the highpass signal can also be applied without using lifting, although this may involve sacrificing exact invertibility. Also, the use of a half-magnitude lowpass signal can also be applied without using lifting, although this may involve sacrificing exact invertibility.
  • lifting and clipping operations are described for a first-step vertical filtering stage using Haar wavelet filters as LPF and HPF, as described with reference to Figure 14.
  • LPF and HPF Haar wavelet filters
  • the same lifting and clipping techniques can be applied for other stages (e.g., horizontally) and with other LPF and HPF kernels.
  • the H band is not subjected to additional wavelet decomposition after the first filtering stage.
  • the H band is subjected to additional wavelet decomposition after the first filtering stage, so as to yield LL, LH, HL and HH bands, whose values are then assigned to main and auxiliary frames in the lower-resolution chroma sampling format.
  • intermediate and/or final results of encoding of the main frame of the lower-resolution chroma sampling format can be used during encoding of the auxiliary frame of the lower-resolution chroma sampling format.
  • an encoder can use motion vectors (and potentially macroblock/sub- macroblock/sub-macroblock-partition information and/or CU/PU/merge/TU segmentation information) from encoding of the main frame to assist in searching for motion vectors (and corresponding segmentation information) for use in encoding the auxiliary frame.
  • an encoder can select quantization step size values to apply in various areas of an auxiliary frame based on results of encoding of the main frame.
  • pre-processing operations include wavelet decomposition for frame packing of video content of a 4:4:4 format, 4:2:2 format or 4:2:0 format into a 4:0:0 format, which is typically used for gray scale or monochrome video content.
  • the chroma information from a frame of the 4:4:4 format, 4:2:2 format or 4:2:0 format can be packed into the primary component of one or more additional or auxiliary frames of 4:0:0 format.
  • syntax indications in the bitstream can be used to identify the type of filtering applied in the encoder and/or indicate the type of filtering that should be applied when decoding.
  • the content interpretation type syntax element of a frame packing arrangement SEI message can indicate the filtering type.
  • additional values of content interpretation type can indicate the application of wavelet decomposition or other band separation filtering.
  • the range of values 1-6 is defined as in the table.
  • a value of 7 can indicate a first band separation/wavelet filtering scheme
  • a value of 8 can indicate a second band separation/wavelet filtering scheme
  • a value of 9 can indicate a third band separation/wavelet filtering scheme, and so on.
  • Different band separation/wavelet filtering schemes can differ in terms of pattern of decomposition (e.g. , three-band decomposition versus four-band decomposition, for different ways of rearranging sample values), filters used and/or implementation choices for filters.
  • syntax indications in the bitstream can be used to identify phase of the filtering applied in the encoder and/or indicate the phase of filtering that should be applied when decoding.
  • syntax elements such as the
  • VUI video usability information
  • band separation filtering and inverse band separation filtering use a Haar wavelet pair for horizontal and vertical filtering, chroma_loc_info_present_flag,
  • chroma sample loc type top field and chroma sample loc type bottom field syntax elements have a value of 1.
  • the syntax element pic output flag could be set to 1 for main video frames and set to 0 for auxiliary video frames.
  • the pic output flag syntax element is set equal to 0 when frame_packing_arrangement is equal to 5, and the syntax element current frame is frameO flag is set equal to 0. This contrasts with an approach in which pic output flag is set to the same value in both the main and auxiliary constituent frames.
  • the cropping rectangle can be set to enclose only the main constituent frame without enclosing the auxiliary constituent frame.
  • the cropping rectangle can be set to enclose both the main and auxiliary constituent frames.
  • a decoder uses wavelet reconstruction or other inverse band separation filtering to reconstruct an approximation of the input signal from values of bands.
  • the values of bands may be approximations of original values (after lossy encoding and corresponding decoding) or exact representations of the original values (e.g., after lossless intermediate processing).
  • the decoder reconstructs the values of the input signal (e.g., sample values for A) from the bandpass filtered values (e.g., for the bands D, E, and F).
  • the lowpass/highpass subband filtering techniques and associated inverse processing operations applied by the decoder mirror the lowpass/highpass subband filtering techniques and associated operations applied by the encoder.
  • a decoder decodes values of the main and auxiliary frames of the lower-resolution chroma sampling format.
  • the decoder then assigns reconstructed sample values from the main and auxiliary frames of the lower-resolution chroma sampling format to appropriate bands of the frame in the higher-resolution chroma sampling format.
  • reconstructed values are assigned from the frame(s) in the lower-resolution chroma sampling format to the bands C, D, E, F, G and/or H, as appropriate for the approach used for wavelet decomposition.
  • the decoder then performs wavelet reconstruction or other inverse band separation filtering on the values in the bands.
  • the decoder can perform wavelet reconstruction or other inverse band separation filtering on the values in the remaining bands.
  • the details of the wavelet reconstruction or other inverse band separation filtering typically match or mirror the details of corresponding wavelet decomposition or other band separation filtering.
  • the wavelet reconstruction typically performs the multiple stages in reverse order for the affected bands.
  • a vertical filtering stage for the entire component plane A to produce bands CA(X, y) and DA(X, y) is followed by a horizontal filtering stage for the vertical lowpass band CA(X, y) to produce bands EA(X, y) and FA(X, y).
  • a horizontal inverse filtering stage for reconstructed versions of the two bands EA(X, y) and FA(X, y) produces a reconstructed version of the band CA(X, y).
  • vertical inverse filtering of the reconstructed versions of the bands CA(X, y) and DA(X, y) yields a reconstructed version of the component plane A.
  • wavelet reconstruction can use a Haar wavelet filter pair to invert the wavelet decomposition.
  • wavelet reconstruction uses another type of filter pair (e.g. , a symmetric biorthogonal Daubechies wavelet filter pair)
  • the wavelet reconstruction uses a filter pair with filter taps and normalization factor defined to invert the filtering. If bit depth expanded when producing values of bands from input values, bit depth is restored when reconstructing values of the component planes.
  • Other implementation options may also be set to match or mirror operations performed during wavelet decomposition or other band separation filtering.
  • a decoder may implement inverse filtering operations in ways that do not exactly mirror corresponding filtering operations performed before encoding.
  • samples in areas of frames of YUV 4:2:0 format are inverse band separation filtered and assigned to reconstructed chroma components U' 444 and V' 444 of frame of YUV 4:4:4 format as shown in the following pseudocode.
  • D'V_TEMP2 Y'To (2 * x + 1 , (H » 1) + y) - T
  • V TEMP2 (E' V TEMP - F 1 ' V TEMP ) « 1
  • * v ⁇ ) ( V C ⁇ V TEMPl + ⁇ ⁇ ' V TEMPl » 1
  • V 444 (2 * jc + 1,2 * y + 1) — ( C'v TEMP2 - D' v TEMP2 ) » 1
  • Figure 17 shows a generalized technique (1700) for frame packing with wavelet decomposition or other band separation filtering.
  • the device packs one or more frames of a higher-resolution chroma sampling format into one or more frames of a lower- resolution chroma sampling format.
  • the device packs frame(s) of 4:4:4 format into frame(s) of 4:2:0 format.
  • the device packs frame(s) of 4:2:2 format into frame(s) of 4:2:0 format.
  • the device packs frame(s) of 4:4:4 format into frame(s) of 4:2:2 format.
  • the device packs frame(s) of a 4:4:4, 4:2:2 or 4:2:0 format into frame(s) of a 4:0:0 format.
  • the device performs (1711) wavelet decomposition or other band separation filtering on sample values of chroma components of the frame(s) of the higher-resolution chroma sampling format to produce sample values of multiple bands.
  • the device assigns (1712) the sample values of the multiple bands to parts of the frame(s) of the lower-resolution chroma sampling format.
  • the wavelet decomposition can be a three-band wavelet
  • the wavelet decomposition or other band separation filtering uses a filter pair with a LPF and HPF.
  • division operations can be implemented with arithmetic right shift operations, implemented with integer division operations or implemented in another way.
  • Division operations can include no rounding, include nearest-integer rounding or include dithered rounding.
  • wavelet decomposition or other band separation filtering may use lifting. Also, depending on implementation, wavelet decomposition or other band separation filtering may include clipping for at least some of the sample values of the bands.
  • the wavelet decomposition or other band separation filtering includes multiple filtering stages.
  • the multiple filtering stages may include a vertical filtering stage followed by a horizontal filtering stage.
  • the multiple filtering stages may include a horizontal filtering stage followed by a vertical filtering stage.
  • the multiple filtering stages may include a horizontal filtering stage followed by a vertical filtering stage.
  • normalization to compensate for expansion (a) can occur after each filtering stage of the multiple filtering stages, (b) can be at least partially deferred for one or more of the multiple filtering stages, or (c) can be at least partially ignored for one or more filtering stages, so as to provide scaling of the sample values of the bands.
  • the device can then encode (1720) the frame(s) of the lower-resolution chroma sampling format.
  • a different device performs the encoding (1720).
  • the device(s) can repeat the technique (1700) on a frame-by- frame basis or other basis.
  • the device can signal metadata about frame packing/unpacking. For example, the device signals metadata that indicates type of filtering applied and/or filtering phase of filtering applied.
  • Figure 18 shows a generalized technique (1800) for frame unpacking with wavelet reconstruction or other inverse band separation filtering.
  • the device Before the frame unpacking itself, the device can decode (1810) one or more frames of a lower-resolution chroma sampling format. Alternatively, a different device performs the decoding (1810).
  • the device unpacks the frame(s) of the lower-resolution chroma sampling format into one or more frames of a higher- resolution chroma sampling format. For example, the device unpacks frame(s) of 4:2:0 format into frame(s) of 4:4:4 format. Or, the device unpacks frame(s) of 4:2:0 format into frame(s) of 4:2:2 format. Or, the device unpacks frame(s) of 4:2:2 format into frame(s) of 4:4:4 format. Or, the device unpacks frame(s) of a 4:0:0 format into frame(s) of a 4:4:4, 4:2:2 or 4:2:0 format.
  • the device assigns (1821) parts of the frame(s) of the lower-resolution chroma sampling format to sample values of multiple bands.
  • the device then performs (1822) wavelet reconstruction or other inverse band separation filtering on the sample values of the bands to produce sample values of chroma components of the frame(s) of the higher-resolution chroma sampling format.
  • the wavelet reconstruction can be a three-band wavelet
  • the wavelet reconstruction or other band separation filtering uses a filter pair with a LPF and HPF.
  • division operations can be implemented with arithmetic right shift operations, implemented with integer division operations, or implemented in another way. Division operations can include no rounding, include nearest-integer rounding or include dithered rounding.
  • wavelet reconstruction or other inverse band separation filtering may use lifting. Also, depending on implementation, wavelet reconstruction or other inverse band separation filtering may include clipping for at least some of the sample values of the bands.
  • the wavelet reconstruction or other inverse band separation filtering includes multiple filtering stages.
  • the multiple filtering stages may include a vertical filtering stage followed by a horizontal filtering stage.
  • the multiple filtering stages may include a horizontal filtering stage followed by a vertical filtering stage.
  • normalization to compensate for expansion (a) can occur after each filtering stage of the multiple filtering stages, (b) can be at least partially deferred for one or more of the multiple filtering stages, or (c) can compensate for scaling of the sample values of the bands in pre-processing.
  • the device can also receive metadata about frame packing/unpacking. For example, the device receives metadata that indicates type of filtering applied and/or filtering phase of filtering applied.
  • the device(s) can repeat the technique (1800) on a frame -by- frame basis or other basis.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Color Television Systems (AREA)
  • Television Signal Processing For Recording (AREA)
PCT/US2013/065754 2012-10-22 2013-10-18 Band separation filtering / inverse filtering for frame packing / unpacking higher-resolution chroma sampling formats Ceased WO2014066182A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP13789089.3A EP2910023A1 (en) 2012-10-22 2013-10-18 Band separation filtering / inverse filtering for frame packing / unpacking higher-resolution chroma sampling formats
KR1020157010297A KR20150079624A (ko) 2012-10-22 2013-10-18 고해상도 크로마 샘플링 포맷들의 프레임 팩킹 및 언팩킹을 위한 대역 분리 필터링/역필터링 기법
JP2015538079A JP2016500965A (ja) 2012-10-22 2013-10-18 高解像度クロマ・サンプリング・フォーマットのフレーム・パッキング/アンパッキングのための帯域分離フィルタリング/逆フィルタリング
CN201380055253.2A CN104982036B (zh) 2012-10-22 2013-10-18 用于帧包装和帧解包的方法及计算设备

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US201261717097P 2012-10-22 2012-10-22
US61/717,097 2012-10-22
US14/027,028 2013-09-13
US14/027,028 US9661340B2 (en) 2012-10-22 2013-09-13 Band separation filtering / inverse filtering for frame packing / unpacking higher resolution chroma sampling formats

Publications (1)

Publication Number Publication Date
WO2014066182A1 true WO2014066182A1 (en) 2014-05-01

Family

ID=50485301

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2013/065754 Ceased WO2014066182A1 (en) 2012-10-22 2013-10-18 Band separation filtering / inverse filtering for frame packing / unpacking higher-resolution chroma sampling formats

Country Status (6)

Country Link
US (1) US9661340B2 (https=)
EP (1) EP2910023A1 (https=)
JP (1) JP2016500965A (https=)
KR (1) KR20150079624A (https=)
CN (1) CN104982036B (https=)
WO (1) WO2014066182A1 (https=)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105578190A (zh) * 2016-02-03 2016-05-11 珠海全志科技股份有限公司 应用于视频硬解码的无损压缩方法及系统
WO2017050116A1 (zh) * 2015-09-25 2017-03-30 网宿科技股份有限公司 一种基于gpu的完全硬件转码的方法和系统
CN111031388A (zh) * 2019-11-25 2020-04-17 无锡思朗电子科技有限公司 Yuv4:4:4数据的编解码方法

Families Citing this family (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9445093B2 (en) * 2011-06-29 2016-09-13 Qualcomm Incorporated Multiple zone scanning order for video coding
WO2013102293A1 (en) * 2012-01-04 2013-07-11 Mediatek Singapore Pte. Ltd. Improvements of luma-based chroma intra prediction
US9979960B2 (en) 2012-10-01 2018-05-22 Microsoft Technology Licensing, Llc Frame packing and unpacking between frames of chroma sampling formats with different chroma resolutions
CN105409218B (zh) * 2013-04-08 2021-05-25 Ge视频压缩有限责任公司 解码器和解码方法
CN104427323B (zh) * 2013-08-23 2016-08-10 鸿富锦精密工业(深圳)有限公司 基于深度的三维图像处理方法
US9489104B2 (en) 2013-11-14 2016-11-08 Apple Inc. Viewable frame identification
US9582160B2 (en) 2013-11-14 2017-02-28 Apple Inc. Semi-automatic organic layout for media streams
US9854246B2 (en) 2014-02-28 2017-12-26 Apple Inc. Video encoding optimization with extended spaces
US20150254806A1 (en) * 2014-03-07 2015-09-10 Apple Inc. Efficient Progressive Loading Of Media Items
US10726800B2 (en) * 2014-05-09 2020-07-28 Syndiant, Inc. Generating quincunx video streams for light modulating backplanes with configurable multi electrode pixels
EP2980793A1 (en) 2014-07-28 2016-02-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Encoder, decoder, system and methods for encoding and decoding
FR3024582A1 (fr) * 2014-07-29 2016-02-05 Orange Gestion de la perte de trame dans un contexte de transition fd/lpd
US9451296B2 (en) 2014-08-19 2016-09-20 At&T Intellectual Property I, L.P. Extracting data from advertisement files for ad insertion
US9811882B2 (en) * 2014-09-30 2017-11-07 Electronics And Telecommunications Research Institute Method and apparatus for processing super resolution image using adaptive preprocessing filtering and/or postprocessing filtering
US10283091B2 (en) * 2014-10-13 2019-05-07 Microsoft Technology Licensing, Llc Buffer optimization
US20160212423A1 (en) 2015-01-16 2016-07-21 Microsoft Technology Licensing, Llc Filtering to mitigate artifacts when changing chroma sampling rates
US9854201B2 (en) 2015-01-16 2017-12-26 Microsoft Technology Licensing, Llc Dynamically updating quality to higher chroma sampling rate
US9749646B2 (en) 2015-01-16 2017-08-29 Microsoft Technology Licensing, Llc Encoding/decoding of high chroma resolution details
US10432961B2 (en) * 2015-03-10 2019-10-01 Apple Inc. Video encoding optimization of extended spaces including last stage processes
CN107211128B (zh) * 2015-03-10 2021-02-09 苹果公司 自适应色度下采样和色彩空间转换技术
US9736335B2 (en) * 2015-04-15 2017-08-15 Apple Inc. Techniques for advanced chroma processing
CN113115043A (zh) * 2015-08-07 2021-07-13 辉达公司 视频编码器、视频编码系统和视频编码方法
US10277921B2 (en) * 2015-11-20 2019-04-30 Nvidia Corporation Hybrid parallel decoder techniques
TWI629892B (zh) * 2016-05-09 2018-07-11 國立成功大學 景深包裝及解包裝之rgb格式與yuv格式的轉換與反轉換的方法及電路
GB2554663B (en) * 2016-09-30 2022-02-23 Apical Ltd Method of video generation
US10368080B2 (en) 2016-10-21 2019-07-30 Microsoft Technology Licensing, Llc Selective upsampling or refresh of chroma sample values
US10609372B2 (en) 2017-09-29 2020-03-31 Dolby Laboratories Licensing Corporation Up-conversion to content adaptive perceptual quantization video signals
EP3474225B1 (en) * 2017-10-18 2019-09-25 Axis AB Method and encoder for encoding a video stream in a video coding format supporting auxiliary frames
US10368071B2 (en) * 2017-11-03 2019-07-30 Arm Limited Encoding data arrays
EP3741124A1 (en) * 2018-01-16 2020-11-25 VID SCALE, Inc. Adaptive frame packing for 360-degree video coding
KR20190116833A (ko) 2018-04-05 2019-10-15 김성봉 영업공간 매칭 서버 및 매칭방법
CN109379376B (zh) * 2018-11-30 2021-10-08 成都德辰博睿科技有限公司 基于数据压缩的无线电监测设备、系统及数据压缩方法
US10951895B2 (en) 2018-12-31 2021-03-16 Alibaba Group Holding Limited Context model selection based on coding unit characteristics
CN115243103B (zh) * 2019-04-30 2025-09-09 深圳光峰科技股份有限公司 图像拆分方法与图像显示方法
US20210127125A1 (en) * 2019-10-23 2021-04-29 Facebook Technologies, Llc Reducing size and power consumption for frame buffers using lossy compression
WO2022183358A1 (zh) 2021-03-02 2022-09-09 京东方科技集团股份有限公司 视频图像去交错方法和视频图像去交错装置
CN119011866B (zh) * 2023-03-08 2025-09-23 杭州海康威视数字技术股份有限公司 一种解码方法、装置及其设备
CN117729328A (zh) * 2023-07-28 2024-03-19 小红书科技有限公司 视频图像的编码方法、解码方法及相关设备

Family Cites Families (61)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5047853A (en) 1990-03-16 1991-09-10 Apple Computer, Inc. Method for compresssing and decompressing color video data that uses luminance partitioning
JP3381855B2 (ja) 1992-12-28 2003-03-04 ソニー株式会社 画像信号符号化方法および画像信号符号化装置、並びに画像信号復号化方法および画像信号復号化装置
US5742892A (en) * 1995-04-18 1998-04-21 Sun Microsystems, Inc. Decoder for a software-implemented end-to-end scalable video delivery system
JPH09139944A (ja) * 1995-09-12 1997-05-27 Matsushita Electric Ind Co Ltd 符号化方法,符号化装置,ウエーブレット変換装置およびウエーブレット逆変換装置
US6282364B1 (en) 1996-02-05 2001-08-28 Matsushita Electric Industrial Co., Ltd. Video signal recording apparatus and video signal regenerating apparatus
US6208350B1 (en) 1997-11-04 2001-03-27 Philips Electronics North America Corporation Methods and apparatus for processing DVD video
US6937659B1 (en) * 1997-11-14 2005-08-30 Ac Capital Management, Inc. Apparatus and method for compressing video information
US6829301B1 (en) 1998-01-16 2004-12-07 Sarnoff Corporation Enhanced MPEG information distribution apparatus and method
US6501480B1 (en) 1998-11-09 2002-12-31 Broadcom Corporation Graphics accelerator
JP2001197499A (ja) * 1999-10-29 2001-07-19 Sony Corp 動画像符号化方法及び装置、動画像復号方法及び装置、並びに動画像伝送方法及び装置
US6674479B2 (en) 2000-01-07 2004-01-06 Intel Corporation Method and apparatus for implementing 4:2:0 to 4:2:2 and 4:2:2 to 4:2:0 color space conversion
US7576748B2 (en) 2000-11-28 2009-08-18 Nintendo Co. Ltd. Graphics system with embedded frame butter having reconfigurable pixel formats
JP2002204356A (ja) * 2000-10-27 2002-07-19 Canon Inc データ処理装置、プロセッサ、及びその制御方法
US20030112863A1 (en) 2001-07-12 2003-06-19 Demos Gary A. Method and system for improving compressed image chroma information
US7076108B2 (en) * 2001-12-11 2006-07-11 Gen Dow Huang Apparatus and method for image/video compression using discrete wavelet transform
JP4027196B2 (ja) * 2002-09-24 2007-12-26 キヤノン株式会社 画像表示装置及びその制御方法
US7551792B2 (en) 2003-11-07 2009-06-23 Mitsubishi Electric Research Laboratories, Inc. System and method for reducing ringing artifacts in images
US20050129130A1 (en) 2003-12-10 2005-06-16 Microsoft Corporation Color space coding framework
US20050228654A1 (en) 2004-03-30 2005-10-13 Yolanda Prieto Method and apparatus for improved bit rate efficiency in wavelet based codecs by means of subband correlation
JP4483457B2 (ja) * 2004-03-31 2010-06-16 日本ビクター株式会社 伝送システム
KR100679011B1 (ko) * 2004-07-15 2007-02-05 삼성전자주식회사 기초 계층을 이용하는 스케일러블 비디오 코딩 방법 및 장치
JP2006157443A (ja) * 2004-11-29 2006-06-15 Toshiba Corp 画像送信装置、画像受信装置および画像伝送システム
CN101313582A (zh) * 2005-09-27 2008-11-26 高通股份有限公司 使用各种运动模型的编码器辅助式帧速率提升转换
US8879635B2 (en) * 2005-09-27 2014-11-04 Qualcomm Incorporated Methods and device for data alignment with time domain boundary
WO2007060498A1 (en) 2005-11-22 2007-05-31 Freescale Semiconductor, Inc. Method and system for filtering image data
KR100771879B1 (ko) 2006-08-17 2007-11-01 삼성전자주식회사 내부 메모리 용량을 감소시키는 디블록킹 필터링 방법 및그 방법을 이용하는 영상 처리 장치
US9001899B2 (en) 2006-09-15 2015-04-07 Freescale Semiconductor, Inc. Video information processing system with selective chroma deblock filtering
CN101578874B (zh) 2007-01-04 2011-12-07 汤姆森特许公司 用于减少多视图编码视频中的照度补偿和/或色彩补偿的编码伪像的方法和装置
US8054886B2 (en) 2007-02-21 2011-11-08 Microsoft Corporation Signaling and use of chroma sample positioning information
SI2887671T1 (sl) 2007-04-12 2018-10-30 Dolby International Ab Razporejanje v kodiranju in dekodiranju videa
WO2008143157A1 (ja) * 2007-05-17 2008-11-27 Sony Corporation 情報処理装置および方法
KR20080114388A (ko) 2007-06-27 2008-12-31 삼성전자주식회사 스케일러블 영상 부호화장치 및 방법과 그 영상 복호화장치및 방법
US7924292B2 (en) 2007-08-31 2011-04-12 Broadcom Corportion Device and method for reducing visual artifacts in color images
US8139081B1 (en) 2007-09-07 2012-03-20 Zenverge, Inc. Method for conversion between YUV 4:4:4 and YUV 4:2:0
US8953673B2 (en) * 2008-02-29 2015-02-10 Microsoft Corporation Scalable video coding and decoding with sample bit depth and chroma high-pass residual layers
JP5035154B2 (ja) * 2008-03-07 2012-09-26 富士通株式会社 映像信号処理装置及び映像信号処理方法
US9571856B2 (en) 2008-08-25 2017-02-14 Microsoft Technology Licensing, Llc Conversion operations in scalable video encoding and decoding
AU2010206977B2 (en) 2009-01-26 2016-02-25 Interdigital Vc Holdings, Inc. Frame packing for video coding
JP4947389B2 (ja) 2009-04-03 2012-06-06 ソニー株式会社 画像信号復号装置、画像信号復号方法、および画像信号符号化方法
US20110280311A1 (en) 2010-05-13 2011-11-17 Qualcomm Incorporated One-stream coding for asymmetric stereo video
US8625666B2 (en) 2010-07-07 2014-01-07 Netzyn, Inc. 4:4:4 color space video with 4:2:0 color space video encoders and decoders systems and methods
EP2591609B1 (en) * 2010-07-08 2017-09-06 Dolby Laboratories Licensing Corporation Method and apparatus for multi-layered image and video coding using reference processing signals
CN105163118B (zh) 2010-07-20 2019-11-26 Sk电信有限公司 用于解码视频信号的解码方法
US9596447B2 (en) 2010-07-21 2017-03-14 Qualcomm Incorporated Providing frame packing type information for video coding
US8787443B2 (en) 2010-10-05 2014-07-22 Microsoft Corporation Content adaptive deblocking during video encoding and decoding
US10327008B2 (en) 2010-10-13 2019-06-18 Qualcomm Incorporated Adaptive motion vector resolution signaling for video coding
EP2456204A1 (en) 2010-11-18 2012-05-23 Koninklijke Philips Electronics N.V. Method and apparatus for encoding or generating an image
US20120236115A1 (en) 2011-03-14 2012-09-20 Qualcomm Incorporated Post-filtering in full resolution frame-compatible stereoscopic video coding
US8780996B2 (en) 2011-04-07 2014-07-15 Google, Inc. System and method for encoding and decoding video data
WO2012160797A1 (en) * 2011-05-20 2012-11-29 Panasonic Corporation Methods and apparatuses for encoding and decoding video using inter-color-plane prediction
US8787454B1 (en) 2011-07-13 2014-07-22 Google Inc. Method and apparatus for data compression using content-based features
WO2013128010A2 (en) 2012-03-02 2013-09-06 Canon Kabushiki Kaisha Method and devices for encoding a sequence of images into a scalable video bit-stream, and decoding a corresponding scalable video bit-stream
US8958474B2 (en) 2012-03-15 2015-02-17 Virtualinx, Inc. System and method for effectively encoding and decoding a wide-area network based remote presentation session
US8639057B1 (en) 2012-03-19 2014-01-28 The Modern Video Company Artifact removal method and system for contoured images and video
CN102801988B (zh) 2012-07-20 2014-10-08 浙江工业大学 一种基于色度分量幅度的yuv444转yuv420的视频格式转换方法
GB2506345A (en) 2012-09-06 2014-04-02 British Broadcasting Corp Video encoding and decoding with chrominance sub-sampling
US20140072027A1 (en) 2012-09-12 2014-03-13 Ati Technologies Ulc System for video compression
US20140072048A1 (en) 2012-09-13 2014-03-13 Samsung Electronics Co., Ltd Method and apparatus for a switchable de-ringing filter for image/video coding
US9979960B2 (en) * 2012-10-01 2018-05-22 Microsoft Technology Licensing, Llc Frame packing and unpacking between frames of chroma sampling formats with different chroma resolutions
US9426469B2 (en) 2012-12-17 2016-08-23 Broadcom Corporation Combination HEVC deblocker/SAO filter
US8817179B2 (en) 2013-01-08 2014-08-26 Microsoft Corporation Chroma frame conversion for the video codec

Non-Patent Citations (13)

* Cited by examiner, † Cited by third party
Title
"High efficiency video coding (HEVC) text specification draft 7", JCTVC-I1003 D5, April 2012 (2012-04-01)
"High efficiency video coding (HEVC) text specification draft 7", JCTVC-I1003_D5, April 2012 (2012-04-01)
A. COHEN; I. DAUBECHIES; J.-C. FEAUVEAU: "Biorthogonal Bases of Compactly Supported Wavelets", COMMUNICATIONS ON PURE AND APPLIED MATHEMATICS, vol. 45, no. 5, 1992, pages 485 - 560, XP055220114, DOI: doi:10.1002/cpa.3160450502
D. LE GALL; A. TABATABAI: "Sub-band Coding of Digital Images Using Symmetric Short Kernel Filters and Arithmetic Coding Techniques", PROC. IEEE INTL. CONF. ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP, 1988, pages 761 - 764
G. UYTTERHOEVEN; D. ROOSE; A. BULTHEEL: "Wavelet-Based Interactive Video Communication and Image Database Consulting - Wavelet Transforms using the Lifting Scheme", TECHNICAL REPORT ITA-WAVELETS-WP1.1, KATHOLIEKE UNIVERSITEIT LEUVEN, 28 April 1997 (1997-04-28)
J. D. VILLASENOR; B. BELZER; J. LIAO: "Wavelet Filter Evaluation for Image Compression", IEEE TRANS ON IMAGE PROCESSING, August 1995 (1995-08-01), pages 1053 - 1057
J. ROTHWEILER: "Polyphase Quadrature Filters - A New Subband Coding Technique", PROC. INT. CONF. ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, vol. 8, 1983, pages 1280 - 1283, XP000560573
M. J. T. SMITH; T. P. BARNWELL, III: "Exact Reconstruction Techniques for Tree- structured Subband Coders", IEEE TRANS. ON SIGNAL PROCESSING, vol. 34, 1986, pages 434 - 441, XP000828453, DOI: doi:10.1109/TASSP.1986.1164832
R. C. CALDERBANK; I. DAUBECHIES; W. SWELDENS; B.-L. YEO: "Wavelet Transforms That Map Integers to Integers", APPL. COMPUT. HARMON. ANAL., vol. 5, no. 3, 1998, pages 332 - 369, XP001152116, DOI: doi:10.1006/acha.1997.0238
REC. ITU-T T.800 ISO/IEC 15444-1 INFORMATION TECHNOLOGY -JPEG 2000 IMAGE CODING SYSTEM: CORE CODING SYSTEM, 2002
See also references of EP2910023A1 *
W. SWELDENS: "The Lifting Scheme: A Construction of Second Generation Wavelets", SLAM JOURNAL ON MATHEMATICAL ANALYSIS, 1998
WU Y ET AL: "Frame packing arrangement SEI for 4:4:4 content in 4:2:0 bitstreams", 11. JCT-VC MEETING; 102. MPEG MEETING; 10-10-2012 - 19-10-2012; SHANGHAI; (JOINT COLLABORATIVE TEAM ON VIDEO CODING OF ISO/IEC JTC1/SC29/WG11 AND ITU-T SG.16 ); URL: HTTP://WFTP3.ITU.INT/AV-ARCH/JCTVC-SITE/,, no. JCTVC-K0240, 1 October 2012 (2012-10-01), XP030113122 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017050116A1 (zh) * 2015-09-25 2017-03-30 网宿科技股份有限公司 一种基于gpu的完全硬件转码的方法和系统
CN105578190A (zh) * 2016-02-03 2016-05-11 珠海全志科技股份有限公司 应用于视频硬解码的无损压缩方法及系统
US10681363B2 (en) 2016-02-03 2020-06-09 Allwinner Technology Co., Ltd. Lossless compression method and system applied to hardware video decoding
CN111031388A (zh) * 2019-11-25 2020-04-17 无锡思朗电子科技有限公司 Yuv4:4:4数据的编解码方法

Also Published As

Publication number Publication date
CN104982036B (zh) 2018-05-25
US9661340B2 (en) 2017-05-23
CN104982036A (zh) 2015-10-14
JP2016500965A (ja) 2016-01-14
EP2910023A1 (en) 2015-08-26
US20140112394A1 (en) 2014-04-24
KR20150079624A (ko) 2015-07-08

Similar Documents

Publication Publication Date Title
US9661340B2 (en) Band separation filtering / inverse filtering for frame packing / unpacking higher resolution chroma sampling formats
US10523953B2 (en) Frame packing and unpacking higher-resolution chroma sampling formats
US12556742B2 (en) Features of base color index map mode for video and image coding and decoding
US10044974B2 (en) Dynamically updating quality to higher chroma sampling rate
US10523933B2 (en) Control data for motion-constrained tile set
EP3245790B1 (en) Encoding/decoding of high chroma resolution details
US10368080B2 (en) Selective upsampling or refresh of chroma sample values
US20160212423A1 (en) Filtering to mitigate artifacts when changing chroma sampling rates

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13789089

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2013789089

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2015538079

Country of ref document: JP

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 20157010297

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE