WO2008130716A2 - Video coding - Google Patents

Video coding Download PDF

Info

Publication number
WO2008130716A2
WO2008130716A2 PCT/US2008/005216 US2008005216W WO2008130716A2 WO 2008130716 A2 WO2008130716 A2 WO 2008130716A2 US 2008005216 W US2008005216 W US 2008005216W WO 2008130716 A2 WO2008130716 A2 WO 2008130716A2
Authority
WO
WIPO (PCT)
Prior art keywords
information
picture
view
encoded
different
Prior art date
Application number
PCT/US2008/005216
Other languages
French (fr)
Other versions
WO2008130716A3 (en
Inventor
Purvin Bibhas. Pandit
Peng Yin
Original Assignee
Thomson Licensing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing filed Critical Thomson Licensing
Publication of WO2008130716A2 publication Critical patent/WO2008130716A2/en
Publication of WO2008130716A3 publication Critical patent/WO2008130716A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/597Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/70Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards

Definitions

  • the present principles relate generally to video encoding and decoding.
  • MVC multi-view video coding
  • At least a portion of an encoded picture is accessed.
  • the portion has been encoded using motion information determined for a different portion, and using one or more of illumination compensation or color compensation.
  • the portion of the encoded picture is decoded.
  • At least a portion of a picture to be encoded is accessed. Motion information for a different portion of the picture is identified. The portion of the picture is encoded using the identified motion information and using one or more of illumination compensation or color compensation.
  • a video signal structure or a signal includes at least a portion of a picture.
  • the portion has been encoded using motion information determined for a different portion, and using one or more of illumination compensation or color compensation.
  • the structure or signal also includes information describing how one or more of the illumination compensation or the color compensation was used in encoding the portion.
  • implementations may be configured or embodied in various manners. For example, an implementation may be performed as a method, or embodied as an apparatus configured to perform a set of operations, or embodied as an apparatus storing instructions for performing a set of operations, or embodied in a signal. Other aspects and features will become apparent from the following detailed description considered in conjunction with the accompanying drawings and the claims.
  • FIG. 1 is a block diagram for an implementation of a decoding structure for macroblock-based illumination compensation
  • FIG. 2 is a block diagram for an implementation of an encoding structure for macroblock-based illumination compensation
  • FIG. 3 is a block diagram for an exemplary video encoder to which the present principles may be applied, in accordance with an embodiment of the present principles
  • FIG. 4 is a block diagram for an exemplary video decoder to which the present principles may be applied, in accordance with an embodiment of the present principles
  • FIGS. 5A and 5B are a flow diagram for an exemplary method for encoding macroblocks, in accordance with an embodiment of the present principles
  • FIGS. 6A and 6B are a flow diagram for an exemplary method for decoding macroblocks, in accordance with an embodiment of the present principles
  • FIGS. 7A and 7B are a flow diagram for another exemplary method for encoding macroblocks, in accordance with an embodiment of the present principles
  • FIGS. 8A and 8B are a flow diagram for another exemplary method for decoding macroblocks, in accordance with an embodiment of the present principles.
  • FIG. 9 is a block diagram for an implementation of a receiving device for decoding pictures.
  • At least one implementation described in this application is directed to combining motion skip mode with illumination compensation and/or color compensation. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the present principles, of one or more described implementations, and are included within its spirit and scope.
  • processor or “controller” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (“DSP”) hardware, read-only memory (“ROM”) for storing software, random access memory (“RAM”), and non-volatile storage.
  • DSP digital signal processor
  • ROM read-only memory
  • RAM random access memory
  • any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
  • any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function.
  • the present principles as defined by such claims reside in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
  • such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C).
  • This may be extended, as readily apparent by one of ordinary skill in this and related arts, for as many items listed.
  • Multi-view video coding is the compression framework for the encoding of multi-view sequences.
  • a Multi-view Video Coding (MVC) sequence is a set of two or more video sequences that capture the same scene from a different view point.
  • high level syntax refers to syntax present in the bitstream that resides hierarchically above the macroblock layer.
  • high level syntax may refer to, but is not limited to, syntax at the slice header level, Supplemental Enhancement Information (SEI) level, Picture Parameter Set (PPS) level, Sequence Parameter Set (SPS) level, View Parameter Set (VPS), and Network Abstraction Layer (NAL) unit header level.
  • SEI Supplemental Enhancement Information
  • PPS Picture Parameter Set
  • SPS Sequence Parameter Set
  • VPS View Parameter Set
  • NAL Network Abstraction Layer
  • low level syntax refers to syntax below the slice layer that resides in the macroblock layer.
  • low level syntax may refer to, but is not limited to, mb_ic_flag and dpcm_of_dvic.
  • present principles are not limited to solely these low level syntaxes and, thus, these and other low level syntaxes may be used in accordance with the present principles, while maintaining the spirit of the present principles.
  • IC illumination compensation
  • CC color compensation
  • a multi-view video source includes multiple views of the same scene, there exists a high degree of correlation between the multiple view images. Therefore, view redundancy can be exploited in addition to temporal redundancy and is achieved by performing view prediction across the different views.
  • multi-view video systems involving a large number of cameras will be built using heterogeneous cameras, or cameras that have not been perfectly calibrated. This leads to differences in luminance and chrominance when the same parts of a scene are viewed with different cameras. Moreover, camera distance and positioning also affects illumination, in the sense that the same surface may reflect the light differently when perceived from different angles. Under these scenarios, luminance and chrominance differences will decrease the efficiency of cross-view prediction.
  • JMVM Joint Multiview Video Model
  • IC local illumination compensation
  • Motion skip mode is proposed to improve the coding efficiency for multi-view video coding.
  • Motion skip mode originated from the idea that there is a similarity relating to the motion between two neighboring views.
  • the motion information is inferred from the corresponding macroblock in the frame with the same temporal index of the neighboring view.
  • a disparity vector is applied to find the corresponding macroblock in the neighboring view.
  • the proposed method does not address how motion skip mode works when IC (CC) is enabled.
  • At least one embodiment provides a solution for how to combine motion skip mode with illumination compensation and color compensation tools for multi-view video coding.
  • IC as an example.
  • CC Integrated Circuit
  • JMVM JMVM
  • IC is adopted into software as a new coding tool to improve the coding efficiency.
  • the proposed method employs predictive coding for the direct current (DC) component of Inter prediction residues.
  • the predictor for illumination change is formed from neighboring blocks to explore the strong spatial correlation of illumination differences.
  • the following flag is added to the slice header to indicate whether IC is enabled for this slice:
  • ic_enable 1 specifies that IC is enabled for the current slice. ic_enable equal to 0 specifies that IC is not enabled for the current slice.
  • mb_ic_flag 1 specifies that IC is used for the current macroblock.
  • mb_ic_flag 0 specifies that IC is not used for the current macroblock.
  • the default value for mb_ic_flag is zero.
  • dpcm_of_dvic specifies the amount of IC offset to be used for the current macroblock.
  • Illumination compensation aims to compensate local illumination changes between pictures in multi-view sequences.
  • FIG. 1 the decoding structure of macroblock-based illumination compensation is indicated generally by the reference numeral 100.
  • the decoding structure 100 includes an entropy decoder 105 having an output in signal communication with an input of an inverse quantizer/inverse transformer 110.
  • An output of the inverse quantizer/inverse transformer 110 is connected in signal communication with a first non-inverting input of a combiner 115.
  • An output of the combiner 115 is connected in signal communication with an input of a reference picture buffer 125.
  • An output of the reference picture buffer 125 is connected in signal communication with a first input of a motion compensator/illumination compensator 120.
  • An output of the motion compensator/illumination compensator 120 is connected in signal communication with a second non-inverting input of the combiner 115.
  • An output of a combiner 135 is connected in signal communication with a fourth input of the motion compensator/illumination compensator 120.
  • An output of a DVIC (differential value of illumination compensation) predictor 130 is connected in signal communication with a first non-inverting input of the combiner 135.
  • An input of the entropy decoder 105 is available as an input of the decoding structure 100, for receiving a bitstream.
  • a second input of the motion compensator/illumination compensator 120 is available as an input of the decoding structure 100, for receiving a mb_ic_flag.
  • a third input of the motion compensator/illumination compensator 120 is available as an input of the decoding structure 100, for receiving a motion vector(s).
  • a second non-inverting input of the combiner 135 is available as an input of the decoding structure 100, for receiving dpcm_of_dvic.
  • the output of the combiner 115 is also available as an output of the decoding structure, for outputting reconstructed frames.
  • MC IC motion compensation
  • DPCM differential pulse code modulation
  • DVIC Differential Value of Illumination Compensation
  • f'(i, j) represents the reconstructed partition before deblocking
  • r(i, j) represents the reference frame
  • MR_R"(i, j) represents the reconstructed residual signal.
  • the predictor predDVIC is obtained from the neighboring blocks.
  • FIG. 2 the encoding structure of macroblock-based illumination compensation is indicated generally by the reference numeral 200.
  • the encoding structure 200 includes a combiner 205 having an output in signal communication with an input of a transformer/quantizer 210.
  • An output of the transformer/quantizer 210 is connected in signal communication with an input of an entropy coder 220 and an input of an inverse quantizer/inverse transformer 215.
  • An output of the inverse quantizer/inverse transformer 215 is connected in signal communication with a first non-inverting input of a combiner 225.
  • An output of the combiner 225 is connected in signal communication with an input of a reference picture buffer 230.
  • An output of the reference picture buffer 230 is connected in signal communication with a first input of a DVIC calculator 250.
  • An output of the DVIC calculator 250 is connected in signal communication with a first input of an ICA (Illumination change-adaptive) motion estimator 255, a first input of a motion compensated predictor 260, and a second non-inverting input of a combiner 240.
  • An output of the ICA motion estimator 255 is connected in signal communication with a second input of the motion compensated predictor 260.
  • An output of the motion compensated predictor 260 is connected in signal communication with an inverting input of the combiner 205 and a second non-inverting input of the combiner 225.
  • An output of a DVIC predictor 235 is connected in signal communication with a first inverting input of the combiner 240.
  • An output of the combiner 240 is available as an output of the encoding structure 100, for outputting dpcm_of_dvic (dpcm of offset).
  • a first non-inverting input of the combiner 205, a second input of the ICA motion estimator 255, and a second input of the DVIC calculator 250 are available as an input to the encoding structure 200, for receiving input sequences.
  • An output of the entropy coder 220 is available as an output of the encoding structure 200, for outputting encoded residual signals.
  • the output of the ICA motion estimator 255 is also available as an output of the encoding structure 200, for outputting a motion vector(s).
  • An output of a mode selector 245 is available as an output of the encoding structure 200, for outputting mb_ic_flag for 16x16 and DIRECT modes.
  • Motion skip mode infers motion information, such as macroblock type, motion vectors, and reference indices, directly from the corresponding macroblock in the neighboring view at the same temporal instant.
  • the method is decomposed into two stages, i.e., search for the corresponding macroblock and derivation of motion information.
  • a global disparity vector is used to indicate the corresponding position in the picture of a neighboring view.
  • the method locates the corresponding macroblock in the neighboring view by means of a global disparity vector (GDV).
  • GDV global disparity vector
  • implementations presented here may use one or more disparity vectors that apply, for example, to all views, to only one view, or to only one view for only a particular sequence of pictures.
  • the GDV is measured in macroblock-size units between the current picture and the picture of the neighboring view.
  • the GDV can be estimated and decoded periodically, for instance, every anchor picture. In that case, the GDV of a non- anchor picture is interpolated using the recent GDVs from the anchor picture.
  • motion information is derived from (for example, copied, calculated, or otherwise determined from) the corresponding macroblock in the picture of a neighboring view, and it is copied to apply to the current macroblock.
  • Motion skip mode is disabled in the case when the current macroblock is in the picture of a base view or an anchor picture which is defined in JMVM. This follows, for example, because the base view is AVC compatible, and motion skip mode is defined for MVC.
  • anchor pictures do not have temporal references, but rather have inter-view references, and so do not use motion skip mode to infer motion from temporal references.
  • motion_skip_flag a new flag, motion_skip_flag, is included in the header of macroblock layer syntax for MVC. If motion_skip_flag is turned on, the current macroblock derives the macroblock type, motion vectors and reference indices from the corresponding macroblock in the neighboring view.
  • Motion skip mode infers the motion information, such as macroblock type, motion vector, and reference indices, directly from the corresponding macroblock in the neighboring view at the same temporal instant. However, the application of motion skip mode is not explained for the case when IC is enabled.
  • the implementation enables the use of IC with motion skip mode, and combines the syntax for each of these features as part of achieving this combination.
  • Other implementations use different syntax.
  • the enabling of IC for motion skip mode can be based on the inferred motion information.
  • the inferred motion information will indicate what mode was used for that block, and IC may be enabled for 16x16 mode and skip mode as is done in the current Standard.
  • the IC information may include, for example, mb_ic_flag and/or DVIC (or dpcm_of_dvic).
  • mb_ic_flag and/or DVIC or dpcm_of_dvic.
  • For motion skip mode we first derive macroblock type, motion vectors, and reference indices, directly from the corresponding macroblock in the neighboring view at the same temporal instant. Then we compute DVIC based on the motion information.
  • IC information for motion skip mode we derive IC information for motion skip mode. That is, the IC information is implicit at the decoder.
  • mb_ic_flag we perform motion compensation without IC.
  • IC information at the encoder, and decoder, using various methods.
  • the standard method may be used.
  • IC information (for example, a constant to be added to the illumination value for all pixels in an entire block) is determined from one or more neighboring blocks (macroblock or sub-macroblock) in the same picture (same view, same temporal instant) as the current block.
  • the reconstructions of the adjacent blocks are used.
  • IC information is determined from blocks in the same view at a different temporal instant (motion may also be accounted for), or blocks from a different view at the same temporal instant (disparity vector(s) may be used to account for differences in the views).
  • IC information is determined based on a computation of average intensity of an entire picture, compared with average intensity of a neighboring view at the same temporal instant.
  • the derivation of IC can be based on the IC information of inter-view prediction with a disparity vector and/or IC information of temporal prediction with motion of the corresponding macroblock in the neighboring view at the same temporal instant.
  • mb_ic_flag is set to one if either disparity or motion is set to 1.
  • the dpcm value is a function of disparity or motion. In one embodiment, the dpcm value can be set to the addition of disparity and motion.
  • IC information can also derive IC information using the previous processed/decoded IC information in the spatial and/or temporal neighboring blocks. Whether to enable IC and the amount of IC used for the current block can be derived from the neighboring IC offsets by averaging, median filtering, and so forth.
  • the derivation of IC can be based on median filtering, averaging, and/or so forth of neighboring illumination compensation offsets.
  • CC color compensation
  • CC may be used with motion skip mode, with or without IC.
  • CC may be determined in any of various methods, such as methods that correspond to the IC methods and implementations discussed in this application.
  • the syntax for explicitly signaling CC values, or for explicitly signaling the use of CC can be, for example, analogous to the syntax discussed for signaling IC.
  • an exemplary video encoder is indicated generally by the reference numeral 300.
  • the encoder 300 includes a combiner 305 having an output in signal communication with an input of a transformer and quantizer 310.
  • An output of the transformer and quantizer 310 is connected in signal communication with an input of an entropy coder 315 and an input of an inverse quantizer and inverse transformer 320.
  • An output of the inverse quantizer and inverse transformer 320 is connected in signal communication with first non-inverting input of a combiner 325.
  • An output of the combiner 325 is connected in signal communication with an input of a deblocking filter 335 (for deblocking a reconstructed picture).
  • An output of the deblocking filter 335 is connected in signal communication with an input of a reference picture buffer 340.
  • An output of the reference picture buffer 340 is connected in signal communication with a first input of a differential value of illumination compensation (DVIC) calculator and/or differential value of color compensation (DVCC) calculator 360.
  • An output of the DVIC calculator and/or DVCC calculator 360 is connected in signal communication with an input of motion estimator and compensator for illumination and/or color 355, with a first input of a motion compensator 330, and with a non-inverting input of a combiner 350.
  • An output of the motion estimator and compensator for illumination and/or color 355 is connected in signal communication with a second input of the motion compensator 330.
  • An output of the motion compensator 330 is connected in signal communication with a second non-inverting input of the combiner 325 and an inverting input of the combiner 305.
  • An output of a DVIC predictor and/or DVCC predictor 370 is connected in signal communication with an inverting input of the combiner 350.
  • a non-inverting input of the combiner 305, a second input of the motion estimator and compensator for illumination and/or color 355, and a second input of the DVIC calculator and/or DVCC calculator 360 are available as inputs to the encoder 300, for receiving input video sequences.
  • An output of the entropy coder 315 is available as an output of the encoder 300, for outputting encoded residual signals.
  • the output of the motion estimator and compensator for illumination and/or color 355 is available as an output of the encoder 300, for outputting a motion vector(s).
  • An output of the combiner 350 is available as an output of the encoder 300, for outputting dpcm_of_dvic and/or dpcm_of_dvcc (dpcm of offset).
  • An output of a mode selector 365 is available as an output of the encoder 300, for outputting mb_ic_flag and/or mb_cc_flag for 16x16 and direct mode.
  • the output of the motion compensator 330 provides a motion compensated prediction.
  • an exemplary video decoder is indicated generally by the reference numeral 400.
  • the decoder 400 includes an entropy decoder 405 having an output connected in signal communication with an input of an inverse quantizer and inverse transformer 410.
  • An output of the inverse quantizer and inverse transformer 410 is connected in signal communication with a first non-inverting input of a combiner 415.
  • An output of the combiner 415 is connected in signal communication with an input of a deblocking filter 420.
  • An output of the deblocking filter 420 is connected in signal communication with an input of a reference picture buffer 435.
  • An output of the reference picture buffer 435 is connected in signal communication with a first input of a motion compensator for illumination and/or color 425.
  • An output of the motion compensator for illumination and/or color 425 is connected in signal communication with a second non-inverting input of a combiner 415.
  • An output of a differential value of illumination compensation (DVIC) predictor and/or differential value of color compensation (DVCC) predictor 430 is connected in signal communication with a first non-inverting input of a combiner 470.
  • An input of the entropy decoder 405 is available as an input of the decoder 400, for receiving an input video bitstream.
  • a second input of the motion compensator for illumination and/or color 425 is available as an input of the decoder 400, for receiving a motion vector(s).
  • a third input of the motion compensator for illumination and/or color 425 is available as an input of the decoder 400, for receiving mb_ic_flag and/or mb_cc_flag.
  • a second non-inverting input of the combiner 470 is available as an input of the decoder 400, for receiving dpcm_of_dvic and/or dpcm_of_dvcc.
  • the output of the deblocking filter 420 is available as an output of the decoder 400, for providing reconstructed frames.
  • FIG. 5 which refers generally to both FIGS. 5A and 5B, an exemplary method for encoding macroblocks is indicated generally by the reference numeral 500.
  • the method 500 includes a start block 502 that passes control to a function block 504.
  • the function block 504 reads the encoder configuration file, and passes control to a function block 506.
  • the function block 506 sets anchor and non-anchor picture references in a Sequence Parameter Set (SPS) extension, and passes control to a function block 508.
  • SPS Sequence Parameter Set
  • the function block 508 lets the number of views be equal to a variable N, sets a variable i and a variable j equal to zero, and passes control to a decision block 510.
  • the decision block 510 determines whether or not the current value of the variable i is less than the current value of the variable N. If so, then control is passed to a decision block 512. Otherwise, control is passed to a function block 538.
  • the decision block 512 determines whether or not the current value of the variable j is less than the number of pictures in view i. If so, then control is passed to a function block 514. Otherwise, control is passed to a function block 544.
  • the function block 514 starts encoding the current macroblock, and passes control to a function block 516.
  • the function block 516 checks macroblock modes, and passes control to a function block 518.
  • the function block 518 checks motion skip macroblock mode without illumination compensation, and passes control to a function block 520.
  • the function block 520 checks motion skip mode with illumination compensation (to determine whether illumination compensation information is to be obtained from disparity/motion information from a neighboring view at a same temporal location or whether illumination compensation information is to be from previously decoded illumination compensation information of a spatial/temporal neighboring block), and passes control to a decision block 522.
  • the decision block 522 determines whether or not motion skip is the best mode. If so, then control is passed to a function block 524. Otherwise, control is passed to a function block 546.
  • the function block 524 sets motion_skip_flag equal to one, and passes control to a decision block 526.
  • the decision block 526 determines whether or not illumination compensation is enabled. If so, then control is passed to a function block 528. Otherwise, control is passed to a function block 530.
  • the function block 528 sets mb_ic_flag equal to one, sets dpcm_of_dvic, and passes control to a function block 530.
  • the function block 530 encodes the macroblock, and passes control to a decision block 532.
  • the decision block 532 determines whether or not all macroblocks have been encoded. If so, then control is passed to a function block 534. Otherwise, control is returned to the function block 514 to start encoding another MB.
  • the function block 534 increments the variable j, and passes control to a function block 536.
  • the function block 536 increments frame_num and picture order count (POC), and returns control to the decision block 512.
  • the decision block 538 determines whether or not the SPS, PPS, and/or VPS (and/or any other syntax structure and/or syntax element that is used for the purposes of the present principles) are to be sent in-band. If so, then control is passed to a function block 540. Otherwise, control is passed to a function block 548.
  • the function block 540 sends the SPS, PPS, and/or VPS in-band, and passes control to a function block 542.
  • the function block 542 writes the bitstream to a file or streams the bitstream over a network, and passes control to an end block 599.
  • the function block 548 sends the SPS, PPS, and/or VPS out-of-band, and passes control to the function block 542.
  • the function block 544 increments the variable i, resets frame_num and POC, and returns control to the decision block 510.
  • the function block 546 sets motion_skip_flag equal to zero, and passes control to the function block 530.
  • FIG. 6 which refers generally to both FIGS. 6A and 6B, an exemplary method for decoding macroblocks is indicated generally by the reference numeral 600.
  • the method 600 includes a start block 602 that passes control to a function block 604.
  • the function block 604 parses view_id from the SPS, PPS, VPS, slice header and/or NAL unit header, and passes control to a function block 606.
  • the function block 606 parses other SPS parameters, and passes control to a decision block 608.
  • the decision block 608 determines whether or not the current picture needs decoding. If so, then control is passed to a decision block 610. Otherwise, control is passed to a function block 640.
  • the decision block 610 determines whether or not the current value of the picture order count (POC) is equal to a previous value of the POC. If so, then control is passed to a function block 612. Otherwise, control is passed to a function block 614.
  • POC picture order count
  • the function block 612 sets view_num equal to zero, and passes control to the function block 614.
  • the function block 614 indexes view_id information at a high level to determine view coding order, increments view_num, and passes control to a decision block 616.
  • the decision block 616 determines whether or not the current picture is in the expected coding order. If so, then control is passed to a function block 618. Otherwise, control is passed to a function block 642.
  • the function block 618 parses the slice header, and passes control to a function block 622.
  • the function block 622 parses motion_skip_flag, and passes control to a decision block 624.
  • the decision block 624 determines whether or not motion_skip_flag is equal to one. If so, then control is passed to a function block 626. Otherwise, control is passed to a function block 620.
  • the function block 626 parses mb_ic_flag, and passes control to a decision block 628.
  • the decision block 628 determines whether or not mb_ic_flag is equal to one. If so, then control is passed to a function block 630. Otherwise, control is passed to the function block 632.
  • the function block 630 parses dpcm_of_dvic, and passes control to the function block 632.
  • the function block 632 decodes the macroblock, and passes control to a decision block 634.
  • the decision block 634 determines whether or not all macroblocks are done (have been encoded). If so, then control is passed to a function block 636. Otherwise, control is returned to the function block 622.
  • the function block 636 inserts the current picture in the decoded picture buffer (DPB), and passes control to a decision block 638.
  • the decision block 638 determines whether or not all pictures have been decoded. If so, then control is passed to an end block 699. Otherwise, control is returned to the function block 618.
  • the function block 640 gets the next picture, and returns control to the decision block 608.
  • the function block 642 conceals the picture, and passes control to the function block 640.
  • the function block 620 parses the macroblock mode, a motion vector (mv), and ref_idx, and passes control to the function block 632.
  • FIG. 7 which refers generally to both FIGS. 7 A and 7B, another exemplary method for encoding macroblocks is indicated generally by the reference numeral 700.
  • the method 700 includes a start block 702 that passes control to a function block 704.
  • the function block 704 reads the encoder configuration file, and passes control to a function block 706.
  • the function block 706 sets anchor and non-anchor picture references in a Sequence Parameter Set (SPS) extension, and passes control to a function block 708.
  • SPS Sequence Parameter Set
  • the function block 708 lets the number of views be equal to a variable N, sets a variable i and a variable j equal to zero, and passes control to a decision block 710.
  • the decision block 710 determines whether or not the current value of the variable i is less than the current value of the variable N. If so, then control is passed to a decision block 712. Otherwise, control is passed to a function block 738.
  • the decision block 712 determines whether or not the current value of the variable j is less than the number of pictures in view i. If so, then control is passed to a function block 714. Otherwise, control is passed to a function block 744.
  • the function block 714 starts encoding the current macroblock, and passes control to a function block 716.
  • the function block 716 checks macroblock modes, and passes control to a function block 718.
  • the function block 718 checks motion skip macroblock mode without illumination compensation, and passes control to a function block 720.
  • the function block 720 checks motion skip mode with illumination compensation (to determine whether illumination compensation information is to be obtained from disparity/motion information from a neighboring view at a same temporal location or whether illumination compensation information is to be from previously decoded illumination compensation information of a spatial/temporal neighboring block), an passes control to a decision block 722.
  • the decision block 722 determines whether or not motion skip is the best mode. If so, then control is passed to a function block 724. Otherwise, control is passed to a function block 746.
  • the function block 724 sets motion_skip_flag equal to one, and passes control to a decision block 726.
  • the decision block 726 determines whether or not illumination compensation is enabled. If so, then control is passed to a function block 728. Otherwise, control is passed to a function block 730.
  • the function block 728 sets mb_ic_flag equal to one, and passes control to a function block 730.
  • the function block 730 encodes the macroblock, and passes control to a decision block 732.
  • the decision block 732 determines whether or not all macroblocks have been encoded. If so, then control is passed to a function block 734. Otherwise, control is returned to the function block 714 to start encoding another MB.
  • the function block 734 increments the variable j, and passes control to a function block 736.
  • the function block 736 increments frame_num and picture order count (POC), and returns control to the decision block 712.
  • the decision block 738 determines whether or not the SPS, PPS, and/or VPS (and/or any other syntax structure and/or syntax element that is used for the purposes of the present principles) are to be sent in-band. If so, then control is passed to a function block 740. Otherwise, control is passed to a function block 748.
  • the function block 740 sends the SPS, PPS, and/or VPS in-band, and passes control to a function block 742.
  • the function block 742 writes the bitstream to a file or streams the bitstream over a network, and passes control to an end block 799.
  • the function block 748 sends the SPS, PPS, and/or VPS out-of-band, and passes control to the function block 742.
  • the function block 744 increments the variable i, resets frame_num and POC, and returns control to the decision block 710.
  • the function block 746 sets motion_skip_flag equal to zero, and passes control to the function block 730.
  • FIG. 8 which refers generally to both FIGS. 8A and 8B, another exemplary method for decoding macroblocks is indicated generally by the reference numeral 800.
  • the method 800 includes a start block 802 that passes control to a function block 804.
  • the function block 804 parses viewjd from the SPS, PPS, VPS, slice header and/or NAL unit header, and passes control to a function block 806.
  • the function block 806 parses other SPS parameters, and passes control to a decision block 808.
  • the decision block 808 determines whether or not the current picture needs decoding. If so, then control is passed to a decision block 810. Otherwise, control is passed to a function block 840.
  • the decision block 810 determines whether or not the current value of the picture order count (POC) is equal to a previous value of the POC. If so, then control is passed to a function block 812. Otherwise, control is passed to a function block 814.
  • POC picture order count
  • the function block 812 sets view_num equal to zero, and passes control to the function block 814.
  • the function block 814 indexes viewjd information at a high level to determine view coding order, increments view_num, and passes control to a decision block 816.
  • the decision block 816 determines whether or not the current picture is in the expected coding order. If so, then control is passed to a function block 818. Otherwise, control is passed to a function block 842.
  • the function block 818 parses the slice header, and passes control to a function block 822.
  • the function block 822 parses motion_skip_flag, and passes control to a decision block 824.
  • the decision block 824 determines whether or not motion_skip_flag is equal to one. If so, then control is passed to a function block 826. Otherwise, control is passed to a function block 820.
  • the function block 826 parses mb_ic_flag, and passes control to a decision block 828.
  • the decision block 828 determines whether or not mb_ic_flag is equal to one. If so, then control is passed to a function block 829. Otherwise, control is passed to the function block 832.
  • the function block 829 determines an illumination compensation value from the neighboring macroblock, and passes control to the function block 832.
  • the function block 832 decodes the macroblock, and passes control to a decision block 834.
  • the decision block 834 determines whether or not all macroblocks are done (have been encoded). If so, then control is passed to a function block 836. Otherwise, control is returned to the function block 822.
  • the function block 836 inserts the current picture in the decoded picture buffer (DPB), and passes control to a decision block 838.
  • the decision block 838 determines whether or not all pictures have been decoded. If so, then control is passed to an end block 899. Otherwise, control is returned to the function block 818.
  • the function block 840 gets the next picture, and returns control to the decision block 808.
  • the function block 842 conceals the picture, and passes control to the function block 840.
  • the function block 820 parses the macroblock mode, a motion vector (mv), and ref_idx, and passes control to the function block 832.
  • the video processing device 900 may be, for example, a set top box or other device that receives encoded video and provides, for example, decoded video for display to a user or for storage. Thus, the device 900 may provide its output to a television, computer monitor, or a computer or other processing device.
  • the device 900 includes a front-end device 905 and a decoder 910.
  • the front-end device 905 may be, for example, a receiver adapted to receive a program signal having a plurality of bitstreams representing encoded pictures, and to select one or more bitstreams for decoding from the plurality of bitstreams.
  • Typical receivers perform functions such as, for example, one or more of receiving a modulated and encoded data signal, demodulating the data signal, decoding one or more encodings (for example, channel coding and/or source coding) of the data signal, and/or error-correcting the data signal.
  • the front-end device 905 may receive the program signal from, for example, an antenna (not shown).
  • the front- end device 905 provides a received data signal to the decoder 910.
  • the decoder 910 receives a data signal 920.
  • the data signal 920 may include, for example, one or more AVC, SVC, or MVC compatible streams.
  • the decoder 910 decodes all or part of the received signal 920 and provides as output a decoded video signal 930.
  • the decoded video 930 is provided to a selector 950.
  • the device 900 also includes a user interface 960 that receives a user input 970.
  • the user interface 960 provides a picture selection signal 980, based on the user input 970, to the selector 950.
  • the picture selection signal 980 and the user input 970 indicate which of multiple pictures, sequences, scalable versions, views, or other selections of the available decoded data a user desires to have displayed.
  • the selector 950 provides the selected picture(s) as an output 990.
  • the selector 950 uses the picture selection information 980 to select which of the pictures in the decoded video 930 to provide as the output
  • the selector 950 includes the user interface 960, and in other implementations no user interface 960 is needed because the selector 950 receives the user input 970 directly without a separate interface function being performed.
  • the selector 950 may be implemented in software or as an integrated circuit, for example.
  • the selector 950 is incorporated with the decoder 910, and in another implementation the decoder 910, the selector 950, and the user interface 960 are all integrated.
  • front-end 905 receives a broadcast of various television shows and selects one for processing. The selection of one show is based on user input of a desired channel to watch. Although the user input to front-end device 905 is not shown in FIG. 9, front-end device 905 receives the user input 970.
  • the front- end 905 receives the broadcast and processes the desired show by demodulating the relevant part of the broadcast spectrum, and decoding any outer encoding of the demodulated show.
  • the front-end 905 provides the decoded show to the decoder 910.
  • the decoder 910 is an integrated unit that includes devices 960 and 950.
  • the decoder 910 thus receives the user input, which is a user-supplied indication of a desired view to watch in the show.
  • the decoder 910 decodes the selected view, as well as any required reference pictures from other views, and provides the decoded view 990 for display on a television (not shown).
  • the user may desire to switch the view that is displayed and may then provide a new input to the decoder 910.
  • the decoder 910 decodes both the old view and the new view, as well as any views that are in between the old view and the new view. That is, the decoder 910 decodes any views that are taken from cameras that are physically located in between the camera taking the old view and the camera taking the new view.
  • the front-end device 905 also receives the information identifying the old view, the new view, and the views in between. Such information may be provided, for example, by a controller (not shown in FIG. 9) having information about the locations of the views, or the decoder 910.
  • the decoder 910 provides all of these decoded views as output 990.
  • a postprocessor (not shown in FIG. 9) interpolates between the views to provide a smooth transition from the old view to the new view, and displays this transition to the user. After transitioning to the new view, the post-processor informs (through one or more communication links not shown) the decoder 910 and the front-end device 905 that only the new view is needed. Thereafter, the decoder 910 only provides as output 990 the new view.
  • Various implementations discussed in this application may provide one or more advantages. For example, by combining IC and/or CC with motion skip mode, coding efficiency may be increased and a bit savings may be realized.
  • Implementations may signal information using a variety of techniques including, but not limited to, SPS syntax, other high level syntax, non-high-level syntax, out-of-band information, and implicit signaling. Accordingly, although implementations described herein may be described in a particular context, such descriptions should in no way be taken as limiting the features and concepts to such implementations or contexts.
  • implementations may be implemented in either, or both, an encoder and a decoder.
  • the implementations described herein may be implemented in, for example, a method or process, an apparatus, or a software program. Even if only discussed in the context of a single form of implementation (for example, discussed only as a method), the implementation of features discussed may also be implemented in other forms (for example, an apparatus or program).
  • An apparatus may be implemented in, for example, appropriate hardware, software, and firmware.
  • the methods may be implemented in, for example, an apparatus such as, for example, a processor, which refers to processing devices in general, including, for example, a computer, a microprocessor, an integrated circuit, or a programmable logic device. Processing devices also include communication devices, such as, for example, computers, cell phones, portable/personal digital assistants ("PDAs”), and other devices that facilitate communication of information between end-users.
  • PDAs portable/personal digital assistants
  • Implementations of the various processes and features described herein may be embodied in a variety of different equipment or applications, particularly, for example, equipment or applications associated with data encoding and decoding.
  • equipment include video coders, video decoders, video codecs, web servers, set-top boxes, laptops, personal computers, cell phones, PDAs, and other communication devices.
  • the equipment may be mobile and even installed in a mobile vehicle.
  • the methods may be implemented by instructions being performed by a processor, and such instructions may be stored on a processor- readable medium such as, for example, an integrated circuit, a software carrier or other storage device such as, for example, a hard disk, a compact diskette, a random access memory ("RAM"), or a read-only memory (“ROM").
  • the instructions may form an application program tangibly embodied on a processor-readable medium. Instructions may be, for example, in hardware, firmware, software, or a combination. Instructions may be found in, for example, an operating system, a separate application, or a combination of the two.
  • a processor may be characterized, therefore, as, for example, both a device configured to carry out a process and a device that includes a computer readable medium having instructions for carrying out a process.
  • implementations may also produce a signal formatted to carry information that may be, for example, stored or transmitted.
  • the information may include, for example, instructions for performing a method, or data produced by one of the described implementations.
  • a signal may be formatted, for example, as an electromagnetic wave (for example, using a radio frequency portion of spectrum) or as a baseband signal.
  • the formatting may include, for example, encoding a data stream and modulating a carrier with the encoded data stream.
  • the information that the signal carries may be, for example, analog or digital information.
  • the signal may be transmitted over a variety of different wired or wireless links, as is known.
  • a signal may be formatted to carry as data values for syntax described in one or more implementations, or even the syntax instructions themselves if the syntax is being transmitted, for example.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

There are provided methods and apparatus for, for example, combining motion skip mode with illumination compensation and/or color compensation. A particular method includes accessing at least a portion of an encoded picture. The portion has been encoded using motion information determined for a different portion (624), and using one or more of illumination compensation or color compensation (630). The method further includes decoding the portion of the encoded picture (632).

Description

VIDEO CODING
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of U.S. Provisional Application Serial No. 60/913,720, titled "Interview Prediction with Different Resolution Reference Picture", and filed April 24, 2007, which is incorporated by reference herein in its entirety for all purposes.
TECHNICAL FIELD
The present principles relate generally to video encoding and decoding.
BACKGROUND
It has been widely recognized that multi-view video coding (MVC) is a key technology that serves a wide variety of applications. Such applications include free-viewpoint and 3D video applications, home entertainment, and surveillance. In those multi-view applications, the amount of video data involved can be enormous. Thus, there exists the desire for efficient compression technologies to improve the coding efficiency of current video coding solutions performing, for example, simulcast of independent views.
SUMMARY
According to a general aspect, at least a portion of an encoded picture is accessed. The portion has been encoded using motion information determined for a different portion, and using one or more of illumination compensation or color compensation. The portion of the encoded picture is decoded.
According to another general aspect, at least a portion of a picture to be encoded is accessed. Motion information for a different portion of the picture is identified. The portion of the picture is encoded using the identified motion information and using one or more of illumination compensation or color compensation.
According to another general aspect, a video signal structure or a signal includes at least a portion of a picture. The portion has been encoded using motion information determined for a different portion, and using one or more of illumination compensation or color compensation. The structure or signal also includes information describing how one or more of the illumination compensation or the color compensation was used in encoding the portion.
The details of one or more implementations are set forth in the accompanying drawings and the description below. Even if described in one particular manner, it should be clear that implementations may be configured or embodied in various manners. For example, an implementation may be performed as a method, or embodied as an apparatus configured to perform a set of operations, or embodied as an apparatus storing instructions for performing a set of operations, or embodied in a signal. Other aspects and features will become apparent from the following detailed description considered in conjunction with the accompanying drawings and the claims.
DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram for an implementation of a decoding structure for macroblock-based illumination compensation;
FIG. 2 is a block diagram for an implementation of an encoding structure for macroblock-based illumination compensation;
FIG. 3 is a block diagram for an exemplary video encoder to which the present principles may be applied, in accordance with an embodiment of the present principles;
FIG. 4 is a block diagram for an exemplary video decoder to which the present principles may be applied, in accordance with an embodiment of the present principles;
FIGS. 5A and 5B are a flow diagram for an exemplary method for encoding macroblocks, in accordance with an embodiment of the present principles;
FIGS. 6A and 6B are a flow diagram for an exemplary method for decoding macroblocks, in accordance with an embodiment of the present principles;
FIGS. 7A and 7B are a flow diagram for another exemplary method for encoding macroblocks, in accordance with an embodiment of the present principles;
FIGS. 8A and 8B are a flow diagram for another exemplary method for decoding macroblocks, in accordance with an embodiment of the present principles; and
FIG. 9 is a block diagram for an implementation of a receiving device for decoding pictures. DETAILED DESCRIPTION
At least one implementation described in this application is directed to combining motion skip mode with illumination compensation and/or color compensation. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the present principles, of one or more described implementations, and are included within its spirit and scope.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the present principles and the concepts contributed by the inventor(s) to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions.
Moreover, all statements herein reciting principles, aspects, and embodiments of the present principles, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative circuitry embodying the present principles. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.
The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term "processor" or "controller" should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor ("DSP") hardware, read-only memory ("ROM") for storing software, random access memory ("RAM"), and non-volatile storage.
Other hardware, conventional and/or custom, may also be included. Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.
In the claims hereof, any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function. The present principles as defined by such claims reside in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.
Reference in the specification to "one embodiment" (or "one implementation") or "an embodiment" (or "an implementation") of the present principles means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment of the present principles. Thus, the appearances of the phrase "in one embodiment" or "in an embodiment" appearing in various places throughout the specification are not necessarily all referring to the same embodiment.
It is to be appreciated that the use of the terms "and/or" and "at least one of", for example, in the cases of "A and/or B" and "at least one of A and B", is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of both options (A and B). As a further example, in the cases of "A, B, and/or C" and "at least one of A, B, and C", such phrasing is intended to encompass the selection of the first listed option (A) only, or the selection of the second listed option (B) only, or the selection of the third listed option (C) only, or the selection of the first and the second listed options (A and B) only, or the selection of the first and third listed options (A and C) only, or the selection of the second and third listed options (B and C) only, or the selection of all three options (A and B and C). This may be extended, as readily apparent by one of ordinary skill in this and related arts, for as many items listed.
Moreover, it is to be appreciated that while one or more other embodiments of the present principles are described herein with respect to the multi-view video coding extension of the MPEG-4 AVC standard, the present principles are not limited to solely this extension and/or this standard and, thus, may be utilized with respect to other video coding standards, recommendations, and extensions thereof relating to multi-view video coding, while maintaining the spirit of the present principles. Multi-view video coding (MVC) is the compression framework for the encoding of multi-view sequences. A Multi-view Video Coding (MVC) sequence is a set of two or more video sequences that capture the same scene from a different view point.
Further, as used herein, "high level syntax" refers to syntax present in the bitstream that resides hierarchically above the macroblock layer. For example, high level syntax, as used herein, may refer to, but is not limited to, syntax at the slice header level, Supplemental Enhancement Information (SEI) level, Picture Parameter Set (PPS) level, Sequence Parameter Set (SPS) level, View Parameter Set (VPS), and Network Abstraction Layer (NAL) unit header level.
Also, as used herein, "low level syntax" refers to syntax below the slice layer that resides in the macroblock layer. For example, low level syntax, as used herein, may refer to, but is not limited to, mb_ic_flag and dpcm_of_dvic. Of course, the present principles are not limited to solely these low level syntaxes and, thus, these and other low level syntaxes may be used in accordance with the present principles, while maintaining the spirit of the present principles.
We propose and describe various embodiments that address a variety of problems and issues. In an embodiment, we propose how to combine illumination compensation (IC) and/or color compensation (CC) with motion skip mode for multi- view video coding. In an embodiment, we propose to derive IC or CC for motion skip mode. In another embodiment, we propose to explicitly code IC or CC for motion skip mode.
Since a multi-view video source includes multiple views of the same scene, there exists a high degree of correlation between the multiple view images. Therefore, view redundancy can be exploited in addition to temporal redundancy and is achieved by performing view prediction across the different views. In a practical scenario, multi-view video systems involving a large number of cameras will be built using heterogeneous cameras, or cameras that have not been perfectly calibrated. This leads to differences in luminance and chrominance when the same parts of a scene are viewed with different cameras. Moreover, camera distance and positioning also affects illumination, in the sense that the same surface may reflect the light differently when perceived from different angles. Under these scenarios, luminance and chrominance differences will decrease the efficiency of cross-view prediction. In JMVM (Joint Multiview Video Model), local illumination compensation (IC) methods for multi-view video coding were adopted into software as follows: an offset for each signal block is prediction coded and signaled in order to compensate the illumination differences in cross-view prediction.
Motion skip mode is proposed to improve the coding efficiency for multi-view video coding. Motion skip mode originated from the idea that there is a similarity relating to the motion between two neighboring views. In the proposed method, the motion information is inferred from the corresponding macroblock in the frame with the same temporal index of the neighboring view. To compensate the inter-view difference generated by camera geometry, a disparity vector is applied to find the corresponding macroblock in the neighboring view. However, the proposed method does not address how motion skip mode works when IC (CC) is enabled.
At least one embodiment provides a solution for how to combine motion skip mode with illumination compensation and color compensation tools for multi-view video coding. To illustrate a particular embodiment, we describe how the proposed method could be applied to a multi-view extension of the MPEG-4 AVC Standard, which enables both temporal and cross-view prediction. For the following discussion, we will only use IC as an example. However, it is to be appreciated that the same methodology applies to CC. Moreover, in other embodiments, IC and CC may be combined. These and other variations and implementations of the present principles are readily identified by one of ordinary skill in this and related arts, given the teachings of the present principles provided herein, while maintaining the spirit of the present principles.
Illumination Compensation in JMVM
In JMVM, IC is adopted into software as a new coding tool to improve the coding efficiency. Compared to the MPEG-4 AVC Standard, the proposed method employs predictive coding for the direct current (DC) component of Inter prediction residues. The predictor for illumination change is formed from neighboring blocks to explore the strong spatial correlation of illumination differences.
In this coding tool, the following flag is added to the slice header to indicate whether IC is enabled for this slice:
ic_enable equal to 1 specifies that IC is enabled for the current slice. ic_enable equal to 0 specifies that IC is not enabled for the current slice.
In the macroblock prediction syntaxes, several IC related syntaxes are introduced as follows:
mb_ic_flag equal to 1 specifies that IC is used for the current macroblock. mb_ic_flag equal to 0 specifies that IC is not used for the current macroblock. The default value for mb_ic_flag is zero.
dpcm_of_dvic specifies the amount of IC offset to be used for the current macroblock.
Illumination compensation aims to compensate local illumination changes between pictures in multi-view sequences. Turning to FIG. 1 , the decoding structure of macroblock-based illumination compensation is indicated generally by the reference numeral 100.
The decoding structure 100 includes an entropy decoder 105 having an output in signal communication with an input of an inverse quantizer/inverse transformer 110. An output of the inverse quantizer/inverse transformer 110 is connected in signal communication with a first non-inverting input of a combiner 115. An output of the combiner 115 is connected in signal communication with an input of a reference picture buffer 125. An output of the reference picture buffer 125 is connected in signal communication with a first input of a motion compensator/illumination compensator 120. An output of the motion compensator/illumination compensator 120 is connected in signal communication with a second non-inverting input of the combiner 115. An output of a combiner 135 is connected in signal communication with a fourth input of the motion compensator/illumination compensator 120.
An output of a DVIC (differential value of illumination compensation) predictor 130 is connected in signal communication with a first non-inverting input of the combiner 135.
An input of the entropy decoder 105 is available as an input of the decoding structure 100, for receiving a bitstream.
A second input of the motion compensator/illumination compensator 120 is available as an input of the decoding structure 100, for receiving a mb_ic_flag.
A third input of the motion compensator/illumination compensator 120 is available as an input of the decoding structure 100, for receiving a motion vector(s).
A second non-inverting input of the combiner 135 is available as an input of the decoding structure 100, for receiving dpcm_of_dvic.
The output of the combiner 115 is also available as an output of the decoding structure, for outputting reconstructed frames.
If mb_ic_flag is equal to 0 for the current macroblock, then the IC motion compensation (MC) is not performed, i.e., the conventional decoding process is performed. Otherwise, the DPCM (differential pulse code modulation) value of DVIC (dpcm_of_dvic) is used, where DVIC stands for Differential Value of Illumination Compensation, and the proposed illumination compensation motion compensation is performed as follows:
DVIC = dpcm_of_dvic + predDVIC f \iJ) = {MR_ Rn(x\ y \i, j) + r(i + x\ j+ y ')} + OW\C (1 )
where f'(i, j) represents the reconstructed partition before deblocking, and r(i, j) represents the reference frame, and MR_R"(i, j) represents the reconstructed residual signal. The predictor predDVIC is obtained from the neighboring blocks.
Turning to FIG. 2, the encoding structure of macroblock-based illumination compensation is indicated generally by the reference numeral 200.
The encoding structure 200 includes a combiner 205 having an output in signal communication with an input of a transformer/quantizer 210. An output of the transformer/quantizer 210 is connected in signal communication with an input of an entropy coder 220 and an input of an inverse quantizer/inverse transformer 215. An output of the inverse quantizer/inverse transformer 215 is connected in signal communication with a first non-inverting input of a combiner 225. An output of the combiner 225 is connected in signal communication with an input of a reference picture buffer 230. An output of the reference picture buffer 230 is connected in signal communication with a first input of a DVIC calculator 250. An output of the DVIC calculator 250 is connected in signal communication with a first input of an ICA (Illumination change-adaptive) motion estimator 255, a first input of a motion compensated predictor 260, and a second non-inverting input of a combiner 240. An output of the ICA motion estimator 255 is connected in signal communication with a second input of the motion compensated predictor 260. An output of the motion compensated predictor 260 is connected in signal communication with an inverting input of the combiner 205 and a second non-inverting input of the combiner 225.
An output of a DVIC predictor 235 is connected in signal communication with a first inverting input of the combiner 240.
An output of the combiner 240 is available as an output of the encoding structure 100, for outputting dpcm_of_dvic (dpcm of offset).
A first non-inverting input of the combiner 205, a second input of the ICA motion estimator 255, and a second input of the DVIC calculator 250 are available as an input to the encoding structure 200, for receiving input sequences.
An output of the entropy coder 220 is available as an output of the encoding structure 200, for outputting encoded residual signals.
The output of the ICA motion estimator 255 is also available as an output of the encoding structure 200, for outputting a motion vector(s).
An output of a mode selector 245 is available as an output of the encoding structure 200, for outputting mb_ic_flag for 16x16 and DIRECT modes.
Motion Skip Mode for MVC
Motion skip mode infers motion information, such as macroblock type, motion vectors, and reference indices, directly from the corresponding macroblock in the neighboring view at the same temporal instant. The method is decomposed into two stages, i.e., search for the corresponding macroblock and derivation of motion information. In the first stage, a global disparity vector is used to indicate the corresponding position in the picture of a neighboring view. The method locates the corresponding macroblock in the neighboring view by means of a global disparity vector (GDV). However, implementations presented here may use one or more disparity vectors that apply, for example, to all views, to only one view, or to only one view for only a particular sequence of pictures.
The GDV is measured in macroblock-size units between the current picture and the picture of the neighboring view. The GDV can be estimated and decoded periodically, for instance, every anchor picture. In that case, the GDV of a non- anchor picture is interpolated using the recent GDVs from the anchor picture. In the second stage, motion information is derived from (for example, copied, calculated, or otherwise determined from) the corresponding macroblock in the picture of a neighboring view, and it is copied to apply to the current macroblock. Motion skip mode is disabled in the case when the current macroblock is in the picture of a base view or an anchor picture which is defined in JMVM. This follows, for example, because the base view is AVC compatible, and motion skip mode is defined for MVC. Additionally, anchor pictures do not have temporal references, but rather have inter-view references, and so do not use motion skip mode to infer motion from temporal references.
To indicate the use of motion skip mode to a decoder, a new flag, motion_skip_flag, is included in the header of macroblock layer syntax for MVC. If motion_skip_flag is turned on, the current macroblock derives the macroblock type, motion vectors and reference indices from the corresponding macroblock in the neighboring view.
Motion skip mode infers the motion information, such as macroblock type, motion vector, and reference indices, directly from the corresponding macroblock in the neighboring view at the same temporal instant. However, the application of motion skip mode is not explained for the case when IC is enabled.
Implementations
In at least one implementation, we propose how to combine motion skip mode with IC for MVC. The implementation enables the use of IC with motion skip mode, and combines the syntax for each of these features as part of achieving this combination. Other implementations use different syntax. The enabling of IC for motion skip mode can be based on the inferred motion information. For example, the inferred motion information will indicate what mode was used for that block, and IC may be enabled for 16x16 mode and skip mode as is done in the current Standard. In an embodiment, we can disable IC if motion skip mode is enabled. In another embodiment, we only enable IC if the size of the inferred macroblock type is 16x16 when motion skip mode is enabled. For IC signaling of motion skip mode, we can explicitly send the IC information or implicitly derive the IC information. Alternatively, we can adaptively choose the explicit and implicit signaling. The IC information may include, for example, mb_ic_flag and/or DVIC (or dpcm_of_dvic). We can explicitly signal both mb_ic_flag and dpcm_of_dvic. Alternatively, we can explicitly signal only mb_ic_flag, and derive DVIC. Alternatively, we can derive mb_ic_flag and explicitly signal dpcm_of_dvic.
In a first embodiment, we explicitly send IC information for motion skip mode. In the decoder part, we first derive macroblock type, motion vector, and reference indices, directly from the corresponding macroblock in the neighboring view at the same temporal instant. Then we extract IC information. If mb_ic_flag=1 , we extract dpcm_of_dvic information, and then perform motion compensation with IC. If mb_ic_flag=O, we perform motion compensation without IC. In the encoder part, for motion skip mode, we first derive macroblock type, motion vectors, and reference indices, directly from the corresponding macroblock in the neighboring view at the same temporal instant. Then we compute DVIC based on the motion information. We then test both modes which enable IC and disable IC, then we choose the mode with less coding cost (or by using some other metric). We also compare motion skip mode with other modes. If the final mode (for example, as determined by the coding cost or other metric) is motion skip mode with IC enabled, we send motion_skip_flag=1 , mb_ic_flag=1 and the related dpcm_dvic information. If the final mode is motion skip mode with IC disabled, we send motion_skip_flag=1 and mb_ic_flag=O.
In a second embodiment, we derive IC information for motion skip mode. That is, the IC information is implicit at the decoder. In the decoder part, we first derive macroblock type, motion vectors, and reference indices, directly from the corresponding macroblock in the neighboring view at the same temporal instant. Then we derive IC information for mb_ic_flag. If mb_ic_flag=1 , we then derive DVIC information, followed by motion compensation with IC. If mb_ic_flag=O, we perform motion compensation without IC. In the encoder part, for motion skip mode, we first derive macroblock type, motion vectors, and reference indices, directly from the corresponding macroblock in the neighboring view at the same temporal instant. Then we derive mbjcjlag. If mbjcjlag = 1 , we derive DVIC. If mb_ic_flag=O, then DVIC is set to 0. We then compare motion skip mode with other modes. If the final mode is motion skip mode with IC enabled, then we send motion_skip_flag = 1. Otherwise, we send motion_skip_flag=0.
We can determine IC information at the encoder, and decoder, using various methods. For example, the standard method may be used. In one implementation, IC information (for example, a constant to be added to the illumination value for all pixels in an entire block) is determined from one or more neighboring blocks (macroblock or sub-macroblock) in the same picture (same view, same temporal instant) as the current block. In one such implementation, the reconstructions of the adjacent blocks are used. In another implementation, IC information is determined from blocks in the same view at a different temporal instant (motion may also be accounted for), or blocks from a different view at the same temporal instant (disparity vector(s) may be used to account for differences in the views). In another implementation, IC information is determined based on a computation of average intensity of an entire picture, compared with average intensity of a neighboring view at the same temporal instant.
The derivation of IC can be based on the IC information of inter-view prediction with a disparity vector and/or IC information of temporal prediction with motion of the corresponding macroblock in the neighboring view at the same temporal instant. In one embodiment, mb_ic_flag is set to one if either disparity or motion is set to 1. Further, the dpcm value is a function of disparity or motion. In one embodiment, the dpcm value can be set to the addition of disparity and motion.
We can also derive IC information using the previous processed/decoded IC information in the spatial and/or temporal neighboring blocks. Whether to enable IC and the amount of IC used for the current block can be derived from the neighboring IC offsets by averaging, median filtering, and so forth.
In an embodiment, the derivation of IC can be based on median filtering, averaging, and/or so forth of neighboring illumination compensation offsets.
Additionally, all of the same concepts and implementations may be applied to the use of color compensation ("CC"). For example, CC may be used with motion skip mode, with or without IC. Further, CC may be determined in any of various methods, such as methods that correspond to the IC methods and implementations discussed in this application. Additionally, the syntax for explicitly signaling CC values, or for explicitly signaling the use of CC (for example, by using a flag), can be, for example, analogous to the syntax discussed for signaling IC.
Turning to FIG. 3, an exemplary video encoder is indicated generally by the reference numeral 300.
The encoder 300 includes a combiner 305 having an output in signal communication with an input of a transformer and quantizer 310. An output of the transformer and quantizer 310 is connected in signal communication with an input of an entropy coder 315 and an input of an inverse quantizer and inverse transformer 320. An output of the inverse quantizer and inverse transformer 320 is connected in signal communication with first non-inverting input of a combiner 325. An output of the combiner 325 is connected in signal communication with an input of a deblocking filter 335 (for deblocking a reconstructed picture). An output of the deblocking filter 335 is connected in signal communication with an input of a reference picture buffer 340. An output of the reference picture buffer 340 is connected in signal communication with a first input of a differential value of illumination compensation (DVIC) calculator and/or differential value of color compensation (DVCC) calculator 360. An output of the DVIC calculator and/or DVCC calculator 360 is connected in signal communication with an input of motion estimator and compensator for illumination and/or color 355, with a first input of a motion compensator 330, and with a non-inverting input of a combiner 350. An output of the motion estimator and compensator for illumination and/or color 355 is connected in signal communication with a second input of the motion compensator 330. An output of the motion compensator 330 is connected in signal communication with a second non-inverting input of the combiner 325 and an inverting input of the combiner 305.
An output of a DVIC predictor and/or DVCC predictor 370 is connected in signal communication with an inverting input of the combiner 350.
A non-inverting input of the combiner 305, a second input of the motion estimator and compensator for illumination and/or color 355, and a second input of the DVIC calculator and/or DVCC calculator 360 are available as inputs to the encoder 300, for receiving input video sequences.
An output of the entropy coder 315 is available as an output of the encoder 300, for outputting encoded residual signals. The output of the motion estimator and compensator for illumination and/or color 355 is available as an output of the encoder 300, for outputting a motion vector(s).
An output of the combiner 350 is available as an output of the encoder 300, for outputting dpcm_of_dvic and/or dpcm_of_dvcc (dpcm of offset).
An output of a mode selector 365 is available as an output of the encoder 300, for outputting mb_ic_flag and/or mb_cc_flag for 16x16 and direct mode.
The output of the motion compensator 330 provides a motion compensated prediction.
Turning to FIG. 4, an exemplary video decoder is indicated generally by the reference numeral 400.
The decoder 400 includes an entropy decoder 405 having an output connected in signal communication with an input of an inverse quantizer and inverse transformer 410. An output of the inverse quantizer and inverse transformer 410 is connected in signal communication with a first non-inverting input of a combiner 415. An output of the combiner 415 is connected in signal communication with an input of a deblocking filter 420. An output of the deblocking filter 420 is connected in signal communication with an input of a reference picture buffer 435. An output of the reference picture buffer 435 is connected in signal communication with a first input of a motion compensator for illumination and/or color 425. An output of the motion compensator for illumination and/or color 425 is connected in signal communication with a second non-inverting input of a combiner 415.
An output of a differential value of illumination compensation (DVIC) predictor and/or differential value of color compensation (DVCC) predictor 430 is connected in signal communication with a first non-inverting input of a combiner 470.
An input of the entropy decoder 405 is available as an input of the decoder 400, for receiving an input video bitstream.
A second input of the motion compensator for illumination and/or color 425 is available as an input of the decoder 400, for receiving a motion vector(s).
A third input of the motion compensator for illumination and/or color 425 is available as an input of the decoder 400, for receiving mb_ic_flag and/or mb_cc_flag.
A second non-inverting input of the combiner 470 is available as an input of the decoder 400, for receiving dpcm_of_dvic and/or dpcm_of_dvcc. The output of the deblocking filter 420 is available as an output of the decoder 400, for providing reconstructed frames.
Turning to FIG. 5, which refers generally to both FIGS. 5A and 5B, an exemplary method for encoding macroblocks is indicated generally by the reference numeral 500.
The method 500 includes a start block 502 that passes control to a function block 504. The function block 504 reads the encoder configuration file, and passes control to a function block 506. The function block 506 sets anchor and non-anchor picture references in a Sequence Parameter Set (SPS) extension, and passes control to a function block 508. The function block 508 lets the number of views be equal to a variable N, sets a variable i and a variable j equal to zero, and passes control to a decision block 510. The decision block 510 determines whether or not the current value of the variable i is less than the current value of the variable N. If so, then control is passed to a decision block 512. Otherwise, control is passed to a function block 538.
The decision block 512 determines whether or not the current value of the variable j is less than the number of pictures in view i. If so, then control is passed to a function block 514. Otherwise, control is passed to a function block 544.
The function block 514 starts encoding the current macroblock, and passes control to a function block 516. The function block 516 checks macroblock modes, and passes control to a function block 518. The function block 518 checks motion skip macroblock mode without illumination compensation, and passes control to a function block 520. The function block 520 checks motion skip mode with illumination compensation (to determine whether illumination compensation information is to be obtained from disparity/motion information from a neighboring view at a same temporal location or whether illumination compensation information is to be from previously decoded illumination compensation information of a spatial/temporal neighboring block), and passes control to a decision block 522. The decision block 522 determines whether or not motion skip is the best mode. If so, then control is passed to a function block 524. Otherwise, control is passed to a function block 546.
The function block 524 sets motion_skip_flag equal to one, and passes control to a decision block 526. The decision block 526 determines whether or not illumination compensation is enabled. If so, then control is passed to a function block 528. Otherwise, control is passed to a function block 530.
The function block 528 sets mb_ic_flag equal to one, sets dpcm_of_dvic, and passes control to a function block 530.
The function block 530 encodes the macroblock, and passes control to a decision block 532. The decision block 532 determines whether or not all macroblocks have been encoded. If so, then control is passed to a function block 534. Otherwise, control is returned to the function block 514 to start encoding another MB.
The function block 534 increments the variable j, and passes control to a function block 536. The function block 536 increments frame_num and picture order count (POC), and returns control to the decision block 512.
The decision block 538 determines whether or not the SPS, PPS, and/or VPS (and/or any other syntax structure and/or syntax element that is used for the purposes of the present principles) are to be sent in-band. If so, then control is passed to a function block 540. Otherwise, control is passed to a function block 548.
The function block 540 sends the SPS, PPS, and/or VPS in-band, and passes control to a function block 542.
The function block 542 writes the bitstream to a file or streams the bitstream over a network, and passes control to an end block 599.
The function block 548 sends the SPS, PPS, and/or VPS out-of-band, and passes control to the function block 542.
The function block 544 increments the variable i, resets frame_num and POC, and returns control to the decision block 510.
The function block 546 sets motion_skip_flag equal to zero, and passes control to the function block 530.
Turning to FIG. 6, which refers generally to both FIGS. 6A and 6B, an exemplary method for decoding macroblocks is indicated generally by the reference numeral 600.
The method 600 includes a start block 602 that passes control to a function block 604. The function block 604 parses view_id from the SPS, PPS, VPS, slice header and/or NAL unit header, and passes control to a function block 606. The function block 606 parses other SPS parameters, and passes control to a decision block 608. The decision block 608 determines whether or not the current picture needs decoding. If so, then control is passed to a decision block 610. Otherwise, control is passed to a function block 640.
The decision block 610 determines whether or not the current value of the picture order count (POC) is equal to a previous value of the POC. If so, then control is passed to a function block 612. Otherwise, control is passed to a function block 614.
The function block 612 sets view_num equal to zero, and passes control to the function block 614.
The function block 614 indexes view_id information at a high level to determine view coding order, increments view_num, and passes control to a decision block 616. The decision block 616 determines whether or not the current picture is in the expected coding order. If so, then control is passed to a function block 618. Otherwise, control is passed to a function block 642.
The function block 618 parses the slice header, and passes control to a function block 622. The function block 622 parses motion_skip_flag, and passes control to a decision block 624. The decision block 624 determines whether or not motion_skip_flag is equal to one. If so, then control is passed to a function block 626. Otherwise, control is passed to a function block 620.
The function block 626 parses mb_ic_flag, and passes control to a decision block 628. The decision block 628 determines whether or not mb_ic_flag is equal to one. If so, then control is passed to a function block 630. Otherwise, control is passed to the function block 632.
The function block 630 parses dpcm_of_dvic, and passes control to the function block 632.
The function block 632 decodes the macroblock, and passes control to a decision block 634. The decision block 634 determines whether or not all macroblocks are done (have been encoded). If so, then control is passed to a function block 636. Otherwise, control is returned to the function block 622.
The function block 636 inserts the current picture in the decoded picture buffer (DPB), and passes control to a decision block 638. The decision block 638 determines whether or not all pictures have been decoded. If so, then control is passed to an end block 699. Otherwise, control is returned to the function block 618. The function block 640 gets the next picture, and returns control to the decision block 608.
The function block 642 conceals the picture, and passes control to the function block 640.
The function block 620 parses the macroblock mode, a motion vector (mv), and ref_idx, and passes control to the function block 632.
Turning to FIG. 7, which refers generally to both FIGS. 7 A and 7B, another exemplary method for encoding macroblocks is indicated generally by the reference numeral 700.
The method 700 includes a start block 702 that passes control to a function block 704. The function block 704 reads the encoder configuration file, and passes control to a function block 706. The function block 706 sets anchor and non-anchor picture references in a Sequence Parameter Set (SPS) extension, and passes control to a function block 708. The function block 708 lets the number of views be equal to a variable N, sets a variable i and a variable j equal to zero, and passes control to a decision block 710. The decision block 710 determines whether or not the current value of the variable i is less than the current value of the variable N. If so, then control is passed to a decision block 712. Otherwise, control is passed to a function block 738.
The decision block 712 determines whether or not the current value of the variable j is less than the number of pictures in view i. If so, then control is passed to a function block 714. Otherwise, control is passed to a function block 744.
The function block 714 starts encoding the current macroblock, and passes control to a function block 716. The function block 716 checks macroblock modes, and passes control to a function block 718. The function block 718 checks motion skip macroblock mode without illumination compensation, and passes control to a function block 720. The function block 720 checks motion skip mode with illumination compensation (to determine whether illumination compensation information is to be obtained from disparity/motion information from a neighboring view at a same temporal location or whether illumination compensation information is to be from previously decoded illumination compensation information of a spatial/temporal neighboring block), an passes control to a decision block 722. The decision block 722 determines whether or not motion skip is the best mode. If so, then control is passed to a function block 724. Otherwise, control is passed to a function block 746.
The function block 724 sets motion_skip_flag equal to one, and passes control to a decision block 726. The decision block 726 determines whether or not illumination compensation is enabled. If so, then control is passed to a function block 728. Otherwise, control is passed to a function block 730.
The function block 728 sets mb_ic_flag equal to one, and passes control to a function block 730.
The function block 730 encodes the macroblock, and passes control to a decision block 732. The decision block 732 determines whether or not all macroblocks have been encoded. If so, then control is passed to a function block 734. Otherwise, control is returned to the function block 714 to start encoding another MB.
The function block 734 increments the variable j, and passes control to a function block 736. The function block 736 increments frame_num and picture order count (POC), and returns control to the decision block 712.
The decision block 738 determines whether or not the SPS, PPS, and/or VPS (and/or any other syntax structure and/or syntax element that is used for the purposes of the present principles) are to be sent in-band. If so, then control is passed to a function block 740. Otherwise, control is passed to a function block 748.
The function block 740 sends the SPS, PPS, and/or VPS in-band, and passes control to a function block 742.
The function block 742 writes the bitstream to a file or streams the bitstream over a network, and passes control to an end block 799.
The function block 748 sends the SPS, PPS, and/or VPS out-of-band, and passes control to the function block 742.
The function block 744 increments the variable i, resets frame_num and POC, and returns control to the decision block 710.
The function block 746 sets motion_skip_flag equal to zero, and passes control to the function block 730.
Turning to FIG. 8, which refers generally to both FIGS. 8A and 8B, another exemplary method for decoding macroblocks is indicated generally by the reference numeral 800. The method 800 includes a start block 802 that passes control to a function block 804. The function block 804 parses viewjd from the SPS, PPS, VPS, slice header and/or NAL unit header, and passes control to a function block 806. The function block 806 parses other SPS parameters, and passes control to a decision block 808. The decision block 808 determines whether or not the current picture needs decoding. If so, then control is passed to a decision block 810. Otherwise, control is passed to a function block 840.
The decision block 810 determines whether or not the current value of the picture order count (POC) is equal to a previous value of the POC. If so, then control is passed to a function block 812. Otherwise, control is passed to a function block 814.
The function block 812 sets view_num equal to zero, and passes control to the function block 814.
The function block 814 indexes viewjd information at a high level to determine view coding order, increments view_num, and passes control to a decision block 816. The decision block 816 determines whether or not the current picture is in the expected coding order. If so, then control is passed to a function block 818. Otherwise, control is passed to a function block 842.
The function block 818 parses the slice header, and passes control to a function block 822. The function block 822 parses motion_skip_flag, and passes control to a decision block 824. The decision block 824 determines whether or not motion_skip_flag is equal to one. If so, then control is passed to a function block 826. Otherwise, control is passed to a function block 820.
The function block 826 parses mb_ic_flag, and passes control to a decision block 828. The decision block 828 determines whether or not mb_ic_flag is equal to one. If so, then control is passed to a function block 829. Otherwise, control is passed to the function block 832.
The function block 829 determines an illumination compensation value from the neighboring macroblock, and passes control to the function block 832.
The function block 832 decodes the macroblock, and passes control to a decision block 834. The decision block 834 determines whether or not all macroblocks are done (have been encoded). If so, then control is passed to a function block 836. Otherwise, control is returned to the function block 822. The function block 836 inserts the current picture in the decoded picture buffer (DPB), and passes control to a decision block 838. The decision block 838 determines whether or not all pictures have been decoded. If so, then control is passed to an end block 899. Otherwise, control is returned to the function block 818.
The function block 840 gets the next picture, and returns control to the decision block 808.
The function block 842 conceals the picture, and passes control to the function block 840.
The function block 820 parses the macroblock mode, a motion vector (mv), and ref_idx, and passes control to the function block 832.
Referring to FIG. 9, a video processing device 900 is shown. The video processing device 900 may be, for example, a set top box or other device that receives encoded video and provides, for example, decoded video for display to a user or for storage. Thus, the device 900 may provide its output to a television, computer monitor, or a computer or other processing device.
The device 900 includes a front-end device 905 and a decoder 910. The front-end device 905 may be, for example, a receiver adapted to receive a program signal having a plurality of bitstreams representing encoded pictures, and to select one or more bitstreams for decoding from the plurality of bitstreams. Typical receivers perform functions such as, for example, one or more of receiving a modulated and encoded data signal, demodulating the data signal, decoding one or more encodings (for example, channel coding and/or source coding) of the data signal, and/or error-correcting the data signal. The front-end device 905 may receive the program signal from, for example, an antenna (not shown). The front- end device 905 provides a received data signal to the decoder 910.
The decoder 910 receives a data signal 920. The data signal 920 may include, for example, one or more AVC, SVC, or MVC compatible streams. The decoder 910 decodes all or part of the received signal 920 and provides as output a decoded video signal 930. The decoded video 930 is provided to a selector 950. The device 900 also includes a user interface 960 that receives a user input 970. The user interface 960 provides a picture selection signal 980, based on the user input 970, to the selector 950. The picture selection signal 980 and the user input 970 indicate which of multiple pictures, sequences, scalable versions, views, or other selections of the available decoded data a user desires to have displayed. The selector 950 provides the selected picture(s) as an output 990. The selector 950 uses the picture selection information 980 to select which of the pictures in the decoded video 930 to provide as the output 990.
In various implementations, the selector 950 includes the user interface 960, and in other implementations no user interface 960 is needed because the selector 950 receives the user input 970 directly without a separate interface function being performed. The selector 950 may be implemented in software or as an integrated circuit, for example. In one implementation, the selector 950 is incorporated with the decoder 910, and in another implementation the decoder 910, the selector 950, and the user interface 960 are all integrated.
In one application, front-end 905 receives a broadcast of various television shows and selects one for processing. The selection of one show is based on user input of a desired channel to watch. Although the user input to front-end device 905 is not shown in FIG. 9, front-end device 905 receives the user input 970. The front- end 905 receives the broadcast and processes the desired show by demodulating the relevant part of the broadcast spectrum, and decoding any outer encoding of the demodulated show. The front-end 905 provides the decoded show to the decoder 910. The decoder 910 is an integrated unit that includes devices 960 and 950. The decoder 910 thus receives the user input, which is a user-supplied indication of a desired view to watch in the show. The decoder 910 decodes the selected view, as well as any required reference pictures from other views, and provides the decoded view 990 for display on a television (not shown).
Continuing the above application, the user may desire to switch the view that is displayed and may then provide a new input to the decoder 910. After receiving a "view change" from the user, the decoder 910 decodes both the old view and the new view, as well as any views that are in between the old view and the new view. That is, the decoder 910 decodes any views that are taken from cameras that are physically located in between the camera taking the old view and the camera taking the new view. The front-end device 905 also receives the information identifying the old view, the new view, and the views in between. Such information may be provided, for example, by a controller (not shown in FIG. 9) having information about the locations of the views, or the decoder 910. Other implementations may use a front-end device that has a controller integrated with the front-end device. The decoder 910 provides all of these decoded views as output 990. A postprocessor (not shown in FIG. 9) interpolates between the views to provide a smooth transition from the old view to the new view, and displays this transition to the user. After transitioning to the new view, the post-processor informs (through one or more communication links not shown) the decoder 910 and the front-end device 905 that only the new view is needed. Thereafter, the decoder 910 only provides as output 990 the new view.
Various implementations discussed in this application may provide one or more advantages. For example, by combining IC and/or CC with motion skip mode, coding efficiency may be increased and a bit savings may be realized.
We thus provide one or more embodiment having particular features and aspects. However, features and aspects of described implementations may also be adapted for other implementations. Implementations may signal information using a variety of techniques including, but not limited to, SPS syntax, other high level syntax, non-high-level syntax, out-of-band information, and implicit signaling. Accordingly, although implementations described herein may be described in a particular context, such descriptions should in no way be taken as limiting the features and concepts to such implementations or contexts.
Additionally, implementations may be implemented in either, or both, an encoder and a decoder.
The implementations described herein may be implemented in, for example, a method or process, an apparatus, or a software program. Even if only discussed in the context of a single form of implementation (for example, discussed only as a method), the implementation of features discussed may also be implemented in other forms (for example, an apparatus or program). An apparatus may be implemented in, for example, appropriate hardware, software, and firmware. The methods may be implemented in, for example, an apparatus such as, for example, a processor, which refers to processing devices in general, including, for example, a computer, a microprocessor, an integrated circuit, or a programmable logic device. Processing devices also include communication devices, such as, for example, computers, cell phones, portable/personal digital assistants ("PDAs"), and other devices that facilitate communication of information between end-users.
Implementations of the various processes and features described herein may be embodied in a variety of different equipment or applications, particularly, for example, equipment or applications associated with data encoding and decoding. Examples of equipment include video coders, video decoders, video codecs, web servers, set-top boxes, laptops, personal computers, cell phones, PDAs, and other communication devices. As should be clear, the equipment may be mobile and even installed in a mobile vehicle.
Additionally, the methods may be implemented by instructions being performed by a processor, and such instructions may be stored on a processor- readable medium such as, for example, an integrated circuit, a software carrier or other storage device such as, for example, a hard disk, a compact diskette, a random access memory ("RAM"), or a read-only memory ("ROM"). The instructions may form an application program tangibly embodied on a processor-readable medium. Instructions may be, for example, in hardware, firmware, software, or a combination. Instructions may be found in, for example, an operating system, a separate application, or a combination of the two. A processor may be characterized, therefore, as, for example, both a device configured to carry out a process and a device that includes a computer readable medium having instructions for carrying out a process.
As should be evident to one of skill in the art, implementations may also produce a signal formatted to carry information that may be, for example, stored or transmitted. The information may include, for example, instructions for performing a method, or data produced by one of the described implementations. Such a signal may be formatted, for example, as an electromagnetic wave (for example, using a radio frequency portion of spectrum) or as a baseband signal. The formatting may include, for example, encoding a data stream and modulating a carrier with the encoded data stream. The information that the signal carries may be, for example, analog or digital information. The signal may be transmitted over a variety of different wired or wireless links, as is known. For example, a signal may be formatted to carry as data values for syntax described in one or more implementations, or even the syntax instructions themselves if the syntax is being transmitted, for example.
A number of implementations have been described. Nevertheless, it will be understood that various modifications may be made. For example, elements of different implementations may be combined, supplemented, modified, or removed to produce other implementations. Additionally, one of ordinary skill will understand that other structures and processes may be substituted for those disclosed and the resulting implementations will perform at least substantially the same function(s), in at least substantially the same way(s), to achieve at least substantially the same result(s) as the implementations disclosed. Accordingly, these and other implementations are contemplated by this application and are within the scope of the following claims.

Claims

CLAIMS:
1. A method, comprising: accessing at least a portion of an encoded picture, the portion having been encoded using motion information determined for a different portion (624), and using one or more of illumination compensation or color compensation (630); and decoding the portion of the encoded picture (632).
2. The method of claim 1 , further comprising accessing information describing how one or more of the illumination compensation or the color compensation was used in encoding the portion (628, 630), and wherein the decoding comprises decoding the portion based on the accessed information (632).
3. The method of claim 1 , further comprising accessing information describing the motion information for the different portion, and wherein decoding the portion is based on the accessed information.
4. The method of claim 2, wherein the accessed information comprises at least one of an indication of whether the one or more of the illumination compensation or the color compensation is enabled for use in encoding the portion (626), and an amount of the one or more of the illumination compensation or the color compensation used in encoding the portion (630).
5. The method of claim 2, wherein: accessing the portion comprises accessing the portion from a received stream, and accessing the information comprises accessing the information from the received stream (604).
6. The method of claim 5, wherein the information is comprised in at least one low level syntax element (626, 630).
7. The method of claim 2, wherein accessing the portion comprises accessing the portion from a received stream received at a receiver, and the method further comprises determining the information at the receiver (604).
8. The method of claim 7, wherein determining the information comprises determining the information based on one or more of inter-view information or temporal information (829).
9. The method of claim 8, wherein the portion corresponds to a particular view, and the inter-view information comprises disparity information relating to the portion and a corresponding portion in a neighboring view (829).
10. The method of claim 9, wherein the disparity information comprises a disparity vector (829).
1 1. The method of claim 8, wherein the portion corresponds to a particular view, and the temporal information comprises motion information relating to a corresponding portion in a neighboring view (829).
12. The method of claim 7, wherein the information is determined by at least one of median filtering or averaging at least one of neighboring illumination compensation offsets or color compensation offsets (829).
13. The method of claim 1 , wherein the different portion is a different portion from one of the encoded picture or a different encoded picture (829).
14. The method of claim 1 , wherein: the encoded picture is from a first view, the different portion is from a different encoded picture, the different encoded picture is from a second view, and the portion is related to the different portion by a regional disparity vector that provides a correspondence between limited locations in pictures from the first view and limited locations in pictures from the second view, the limited locations including some but not all locations (829).
15. An apparatus, comprising: a decoder (400) for accessing at least a portion of an encoded picture, and decoding the portion of the encoded picture, wherein the portion has been encoded using motion information determined for a different portion, and using one or more of illumination compensation or color compensation.
16. The apparatus of claim 15, wherein said decoder (400) accesses information describing how one or more of the illumination compensation or the color compensation was used in encoding the portion, and wherein said decoder decodes the portion based on the accessed information.
17. The apparatus of claim 16, wherein the accessed information comprises at least one of an indication of whether the one or more of the illumination compensation or the color compensation is enabled for use in encoding the portion, and an amount of the one or more of the illumination compensation or the color compensation used in encoding the portion.
18. The apparatus of claim 16, wherein said decoder (400) accesses the portion and the information from a received stream.
19. The apparatus of claim 18, wherein the information is comprised in at least one low level syntax element.
20. The apparatus of claim 16, wherein said decoder (400) accesses the portion from a received stream received at a receiver, and determines the information at the receiver.
21. The apparatus of claim 20, wherein said decoder (400) determines the information based on one or more of inter-view information or temporal information.
22. The apparatus of claim 21 , wherein the portion corresponds to a particular view, and the inter-view information comprises disparity information relating to the portion and a corresponding portion in a neighboring view.
23. The apparatus of claim 22, wherein the disparity information comprises a disparity vector.
24. The apparatus of claim 21 , wherein the portion corresponds to a particular view, and the temporal information comprises motion information relating to a corresponding portion in a neighboring view.
25. The apparatus of claim 20, wherein the information is determined by at least one of median filtering or averaging at least one of neighboring illumination compensation offsets or color compensation offsets.
26. The apparatus of claim 15, wherein the different portion is a different portion from one of the encoded picture or a different encoded picture.
27. The apparatus of claim 15, wherein: the encoded picture is from a first view, the different portion is from a different encoded picture, the different encoded picture is from a second view, and the portion is related to the different portion by a regional disparity vector that provides a correspondence between limited locations in pictures from the first view and limited locations in pictures from the second view, the limited locations including some but not all locations.
28. A method, comprising: accessing at least a portion of a picture to be encoded (514); identifying motion information for a different portion of the picture (518); and encoding the portion of the picture using the identified motion information and using one or more of illumination compensation or color compensation (520).
29. The method of claim 28, further comprising generating information describing how one or more of the illumination compensation or the color compensation was used in encoding the portion (520).
30. The method of claim 29, wherein the generated information comprises at least one of an indication of whether the one or more of the illumination compensation or the color compensation is enabled for use in encoding the portion, and an amount of the one or more of the illumination compensation or the color compensation used in encoding the portion (528).
31. The method of claim 29, wherein the portion and the information are encoded into a stream (542).
32. The method of claim 31 , wherein the information is comprised in at least one low level syntax element (528).
33. The method of claim 29, wherein the portion is encoded into a stream, and the method further comprises implicitly signaling the information (728).
34. The method of claim 33, wherein the information is based on one or more of inter-view information or temporal information (720).
35. The method of claim 34, wherein the portion corresponds to a particular view, and the inter-view information comprises disparity information relating to the portion and a corresponding portion in a neighboring view (720).
36. The method of claim 35, wherein the disparity information comprises a disparity vector (720).
37. The method of claim 34, wherein the portion corresponds to a particular view, and the temporal information comprises motion information relating to a corresponding portion in a neighboring view (720).
38. The method of claim 28, wherein the different portion is a different portion from one of the picture or a different picture (720).
39. The method of claim 28, wherein: the picture is from a first view, the different portion is from a different picture, the different picture is from a second view, and the portion is related to the different portion by a regional disparity vector that provides a correspondence between limited locations in pictures from the first view and limited locations in pictures from the second view, the limited locations including some but not all locations (720).
40. An apparatus, comprising: an encoder (300) for accessing at least a portion of a picture to be encoded, identifying motion information for a different portion of the picture, and encoding the portion of the picture using the identified motion information and using one or more of illumination compensation or color compensation.
41. The apparatus of claim 40, wherein said encoder (300) generates information describing how one or more of the illumination compensation or the color compensation was used in encoding the portion.
42. The apparatus of claim 41 , wherein the generated information comprises at least one of an indication of whether the one or more of the illumination compensation or the color compensation is enabled for use in encoding the portion, and an amount of the one or more of the illumination compensation or the color compensation used in encoding the portion.
43. The apparatus of claim 41 , wherein the portion and the information are encoded into a stream.
44. The apparatus of claim 43, wherein the information is comprised in at least one low level syntax element.
45. The apparatus of claim 41 , wherein the portion is encoded into a stream, and said encoder implicitly signals the information.
46. The apparatus of claim 45, wherein the information is based on one or more of inter-view information or temporal information.
47. The apparatus of claim 46, wherein the portion corresponds to a particular view, and the inter-view information comprises disparity information relating to the portion and a corresponding portion in a neighboring view.
48. The apparatus of claim 47, wherein the disparity information comprises a disparity vector.
49. The apparatus of claim 46, wherein the portion corresponds to a particular view, and the temporal information comprises motion information relating to a corresponding portion in a neighboring view.
50. The apparatus of claim 40, wherein the different portion is a different portion from one of the picture or a different picture.
51. The apparatus of claim 40, wherein: the picture is from a first view, the different portion is from a different picture, the different picture is from a second view, and the portion is related to the different portion by a regional disparity vector that provides a correspondence between limited locations in pictures from the first view and limited locations in pictures from the second view, the limited locations including some but not all locations.
52. An apparatus, comprising: means for accessing at least a portion of an encoded picture, the portion having been encoded using motion information determined for a different portion, and using one or more of illumination compensation or color compensation; and means for decoding the portion of the encoded picture.
53. An apparatus, comprising: means for accessing at least a portion of a picture to be encoded; means for identifying motion information for a different portion of the picture; and means for encoding the portion of the picture using the identified motion information and using one or more of illumination compensation or color compensation.
54. A video signal structure for video encoding, comprising: at least a portion of a picture, the portion having been encoded using motion information determined for a different portion, and using one or more of illumination compensation or color compensation; and information describing how one or more of the illumination compensation or the color compensation was used in encoding the portion.
55. A computer readable medium having stored thereon: at least a portion of a picture, the portion having been encoded using motion information determined for a different portion, and using one or more of illumination compensation or color compensation; and information describing how one or more of the illumination compensation or the color compensation was used in encoding the portion.
56. A video signal, comprising: at least a portion of a picture, the portion having been encoded using motion information determined for a different portion, and using one or more of illumination compensation or color compensation; and information describing how one or more of the illumination compensation or the color compensation was used in encoding the portion.
57. A computer readable medium having computer readable program code embodied thereon, the computer readable program code comprising: program code for accessing at least a portion of a picture to be encoded; program code for identifying motion information for a different portion of the picture; and program code for encoding the portion of the picture using the identified motion information and using one or more of illumination compensation or color compensation.
58. A computer readable medium having computer readable program code embodied thereon, the computer readable program code comprising: program code for accessing at least a portion of an encoded picture, the portion having been encoded using motion information determined for a different portion, and using one or more of illumination compensation or color compensation; and program code for decoding the portion of the encoded picture.
59. An apparatus, comprising a processor configured to perform at least the following: accessing at least a portion of a picture to be encoded; identifying motion information for a different portion of the picture (518); and encoding the portion of the picture using the identified motion information and using one or more of illumination compensation or color compensation (520).
60. An apparatus, comprising a processor configured to perform at least the following: accessing at least a portion of an encoded picture, the portion having been encoded using motion information determined for a different portion (624), and using one or more of illumination compensation or color compensation (630); and decoding the portion of the encoded picture (632).
61. A method, comprising: receiving a program signal having a bitstream representing encoded pictures; demodulating the received bitstream; accessing at least a portion of one of the encoded pictures from the demodulated bitstream, the portion having been encoded using motion information determined for a different portion of the encoded picture (624), and using one or more of illumination compensation or color compensation (630); and decoding the portion (632).
62. An apparatus, comprising: a receiver (905) adapted to receive a program signal having a bitstream representing encoded pictures, and to demodulate the received bitstream; and a decoder (910) adapted: to access at least a portion of one of the encoded pictures from the demodulated bitstream, the portion having been encoded using motion information determined for a different portion of the encoded picture, and using one or more of illumination compensation or color compensation, and to decode the portion.
PCT/US2008/005216 2007-04-24 2008-04-23 Video coding WO2008130716A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US91372007P 2007-04-24 2007-04-24
US60/913,720 2007-04-24

Publications (2)

Publication Number Publication Date
WO2008130716A2 true WO2008130716A2 (en) 2008-10-30
WO2008130716A3 WO2008130716A3 (en) 2008-12-11

Family

ID=39816871

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2008/005216 WO2008130716A2 (en) 2007-04-24 2008-04-23 Video coding

Country Status (1)

Country Link
WO (1) WO2008130716A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150023422A1 (en) * 2013-07-16 2015-01-22 Qualcomm Incorporated Processing illumination compensation for video coding

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
CHEN JAE HOON KIM JOAQUÍN LÓPEZ ANTONIO ORTEGA: "Illumination compensation for inter view Multi View Video Compression" VIDEO STANDARDS AND DRAFTS, XX, XX, no. M11132, 15 July 2004 (2004-07-15), pages 1-5, XP030039911 Redmond, USA *
KWANGHOON SOHN ET AL: "Preliminary results on CE2 for multi-view video coding" VIDEO STANDARDS AND DRAFTS, XX, XX, no. M13194, 30 March 2006 (2006-03-30), pages 1-6, XP030041863 Montreux, Switserland *
LIM J ET AL: "A multiview sequence CODEC with view scalability" SIGNAL PROCESSING. IMAGE COMMUNICATION, ELSEVIER SCIENCE PUBLISHERS, AMSTERDAM, NL, vol. 19, no. 3, 1 March 2004 (2004-03-01), pages 239-256, XP004489364 ISSN: 0923-5965 *
YUNG-LYUL LEE ET AL: "Result of CE2 on Multi-view Video Coding" VIDEO STANDARDS AND DRAFTS, XX, XX, no. M13143, 29 March 2006 (2006-03-29), pages 1-12, XP030041812 Montreux, Switserland *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150023422A1 (en) * 2013-07-16 2015-01-22 Qualcomm Incorporated Processing illumination compensation for video coding
CN105379288A (en) * 2013-07-16 2016-03-02 高通股份有限公司 Processing illumination compensation for video coding
US9860529B2 (en) * 2013-07-16 2018-01-02 Qualcomm Incorporated Processing illumination compensation for video coding
CN105379288B (en) * 2013-07-16 2019-05-21 高通股份有限公司 Handle the illumination compensation to video coding

Also Published As

Publication number Publication date
WO2008130716A3 (en) 2008-12-11

Similar Documents

Publication Publication Date Title
US8532410B2 (en) Multi-view video coding with disparity estimation based on depth information
US20190379904A1 (en) Inter-view prediction
KR101663819B1 (en) Refined depth map
US20100284466A1 (en) Video and depth coding
US20100118942A1 (en) Methods and apparatus at an encoder and decoder for supporting single loop decoding of multi-view coded video
US20110038418A1 (en) Code of depth signal
US20100027881A1 (en) Local illumination and color compensation without explicit signaling
KR20170023086A (en) Methods and systems for intra block copy coding with block vector derivation
CN114600466A (en) Image encoding apparatus and method based on cross component filtering
WO2010021664A1 (en) Depth coding
US20220377370A1 (en) Inter prediction in video or image coding system
WO2008130716A2 (en) Video coding
WO2022146215A1 (en) Temporal filter

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08799893

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08799893

Country of ref document: EP

Kind code of ref document: A2