WO2008024345A1

WO2008024345A1 - Adaptive region-based flipping video coding

Info

Publication number: WO2008024345A1
Application number: PCT/US2007/018483
Authority: WO
Inventors: Peng Yin; Oscar Divorra Escoda; Yeping Su
Original assignee: Thomson Licensing
Priority date: 2006-08-24
Filing date: 2007-08-21
Publication date: 2008-02-28

Abstract

There are provided methods and apparatus for video encoding and decoding using region-based flipping. An encoding apparatus includes a video encoder (400) for encoding image regions of a picture into a resultant bitstream, wherein at least one of the image regions is flipped (402) before encoding. A decoding apparatus includes a video decoder (500) for decoding image regions from a bitstream, wherein at least one of the image regions is flipped (577) after decoding.

Description

METHODS AND APPARATUS FOR VIDEO ENCODING AND DECODING USING ADAPTIVE REGION-BASED FLIPPING

CROSS-REFERENCE TO RELATED APPLICATIONS This application claims the benefit of U.S. Provisional Application Serial No.

60/823,400, filed August 24, 2006, which is incorporated by reference herein in its entirety.

TECHNICAL FIELD The present principles relate generally to video encoding and decoding and, more particularly, to methods and apparatus for video encoding and decoding using adaptive region-based flipping.

BACKGROUND The concept of video flipping is a term of art used to describe longitudinal and/or latitudinal pixel position inversion in the spatial domain of a picture. The concept of video flipping is also sometimes referred to as pixel mirroring or inversion. Flipping includes the concept of vertical flipping (that is, mirroring or inverting pixel positions from the top-to-bottom, and conversely from the bottom-to-top of an image) prior to coding or decoding, horizontal flipping (that is, mirroring or inverting pixel positions from the left-to-right, and conversely from the right-to-left) prior to coding or decoding, or both.

It is noted that previous to coding a video image, flipping (whether horizontal, vertical, or both) of an input picture can improve coding efficiency in both an intra and/or inter coding context. Flipping an entire image or picture can have a significant impact on coding efficiency, as proposed in a first prior art reference.

In the first prior art reference, it is proposed to improve the coding efficiency of the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 recommendation (hereinafter the "MPEG- 4 AVC standard") by utilizing the spatial direction dependency of images in intra/inter coding. In accordance with the first prior art approach, the encoder encodes four directional patterns of every picture, which are normal, horizontally flipped, vertically flipped, and horizontally and vertically flipped, and selects the best direction by employing a rate-distortion optimization method. The encoder adds a two-bit "flip flag" to the stream so that the decoder can find the direction and decode the picture in its right position according to the original sequence. For intra coding, the first prior art approach works because of the asymmetrical characteristics of the intra prediction in the MPEG-4 AVC standard. Turning to FIG. 1A₁ an example of intra 4x4 prediction is indicated generally by the reference numeral 100. Turning to FIG. 1B₁ an example of prediction directions for the example of intra 4x4 prediction of FIG. 1A is indicated generally by the reference numeral 150. Relating to the example of intra 4x4 prediction 100, FIG. 1A shows the corresponding blocks 110, the scanning order of 4x4 blocks in a macroblock 112, the pixels used for prediction 114, and the pixels not used for prediction 116. Thus, for example, in the intra 4x4 prediction mode shown in FIGs. 1 A and B, the accuracy of the prediction for each direction differs because of the scanning/encoding order of the blocks. In the prediction modes such as 0, 1 , 4, 5 and 6, the pixels in the target block can be predicted by the nearest boundary pixels. However, in the other modes, some of the nearest boundary pixels are not coded and not available. Thus, in the prediction modes such as 3, 7 and 8, the accuracy of the prediction tends to be lower than that in the other modes. By flipping the whole picture, more macroblocks can be enabled to favor 0, 1 , 4, 5 and 6 modes against 3, 7 and 8 modes. For inter pictures, when the input pictures are coded after being flipped horizontally and/or vertically, the conditions about motion vector prediction, context modeling and so forth are changed. Thus, by flipping both the input picture and the reference pictures adaptively and selecting the best direction, the encoder can improve the coding efficiency.

However, as noted above, for some prediction modes, flipping the entire picture does not result in the same accuracy of the prediction as compared to other modes. Turning to FIG. 2, a video encoder capable of performing video encoding in accordance with the MPEG-4 AVC standard is indicated generally by the reference numeral 200. The video encoder 200 includes a frame ordering buffer 210 having an output in signal communication with a non-inverting input of a combiner 285. An output of the combiner 285 is connected in signal communication with a first input of a transformer and quantizer 225. An output of the transformer and quantizer 225 is connected in signal communication with a first input of an entropy coder 245 and a first input of an inverse transformer and inverse quantizer 250. An output of the entropy coder 245 is connected in signal communication with a first non-inverting input of a combiner 290. An output of the combiner 290 is connected in signal communication with a first input of an output buffer 235. A first output of an encoder controller 205 is connected in signal communication with a second input of the frame ordering buffer 210, a second input of the inverse transformer and inverse quantizer 250, an input of a picture-type decision module 215, an input of a macroblock-type (MB-type) decision module 220, a second input of an intra prediction module 260, a second input of a deblocking filter 265, a first input of a motion compensator 270, a first input of a motion estimator 275, and a second input of a reference picture buffer 280.

A second output of the encoder controller 205 is connected in signal communication with a first input of a Supplemental Enhancement Information (SEI) inserter 230, a second input of the transformer and quantizer 225, a second input of the entropy coder 245, a second input of the output buffer 235, and an input of the Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) inserter 240.

A first output of the picture-type decision module 215 is connected in signal communication with a third input of a frame ordering buffer 210. A second output of the picture-type decision module 215 is connected in signal communication with a second input of a macroblock-type decision module 220.

An output of the Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) inserter 240 is connected in signal communication with a third non-inverting input of the combiner 290.

An output of the inverse quantizer and inverse transformer 250 is connected in signal communication with a first non-inverting input of a combiner 225. An output of the combiner 225 is connected in signal communication with a first input of the intra prediction module 260 and a first input of the deblocking filter 265. An output of the deblocking filter 265 is connected in signal communication with a first input of a reference picture buffer 280. An output of the reference picture buffer 280 is connected in signal communication with a second input of the motion estimator 275. A first output of the motion estimator 275 is connected in signal communication with a second input of the motion compensator 270. A second output of the motion estimator 275 is connected in signal communication with a third input of the entropy coder 245.

An output of the motion compensator 270 is connected in signal communication with a first input of a switch 297. An output of the intra prediction module 260 is connected in signal communication with a second input of the switch 297. An output of the macroblock-type decision module 220 is connected in signal communication with a third input of the switch 297. The third input of the switch 297 determines whether or not the "data" input of the switch (as compared to the control input, i.e., the third input) is to be provided by the motion compensator 270 or the intra prediction module 260. The output of the switch 297 is connected in signal communication with a second non-inverting input of the combiner 285 and with an inverting input of the combiner 285.

Inputs of the frame ordering buffer 210 and the encoder controller 205 are available as input of the encoder 200, for receiving an input picture 201. Moreover, an input of the Supplemental Enhancement Information (SEI) inserter 230 is available as an input of the encoder 200, for receiving metadata. An output of the output buffer 235 is available as an output of the encoder 200, for outputting a bitstream.

Turning to FIG. 3, a video decoder capable of performing video decoding in accordance with the MPEG-4 AVC standard is indicated generally by the reference numeral 300.

The video decoder 300 includes an input buffer 310 having an output connected in signal communication with a first input of the entropy decoder 345. A first output of the entropy decoder 345 is connected in signal communication with a first input of an inverse transformer and inverse quantizer 350. An output of the inverse transformer and inverse quantizer 350 is connected in signal communication with a second non-inverting input of a combiner 325. An output of the combiner 325 is connected in signal communication with a second input of a deblocking filter 365 and a first input of an intra prediction module 360. A second output of the deblocking filter 365 is connected in signal communication with a first input of a reference picture buffer 380. An output of the reference picture buffer 380 is connected in signal communication with a second input of a motion compensator 370. A second output of the entropy decoder 345 is connected in signal communication with a third input of the motion compensator 370 and a first input of the deblocking filter 365. A third output of the entropy decoder 345 is connected in signal communication with an input a decoder controller 305. A first output of the decoder controller 305 is connected in signal communication with a second input of the entropy decoder 345. A second output of the decoder controller 305 is connected in signal communication with a second input of the inverse transformer and inverse quantizer 350. A third output of the decoder controller 305 is connected in signal communication with a third input of the deblocking filter 365. A fourth output of the decoder controller 305 is connected in signal communication with a second input of the intra prediction module 360, with a first input of the motion compensator 370, and with a second input of the reference picture buffer 380.

An output of the motion compensator 370 is connected in signal communication with a first input of a switch 397. An output of the intra prediction module 360 is connected in signal communication with a second input of the switch 397. An output of the switch 397 is connected in signal communication with a first non-inverting input of the combiner 325.

An input of the input buffer 310 is available as an input of the decoder 300, for receiving an input bitstream. A first output of the deblocking filter 365 is available as an output of the decoder 300, for outputting an output picture.

SUMMARY

These and other drawbacks and disadvantages of the prior art are addressed by the present principles, which are directed to methods and apparatus for video encoding and decoding using adaptive region-based flipping. According to an aspect of the present principles, there is provided an apparatus. The apparatus includes a video encoder for encoding image regions of a picture into a resultant bitstream, wherein at least one of the image regions is adaptively flipped before encoding. According to yet another aspect of the present principles, there is provided a method. The method includes encoding image regions of a picture into a resultant bitstream, wherein at least one of the image regions is adaptively flipped before encoding. According to still another aspect of the present principles, there is provided an apparatus. The apparatus includes a video decoder for decoding image regions of a picture from a bitstream, wherein at least one of the image regions is adaptively flipped after decoding.

According to a further aspect of the present principles, there is provided a method. The method includes decoding image regions of a picture from a bitstream, wherein at least one of the image regions is adaptively flipped after decoding.

These and other aspects, features and advantages of the present principles will become apparent from the following detailed description of exemplary embodiments, which is to be read in connection with the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

The present principles may be better understood in accordance with the following exemplary figures, in which:

FIG. 1 A shows a diagram for an example of intra 4x4 prediction; FIG. 1 B shows a diagram for prediction directions for the example of intra 4x4 prediction of FIG. 1A;

FIG. 2 shows a block diagram for a video encoder capable of performing video encoding in accordance with the MPEG-4 AVC Standard;

FIG. 3 shows a block diagram for a video decoder capable of performing video decoding in accordance with the MPEG-4 AVC Standard;

FIG. 4 shows a block diagram for a video encoder capable of performing video encoding in accordance with the MPEG-4 AVC Standard, modified and/or extended for use with the present principles, according to an embodiment of the present principles; FIG. 5 shows a block diagram for a video decoder capable of performing video decoding in accordance with the MPEG-4 AVC Standard, modified and/or extended for use with the present principles, according to an embodiment of the present principles; FIG. 6 shows a diagram of the well-known Foreman image, to which the present principles may be applied, in accordance with an embodiment of the present principles;

FIG. 7A shows a diagram for an example of intra 4x4 prediction with an alternate scanning order than that of the prior art, in accordance with an embodiment of the present principles;

FIG. 7B shows a diagram for prediction directions for the example of intra 4x4 prediction of FIG. 7A, in accordance with an embodiment of the present principles; FIG. 8 is a block diagram for an exemplary video encoder supporting virtual reference pictures to which the present principles may be applied, in accordance with an embodiment of the present principles;

FIG. 9 is a block diagram for an exemplary video decoder supporting virtual reference pictures to which the present principles may be applied, in accordance with an embodiment of the present principles; FIG. 10 is a flow diagram for an exemplary method for encoding video content using Virtual Reference Picture (VPR) management in a Decoded Picture Buffer (DPB), in accordance with an embodiment of the present principles;

FIG. 11 is a flow diagram for an exemplary method for decoding video content using Virtual Reference Picture (VPR) management in a Decoded Picture Buffer (DPB), in accordance with an embodiment of the present principles;

FIG. 12 is a flow diagram for an exemplary method for encoding video content using Virtual Reference Picture (VPR) management in local memory, in accordance with an embodiment of the present principles;

FIG. 13 is a flow diagram for an exemplary method for decoding video content using Virtual Reference Picture (VPR) management in a Decoded Picture Buffer (DPB), in accordance with an embodiment of the present principles;

FIGs 14A and 14B are diagrams of the well-known Foreman image split vertically and horizontally, respectively, into two equal regions A and B, in accordance with respective embodiments of the present principles,; FIG. 15 is a diagram for an exemplary method for managing virtual reference pictures for image region flipping, in accordance with an embodiment of the present principles; FIG. 16 is a diagram for an exemplary method for managing virtual reference pictures for image region flipping, in accordance with an embodiment of the present principles;

FIG. 17 is a diagram for an exemplary method for encoding video data using picture region flipping, in accordance with an embodiment of the present principles; and

FIG. 18 is a diagram for an exemplary method for decoding video data using picture region flipping, in accordance with an embodiment of the present principles.

DETAILED DESCRIPTION

The present principles are directed to methods and apparatus for video encoding and decoding using adaptive region-based flipping.

The present description illustrates the present principles. It will thus be appreciated that those skilled in the art will be able to devise various arrangements that, although not explicitly described or shown herein, embody the present principles and are included within its spirit and scope.

All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the present principles and the concepts contributed by the inventor(s) to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions.

Moreover, all statements herein reciting principles, aspects, and embodiments of the present principles, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.

Thus, for example, it will be appreciated by those skilled in the art that the block diagrams presented herein represent conceptual views of illustrative circuitry embodying the present principles. Similarly, it will be appreciated that any flow charts, flow diagrams, state transition diagrams, pseudocode, and the like represent various processes which may be substantially represented in computer readable media and so executed by a computer or processor, whether or not such computer or processor is explicitly shown.

The functions of the various elements shown in the figures may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term "processor" or "controller" should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor ("DSP") hardware, read-only memory ("ROM") for storing software, random access memory ("RAM"), and non-volatile storage.

Other hardware, conventional and/or custom, may also be included. Similarly, any switches shown in the figures are conceptual only. Their function may be carried out through the operation of program logic, through dedicated logic, through the interaction of program control and dedicated logic, or even manually, the particular technique being selectable by the implementer as more specifically understood from the context.

In the claims hereof, any element expressed as a means for performing a specified function is intended to encompass any way of performing that function including, for example, a) a combination of circuit elements that performs that function or b) software in any form, including, therefore, firmware, microcode or the like, combined with appropriate circuitry for executing that software to perform the function. The present principles as defined by such claims reside in the fact that the functionalities provided by the various recited means are combined and brought together in the manner which the claims call for. It is thus regarded that any means that can provide those functionalities are equivalent to those shown herein.

Reference in the specification to "one embodiment" or "an embodiment" of the present principles means that a particular feature, structure, characteristic, and so forth described in connection with the embodiment is included in at least one embodiment of the present principles. Thus, the appearances of the phrase "in one embodiment" or "in an embodiment" appearing in various places throughout the specification are not necessarily all referring to the same embodiment. It is to be appreciated that the use of the term "and/or", for example, in the case of "A and/or B", is intended to encompass the selection of the first listed option (A)₁ the selection of the second listed option (B), or the selection of both options (A and B). As a further example, in the case of "A, B, and/or C", such phrasing is intended to encompass the selection of the first listed option (A), the selection of the second listed option (B), the selection of the third listed option (C), the selection of the first and the second listed options (A and B), the selection of the first and third listed options (A and C), the selection of the second and third listed options (B and C), or the selection of all three options (A and B and C). This may be extended, as readily apparent by one of ordinary skill in this and related arts, for as many items listed.

As used herein, "high level syntax" refers to syntax present in the bitstream that resides hierarchically above the macroblock layer. For example, high level syntax, as used herein, may refer to, but is not limited to, syntax at the slice header level, Supplemental Enhancement Information (SEI) level, picture parameter set level, sequence parameter set level and NAL unit header level.

Turning to FIG. 4, a video encoder capable of performing video encoding in accordance with the MPEG-4 AVC standard, modified and/or extended for use with the present principles, is indicated generally by the reference numeral 400. The video encoder 400 includes a frame ordering buffer 410 having an output in signal communication with a non-inverting input of a combiner 485. An output of the combiner 485 is connected in signal communication with a first input of a transformer and quantizer 425. An output of the transformer and quantizer 425 is connected in signal communication with a first input of an entropy coder 445 and a first input of an inverse transformer and inverse quantizer 450. An output of the entropy coder 445 is connected in signal communication with a first non-inverting input of a combiner 490. An output of the combiner 490 is connected in signal communication with a first input of an output buffer 435.

A first output of an encoder controller 405 is connected in signal communication with a second input of the frame ordering buffer 410, a second input of the inverse transformer and inverse quantizer 450, an input of a picture-type decision module 415, an input of a macroblock-type (MB-type) decision module 420, a second input of an intra prediction module 460, a second input of a deblocking filter 465, a first input of a motion compensator 470, a first input of a motion estimator 475, a second input of a reference picture buffer 480, a second input of a flip picture region module 477, and a second input of a flip region module 402.

A second output of the encoder controller 405 is connected in signal communication with a first input of a Supplemental Enhancement Information (SEI) inserter 430, a second input of the transformer and quantizer 425, a second input of the entropy coder 445, a second input of the output buffer 435, and an input of the Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) inserter 440, and a second input of a flip picture region module 478. A first output of the picture-type decision module 415 is connected in signal communication with a third input of a frame ordering buffer 410. A second output of the picture-type decision module 415 is connected in signal communication with a second input of a macroblock-type decision module 420.

An output of the Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) inserter 440 is connected in signal communication with a third non-inverting input of the combiner 490.

An output of the inverse quantizer and inverse transformer 450 is connected in signal communication with a first non-inverting input of a combiner 425. An output of the combiner 425 is connected in signal communication with a first input of the intra prediction module 460 and a first input of the deblocking filter 465. An output of the deblocking filter 465 is connected in signal communication with a first input of a flip picture region module 478. An output of the flip picture region module 478 is connected in signal communication with a first input of a reference picture buffer 480. An output of the reference picture buffer 480 is connected in signal communication with a first input of the flip picture region module 477. An output of the flip picture region module 477 is connected in signal communication with a second input of the motion estimator 475. A first output of the motion estimator 475 is connected in signal communication with a second input of the motion compensator 470. A second output of the motion estimator 475 is connected in signal communication with a third input of the entropy coder 445.

An output of the motion compensator 470 is connected in signal communication with a first input of a switch 497. An output of the intra prediction module 460 is connected in signal communication with a second input of the switch 497. An output of the macroblock-type decision module 420 is connected in signal communication with an output of the switch 497. The output of the switch 497 is further connected in signal communication with a second non-inverting input of the combiner 485. Input of the encoder controller 405 and the flip picture region module 402 are available as inputs of the encoder 400, for receiving an input picture. An output of the flip picture region module 402 is connected in signal communication with an input of the frame ordering buffer 410. Moreover, an input of Supplemental Enhancement Information (SEI) inserter 430 is available as an input of the encoder 400, for receiving metadata. An output of the output buffer 435 is available as an output of the encoder 400, for outputting a bitstream.

Turning to. FIG. 5, a video decoder capable of performing video decoding in accordance with the MPEG-4 AVC standard, modified and/or extended for use with the present principles, is indicated generally by the reference numeral 500. The video decoder 500 includes an input buffer 510 having an output connected in signal communication with an input of the entropy decoder 545. A first output of the entropy decoder 545 is connected in signal communication with a first input of an inverse transformer and inverse quantizer 550. An output of the inverse transformer and inverse quantizer 550 is connected in signal communication with a second non-inverting input of a combiner 525. An output of the combiner 525 is connected in signal communication with a second input of a deblocking filter 565 and a first input of an intra prediction module 560. A second output of the deblocking filter 565 is connected in signal communication with a first input of a flip picture region module 577. An output of the flip picture region module 577 is connected in signal communication with a first input of a reference picture buffer 580. An output of the reference picture buffer 580 is connected in signal communication with a first input of a flip picture region module 567. An output of the flip picture region module 567 is connected in signal communication with a second input of a motion compensator 570. A second output of the entropy decoder 545 is connected in signal communication with a third input of the motion compensator 570 and a first input of the deblocking filter 565. A third output of the entropy decoder 545 is connected in signal communication with an input a decoder controller 505. A first output of the decoder controller 505 is connected in signal communication with a second input of the inverse transformer and inverse quantizer 550. A second output of the decoder controller 505 is connected in signal communication with a third input of the deblocking filter 565. A third output of the decoder controller 505 is connected in S signal communication with a second input of the intra prediction module 560, with a first input of the motion compensator 570, with a second input of the reference picture buffer 580, with a second input of the flip picture region module 567, and with a second input of the flip picture region module 577.

An output of the motion compensator 570 is connected in signal 0 communication with a first input of a switch 597. An output of the intra prediction module 560 is connected in signal communication with a second input of the switch 597. An output of the switch 597 is connected in signal communication with a first non-inverting input of the combiner 525.

An input of the input buffer 510 is available as an input of the decoder 500, for5 receiving an input bitstream. A second output of the flip picture region module 577 is available as an output of the decoder 500, for outputting an output picture.

As noted above, the present principles are directed to methods and apparatus for video encoding and decoding using adaptive region-based flipping. It has been noted that advantages may be realized when "regions" (or "portions" or "blocks" 0 hereinafter interchangeably referred to as "regions") of a picture are flipped, instead of the entire picture (as per the prior art). Such advantages obtained by flipping regions of a picture include, but are not limited to greater coding and decoding ■ efficiencies. Thus, in accordance with the present principles, we have developed methods and apparatus for video encoding and decoding that involve adaptively 5 flipping video image regions prior to the encoding of that image region. In an embodiment, the present principles are implemented as an extension of the MPEG-4 AVC Standard using slice groups and modifying syntax. In another embodiment, the present principles perform the flipping function by implementing a change in the data scanning order.

30 We notice that in a picture, different regions may have different characteristics which may favor different prediction directions. Turning to FIG. 6, the well-known Foreman image is indicated generally by the reference numeral 600. For example, if we look at the well-known Foreman image in FIG. 6, the left part of the image includes edges with the orientation of upright-downleft and the right part of the image includes edges with the orientation of up left-down right. Thus, if we flip the whole picture adaptively, at least one part of image will suffer a loss in coding efficiency. Therefore, in accordance with the present principles, we propose adaptively flipping each image region or frame blocks, prior to coding. For example, when using MPEG-4 AVC intra prediction, for the left part of the image in FIG. 6, edges are upright-downleft, hence modes 3, 7, and 8 are most likely to be used. However, for the right part of the image in FIG. 6, since edges are upleft-downright, modes 4, 5, and 6 are most likely to be used. Ideally, since modes 4, 5, and 6 are more accurate than modes 3, 7, and 8, if we only horizontally flip the left part and keep the right part untouched, we will obtain better coding efficiency than flipping the whole image. The present principles can be used for intra and/or inter coding. The way in which an image is partitioned into different regions is coded in some high level syntax, at sequence, picture, slice, or macroblock level. For each image region, the encoder specifies which direction the region should be flipped. The direction is coded in some high level syntax on the sequence, picture, slice and/or macroblock level, so the decoder can find the direction and decode the picture in the right position like in the original sequence. The direction can be absolutely coded or differentially coded, spatially or temporarily, using uniform coding or using variable length coding, arithmetic coding or differential coding using spatially and temporarily available information. If one reconstructed image region is allowed to be used for prediction of another region, it requires flipping to the same direction as that of the other region.

One exemplary embodiment can be applied to the MPEG-4 AVC Standard given some adaptation of the MPEG-4 AVC Standard. We use a slice group to specify each image region. Thus, in each slice group, in addition to specifying which macroblocks the slice group covers, we also specify which direction the slice group should be flipped. For inter pictures, another alternative to indicate the direction is to associate the direction with a reference index. The management of the references is described hereinafter. An example of how to indicate flip direction in one exemplary embodiment of is illustrated in the text of TABLE 1 , where element flip_direction in the slide header is used to indicate the flip direction. Tabic 1 slice hciidcr svntax

The MPEG-4 AVC Standard does not allow intra or inter prediction across slice boundaries. However, in our case, we can relax the condition. If one slice in group A is used to predict for another slice in group B, group A may be flipped in the same direction as group B. In particular, we first flip group A back into the original direction and put group A back into its original position. Then we flip groups A and B together according to group B's flipping direction.

In order to apply the present principles, the encoder has to decide how to split the image into different regions and what direction it should flip. In an embodiment, one exemplary method to do this is to use a tree-based approach, like a binary-tree or a quad-tree. We can use a "top-down" approach and/or a "bottom-up" approach for tree partitioning. For the sake of illustration, one example of using a top-to- bottom binary tree approach to decide how to split an image and to decide flipping direction is described below. Of course, it is to be appreciated that given the teachings of the present principles provided herein, one of ordinary skill in this and related arts will contemplate these and other approaches to decide how to split an image and to decide flipping direction, while maintaining the spirit of the present principles.

Step 1: Flip the whole image I with all directions. Record the distortion measure J(I) and flip direction which has least distortion. One example of distortion measure is shown in Equation 1 :

J = SSD + λ MODE R , λ_MOD£ = 0.85χ 2^-²>'³ (1) where SSD is the sum of squared difference of a coded picture. λ_MODE is the

Lagrangian parameter used for the mode selection. R is the number of bits of a coded picture.

Step 2: Split image I into two equal regions A and B, either vertically (FIG. 14A) or horizontally (Fig. 14B) . FIGs. 14A and 14B are diagrams of the well-known Foreman image split vertically and horizontally, respectively, into two equal regions A and B.

Step 3: For each region, exhaustive testing all flipping directions. Record the distortion and flipping direction with least distortion of each region A and B. Add the distortion of two regions together: J(I_n) = J(A)+J(B), where n=1 means splitting vertically, n=2 means splitting horizontally.

Step 4: Compare J(I_n) with n=1 and 2, select the splitting method with less distortion and record it as J(l').

Step 5: Compare J(I) with J(l'). If J(I')<J(I), continue to split the image region as above in the raster scan order. Otherwise, stop splitting.

To adapt to the directional characteristics of the local signal, instead of flipping the image region, we can change the data scanning order and, thus, the block coding order. Subsequently, the original prediction modes will be re-defined according the current coding order. This will be an alternative embodiment to the region flipping based methods disclosed above. More specifically, if we want to flip the region horizontally, we can just change the scanning order from right top and left bottom. If we want to flip the region vertically, we can just change the scanning order from left bottom to top right. If we want to flip the region horizontally and vertically, we can just change the scanning order from bottom right and top left. When we change the scanning order, the prediction direction has to be changed accordingly.

Turning to FIG. 7A, an example of intra 4x4 prediction with an alternate scanning order than that of the prior art is indicated generally by the reference numeral 700. Turning to FIG. 7B, prediction directions for the example of intra 4x4 prediction of FIG. 7A is indicated generally by the reference numeral 750. Relating to the example of intra 4x4 prediction 700, FIG. 7A shows the corresponding blocks 710, the scanning order of 4x4 blocks in a macroblock 712, and the pixels used for prediction 714.

As shown in FIGs. 7A and 7B₁ the new prediction directions are all vertically and horizontally flipped in accordance to the current scanning order, which starts from bottom right and moves to top left.

A description will now be given regarding the management of references pictures for use with one or more embodiments of the present principles described herein. Turning to FIG. 8, an exemplary video encoder supporting virtual reference pictures to which the present principles may be applied is indicated generally by the reference numeral 800.

The video encoder 800 includes a frame ordering buffer 810 having an output in signal communication with a non-inverting input of a combiner 885. An output of the combiner 885 is connected in signal communication with a first input of a transformer and quantizer 825. An output of the transformer and quantizer 825 is connected in signal communication with a first input of an entropy coder 845 and a first input of an inverse transformer and inverse quantizer 850. An output of the entropy coder 845 is connected in signal communication with a first non-inverting input of a combiner 890. An output of the combiner 890 is connected in signal communication with a first input of an output buffer 835.

A first output of an encoder controller 805 is connected in signal communication with a first input of the flip picture region module 802, a second input of the frame ordering buffer 810, a second input of the inverse transformer and inverse quantizer 850, an input of a picture-type decision module 815, an input of a macroblock-type (MB-type) decision module 820, a second input of an intra prediction module 860, a second input of a deblocking filter 865, a first input of a motion compensator 870, a first input of a motion estimator 875, and a second input of a reference picture buffer 880. A second output of the encoder controller 805 is connected in signal communication with a first input of a Supplemental Enhancement Information (SEI) inserter 830, a second input of the transformer and quantizer 825, a second input of the entropy coder 845, a second input of the output buffer 835, an input of the Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) inserter 840, a second input of a virtual reference picture buffer 889, and a first input of a flip picture region module 878.

A first output of the picture-type decision module 815 is connected in signal communication with a third input of a frame ordering buffer 810. A second output of the picture-type decision module 815 is connected in signal communication with a second input of a macroblock-type decision module 820.

An output of the Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) inserter 840 is connected in signal communication with a third non-inverting input of the combiner 890.

An output of the inverse quantizer and inverse transformer 850 is connected in signal communication with a first non-inverting input of a combiner 825. An output of the combiner 825 is connected in signal communication with a first input of the intra prediction module 860 and a first input of the deblocking filter 865. An output of the deblocking filter 865 is connected in signal communication with a second input of a flip picture region module 878. An output of the flip picture region module 878 is connected in signal communication with a first input of a reference picture buffer 880. An output of a switch 877 is connected in signal communication with a second input of the motion estimator 875 and a third input of the motion compensator 870. A first input of the switch 877 is connected in signal communication with a first output of the reference picture buffer 880. A second input of the switch 877 is connected in signal communication with an output of the virtual reference picture buffer 889. A second output of the reference picture buffer 880 is connected in signal communication with a first input of the virtual reference picture buffer 889. A first output of the motion estimator 875 is connected in signal communication with a second input of the motion compensator 870. A second output of the motion estimator 875 is connected in signal communication with a third input of the entropy coder 845.

An output of the motion compensator 870 is connected in signal communication with a first input of a switch 897. An output of the intra prediction module 860 is connected in signal communication with a second input of the switch 897. An output of the macroblock-type decision module 820 is connected in signal communication with a third input of the switch 897. The third input of the switch 897 determines whether or not the "data" input of the switch (as compared to the control input, i.e., the third input) is to be provided by the motion compensator 870 or the intra prediction module 860. The output of the switch 897 is further connected in signal communication with a second non-inverting input of the combiner 825. An input of the encoder controller 805 and a second input of a flip picture region module 802 are available as inputs of the encoder 800, for receiving an input picture 801. Moreover, an input of the Supplemental Enhancement Information (SEI) inserter 830 is available as an input of the encoder 800, for receiving metadata. An output of the output buffer 835 is available as an output of the encoder 800, for outputting a bitstream. An output of the flip picture region module 802 is connected in signal communication with a first input of the frame ordering buffer 810.

Turning to FIG. 9, an exemplary video decoder supporting virtual reference pictures to which the present principles may be applied is indicated generally by the reference numeral 900.

The video decoder 900 includes an input buffer 910 having an output connected in signal communication with a first input of the entropy decoder 945. A first output of the entropy decoder 945 is connected in signal communication with a first input of an inverse transformer and inverse quantizer 950. An output of the inverse transformer and inverse quantizer 950 is connected in signal communication with a second non-inverting input of a combiner 925. An output of the combiner 925 is connected in signal communication with a second input of a deblocking filter 965 and a first input of an intra prediction module 960. A second output of the deblocking filter 965 is connected in signal communication with a first input of a flip picture region 978. A first output of the flip region module is connected in signal communication with an input of the reference picture buffer 980. An output of a switch 977 is connected in signal communication with a second input of a motion compensator 970. A first input of the switch 977 is connected in signal communication with an output of a virtual reference picture buffer 980. A second input of the switch 977 is connected in signal communication with a first output of the reference picture buffer 989. A second output of the reference picture buffer 989 is connected in signal communication with a first input of the virtual reference picture buffer 980. A second output of the entropy decoder 945 is connected in signal communication with a third input of the motion compensator 970 and a first input of the deblocking filter 965. A third output of the entropy decoder 945 is connected in signal communication with an input a decoder controller 905. A first output of the decoder controller 905 is connected in signal communication with a second input of the entropy decoder 945. A second output of the decoder controller 905 is connected in signal communication with a second input of the inverse transformer and inverse quantizer 950. A third output of the decoder controller 905 is connected in signal communication with a third input of the deblocking filter 965. A fourth output of the decoder controller 905 is connected in signal communication with a second input of the intra prediction module 960, with a first input of the motion compensator 970, with a second input of the reference picture buffer 989, with a second input of the virtual reference picture buffer 980, and with a second input of the flip picture region module 978. A first output of the flip picture region module 978 is connected in signal communication with an input of the reference picture buffer 989.

An output of the motion compensator 970 is connected in signal communication with a first input of a switch 997. An output of the intra prediction module 960 is connected in signal communication with a second input of the switch 997. An output of the switch 997 is connected in signal communication with a first non-inverting input of the combiner 925.

An input of the input buffer 910 is available as an input of the decoder 900, for receiving an input bitstream. A second output of the flip picture region module 978 is available as an output of the decoder 900, for outputting an output picture. Turning to FIG. 17, an exemplary method for encoding video data using picture region flipping is indicated generally by the reference numeral 1700.

The method 1700 includes a start block 1705 that passes control to a function block 1710. The function block 1710 searches for the best image partition with the least distortion measure, records the flipping direction for each image region, and passes control to a function block 1715. The function block 1715 uses one or more slice groups to describe each image region in the picture parameter set, sets flip_direction in the slice header, and passes control to a loop limit block 1720. The loop limit block 1720 begins a loop over each slice group, and passes control to a function block 1725. The function block 1725 encodes each slice group i with its flip direction, and passes control to a loop limit block 1730. The loop limit block 1730 ends the loop over each slice group, and passes control to an end block 1735.

Turning to FIG. 18, an exemplary method for decoding video data using picture region flipping is indicated generally by the reference numeral 1800.

The method 1800 includes a start block 1805 that passes control to a function block 1810. The function block 1810 parses the bitstream, and passes control to a function block 1815. The function block 1815 extracts slice group syntax to describe each image region, gets flip_direction for each slice group, and passes control to a loop limit block 1820. The loop limit block 1820 begins a loop over each slice group, and passes control to a function block 1825. The function block 1825 decodes each slice group i with its flip direction, and passes control to a loop limit block 1830. The loop limit block 1830 ends the loop over each slice group, and passes control to an end block 1835.

Memory management for virtual reference pictures

Since the virtual reference pictures (VRPs) need to be generated and stored at both the encoder and decoder, the associated storage memory should be considered. There are several approaches to providing a memory management model for virtual reference pictures: (1) in a first approach, store generated virtual reference pictures in the decoded picture buffer; and (2) in a second approach, store virtually generated frames in a temporary generated picture buffer which is only valid during the encoding/decoding of the current frame. With respect to providing a memory management model in accordance with the first approach mentioned above, since virtual reference pictures are only needed for the encoding/decoding of the current picture, decoded picture buffer insertion and deletion processes should be properly defined. In one possible implementation, generated reference pictures will be inserted in the decoded picture buffer before reference lists are constructed, and will be removed right after the encoding/decoding of the current frame is finished.

When virtual reference pictures are stored in the decoded picture buffer, they will need to be differentiated from non-virtual decoded pictures, there are several options how this can be done in an MPEG-4 AVC-based implementation. Some exemplary options for differentiating virtual reference pictures stored in the decoded picture buffer from non-virtual reference pictures including, for example: (1) store virtual reference pictures as short-term reference pictures and use unused frame_num/picture_prder_count; (2) store virtual reference pictures as long-term reference pictures and use unused longterm_id's in the long term memory; and (3) since a virtual reference picture is different from previously decoded pictures in nature, dedicated memory slots can be allocated in the decoded picture buffer for the storage of virtual reference pictures. In that VRP memory, virtual reference pictures will be identified by their vrp id, which is unique for each virtual reference picture. With respect to providing a memory management model in accordance with the second approach mentioned above, by storing virtually generated frames in a temporary generated picture buffer which is only valid during the encoding/decoding of the current frame, this temporary generated picture buffer will be able to store all virtually generated pictures. Virtual reference pictures will be identified by their vrp_id, which is unique for each virtual reference picture.

Turning to FIG. 10, an exemplary method for encoding video content using Virtual Reference Picture (VPR) management in a Decoded Picture Buffer (DPB) is indicated generally by the reference numeral 1000. The method 1000 includes a start block 1005 that passes control to a function block 1010. The function block 1010 sets vrp_present_flag equal to zero, and passes control to a decision block 1015. The decision block 1015 determines whether or not VRP is enabled. If so, then control is passed to a function block 1020. Otherwise, control is passed to a function block 1070.

The function block 1020 sets vrp_present_flag equal to one, and passes control to a function block 1025. The function block 1025 sets num_vrps and VRP parameter syntaxes, and passes control to a function block 1030. The function block 1030 performs VRP generation to generate one or more VRPs (hereinafter "VRP"), and passes control to a function block 1035. The function block 1035 inserts the VRP in the decoded picture buffer (DPB), sets frame_num/Picture Order Count or long_term_frame_idx or vrp_id, and passes control to a function block 1040. The function block 1040 includes the VRP in reference list construction, and passes control to a function block 1045. The function block 1045 includes the VRP in reference list reordering, and passes control to a function block 1050. The function block 1050 writes high-level syntaxes into the bitstream, and passes control to a function block 1055. The function block 1055 encodes the current picture, refers to the VRP by refjdx if the VRP is present, and passes control to a function block 1060. The function block 1060 removes the VRP from the DPB₁ and passes control to a function block 1065. The function block 1065 writes the low-level syntaxes into the bitstream, and passes control to an end block 1099.

The function block 1070 performs reference list construction without the VRP, and passes control to the function block 1050.

Turning to FIG. 11 , an exemplary method for decoding video content using Virtual Reference Picture (VPR) management in a Decoded Picture Buffer (DPB) is indicated generally by the reference numeral 1100. The method 1100 includes a start block 1105 that passes control to a function block 1110. The function block 1110 reads high-level syntaxes from the bitstream including, e.g., vrp_present_flag, num_vrps, and other VRP parameter syntaxes, and passes control to a decision block 1115. The decision block 1115 determines whether or not vrp_present_flag is equal to one. If so, then control is passed to a function block 1120. Otherwise, control is passed to a function block 1160. The function block 1120 decodes the VRP parameters, and passes control to a function block 1125. The function block 1125 performs VRP generation to generate one or more VRPs (hereinafter "VRP), and passes control to a function block 1130. The function block 1130 inserts the VRP into the DPB, sets frame_num/Picture Order Count or long_term_frame_idx or vrp_id, and passes control to a function block 1135. The function block 1135 includes the VRP in the default reference list construction, and passes control to a function block 1140. The function block 1140 includes the VRP in reference list reordering, and passes control to a function block 1145. The function block 1145 reads the low-level syntaxes from the bitstream, and passes control to a function block 1150. The function block 1150 decodes the current picture, refers to the VRP by refjdx if the VRP is present, and passes control to a function block 1155. The function block 1155 removes the VRP from the DPB, and passes control to an end block 1199.

The function block 1160 performs reference list construction without the VRP, and passes control to the function block 1145. Turning to FIG. 12, an exemplary method for encoding video content using

Virtual Reference Picture (VPR) management in local memory is indicated generally by the reference numeral 1200. The method 1200 includes a start block 1205 that passes control to a function block 1210. The function block 1210 sets vrp_present_flag equal to zero, and passes control to a decision block 1215. The decision block 1215 determines whether or not VRP is enabled. If so, then control is passed to a function block 1220. Otherwise, control is passed to a function block 1240.

The function block 1220 sets vrp_present_flag equal to one, and passes control to a function block 1225. The function block 1225 sets num_vrps and VRP parameter syntaxes, and passes control to a function block 1230. The function block 1230 performs VRP generation to generate one or more VRPs (hereinafter "VRP"), and passes control to a function block 1235. The function block 1235 stores the VRP in local memory, sets vrp_id, and passes control to a function block 1240. The function block 1240 performs reference list construction without the VRP, and passes control to a function block 1245. The function block 1245 writes high-level syntaxes into the bitstream, and passes control to a function block 1250. The function block 1250 encodes the current picture, refers to the VRP by vrp_id if the VRP is present, and passes control to a function block 1255. The function block 1255 releases the memory allocated for the VRP, and passes control to a function block 1260. The function block 1260 writes low-level syntaxes into the bitstream, and passes control to an end block 1299.

Turning to FIG. 13, an exemplary method for decoding video content using Virtual Reference Picture (VPR) management in a Decoded Picture Buffer (DPB) is indicated generally by the reference numeral 1300. The method 1300 includes a start block 1305 that passes control to a function block 1310. The function block 1310 reads high-level syntaxes from the bitstream including, e.g., vrp_present_flag, num_vrps, and other VRP parameter syntaxes, and passes control to a decision block 1320. The decision block 1320 determines whether or not vrp_present_flag is equal to one. If so, then control is passed to a function block 1325. Otherwise, control is passed to a function block 1345.

The function block 1325 decodes the VRP parameters, and passes control to a function block 1330. The function block 1330 performs VRP generation to generate one or more VRPs (hereinafter "VRP), and passes control to a function block 1340. The function block 1340 stores the VRP in the local memory, sets vrp_id, and passes control to a function block 1345. The function block 1345 performs reference list construction without the VRP, and passes control to a function block 1350. The function block 1350 reads the low-level syntaxes from the bitstream, and passes control to a function block 1360. The function block 1360 decodes the current picture, refers to the VRP by vrp_id if the VRP is present, and passes control to a function block 1365. The function block 1365 releases the memory allocated for the VRP, and passes control to a function block 1370. The function block 1370 removes the VRP from the DPB, and passes control to an end block 1399.

Turning to FIG. 15, a method for managing virtual reference pictures for image region flipping is indicated generally by the reference numeral 1500. The method 1500 includes a start block 1505 that passes control to a function block 1510. The function block 1510 stores the flipped picture in the VRP buffer, associates refjndex with flip direction, and passes control to a loop limit block 1515. The loop limit block 1515 begins a loop over each macroblock, and passes control to a function block 1520. The function block 1520 searches the best refjndex for the ith macroblock, and passes control to a function block 1525. The function block 1525 encodes the ith macroblock with the best refjndex, and passes control to a loop limit block 1530. The loop limit block 1530 ends the loop over each macroblock, and passes control to an end block 1535. Turning to FIG. 16, a method for managing virtual reference pictures for image region flipping is indicated generally by the reference numeral 1600.

The method 1600 includes a start block 1605 that passes control to a function block 1610. The function block 1610 parses the bitstream, and passes control to a function block 1615. The function block 1615 stores the flipped picture in the VRP buffer, associates refjndex with flip direction, and passes control to a loop limit block 1620. The loop limit block 1620 begins a loop over each macroblock, and passes control to a function block 1625. The function block 1625 decodes the ith macroblock with the decoded refjndex, and passes control to a function block 1630. The function block 1630 flips the reconstructed macroblock back to its original direction, and passes control to a loop limit block 1635. The loop limit block 1635 ends the loop over each macroblock, and passes control to an end block 1640.

A description will now be given of some of the many attendant advantages/features of the present invention, some of which have been mentioned above. For example, one advantage/feature is an apparatus that includes a video encoder for encoding image regions of a picture into a resultant bitstream, wherein at least one of the image regions is adaptively flipped before encoding.

Another advantage/feature is the apparatus having the video encoder as described above, wherein the at least one of the image regions is adaptively flipped for at least one of intra and inter coding. Yet another advantage/feature is the apparatus having the video encoder as described above, wherein the encoder codes a flipping direction in a high level syntax of the resultant bitstream. Still another advantage/feature is the apparatus having the video encoder that codes the flipping direction as described above, wherein the flipping direction is coded in at least one of a slice header level, a Supplemental Enhancement Information (SEI) level, a picture parameter set level, a sequence parameter set level and a network abstraction layer unit header level.

Moreover, another advantage/feature is the apparatus having the video encoder that codes the flipping direction as described above, wherein the flipping direction is absolutely coded, or is differentially coded using spatially or temporarily available information, or is implicitly coded from coded data. Further, another advantage/feature is the apparatus having the video encoder as described above, wherein the encoder signals a partition of the picture from which the image regions are determined in a high level syntax of the resultant bitstream.

Also, another advantage/feature is the apparatus having the video encoder that signals the partition as described above, wherein the partition is signaled in at least one of a slice header level, a Supplemental Enhancement Information (SEI) level, a picture parameter set level, a sequence parameter set level and a network abstraction layer unit header level.

Additionally, another advantage/feature is the apparatus having the video encoder that signals the partition as described above, wherein the partition is absolutely coded, or is differentially coded using spatially or temporarily available information, or is implicitly coded from coded data.

Moreover, another advantage/feature is the apparatus having the video encoder as described above, wherein a flipping direction of a reference image region used for a prediction of a current one of the image regions is restricted to be the same as that of the current one of the image regions.

Further, another advantage/feature is the apparatus having the video encoder as described above, wherein said encoder (200) is configured to perform the adaptive flipping as an extension of the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 recommendation. Also, another advantage/feature is the apparatus having the video encoder that is configured to perform the adaptive flipping as an extension of the MPEG-4 AVC Standard as described above, wherein the encoder uses slice groups to code the image regions. Additionally, another advantage/feature is the apparatus having the video encoder that is configured to perform the adaptive flipping as an extension of the MPEG-4 AVC Standard as described above, wherein the encoder codes a flipping direction in at least one of the slice groups.

Moreover, another advantage/feature is the apparatus having the video encoder that is configured to perform the adaptive flipping as an extension of the MPEG-4 AVC Standard as described above, wherein the encoder selectively allows or disallows a prediction to cross a slice boundary.

Further, another advantage/feature is the apparatus having the video encoder that is configured to perform the adaptive flipping as an extension of the MPEG-4 AVC Standard as described above, wherein said encoder (200) associates a flipping direction with a reference index.

Also, another advantage/feature is the apparatus having the video encoder as described above, wherein the encoder uses a tree-based partitioning to split the image regions. Additionally, another advantage/feature is the apparatus having the video encoder that uses the tree-based partitioning as described above, wherein at least one of a binary-tree and a quad-tree method are used to split the image regions.

Moreover,^* another advantage/feature is the apparatus having the video encoder that uses a tree-based partitioning as described above, wherein at least one of a bottom-up and a top-down method are used to split the image regions.

Further, another advantage/feature is the apparatus having the video encoder as described above, wherein the encoder uses at least one of a distortion measure and a coding cost measure to decide how to split the image regions.

Also, another advantage/feature is the apparatus having the video encoder as described above, wherein the encoder performs a flipping operation as a change of data scanning order.

Additionally, another advantage/feature is the apparatus having the video encoder that performs a flipping operation as described above, wherein the encoder changes an encoding order of at least one of the image regions according to the data scanning order.

Moreover,^* another advantage/feature is the apparatus having the video encoder that performs a flipping operation as described above, wherein the encoder changes a prediction direction of at least one of the image regions according to the data scanning order.

Further, another advantage/feature is the apparatus having the video encoder that changes a prediction direction as described above, wherein the prediction direction related to at least one of flipping and change of scanning order affects at least one of an inter motion prediction for the picture when the picture is an inter predicted picture and intra coding direction.

These and other features and advantages of the present principles may be readily ascertained by one of ordinary skill in the pertinent art based on the teachings herein. It is to be understood that the teachings of the present principles may be implemented in various forms of hardware, software, firmware, special purpose processors, or combinations thereof.

Most preferably, the teachings of the present principles are implemented as a combination of hardware and software. Moreover, the software may be implemented as an application program tangibly embodied on a program storage unit. The application program may be uploaded to, and executed by, a machine comprising any suitable architecture. Preferably, the machine is implemented on a computer platform having hardware such as one or more central processing units ("CPU"), a random access memory ("RAM"), and input/output ("I/O") interfaces. The computer platform may also include an operating system and microinstruction code. The various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU. In addition, various other peripheral units may be connected to the computer platform such as an additional data storage unit and a printing unit. It is to be further understood that, because some of the constituent system components and methods depicted in the accompanying drawings are preferably implemented in software, the actual connections between the system components or the process function blocks may differ depending upon the manner in which the present principles are programmed. Given the teachings herein, one of ordinary skill in the pertinent art will be able to contemplate these and similar implementations or configurations of the present principles.

Although the illustrative embodiments have been described herein with reference to the accompanying drawings, it is to be understood that the present principles is not limited to those precise embodiments, and that various changes and modifications may be effected therein by one of ordinary skill in the pertinent art without departing from the scope or spirit of the present principles. All such changes and modifications are intended to be included within the scope of the present principles as set forth in the appended claims.

Claims

CLAIMS:

1. An apparatus, comprising: a video encoder (400) for encoding image regions of a picture into a resultant bitstream, wherein at least one of the image regions is flipped before encoding.

2. The apparatus of claim 1 , wherein the at least one of the image regions is flipped for at least one of intra and inter coding.

3. The apparatus of claim 1 , wherein said encoder (400) codes a flipping direction in a high level syntax of the resultant bitstream.

4. The apparatus of claim 3, wherein the flipping direction is coded in at least one of a slice header level, a Supplemental Enhancement Information (SEI) level, a picture parameter set level, a sequence parameter set level and a network abstraction layer unit header level.

5. The apparatus of claim 3, wherein the flipping direction is absolutely coded, or is differentially coded using spatially or temporarily available information, or is implicitly coded from coded data.

6. The apparatus of claim 1 , wherein said encoder (400) signals a partition of the picture from which the image regions are determined in a high level syntax of the resultant bitstream.

7. The apparatus of claim 6, wherein the partition is signaled in at least one of a slice header level, a Supplemental Enhancement Information (SEI) level, a picture parameter set level, a sequence parameter set level and a network abstraction layer unit header level.

8. The apparatus of claim 6, wherein the partition is absolutely coded, or is differentially coded using spatially or temporarily available information, or is implicitly coded from coded data.

9. The apparatus of claim 1 , wherein a flipping direction of a reference image region used for a prediction of a current one of the image regions is restricted to be the same as that of the current one of the image regions.

10. The apparatus of claim 1 , wherein said encoder (400) is configured to perform the flipping as an extension of the International Organization for

Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 recommendation.

1 1. The apparatus of claim 10, wherein said encoder (400) uses slice groups to code the image regions.

12. The apparatus of claim 10, wherein said encoder (400) codes a flipping direction in at least one of the slice groups.

13. The apparatus of claim 10, wherein said encoder (400) selectively allows or disallows a prediction to cross a slice boundary.

14. The apparatus of claim 10, wherein said encoder (400) associates a flipping direction with a reference index.

15. The apparatus of claim 1 , wherein said encoder (400) uses a tree- based partitioning to split the image regions.

16. The apparatus of claim 15, wherein at least one of a binary-tree and a quad-tree method are used to split the image regions.

17. The apparatus of claim 15, wherein at least one of a bottom-up and a top-down method are used to split the image regions.

18. The apparatus of claim 1 , wherein said encoder (400) uses at (east one of a distortion measure and a coding cost measure to decide how to split the image regions.

19. The apparatus of claim 1 , wherein said encoder (400) performs a flipping operation as a change of data scanning order.

20. The apparatus of claim 19, wherein said encoder (400) changes an encoding order of at least two of the image regions according to the data scanning order.

21. The apparatus of claim of 19, wherein said encoder (400) changes a prediction direction of at least one of the image regions according to the data scanning order.

22. The apparatus of claim of 21 , wherein the prediction direction related to at least one of flipping and change of scanning order affects at least one of an inter motion prediction for the picture when the picture is an inter predicted picture and intra coding direction.

23. A method, comprising: encoding image regions of a picture into a resultant bitstream, wherein at least one of the image regions is flipped before encoding (1700).

24. The method of claim 23, wherein the at least one of the image regions is flipped for at least one of intra and inter coding (1725).

25. The method of claim 23, wherein said encoding step codes a flipping direction in a high level syntax of the resultant bitstream (1715).

26. The method of claim 25, wherein the flipping direction is coded in at least one of a slice header level, a Supplemental Enhancement Information (SEI) level, a picture parameter set level, a sequence parameter set level and a network abstraction layer unit header level.

27. The method of claim 25, wherein the flipping direction is absolutely coded, or is differentially coded using spatially or temporarily available information, or is implicitly coded from coded data.

28. The method of claim 23, wherein said encoding step signals a partition of the picture from which the image regions are determined in a high level syntax of the resultant bitstream (1715).

29. The method of claim 28, wherein the partition is signaled in at least one of a slice header level, a Supplemental Enhancement Information (SEI) level, a picture parameter set level, a sequence parameter set level and a network abstraction layer unit header level.

30. The method of claim 28, wherein the partition is absolutely coded, or is differentially coded using spatially or temporarily available information, or is implicitly coded from coded data.

31. The method of claim 23, wherein a flipping direction of a reference image region used for a prediction of a current one of the image regions is restricted to be the same as that of the current one of the image regions.

32. The method of claim 23, wherein said encoder is configured to perform the flipping as an extension of the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 recommendation.

33. The method of claim 32, wherein said encoding step uses slice groups to code the image regions (1725).

34. The method of claim 32, wherein said encoding step codes a flipping direction in at least one of the slice groups.

35. The method of claim 32, wherein said encoding step selectively allows or disallows a prediction to cross a slice boundary.

36. The method of claim 32, wherein said encoding step associates a flipping direction with a reference index (1510).

37. The method of claim 23, wherein said encoding step uses a tree-based partitioning to split the image regions (1710).

38. The method of claim 37, wherein at least one of a binary-tree and a quad-tree method are used to split the image regions.

39. The method of claim 37, wherein at least one of a bottom-up and a top- down method are used to split the image regions.

40. The method of claim 23, wherein said encoding step uses at least one of a distortion measure and a coding cost measure to decide how to split the image regions.

41. The method of claim 23, wherein said encoding step performs a flipping operation as a change of data scanning order (712).

42. The method of claim 41 , wherein said encoding step changes an encoding order of at least one of the image regions according to the data scanning order.

43. The method of claim of 41 , wherein said encoding step changes a prediction direction of at least one of the image regions according to the data scanning order (750).

44. The method of claim of 43, wherein the prediction direction related to at least one of flipping and change of scanning order affects at least one of an inter motion prediction for the picture when the picture is an inter predicted picture and intra coding direction.

45. An apparatus, comprising: a video decoder (500) for decoding image regions of a picture from a bitstream, wherein at least one of the image regions is flipped after decoding.

46. The apparatus of claim 45, wherein the at least one of the image regions is flipped for at least one of intra and inter coding.

47. The apparatus of claim 45, wherein said decoder (500) decodes a flipping direction from a high tevel syntax of the bitstream.

48. The apparatus of claim 47, wherein the flipping direction is decoded from a slice header level, a Supplemental Enhancement Information (SEI) level, a picture parameter set level, a sequence parameter set level and a network abstraction layer unit header level.

49. The apparatus of claim 47, wherein the flipping direction is absolutely decoded, or is differentially decoded using spatially or temporarily available information, or is implicitly decoded from coded data.

50. The apparatus of claim 45, wherein said decoder (500) determines a partition of the picture, from which the image regions are determined, from a high level syntax of the bitstream.

51. The apparatus of claim 50, wherein the partition is decoded from a slice header level, a Supplemental Enhancement Information (SEI) level, a picture parameter set level, a sequence parameter set level and a network abstraction layer unit header level.

52. The apparatus of claim 50, wherein the partition is absolutely decoded, or is differentially decoded using spatially or temporarily available information, or is implicitly decoded from coded data.

53. The apparatus of claim 45, wherein a flipping direction of a reference image region used for a prediction of a current one of the image regions is restricted to be the same as that of the current one of the image regions.

54. The apparatus of claim 45, wherein said decoder (500) is configured to perform the flipping as an extension of the International Organization for

55. The apparatus of claim 54, wherein said decoder (500) uses slice groups to decode the image regions.

56. The apparatus of claim 54, wherein said decoder (500) decodes a flipping direction in at least one of the slice groups.

57. The apparatus of claim 54, wherein said decoder (500) selectively allows or disallows a prediction to cross a slice boundary.

58. The apparatus of claim 54, wherein said decoder (500) associates a flipping direction with a reference index.

59. The apparatus of claim 45, wherein said decoder (500) performs a flipping operation as a change of data scanning order.

60. The apparatus of claim 59, wherein said decoder (500) changes a decoding order of at least two of the image regions according to the data scanning order.

61. The apparatus of claim of 59, wherein said decoder (500) changes a prediction direction of at least one of the image regions according to the data scanning order.

62. The apparatus of claim of 61 , wherein the prediction direction related to at least one of flipping and change of scanning order affects at least one of an inter motion prediction for the picture when the picture is an inter predicted picture and intra coding direction.

63. A method, comprising: decoding image regions of a picture from a bitstream, wherein at least one of the image regions is flipped after decoding (1800).

64. The method of claim 63, wherein the at least one of the image regions is flipped for at least one of intra and inter coding (1825).

65. The method of claim 63, wherein said decoding step decodes a flipping direction from a high level syntax of the bitstream (1815).

66. The method of claim 65, wherein the flipping direction is decoded from a slice header level, a Supplemental Enhancement Information (SEI) level, a picture parameter set level, a sequence parameter set level and a network abstraction layer unit header level.

67. The method of claim 65, wherein the flipping direction is absolutely decoded, or is differentially decoded using spatially or temporarily available information, or is implicitly decoded from coded data.

68. The method of claim 63, wherein said decoding step decodes a partition of the picture, from which the image regions are determined, from a high level syntax of the bitstream.

69. The method of claim 68, wherein the partition is decoded from a slice header level, a Supplemental Enhancement Information (SEI) level, a picture parameter set level, a sequence parameter set level and a network abstraction layer unit header level.

70. The method of claim 68, wherein the partition is absolutely decoded, or is differentially decoded using spatially or temporarily available information, or is implicitly decoded from coded data.

71. The method of claim 63, wherein a flipping direction of a reference image region used for a prediction of a current one of the image regions is restricted to be the same as that of the current one of the image regions.

72. The method of claim 63, wherein said decoding step is configured to perform the flipping as an extension of the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 recommendation.

73. The method of claim 72, wherein said decoding step uses slice groups to decode the image regions (1825).

74. The method of claim 72, wherein said decoding step decodes a flipping direction in at least one of the slice groups (1815).

75. The method of claim 72, wherein said decoding step selectively allows or disallows a prediction to cross a slice boundary.

76. The method of claim 72, wherein said decoding step associates a flipping direction with a reference index (1615).

77. The method of claim 63, wherein said decoding step performs a flipping operation as a change of data scanning order (700).

78. The method of claim 77, wherein said decoding step changes a decoding order of at least two of the image regions according to the data scanning order.

79. The method of claim of 77, wherein said decoding step changes a prediction direction of at least one of the image regions according to the data scanning order (750).

80. The method of claim of 79, wherein the prediction direction related to at least one of flipping and change of scanning order affects at least one of an inter motion prediction for the picture when the picture is an inter predicted picture and intra coding direction.

81. A video signal structure for video encoding, comprising: image regions of a picture encoded into a resultant bitstream, wherein at least one of the image regions is flipped before encoding.

82. A storage media having video signal data encoded thereupon, comprising: image regions of a picture encoded into a resultant bitstream, wherein at least one of the image regions is flipped before encoding.