WO2006083107A1 - Method and apparatus for compressing multi-layered motion vector - Google Patents
Method and apparatus for compressing multi-layered motion vector Download PDFInfo
- Publication number
- WO2006083107A1 WO2006083107A1 PCT/KR2006/000352 KR2006000352W WO2006083107A1 WO 2006083107 A1 WO2006083107 A1 WO 2006083107A1 KR 2006000352 W KR2006000352 W KR 2006000352W WO 2006083107 A1 WO2006083107 A1 WO 2006083107A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- macroblock
- lower layer
- set forth
- sub
- region
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 89
- 239000013598 vector Substances 0.000 title claims abstract description 11
- 238000005070 sampling Methods 0.000 claims description 15
- 238000012935 Averaging Methods 0.000 claims description 6
- 239000010410 layer Substances 0.000 description 172
- 230000005540 biological transmission Effects 0.000 description 12
- 238000004891 communication Methods 0.000 description 11
- 230000002123 temporal effect Effects 0.000 description 11
- 238000005192 partition Methods 0.000 description 9
- 238000013139 quantization Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 7
- 238000010276 construction Methods 0.000 description 6
- 239000002356 single layer Substances 0.000 description 3
- 238000007792 addition Methods 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 1
- 238000013144 data compression Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 239000011229 interlayer Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/59—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial sub-sampling or interpolation, e.g. alteration of picture size or resolution
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/513—Processing of motion vectors
- H04N19/517—Processing of motion vectors by encoding
- H04N19/52—Processing of motion vectors by encoding by predictive encoding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
Definitions
- Methods and apparatuses consistent with the present invention relate generally to video compression and, more particularly, to efficiently predicting the motion vector of a current layer by using the motion vector of a lower layer in a video coder using a multi-layered structure.
- Data can be compressed by eliminating spatial redundancy such as a case where an identical color or object is repeated in an image, temporal redundancy such as a case where there is little change between neighboring frames or an identical audio sound is repeated, or psychovisual redundancy in which the fact that humans' visual and perceptual abilities are insensitive to high frequencies is taken into account.
- spatial redundancy such as a case where an identical color or object is repeated in an image
- temporal redundancy such as a case where there is little change between neighboring frames or an identical audio sound is repeated
- psychovisual redundancy in which the fact that humans' visual and perceptual abilities are insensitive to high frequencies is taken into account.
- temporal redundancy is eliminated using temporal filtering based on motion compensation and spatial redundancy is eliminated using spatial transform.
- transmission media are necessary. Performance differs according to transmission medium.
- Currently used transmission media have various transmission speeds ranging from the speed of an ultra high-speed communication network, which can transmit data at a transmission rate of several tens of megabits per second, to the speed of a mobile communication network, which can transmit data at a transmission rate of 384 Kbits per second.
- a scalable video encoding method which can support transmission media having a variety of speeds or transmit multimedia at a transmission speed suitable for each transmission environment, is required.
- Such a scalable video coding method refers to a coding method that allows a video resolution, a frame rate, a Signal-to-Noise Ratio (SNR), etc. to be adjusted by truncating part of an already compressed bitstream in conformity with surrounding conditions, such as a transmission bit rate, a transmission error rate, a system source, etc.
- SNR Signal-to-Noise Ratio
- standardization is in progress in Moving Picture Experts Group-21 (MPEG-21) Part 10. In particular, extensive research into multi-layer based scalability has been carried out.
- scalability can be implemented in such a way that multiple layers, including a base layer, a first enhanced layer and a second enhanced layer, are provided, and respective layers are constructed to have different resolutions, such as a Quarter Common Intermediate Format (QCIF), a Common Intermediate Format (CIF) or 2CIF, or different frame rates.
- QCIF Quarter Common Intermediate Format
- CIF Common Intermediate Format
- 2CIF 2CIF
- MVs on a layer basis in order to eliminate temporal redundancy.
- MVs are individually searched for in connection with respective layers.
- an MV is searched for in connection with one layer and is then used for the other layers without change or through up/down sampling.
- the first case is advantageous in that accurate MVs can be acquired, but is disadvantageous in that MVs generated for respective layers act as overhead, compared to the latter. Since the accuracy of the MVs significantly affects the reduction in the temporal redundancy of texture data, a method of searching for accurate MVs for respective layers, as in the first case, is generally used. Further, in the first case, it is very important to efficiently eliminate redundancy between MVs for respective layers.
- FIG. 1 is a diagram showing an example of a conventional scalable video codec using a multi-layer structure.
- a base layer is defined as a layer having a QCIF and a frame rate of 15 Hz
- a first enhanced layer is defined as a layer having a CIF and a frame rate of 30 Hz
- a second enhanced layer is defined as a layer having Standard Definition (SD) and a frame rate of 60 Hz.
- SD Standard Definition
- a bitstream may be truncated and transmitted to reach a bit rate of 0.5 Mbps based on a first enhanced layer having a CIF, a frame rate of 30 Hz and a bit rate of 0.7 Mbps. In this manner, spatial scalability, temporal scalability and SNR scalability can be implemented.
- FlG. 2 is a view illustrating a method of performing such motion prediction.
- the MV of a lower layer having the same temporal position is used as a predicted MV for the MV of a current layer without change.
- An encoder obtains the MVs (MV , MV and MV ) of respective layers with a predetermined accuracy in the respective layers, and performs an inter prediction process of eliminating temporal redundancy from the respective layers using the obtained MVs.
- the encoder transmits only the MV of a base layer, the MV difference D 1 of the first enhanced layer and the MV difference D 2 of the second enhanced layer to a pre-decoder (to a video stream server).
- the pre-decoder may transmit only the MV of the base layer to the decoder, the MV of the base layer and the MV difference D of the first enhanced layer to the decoder, or the MV of the base layer, the MV difference D of the first enhanced layer and the MV difference D of the second enhanced layer to the decoder, in conformity with a network condition.
- the decoder can restore the MVs of corresponding layers based the received data. For example, when the decoder receives the MV of the base layer and the MV difference D of the first enhanced layer, the decoder can restore the MV MVl of a first enhanced layer by adding the MV of the base layer and the MV difference D of the first enhanced layer and restore the texture data of the first enhanced layer using the restored MV MVl.
- FlG. 3 is a schematic view illustrating the three above-described prediction methods.
- FlG. 3 shows a case (®) where an intra prediction is made for a specific macroblock 4 of a current frame 1, a case ((D) where an inter prediction is made using a frame 2 located at a temporal location different from that of the current frame 1, and a case (®) where an intra BL prediction is made using texture data for a region 6 of a base layer frame 3 that corresponds to the macroblock 4.
- macroblocks that are encoded by the three prediction methods are referred to as the intra macroblock, the inter macroblock and the intra BL macroblock, respectively. Disclosure of Invention
- the scalable video coding standard uses a method of selecting the advantageous one of the three above-described prediction methods and encoding a corresponding macroblock. Therefore, even one frame may be composed of an inter macroblock, an intra macroblock and an intra BL macroblock.
- the macroblock of a lower layer corresponding to a specific inter macroblock of the current frame may not be an inter macroblock, so that it is impossible to obtain the MV of the lower layer that is used to predict the MV of the inter macroblock. If the inter macroblock is independently encoded because the MV of a corresponding lower layer does not exit, this may lead to reduced coding efficiency.
- the present invention provides a method and apparatus for generating the missing motion field of a lower layer frame corresponding to a current frame so as to predict the MV of the current frame.
- a method of compressing the MV of the first macroblock of a current layer frame when the region of a first lower layer corresponding to the first macroblock does not have an MV including interpolating the MV of a second macroblock to which the region belongs, based on the MV of at least one neighboring macroblock; acquiring the predicted MV of the first macroblock using the interpolated MV; and subtracting the acquired predicted MV from the MV of the first macroblock.
- an apparatus for compressing the MV of the first macroblock of a current layer frame when the region of a first lower layer corresponding to the first macroblock does not have an MV including a means for interpolating the MV of a second macroblock to which the region belongs, based on the MV of at least one neighboring macroblock; a means for acquiring the predicted MV of the first macroblock using the interpolated MV; and a means for subtracting the acquired predicted MV from the MV of the first macroblock.
- a method of restoring the MV of the first macroblock of a current layer frame from a motion difference for the first macroblock when the region of a first lower layer corresponding to the first macroblock does not have an MV including interpolating the MV of a second macroblock to which the region belongs, based on the MV of at least one neighboring macroblock; acquiring the predicted MV of the first macroblock using the interpolated MV; and adding the motion difference for the first macroblock and the acquired predicted MV.
- an apparatus for restoring the MV of the first macroblock of a current layer frame from a motion difference for the first macroblock when the region of a first lower layer corresponding to the first macroblock does not have an MV including a means for interpolating the MV of a second macroblock to which the region belongs, based on the MV of at least one neighboring macroblock; a means for acquiring the predicted MV of the first macroblock using the interpolated MV; and a means for adding the motion difference for the first macroblock and the acquired predicted MV.
- FlG. 1 is a view showing an example of a scalable video codec using a multi- layered structure
- FlG. 2 is a view illustrating a method of efficiently representing an MV through motion prediction
- FlG. 3 is a schematic view illustrating three types of conventional prediction methods
- FlG. 4 is a schematic view illustrating the basic concept of the present invention.
- FlG. 5 is a schematic view illustrating a method of predicting an MV when the resolutions of layers are the same according to a first exemplary embodiment of the present invention
- FlG. 6 is a schematic view illustrating a method of predicting an MV when the resolutions of layers are different according to a first exemplary embodiment
- FlG. 7 is a view illustrating a method of interpolating motion fields according to a second exemplary embodiment of the present invention.
- FlG. 8 is a view illustrating a case where four side macroblocks around the macroblock of a first lower layer are taken as neighboring macroblocks according to a second exemplary embodiment of the present invention
- FlG. 9 is a view illustrating a case where eight macroblocks surrounding the macroblock of a first lower layer are taken as neighboring macroblocks according to the second exemplary embodiment of the present invention
- FlG. 10 is a view illustrating a method of allocating MVs to neighboring sub- blocks
- FlG. 11 is a view illustrating a process of performing motion prediction for a current macroblock using an interpolated MV when the resolutions of layers are different;
- FlG. 12 is a block diagram showing the construction of a video encoder according to an exemplary embodiment of the present invention.
- FlG. 13 is a block diagram showing the construction of a video decoder according to an exemplary embodiment of the present invention.
- FlG. 14 is a configuration diagram illustrating the construction of a system environment in which the video encoder of FlG. 12 or the video decoder of FlG. 13 operates;
- FlG. 15 is a flowchart illustrating a motion prediction method according to an exemplary embodiment of the present invention.
- FlG. 4 is a schematic view illustrating the basic concept of the present invention.
- the MV of the inter macroblock 11 of a current layer frame 10, on which inter prediction will be performed, is efficiently predicted using the MV of a lower layer.
- the block 21 of a first lower layer corresponding to the inter macroblock 11 may or may not correspond to an inter macroblock.
- the term 'block' refers to a macroblock or a region smaller than a macroblock. If the resolutions of the layers are the same, the size of the block 21 of the first lower layer may be the same as the size of the macroblock. In contrast, if the resolutions of the layers are different, the block 21 of the first lower layer may have a size smaller than that of the macroblock.
- the MV of the block 21 does not exist. Therefore, motion prediction for the inter macroblock 11 cannot be performed using a general method.
- the first exemplary embodiment is a method of predicting the current layer inter macroblock 11 using the MV of a second lower layer block 31 corresponding to the first lower layer block 21 if the MV of the first lower layer block 21 corresponding to the current inter macroblock 11 does not exist.
- the second lower layer block 31 may also not have an MV.
- the following second exemplary embodiment can be employed.
- missing motion fields of a macroblock 21 including the first lower layer block 21 are interpolated using neighboring inter macroblocks 22, 23, etc. Furthermore, motion prediction for the current inter macroblock 11 can be performed using the interpolated motion fields.
- the second exemplary embodiment may be applied only to a case where the first exemplary embodiment cannot be used, but can be independently used regardless of the first exemplary embodiment. That is, the second exemplary embodiment may be used regardless of whether the corresponding block 31 of the second lower layer has an MV, and may also be used even when the second lower layer itself does not exist.
- the term 'prediction' refers to a process of reducing the amount of data by generating predicted data for specific data using information that can be used in both a video encoder and a video decoder, and obtaining the difference between the specific and the predicted data.
- 'motion prediction a process of predicting an original MV using a predicted MV generated by a predetermined method is referred to as 'motion prediction.
- FlG. 5 is a schematic view illustrating a method of predicting an MV according to the first exemplary embodiment when the resolutions of layers are the same. Since a second lower layer and a current layer independently perform motion estimation, they may have different macroblock patterns and MVs.
- MVs for the macroblock 11 of the current layer are predicted from MVs for the corresponding macroblock 31 of the second lower layer. Since the MVs for the macroblock 11 and the MVs for the macroblock 31 do not have the same macroblock pattern, which to use as a predicted MV is a problem.
- an MV at a location corresponding to an MV 1 Ia is an MV 31a and an MV 31b.
- a result that is obtained by averaging, for example, the MV 31a and the MV 31b, can be used as a predicted MV for the MV 11a.
- an MV at a location corresponding to an MV 1 Ib is an MV 3 Ie
- the MV 3 Ie may be used as a predicted MV for the MV 1 Ib.
- the size of a region to which the MV 1 Ib is allocated and the size of a region to which the MV 3 Ie is allocated are different from each other, it can be considered that the region to which the MV 3 Ie is allocated is divided into eight regions and the MV 3 Ie is allocated to each of the eight regions.
- an MV at a location corresponding to the MV 1 Ic is also the MV 3 Ie.
- MVs at a location corresponding to the MV 1 Id are an MV 3 Ic, an MV 3 Id and the MV 3 Ie.
- MV 3 Id MV 3 Id
- mv is the MV 3 Ie.
- FlG. 6 is a schematic view illustrating a method of predicting an MV according to the first exemplary embodiment when the resolutions of layers are different from each other.
- the block 40 of a second lower layer corresponding to the macroblock 11 of a current layer is a part of the macroblock 31 of a predetermined second lower layer.
- an up-sampling process is necessary. Therefore, MVs allocated to the block 40 of the second lower layer are up-sampled by the resolution magnification (m) of the current layer to that of the second lower layer. MVs for the macroblock 11 of the current layer are then predicted using the up-sampled MVs.
- a partition pattern of the macroblock 11 of the current layer and a partition pattern of a region to which the up-sampled MV is allocated can be different from each other.
- a method of generating a corresponding predicted MV in this case is the same as that described with reference to FlG. 5.
- FlG. 7 is a view illustrating a method of interpolating a motion field according to a second exemplary embodiment of the present invention.
- the macroblock 21 of a first lower layer corresponding to the macroblock of a current layer is an intra macroblock (or an intra BL macroblock), it does not have a motion field.
- the missing motion field of the macroblock 21 can be interpolated using MVs allocated to neighboring inter macroblocks 22, 23 and 24.
- the MV or the motion field of the macroblock 21 is interpolated using the MVs of sub-fields (e.g., 4x4 blocks) neighboring the macroblock 21 within the neighboring inter macroblocks 22, 23 and 24, as in FlG. 7.
- the following Equation 2 indicates an example of this interpolation method.
- mv is an interpolated MV
- mv is an interpolated MV
- the interpolated MV mv can be acquired, as in the following Equation 3.
- FIG. 8 shows a case where four side macroblocks 22, 23, 26 and 28 are taken as neighboring macroblocks and
- FIG. 9 shows a case where eight macroblocks 22 to 29 surrounding the macroblock 21 of a first lower layer are taken as neighboring macroblocks.
- the left macroblock 23 of the four neighboring macroblocks is an intra macroblock (or an intra BL macroblock) and the remaining three-macroblocks 22, 26 and 28 are inter macroblocks.
- the MV of the intra macroblock 21 can be interpolated by averaging MVs that are respectively allocated to the twelve sub- blocks.
- the left, lower left and lower right macroblocks 23, 27, 29 of nine neighboring macroblocks are intra macroblocks (or intra BL macroblocks) and the remaining five-macroblocks 22, 24, 25, 26 and 28 are inter macroblocks.
- the MV of the intra macroblock 21 can be interpolated by averaging MVs that are respectively allocated to the five sub-blocks.
- the MV mv that is calculated as described above represents the entire macroblock p 21 of the first lower layer.
- a specific inter macroblock 50 has a predetermined partition pattern, and an MV is allocated to each partition.
- partitions include partitions 52, 53, 54 and 55 having a 4x4 sub-block size, and partitions 51, 56 and 57 having a size larger than the 4x4 sub-block size. If MVs are allocated on a 4x4 sub-block basis, it will result in the right-hand drawing of FlG. 10. At this time, partitions the size of which is larger than the 4x4 sub-block size are each divided into some sub-blocks, and the motion vectors of the partitions are allocated to the sub- blocks in the same manner.
- mv_ll is the same as the MV 55
- mv_l2 and mv_l3 are the same as the MV 57. It is thus possible to determine the MVs of all neighboring sub-blocks through the allocation of MVs on a sub-block basis.
- FlG. 11 is a view illustrating a process of performing motion prediction for a current macroblock using an interpolated MV (mv ) when the resolutions of layers are p different.
- the interpolated MV (mv ) is up-sampled by the ratio of the resolution of a p current layer to the resolution of the first enhanced layer, and is then used as the predicted MV of a current macroblock 11. Since the region 29 of the first lower layer macroblock 21 corresponding to the current macroblock 11 is a part of the first lower layer macroblock 21, the MV of the region 29 is the same as the MV (mv ) of the first p lower layer macroblock 21.
- FlG. 12 is a block diagram showing the construction of a video encoder 100 according to an exemplary embodiment of the present invention.
- a down-sampler 110 down-samples input video to a resolution and frame rate appropriate for each layer.
- the down-sampler 110 may perform down-sampling with respect only to the resolutions of layers, or with respect only to the frame rate. Alternatively, the down-sampler 110 may also perform down-sampling with respect to both the resolution and the frame rate.
- the down sampling associated with the resolution may be performed using a MPEG down-sampler or a wavelet down- sampler.
- the down sampling associated with the frame rate may be performed using a method such as frame skip or frame interpolation.
- a current layer frame F a first lower layer frame F and a second lower layer frame (F ) can be produced. It is assumed that the frames F , F and F exist at respective temporally corresponding locations.
- a motion estimation unit 120 acquires the MV MV of a current layer frame by performing motion estimation on the current layer frame (F ) using another frame of the current layer as a reference frame. Such motion estimation is a process of finding a block that is the most similar to the block of the current frame in the reference frame, that is, that has the lowest error.
- a variety of methods such as a fixed-size block matching method or a Hierarchical Variable Size Block Matching (HVSBM), can be used for the motion estimation.
- a motion estimation unit 121 acquires the MV MV of the frame F of the first lower layer
- a motion estimation unit 122 acquires the MV MV of the frame F of the second lower layer.
- the MV (MVl) acquired by the motion estimation unit 121 is provided to a motion field interpolation unit 150 and an entropy encoder 160.
- the MV (MV2) acquired by the motion estimation unit 122 is provided to a second up-sampler 112 and the entropy encoder 160.
- the motion field interpolation unit 150 interpolates the MV of the macroblock of the frame F of a first lower layer corresponding to a specific macroblock (hereinafter referred to as a 'current macroblock') of the current layer frame F using the MVs of neighboring macroblocks. Since the interpolation method has been described with reference to FIGS. 7 to 11, a description thereof is omitted to avoid redundancy. As described above, the interpolated MV MV is provided to a first up-sampler 111.
- the motion field interpolation unit 150 interpolates the MV of the macroblock of the frame F of a first lower layer corresponding to a specific macroblock (hereinafter referred to as a 'current macroblock') of the current layer frame F using the MVs of neighboring macroblocks. Since the interpolation method has been described with reference to FIGS. 7 to 11, a description thereof is omitted to avoid redundancy. As described above, the interpolated MV MV is provided to a first up-
- P first up-sampler 111 up-samples the interpolated MV by the ratio of the resolution of the current layer to the resolution of the first lower layer. If the resolutions of the first lower layer and the current layer are the same, the up-sampling in the first up-sampler 111 can be omitted.
- the up-sampled MV U (MV ) is provided to the motion prediction unit 140.
- the second up-sampler 112 up-samples the MV MV , which is received from the motion estimation unit 122, by the ratio of the resolution of the current layer to the resolution of the second lower layer, and provides a result U (MV ) to the motion prediction unit 140.
- the motion prediction unit 140 employs the motion prediction method (the first exemplary embodiment or the second exemplary embodiment) according to the present invention when the region of the first lower layer corresponding to the current macroblock does not have an MV. In this case, the motion prediction unit 140 determines whether the region of the second lower layer corresponding to the region of the first lower layer has an MV. If, as a result of the determination, the region of the second lower layer is determined to have an MV, the motion prediction unit 140 employs the first exemplary embodiment. Otherwise the motion prediction unit 140 employs the second exemplary embodiment. Of course, the motion prediction unit 140 can directly employ the second exemplary embodiment without performing such determination.
- the motion prediction unit 140 subtracts the MV of a region corresponding to the current macroblock, among the MV U (MV ) up-sampled by the second up-sampler 112, from the MV of the current macroblock, among the MV (MV ) of the current frame.
- the motion prediction unit When the second exemplary embodiment is employed, the motion prediction unit
- a motion difference ⁇ MV which is generated as a result of the subtraction in the motion prediction unit 140 is provided to the entropy encoder 160.
- a prediction unit 131 constructs the predicted frame of the current frame F using the MV MV of the current frame obtained in the motion estimation o " o unit 120 and the reference frame used in the motion estimation unit 120, and subtracts the constructed predicted frame from the current frame. As a result, a residual frame R is produced.
- a transform unit 132 performs spatial transform on the residual frame R and generates a transform coefficient C.
- This spatial transform method includes Discrete Cosine Transform (DCT), wavelet transform, etc.
- DCT Discrete Cosine Transform
- wavelet transform wavelet transform
- a quantization unit 133 quantizes the transform coefficient C.
- 'quantization' refers to a process of representing a transform coefficient, which has been represented as a predetermined real number, as discrete values by dividing the real number transform coefficient into predetermined sections, and matching the values to indices based on a predetermined quantization table.
- the entropy encoder 160 losslessly encodes a result T, which is quantized by the quantization unit 133, the motion difference ⁇ MV, the MV MV of the first lower layer and the MV MV of the second lower layer, and produces a bit stream.
- MV can be omitted.
- FIG. 13 is a block diagram showing the construction of a video decoder 200 according to an exemplary embodiment of the present invention.
- An entropy decoder 210 performs lossless decoding, and extracts the texture data T of a current layer frame, a motion difference ⁇ MV for a current layer, the MV MV of a first lower layer and the MV MV of a second lower layer from an input bit stream.
- a motion field interpolation unit 240 interpolates the MV of the macroblock of the first lower layer corresponding to the current macroblock of the current layer frame F based on the MV (included in MV ) of a neighboring macroblock. Since this interpolation method has been described with reference to FIGS. 7 to 11, a description thereof is omitted to avoid redundancy.
- the interpolated MV MV p is provided to a first up-sampler 211.
- the first up-sampler 211 up-samples the interpolated MV by the ratio of the resolution of the current layer to the resolution of the first lower layer.
- the up-sampled MV U (MV ) is provided to a motion restoration unit 230.
- a second up-sampler 212 up-samples the MV MV of the second lower layer by the ratio of the resolution of the current layer to the resolution of the second lower layer.
- the result U (MV ) is provided to the motion restoration unit 230.
- a motion restoration unit 230 uses the motion prediction method (the first exemplary embodiment or the second exemplary embodiment) according to the present invention when the region of the first lower layer corresponding to the current macroblock does not have an MV. In this case, the motion restoration unit 230 determines whether the region of the second lower layer corresponding to the region of the first lower layer has an MV. If, as a result of the determination, the region of the second lower layer is determined to have an MV, the motion restoration unit 230 employs the first exemplary embodiment. If the region of the second lower layer does not have an MV, the motion restoration unit 230 employs the second exemplary embodiment. Of course, the motion restoration unit 230 can directly employ the second exemplary embodiment without the determination.
- the motion restoration unit 230 adds a motion difference ⁇ MV for the current macroblock, among the MV MV of the current frame, and the MV of the region corresponding to the current macroblock, among the MV U (MV ) up-sampled by the second up-sampler 212.
- the MV MV for the current macroblock is restored and is provided to an inverse prediction unit 223.
- an inverse quantization unit 221 inversely quantizes texture data T output from the entropy decoder 210.
- Inverse quantization is a process of restoring a value matching indices, which are generated in a quantization process, using a quantization table, which is used in the quantization process, without change.
- An inverse transform unit 222 performs an inverse spatial transform process on the inverse quantized result. This inverse spatial transform process is performed in a way corresponding to the transform unit 132 of the video encoder 100. More particularly, inverse DCT transform, inverse wavelet transform or the like may be used.
- the inverse prediction unit 223 inversely performs the process, which is performed in the temporal transform unit 131, on the inversely transformed result and, thus, restores a video frame. That is, the inverse prediction unit 223 restores the video frame by producing a predicted frame using an MV restored in the motion restoration unit 230, and adding the inversely transformed result and the generated predicted frame.
- FlG. 14 is a configuration diagram illustrating the construction of a system environment in which the video encoder 100 of FlG. 12 or the video decoder 200 of FlG. 13 operates, according to an exemplary embodiment of the present invention.
- the system may be a television (TV), a set-top box, a desktop computer, a laptop computer, a palmtop computer, a Personal Digital Assistant (PDA), or a video or image storage device (e.g., a Video Cassette Recorder (VCR), a Digital Video Recorder (DVR), etc.).
- the system may be a combination of the above-described devices, or one of the above-described devices that is included in another.
- the system may include at least one video source 910, at least one Input/Output (FO) device 920, a processor 940, a memory 950 and a display apparatus 930.
- FO Input/Output
- the video source 910 may be a TV receiver, a VCR or some other video storage device. Furthermore, the video source 910 may be at least one network connection for receiving video from a server via the Internet, a Wide Area Network (WAN), a Local Area Network (LAN), a terrestrial broadcasting system, a cable network, a satellite communication network, a wireless network, a telephone network or the like. In addition, the video source can be a combination of the above-described networks, or one of the above-described networks that is included in another.
- WAN Wide Area Network
- LAN Local Area Network
- the video source can be a combination of the above-described networks, or one of the above-described networks that is included in another.
- the FO device 920, the processor 940 and the memory 950 communicate with each other via a communication medium 960.
- the communication medium 960 may be a communication bus, a communication network, or at least one internal connection circuit.
- Input video data received from the video source 910 may be processed by the processor 940 in accordance with at least one software program stored in the memory 950, and may be executed by the processor 940 so as to generate output video that is provided to the display apparatus 930.
- the software program stored in the memory 950 may include a multi- layered video codec that performs the method according to the present invention.
- the codec may be stored in the memory 950, may be read from a storage medium such as a CD-ROM or a floppy disk, or may be downloaded from a predetermined server via one of various networks.
- the codec may be replaced with software, a hardware circuit, or a combination of software and a hardware circuit.
- FlG. 15 is a flowchart illustrating a motion prediction method according to an exemplary embodiment of the present invention.
- the motion prediction unit 140 determines whether the region of a first lower layer corresponding to the first macroblock of a current layer frame has an MV at operation SlO. If, as a result of the determination, the region of the first lower layer is determined to have the MV (YES at SlO), the first up-sampler 112 up-samples the MV of the region of the first lower layer and provides the up-sampled MV to the motion prediction unit 140 at operation S70. The motion prediction unit 140 predicts the MV of the first macroblock using the up-sampled MV as a predicted MV at operation S80. Since operation S70 is the same as that of the prior art, a detailed description thereof has been omitted in the description of FlG. 15.
- the motion prediction unit 140 determines whether the region of a second lower layer corresponding to the first macroblock has an MV at operation S20. If, as a result of the determination, the region of the second lower layer is determined to have an MV (YES at operation S20), the second up-sampler 111 up-samples the MV of the region of the second lower layer by the ratio of the resolution of the current layer to the resolution of the second lower layer at operation S60. In this case, the up-sampling may be omitted when the resolutions of layers are the same.
- the motion prediction unit 140 predicts the MV of the first macroblock using the up-sampled MV as a predicted MV at operation S80.
- the motion field interpolation unit 150 interpolates the MV of the second macroblock, which corresponds to the current macroblock, based on neighboring macroblocks at operation S30.
- the second macroblock is an intra macroblock or an intra BL macroblock.
- the interpolation method may be performed by averaging the MVs of neighboring sub-blocks within the inter macroblock of the neighboring macroblocks (refer to Equation 2). More particularly, the sub-blocks may include four 4x4 sub-blocks (mv_10, mv_ll, mv_12 and mv_13 in FIG. 7) that are within a macroblock on the left side of the first macroblock and neighbor the first macroblock, four 4x4 sub-blocks (mv_a ⁇ , mv_al, mv_a2 and mv_a3 in FIG.
- the up-sampler 111 up-samples the interpolated MV by the ratio of the resolution of the current layer to the resolution of the first lower layer at operation S40.
- the up- sampling may be omitted if the resolutions of layers are the same.
- the motion prediction unit 140 predicts the MV of the first macroblock using the up-sampled MV as a predicted MV at operation S80.
- Operation S80 includes acquiring a predicted MV using the interpolated MV and subtracting the acquired predicted MV from the MV of the first macroblock.
- the entropy encoder 160 losslessly encodes the motion difference MV, which is acquired through the prediction at operation S80.
- Operations S30 and S40 may be performed if the result of the determination at operation S20 is NO, or may be performed regardless of the determination at operation S20, as described above.
- the present invention can improve video compression performance by efficiently predicting multi-layered MVs.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
Claims
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP06715805A EP1847129A1 (en) | 2005-02-07 | 2006-02-01 | Method and apparatus for compressing multi-layered motion vector |
BRPI0606786-7A BRPI0606786A2 (en) | 2005-02-07 | 2006-02-01 | method of restoring a motion vector, and apparatus for restoring a motion vector |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US65017305P | 2005-02-07 | 2005-02-07 | |
US60/650,173 | 2005-02-07 | ||
KR1020050028683A KR100704626B1 (en) | 2005-02-07 | 2005-04-06 | Method and apparatus for compressing multi-layered motion vectors |
KR10-2005-0028683 | 2005-04-06 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2006083107A1 true WO2006083107A1 (en) | 2006-08-10 |
Family
ID=36777453
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2006/000352 WO2006083107A1 (en) | 2005-02-07 | 2006-02-01 | Method and apparatus for compressing multi-layered motion vector |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP1847129A1 (en) |
WO (1) | WO2006083107A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2119241A4 (en) * | 2007-01-29 | 2015-12-30 | Samsung Electronics Co Ltd | Method and apparatus for encoding video and method and apparatus for decoding video |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0730899A (en) * | 1993-07-12 | 1995-01-31 | Kyocera Corp | Hierarchical motion vector detection system |
US6128342A (en) * | 1995-03-10 | 2000-10-03 | Kabushiki Kaisha Toshiba | Video coding apparatus which outputs a code string having a plurality of components which are arranged in a descending order of importance |
US6621865B1 (en) * | 2000-09-18 | 2003-09-16 | Powerlayer Microsystems, Inc. | Method and system for encoding and decoding moving and still pictures |
-
2006
- 2006-02-01 EP EP06715805A patent/EP1847129A1/en not_active Withdrawn
- 2006-02-01 WO PCT/KR2006/000352 patent/WO2006083107A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0730899A (en) * | 1993-07-12 | 1995-01-31 | Kyocera Corp | Hierarchical motion vector detection system |
US6128342A (en) * | 1995-03-10 | 2000-10-03 | Kabushiki Kaisha Toshiba | Video coding apparatus which outputs a code string having a plurality of components which are arranged in a descending order of importance |
US6148028A (en) * | 1995-03-10 | 2000-11-14 | Kabushiki Kaisha Toshiba | Video coding apparatus and method which codes information indicating whether an intraframe or interframe predictive coding mode is used |
US6621865B1 (en) * | 2000-09-18 | 2003-09-16 | Powerlayer Microsystems, Inc. | Method and system for encoding and decoding moving and still pictures |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2119241A4 (en) * | 2007-01-29 | 2015-12-30 | Samsung Electronics Co Ltd | Method and apparatus for encoding video and method and apparatus for decoding video |
Also Published As
Publication number | Publication date |
---|---|
EP1847129A1 (en) | 2007-10-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20060176957A1 (en) | Method and apparatus for compressing multi-layered motion vector | |
KR100664929B1 (en) | Method and apparatus for effectively compressing motion vectors in video coder based on multi-layer | |
KR100714696B1 (en) | Method and apparatus for coding video using weighted prediction based on multi-layer | |
US8559520B2 (en) | Method and apparatus for effectively compressing motion vectors in multi-layer structure | |
RU2341035C1 (en) | Video signal coding and decoding procedure based on weighted prediction and related device for implementation thereof | |
JP4891234B2 (en) | Scalable video coding using grid motion estimation / compensation | |
KR100763182B1 (en) | Method and apparatus for coding video using weighted prediction based on multi-layer | |
US20060120448A1 (en) | Method and apparatus for encoding/decoding multi-layer video using DCT upsampling | |
KR100703745B1 (en) | Video coding method and apparatus for predicting effectively unsynchronized frame | |
KR100703746B1 (en) | Video coding method and apparatus for predicting effectively unsynchronized frame | |
KR20060135992A (en) | Method and apparatus for coding video using weighted prediction based on multi-layer | |
KR20060128596A (en) | Video encoding method, video decoding method, video encoder, and video decoder, which use smoothing prediction | |
WO2006078115A1 (en) | Video coding method and apparatus for efficiently predicting unsynchronized frame | |
EP1659797A2 (en) | Method and apparatus for compressing motion vectors in video coder based on multi-layer | |
US20060250520A1 (en) | Video coding method and apparatus for reducing mismatch between encoder and decoder | |
EP1730967B1 (en) | Method and apparatus for effectively compressing motion vectors in multi-layer structure | |
KR100703751B1 (en) | Method and apparatus for encoding and decoding referencing virtual area image | |
EP1847129A1 (en) | Method and apparatus for compressing multi-layered motion vector | |
WO2006078125A1 (en) | Video coding method and apparatus for efficiently predicting unsynchronized frame | |
WO2006104357A1 (en) | Method for compressing/decompressing motion vectors of unsynchronized picture and apparatus using the same | |
WO2006098586A1 (en) | Video encoding/decoding method and apparatus using motion prediction between temporal levels | |
WO2006109989A1 (en) | Video coding method and apparatus for reducing mismatch between encoder and decoder |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200680011069.8 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2006715805 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1255/MUMNP/2007 Country of ref document: IN |
|
WWP | Wipo information: published in national office |
Ref document number: 2006715805 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: PI0606786 Country of ref document: BR Kind code of ref document: A2 |