EP1790166A2 - Verfahren und vorrichtung zur bewegungsschätzung - Google Patents
Verfahren und vorrichtung zur bewegungsschätzungInfo
- Publication number
- EP1790166A2 EP1790166A2 EP05780826A EP05780826A EP1790166A2 EP 1790166 A2 EP1790166 A2 EP 1790166A2 EP 05780826 A EP05780826 A EP 05780826A EP 05780826 A EP05780826 A EP 05780826A EP 1790166 A2 EP1790166 A2 EP 1790166A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- motion vector
- video stream
- frame
- base layer
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/53—Multi-resolution motion estimation; Hierarchical motion estimation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/30—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability
- H04N19/33—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
- H04N19/56—Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/60—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
- H04N19/61—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
Definitions
- This invention relates to a method and apparatus for compressing video stream, particularly relates to a method and apparatus for compressing video stream by using spatial layered compressing scheme.
- each frame of digital image is a still picture (also called image ) made up of a group of pixel-points (also called pixels).
- the quantity of the pixels depends upon the display definition of a particular system.
- H.263 etc. have been generated for reducing the quantity of the necessary data to be transmitted.
- the bit stream is divided into more than two kinds of bit streams or layers for encoding. Then, during decoding, the respective layers may be combined as desired to form a high resolution signal.
- the base layers may provide a low resolution video stream
- the enhancement layers may provide additional information to enhance the base layer image.
- the motion prediction has been used to obtain predictive image in accordance with the relevance between the former and latter frames.
- the input video stream is processed to form I,
- I frame is encoding according to the information of itself only
- P frame is predictively encoding according to the I or P frames nearest to it in the front
- B frame is predictively encoding according to itself or the frames before and after it.
- Fig.l is a block diagram of a video coder 100 supporting the spatial layered compressing of MPEG - 2 / MPEG - 4.
- the video encoder 100 comprises base-encoder 112 and enhancement encoder 114.
- the base-encoder comprises a downsampler 120, a motion estimation ( ME ) means 122, a motion compensator (MC) 124, a right angle transforming ( for example, discrete cosine transform ( DCT ) ) circuit 135, a quantize ⁇ Q )132, a variable length encodei( VLC )134, a bitrate control circuit 135, an inverse quantizer (IQ ) 138, an inverse transform circuit ( IDCT ) 140, switches 128 and 144 as well as upsampler 150.
- DCT discrete cosine transform
- IDCT inverse transform circuit
- the enhancement encoder 114 comprises a motion estimation means 154, a motion compensator 155, a right angle transforming (for, example DCT transform ) circuit 158, a quantizer 160, a variable length encoder 162, a bitrate control circuit 164, an inverse quantizer 166, inverse transform circuit ( IDCT ) 168 and switches 170 and 172. All functions of the means mentioned above are well known in the art, so they will not be described in detail herein.
- the motion estimation is one of the most time-consuming portion in the video compressing system, that is, the larger the amount of calculation of motion estimation is, the lower the encoding efficiency of the video compression system is.
- the motion estimation will be made for both the base layer and the enhancement layer, respectively, and no association exists between them.
- the motion estimations are made for the base layer and the enhancement layer, respectively, since prediction is made for the same frame of image, a relatively large portion of the searching process is repeated, which results in the larger amount of calculation of motion estimation and lower encoding efficiency of the compressing scheme. Therefore, there is the need for a spatial layered video compression scheme with better efficiency of encoding.
- the present invention is directed to a much more efficient spatial layered compression method to overcome the disadvantages of the spatial layered compression scheme described above, by introducing a reference motion vector, the present invention allow the motion estimation of the base layer to associate with that of the enhancement layer such that the originally repetitive searching processes could be finished in one time, and then a small amount of searching is performed; thereby, on this basis, the computing complexity of the motion estimation is reduced and the efficiency of compressed encoding is improved.
- An embodiment in accordance with the invention discloses a method for spatial layered compression of video stream and an apparatus thereof.
- processing the original video stream to obtain a reference motion vector for each of frame of image of the video stream; then, down-sampling the reference motion vector, and down-sampling the video stream; secondly, according to the down-sampled reference motion vector to acquire a motion vector of the corresponding frame of image of the down-sampled video stream; next, processing corresponding frame of image of the down-sampled video stream, respectively by using the motion vector, whereby to generate a base layer; finally, according to the reference motion vector, acquiring a motion vector of the corresponding frame of image of the video stream during generating the enhancement layer, and processing the video stream by using the motion vector and the base layer, whereby to generate an enhancement layer.
- An alternative embodiment in accordance with the invention discloses yet another method for spatial layered compression of video stream and an apparatus thereof. Firstly, down-sampling the video stream to acquire a reference motion vector for each frame of image of the down-sampled video stream; secondly, according to said reference motion vector, acquiring a motion vector of the corresponding frame of image of the down-sampled video stream ; then, processing the down-sampled video stream by using the motion vector, whereby to generate a base layers ; finally, up-sampling said reference motion vector during generating enhancement layer, and according to the up-sampled reference motion vector, acquiring a motion vector of the corresponding frame of image of the video stream generate, and processing the video stream by using the motion vector and the base layers, thereby to generate an enhancement layer.
- Another embodiment in accordance with the invention discloses a further method for spatial layered compression of video stream and an apparatus thereof.
- processing the video stream thereby to generate a base layer ; then, up-sampling the motion vector for each frame of image of the base layer, thereby to acquire a reference motion vector of the corresponding frame of image; finally, according to the reference motion vector, acquiring a motion vector of the corresponding frame of image of the video stream, thereby processing the video stream by using the motion vector and the base layer to generate an enhancement layer.
- Fig.l is a block diagram of spatial layered compressing video encoder in accordance with the prior art
- Fig.2 is a schematic diagram of encoding system using reference motion vector in accordance with an embodiment of the invention:
- Fig.3 is a flowchart of encoding by using the reference motion vector in accordance with one embodiment of the invention
- Fig.4 is a schematic diagram of an encoding system using reference motion vector in accordance with another embodiment of the invention.
- Fig.5 is a schematic diagram of an encoding system using reference motion vector in accordance with a further embodiment of the invention.
- Fig.2 is a schematic diagram of an encoding system using reference motion vector in accordance with one embodiment of the invention.
- the encoding system 200 is used for the layered compressing, wherein the base layer portion is used to provide low resolution base information of the video stream and the enhancement layer is used to transfer edge enhancement information, both kinds of information may be recombined at the receiving terminal to form the high-resolution picture information.
- the encoding system 200 comprises an acquiring means 216, a base layer acquiring means 212 and an enhancement layer acquiring means 214.
- the acquiring means 216 is used for processing the original video stream, thereby to obtain the reference motion vector for each frame of image of the video stream.
- Acquiring means 216 comprises a motion estimation means 276 and a frame memory 282.
- the frame memory 282 is used to store the original video sequence.
- the motion estimation means 276 is used to acquire the reference frames (for example: I or P frames ) from frame memory 282, and to make motion estimation on the current frame ( for example P frames ) according to the reference frames, thereby to derive the reference motion vector of the current frame by computing.
- the base layer acquiring means 212 processes the video stream using the reference motion vector, thereby to generate a base layer.
- Means 212 comprises down-samplers 120, 286.
- the down-sampler 120 is used to down-sample the original video stream.
- Down-sampler 286 is used to down-sample the reference motion vector.
- the base layer acquiring means 212 further comprises a motion vector acquiring means 222.
- the motion vector acquiring means 222 is used to acquire the motion vector of the corresponding frame of image of the down-sampled video stream based on the down-sampled reference motion vector. The process by which the motion vector acquiring means 222 acquires the motion vector will be described as follows.
- the base layer acquiring means 212 further comprises a base layer generation means 213, using the motion vector to process the down-sampled video stream, thereby to generate the base layer. Except the down-samplers 120, 286 and motion vector acquiring means 222, all the other means within the base layer acquiring means 212 are basically the same as the base layer encoder Fig.l and belong to the base layer generation means 213, including motion compensator 124, DCT transform circuit 130, quantizer 132, variable length encoder 134, bitrate control circuit 135, inverse quantizer 138, inverse transform circuit 140, arithmetic units 125, 148, switches 128, 144 and up-sampler 150.
- the process with which the base layer generation means 213 generates the base layer based on the motion vector output from the motion vector acquiring means 222 is substantively the same as that of the prior art and will be discussed in detail as below.
- the same reference number designates the components having identical or similar features and functions.
- the only difference between the motion estimation means 122 and the motion vector acquiring means 222 is the ways by which they acquire the motion vectors.
- the motion estimation means 122 of Fig.l directly uses the reference frames of the frame memory (not shown ) to search within a larger searching window for acquiring the motion vector of the corresponding frame of image of the video stream, while the motion vector acquiring means 222 of Fig.2 further searches within a smaller searching window based on said reference motion vector for acquiring the motion vector of the corresponding frame of image of the video stream.
- the enhancement layer acquiring means 214 processes the video stream by using the reference motion vector and the base layer, thereby to generate an enhancement layer.
- the means 214 comprises a motion vector acquiring means 254 and an enhancement layer generation means 215.
- the motion vector acquiring means 254 is used to acquire the motion vector of the corresponding frame of image of the video stream based on the reference motion vector.
- the enhancement layer generation means 215 processes the video stream by using the motion vector and the base layer, thereby to generate the enhancement layer.
- the components are substantially the same as those in the enhancement layer encoder 114 of the Fig. 1 except for the motion vector acquiring means 254, and all of them belong to the enhancement layer generation means 215 which includes motion compensator 155, DCT circuit 158, quantizer 160, variable length encoder 162, bitrate control circuit 164, inverse quantizer 166, inverse DCT circuit 168, and switches 170, 172.
- These components are similar to the corresponding components of the base layer acquiring means 212 in function.
- the process with which the enhancement layer generation means 215 generates enhancement layer by using the motion vector output from the motion vector acquiring means 254 is essentially same as that of the prior art, and the detailed description will be given as below.
- the same reference number designates the components having identical or similar features and functions.
- the only difference between the motion estimation means 154 and motion vector acquiring means 254 is the ways by which they acquire the motion vector.
- the motion estimation means 154 of Fig.l directly uses the reference frames of the frame memory (not shown ) to search within a larger searching window for acquiring the motion vector of the corresponding frame of image of the video stream, while the motion vector acquiring means 254 of Fig.2 further searches within a smaller searching window based on said reference motion vector for acquiring the motion vector of the corresponding frame of image of the video stream.
- the process with which the base layer acquiring means 212 and the enhancement layer acquiring means 214 acquire respective motion vectors by using the reference motion vector output by the acquiring means 216 and thereby to generate the base layer and enhancement layer, will be described in detail in the following.
- An original video stream is inputted to the acquiring means 216 and then fed to motion estimation means 276 and frame memory 282, respectively.
- the video stream has been processed to form I, P, B frames, and form such a sequence as I, B, P, B, P..., B, P, in accordance with the parameter setting.
- the input video sequence is stored in the frame memory 282.
- the motion estimation means 276 is used to acquire the reference frames (for example: I frames ) from frame memory 282, and to make motion estimation on the current frame ( for example P frames ) according to the reference frames, thereby to compute the reference motion vector of the macro block of the current frame.
- the macro block is a sub-block with 16*16 pixels within the currently encoded frame and is used to match the blocks between the current macro block and reference frame to calculate the reference motion vector of the current macro block, and thereby to obtain the reference motion vector of the current frame.
- I frame is intra-frame encoding image
- P frame is the intra-frame encoding or forward predictive encoding or backward predictive encoding image
- B frame is intra-frame encoding or forward predictive encoding or bi-directional predictive encoding image.
- Motion estimation means 276 makes forward prediction to the P frame and calculates its reference motion vector. In addition, the motion estimation means also makes forward or bi-directional prediction to the B frame and calculates its reference motion vector. No motion prediction is needed for intra-frame encoding.
- the motion estimation means 276 reads out the previous reference frame from the frame memory 282, and searches in the searching window of the previous reference frame for a macro block that most marches the pixel block of the current frame.
- the state of matching is judged by the mean squared error ( MAD ) or absolute error ( MSE ) between the pixel of the currently input block and the pixel of the corresponding block of the reference frame.
- the corresponding block of the reference frame having the minimum MAD or MSE is the optimum matching block, and the relative position of said optimum matching block to the position of the current block is the reference motion vector. .
- the motion estimation means 276 in the acquiring means 216 may acquire the reference motion vector of a frame of image of the video stream. After being down-sampled by the down-sampler 286, the reference motion vector is fed to the motion estimation means 222 of the base layer acquiring means 212, such that the motion estimation means 222 could make further motion estimation on the same frame of image at the base layer. Besides, the reference motion vector may also be fed to the motion estimation means 254 to the enhancement layer acquiring means 214, such that the motion estimation means 254 could make further motion estimation on the same frame of image at the enhancement layer.
- the base layer acquiring means 212 and the enhancement layer acquiring means 214 are also predictively encoding the input video stream, however said predictive encoding is a little bit delayed in time, because the base layer and the enhancement layer must make further motion estimation based on the reference motion vector.
- the original input video stream is divided by the separator, and supplied to the base layer acquiring means 212 and enhancement layer acquiring means, respectively.
- the input video stream is fed into down-sampler
- the down-sampler may be a low - pass filter used to reduce the resolution of the input video stream.
- the down-sampled video stream is fed into motion vector acquiring means 222.
- the motion vector acquiring means 222 acquires the image of the previous reference frame of the video sequence stored in the frame memory, and searches a macro block that is best marching the current frame within a smaller searching window of the previous reference frame based on the down-sampled reference motion vector of the current frame output from above down-sampler 286, and thereby to acquire the video motion vector of a corresponding frame of image of the down-sampled video stream.
- the motion compensator 124 may read out image data of the previous reference frame stored in the frame memory (not shown ) which had been encoded and partly decoded on the basis of the prediction mode, reference motion vector and motion vector, and shift the previous frame of image in accordance with the reference motion vector, then it shifts the same once more in accordance with the motion vector, thereby to predict the current frame of image.
- the previous frame of image can be shifted for only once by the amount that is the sum of the reference motion vector and the motion vector; in this case, the sum of the reference motion vector and the motion vector can be used as the motion vector of said frame of image.
- the motion compensator 124 provides the predicted image to arithmetic unit 125 and switch 144.
- Arithmetic unit 125 also receives the input video stream, and calculates the difference between the image of the input video stream and the predicted image coming from motion compensator 124. The difference is supplied to the DCT circuit 130. If the prediction mode received from the motion estimation means 122 is intra-frame prediction, the motion compensator 124 does not output any predicted image. In such a case, arithmetic unit 125 does not perform the above described processing, but directly input the video stream to DCT circuit 130.
- the DCT circuit 130 performs DCT processing on the signal output from the arithmetic unit to acquire DCT coefficient, which are supplied to quantizer 132.
- the quantizer 132 sets the magnitude for quantizing ( quantizing level ) based on the amount of data stored in the buffer, and quantizes the DCT coefficient supplied from DCT circuit 130 by using the quantizing level.
- the quantized DCT coefficient and the set quantizing magnitude are supplied to the VLC unit 134 together. According to the supplied quantizing magnitude from the quantizer 132, the
- VLC unit 134 converts the quantizing coefficients supplied coefficient from the quantizer into a variable length code, e.g., Huffman code, thereby to generate a base layer.
- a variable length code e.g., Huffman code
- the converted quantizing coefficients are output to a buffei( not shown ).
- the quantizing coefficient and quantizing magnitude are also supplied to the inverse quantizer 138 which inversely quantizes the quantizing coefficient according to the quantizing magnitude so as to convert the quantizing coefficient into DCT coefficient.
- the DCT coefficients are supplied to the inverse DCT unit 140 which performs inverse DCT conversion to the DCT coefficients.
- the acquired inverse DCT coefficients are supplied to arithmetic unit 148.
- the arithmetic unit 148 receives the inverse DCT coefficients from the inverse DCT unit 140, and receives data from motion compensator 124 according to the position of the switch 144.
- the arithmetic unit 148 calculates the sum of the signal supplied by inverse DCT unit 140 and the predictive image supplied by motion compensator 124 to partly decode the original image.
- the output of inverse DCT unit 140 may be directly sent to the frame memory.
- the decoded image acquired by the arithmetic unit 148 are fed to and stored in the frame memory to be used as a reference frame for intra-frame encoding, forward encoding, backward encoding, or bi-directional encoding in the future.
- the output of the arithmetic unit 140 is also supplied to the up-sampler 150 to generate a reconstructed stream which has the resolution that is substantially the same as that of the high resolution input video stream.
- the reconstructed stream has errors in some degree. Said difference is determined by subtracting the reconstructed high resolution video stream from the original unchanged high resolution video stream and is inputted to the enhancement layer to be encoded. Therefore, the enhancement layer encodes and compresses the frames having said difference information.
- the process of predictive encoding for the enhancement layer is very similar to that for the base layer.
- the reference motion vector is fed to the motion estimation means 254 of the enhancement layer acquiring means 214.
- the motion estimation means 254 makes further motion estimation on the same frame image at the enhancement layer based on the reference motion vector, thereby to acquire the motion vector of the corresponding frame of image of the video stream.
- the motion compensator 155 shifts the reference frames correspondingly, thereby to predict the current frame. Because this process of motion prediction is similar to that for the base layer, it will not be discussed in detail herein.
- Fig.3 is a flowchart of encoding by using the reference motion vector in accordance with one embodiment of the invention. This flow is an operational flow of means 200.
- step S305 receiving a specific high resolution video stream (step S305), e.g., a video stream having a resolution of 1920* 108Oi.
- step S310 acquiring the reference motion vector for each frame of image of the video stream.
- the macro block that best matching the current frame is searched within the searching window of the reference frame I, for example, the searching is conducted in a searching window has a size of ⁇ 15 pixels which is the value recommended by the motion estimation.
- the shift between the current block and matching block is the reference motion vector. Because this reference motion vector is acquired by predicting the reference frame within the original video stream which has no error, it could better reflect the actual video movement.
- the acquiring process of the reference motion vector is expressed by the following formulae in which the ( Bx, By ) is the motion vector:
- arg is the motion vector corresponding to the current macro block when the SAD is minimal.
- SAD indicating the resemblance of two macro blocks, is the absolute value of the difference between respective pixels; m and n are the moving components of the matching block in horizontal and vertical directions, respectively; P c ( i, j ) and R p (i, j) are the pixels of current frame and previous reference frame, respectively. Subscripts "c” and “p” indicate "current frame " and "previous frame ", respectively.
- the reference motion vector may be respectively used for re-estimating the motion in base layer and enhancement layer of the video stream, such that the base layer and enhancement layer need only the motion estimating within a small range based on this reference motion vector, thereby to reduce the computing complexity and increasing compressed encoding efficiency of the encoding system.
- Down-sampling he video stream (step S316 ) , for reducing its resolution, for example, to 720*480i.
- the motion vector of the corresponding frame of image of the down-sampled video stream is acquired (step S322).
- the corresponding frame of image mentioned herein is the same frame as the current frame when the reference motion vector is acquired. It is because that the prediction is made on the same frame, the motion vector( Dxi ;Dyi )can be obtained, based on the reference motion vector( Bx' , By' ) , by further searching the macro block that optimistically matching the current block within a smaller searching window of the reference frame. It has been proved by experiment that the searching window may be a new searching window of ⁇ 2 pixels. By referring to the formulae ( 3 ) and ( 4 ) , the searching process may be more clearly understood.
- the motion estimation is searching on the basis of reference motion vector ( Bx' , By' ) . Because the most of searching have been finished when calculating the reference motion vector, only very limited searching is needed for finding the optimum matching block in this step. The amount of searching in a searching window of ⁇ 2 pixels is obviously much lesser than that of the searching window of ⁇ 15 pixels.
- the down-sampled video stream is processed by using the motion vector, to generate a base layer (step S326 ) .
- Predictive frame of the current frame can be obtained by only shifting the reference frame in accordance with above described reference motion vector and motion vector, then a well known processing is enough to generate a base layer.
- the corresponding frame of image herein is the same frame as the current frame when the reference motion vector is acquired. It is because that the prediction is made on the same frame, the motion vector ( Dx 2 ; Dy 2 ) can be obtained, based on the reference motion vector ( Bx, By ) , by further searching the macro block that optimistically matches the current block within a relatively small searching window of the reference frame.
- the method of obtaining the motion vector is similar to that of obtaining the motion vector by the base layer, so the detailed description is omitted.
- step S336 processing the video stream by using the motion vector and the base layer, thereby to generate a enhancement layer.
- the reference motion vector can be used by the base layer and enhancement layer at the same time to predict motion, thus reducing the calculating complexity for searching in both layers and increasing the efficiency of the compressed encoding.
- the resolutions for the high definition ( HD ) frame and standard definition ( SD ) frame are 1920xl088i and 720x480i, respectively, and the searching window is of ⁇ 15 pixels.
- the computing complexity of the error measure SAD between two macro blocks for Y component is T S A D -
- the total numbers of macro blocks for a HD frame and a SD frame are 8160 and 1350, respectively. If performing the motion estimation for each macro block within a searching window of ⁇ 15 pixels, the largest amount of calculation for obtaining the preferred motion vector of the macro block is
- the amount of calculation for the reference motion vector is ( 7, 841, 760*T SA D ) •
- the total largest amount of calculation for the motion vector of each frame is the sum of the amount of calculation of reference motion vector, searching amount for SD frame within a relatively smaller searching window, and searching amount for HD frame within a relatively smaller searching window, i.e. ( 7, 875, 510*TSAD ) ⁇
- Fig.4 is a schematic diagram of an encoding system using reference motion vector in accordance with another embodiment of the invention.
- the encoding system
- the acquiring means 410 comprises a down-sampler
- the original video stream is down-sampled by down-sampler 120 first. Then the down-sampled video stream is fed to the reference motion vector acquiring means 416, i.e. respectively fed to the motion estimation means 476 and frame memory 282, thereby acquiring the reference motion vector of each frame of image of the video stream.
- the reference motion vector is directly fed to the motion estimation means 422 of the base layer acquiring means 412, and based on the reference motion vector, the means 422 re-estimates the motion within a relatively small searching window to acquire the motion vector of the corresponding frame of image of the down-sampled video stream ; afterwards, the base layer generation means 413 processes the down-sampled video stream by using the motion vector, thereby to generate the base layer.
- the reference motion vector described above is up-sampled by the up-sampler 486 first, then a motion vector acquiring means, i.e. motion vector estimation means 454, re-estimates the motion based on the up-sampled reference motion vector to acquire the motion vector of the corresponding frame of image of the video stream. Then, the video stream is processed by the enhancement layer generation means 415 with the reference motion vector and the base layer, thereby to generate an enhancement layer.
- the motion estimations in the base layer and enhancement layer are associated together, such that the repetitive searching that has to be made by them when predicting the same frame of image could be finished in one time; and the base layer and the enhancement layer re-estimate within a relative small searching window based on the same reference motion vector. Because the searching processing is saved greatly, the amount of calculation of the whole encoding system is reduced.
- Fig.5 is a schematic diagram of an encoding system using reference motion vector in accordance with further embodiment of the invention.
- the encoding system 500 of this embodiment is similar to that shown in Fig.2 and the description here will concentrate on the difference between them only and omit the like parts.
- the difference is that the motion estimation means 522 of base layer acquiring means 512 outputs the motion vector of each frame of image of the base layer, and said motion vector is up-sampled to be used as a reference motion vector of corresponding frame of image by a reference motion vector acquiring means, i.e., up-sampler 586, the reference motion vector is fed to the motion estimation means 554 of the enhancement layer acquiring means 514.
- a reference motion vector acquiring means i.e., up-sampler 586
- the motion estimation is proceed once more within a relatively small searching window, thereby to acquire the motion vector of the corresponding frame of image of the video stream. Then, according to the reference motion vector, the motion vector as well as the output of the base layer, the enhancement layer generation means 515 generates an enhancement layer in a way that is similar to that of the embodiment shown in
- the enhancement layer processes its searching once more within a relatively small range, such that the enhancement layer omits a part of searching that is identical with that of the base layer, therefore reducing the total amount of calculation in the encoding system.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200410076990 | 2004-08-31 | ||
PCT/IB2005/052756 WO2006024988A2 (en) | 2004-08-31 | 2005-08-23 | A method and apparatus for motion estimation |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1790166A2 true EP1790166A2 (de) | 2007-05-30 |
Family
ID=35586994
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP05780826A Withdrawn EP1790166A2 (de) | 2004-08-31 | 2005-08-23 | Verfahren und vorrichtung zur bewegungsschätzung |
Country Status (4)
Country | Link |
---|---|
EP (1) | EP1790166A2 (de) |
JP (1) | JP2008512023A (de) |
KR (1) | KR20070051294A (de) |
WO (1) | WO2006024988A2 (de) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102224731A (zh) * | 2009-09-22 | 2011-10-19 | 松下电器产业株式会社 | 图像编码装置、图像解码装置、图像编码方法及图像解码方法 |
US10764603B2 (en) * | 2018-12-31 | 2020-09-01 | Alibaba Group Holding Limited | Resolution-adaptive video coding |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6510177B1 (en) * | 2000-03-24 | 2003-01-21 | Microsoft Corporation | System and method for layered video coding enhancement |
WO2003036978A1 (en) * | 2001-10-26 | 2003-05-01 | Koninklijke Philips Electronics N.V. | Method and apparatus for spatial scalable compression |
-
2005
- 2005-08-23 JP JP2007529081A patent/JP2008512023A/ja active Pending
- 2005-08-23 WO PCT/IB2005/052756 patent/WO2006024988A2/en not_active Application Discontinuation
- 2005-08-23 KR KR1020077004940A patent/KR20070051294A/ko not_active Application Discontinuation
- 2005-08-23 EP EP05780826A patent/EP1790166A2/de not_active Withdrawn
Non-Patent Citations (1)
Title |
---|
See references of WO2006024988A2 * |
Also Published As
Publication number | Publication date |
---|---|
WO2006024988A2 (en) | 2006-03-09 |
JP2008512023A (ja) | 2008-04-17 |
WO2006024988A3 (en) | 2006-05-11 |
KR20070051294A (ko) | 2007-05-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7146056B2 (en) | Efficient spatial scalable compression schemes | |
US9420279B2 (en) | Rate control method for multi-layered video coding, and video encoding apparatus and video signal processing apparatus using the rate control method | |
JP2897763B2 (ja) | 動き補償符号化装置、復号化装置、符号化方法及び復号化方法 | |
US6393059B1 (en) | Conversion of video data bit stream | |
US20060133475A1 (en) | Video coding | |
JP2005506815A5 (de) | ||
JP2005507589A5 (de) | ||
JP2006279573A (ja) | 符号化装置と方法、ならびに復号装置と方法 | |
US7372906B2 (en) | Compression circuitry for generating an encoded bitstream from a plurality of video frames | |
WO2004057866A2 (en) | Elastic storage | |
EP1790166A2 (de) | Verfahren und vorrichtung zur bewegungsschätzung | |
US20020054338A1 (en) | Image processing apparatus employing hierarchical encoding | |
US7085321B2 (en) | Compression | |
JP2010288096A (ja) | 動画像符号化方法、動画像符号化装置及び動画像符号化プログラム | |
JP2001268578A (ja) | 符号化方式変換装置、画像通信システム、および符号化方式変換方法 | |
JP2008252931A (ja) | 復号装置及び方法、符号化装置及び方法、画像処理システム、画像処理方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20070402 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR |
|
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20080301 |