CN101010962A - Method and device for motion estimation - Google Patents

Method and device for motion estimation Download PDF

Info

Publication number
CN101010962A
CN101010962A CN 200580029013 CN200580029013A CN101010962A CN 101010962 A CN101010962 A CN 101010962A CN 200580029013 CN200580029013 CN 200580029013 CN 200580029013 A CN200580029013 A CN 200580029013A CN 101010962 A CN101010962 A CN 101010962A
Authority
CN
China
Prior art keywords
motion vector
video flowing
reference motion
sampled
basic layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 200580029013
Other languages
Chinese (zh)
Inventor
王进
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Koninklijke Philips NV
Original Assignee
Koninklijke Philips Electronics NV
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics NV filed Critical Koninklijke Philips Electronics NV
Priority to CN 200580029013 priority Critical patent/CN101010962A/en
Publication of CN101010962A publication Critical patent/CN101010962A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A method and apparatus for spatial layered compression of video stream are disclosed. Reference motion vector is introduced into the compression scheme of the present invention, and according to this reference motion vector, the base layer and enhancement layer may acquire the motion vector of the corresponding frame of image of the video stream, respectively, and thereby to respectively generate a base layer and enhancement layer. The introduced reference motion vector makes the motion estimation about the base layer associated with the motion estimation about the enhancement layer, thereby to reduce the total amount of calculation of the motion estimation about the base layer and enhancement layer. Further, because the reference frame for obtaining the reference motion vector can be obtained from original video sequence and no additional harmful operating is made on the original video sequence, the reference motion vector may better reflect actual motion within the video sequence.

Description

A kind of method and apparatus of estimation
Background technology
The present invention relates to a kind of video stream compression method and apparatus, relate in particular to the video stream compression method and apparatus that utilizes the space delamination compression scheme.
Owing to include lot of data in the digital video, therefore when making high definition (HighResolution) TV programme, the high-resolution transmission of video signals is a very big problem.Furtherly, each frame of digital image all is a rest image (also claiming image) of being made up of a group pixel (pixel also claims pixel), and the quantity of these pixels depends on the display resolution of a particular system.Therefore, the quantity of original digital information is huge in high-resolution video.In order to reduce the data bulk that need be transmitted, a lot of video compression standards have been proposed, as MPEG-2, MPEG-4 with H.263 wait.
Above-mentioned these standards are all supported demixing technology, comprise space delamination, time domain layering, SNR layering etc.In hierarchical coding, bit stream is divided into two or more bit streams or layer is encoded.Then, thus when decoding each layer can be merged as required and formed a high-resolution signal.For example, basic layer can provide the video flowing of a low resolution, and enhancement layer can provide additional information to strengthen basic tomographic image.
In the existing space layered compression schemes, except adopting above-mentioned implements spatial scalable compression technology,, used motion prediction to obtain predicted picture also according to the association between the frame of front and back.Input video stream formed I frame, P frame, B frame through handling before being compressed coding, and according to parameter the certain sequence of formation is set.The I frame is only to encode according to the information of this two field picture itself, and the P frame carries out predictive coding according to front nearest I frame or P frame, and the B frame carries out predictive coding according to the frame of itself or front and back.
Fig. 1 is a block diagram of supporting the video encoder 100 of MPEG-2/MPEG-4 space delamination compression.Video encoder 100 comprises basic encoding unit (Base-encoder) 112 and enhanced encoder (Enh-encoder) 114.Basic encoding unit comprises down-sampled device (downsampler) 120, movement estimation apparatus (Motion Estimation, ME) 122, motion compensation unit (MotionCompensator, MC) 124, right angle conversion (for example Discrete Cosine Transform (DCT) conversion) circuit 135, quantizer (Quantizer, Q) 132, variable length coder (VariableLength Coder, VLC) 134, code rate control circuit (Bitrate Control Circuit) 135, inverse quantizer (Inverse Quantizer, IQ) 138, inverse conversion circuit (Inverse TransformCircuit (IDCT)) 140, switch (Switch) 128,144, and rise sampling apparatus (Upsampler) 150.Enhanced encoder 114 comprises movement estimation apparatus 154, motion compensation unit 155, and the right angle conversion is (for example: the DCT conversion) circuit 158, quantizer 160, variable length coder 162, code rate control circuit 164, inverse quantizer 166, inverse conversion circuit (IDCT) 168, switch 170,172.Each functions of components is known in the prior art, therefore here is not described in detail.
As everyone knows, estimation is one of part the most time-consuming in the video compression system, that is to say, the amount of calculation of estimation is big more, and then the code efficiency of video compression system is just poor more.In the above-mentioned hierarchical coding compression scheme, when same frame video image was predicted, basic layer and enhancement layer will carry out estimation respectively separately, between the two onrelevant also.But, when basic layer and enhancement layer were carried out estimation respectively, owing to being that same two field picture is predicted, so the search procedure of quite a few repeated, cause the amount of calculation of its estimation bigger thus, influenced the code efficiency of this compression scheme.Therefore, we are starved of the better space delamination video compression scheme of a kind of code efficiency.
Summary of the invention
The objective of the invention is in order to overcome the above-mentioned shortcoming of existing space layered compression schemes, a kind of more effective method for spatial layered compression has been proposed, by introducing reference motion vector, the estimation of basic layer and enhancement layer is associated, the search work that makes both repeat is once finished, and carry out a spot of search procedure on this basis again, thus reduce the computational complexity of estimation, improved compression coding efficiency.
A kind of method for spatial layered compression and device thereof of video flowing are disclosed according to one embodiment of present invention.At first, this original video stream is handled, thereby obtained the reference motion vector of each two field picture of this video flowing; Then, this reference motion vector is carried out down-sampled, and carry out down-sampled this video flowing; Secondly, through down-sampled reference motion vector, obtain the motion vector of the respective frame image of the down-sampled video flowing of this process according to this; Then, utilize this motion vector that the respective frame image of the down-sampled video flowing of this process is handled respectively, thereby generate a basic layer; At last, when generating enhancement layer,, obtain the motion vector of the respective frame image of video flowing, and utilize this motion vector and be somebody's turn to do basic layer this video flowing is handled, thereby generate an enhancement layer according to this reference motion vector.
The method for spatial layered compression and the device thereof of another kind of video flowing are disclosed according to another embodiment of the invention.At first, this video flowing is carried out down-sampled, obtain reference motion vector through each two field picture of down-sampled video flowing; Then, according to this reference motion vector, obtain the motion vector of the respective frame image of this video flowing after down-sampled; Then, utilize this motion vector to this video flowing after down-sampled handle, thereby generate a basic layer; At last, when generating enhancement layer, the above-mentioned reference motion vector is carried out rising sampling, according to reference motion vector through the liter sampling, obtain the motion vector of the respective frame image of this video flowing, and utilize this motion vector and be somebody's turn to do basic layer this video flowing is handled, thereby generate an enhancement layer.
The method for spatial layered compression and the device thereof of another video flowing are disclosed according to still another embodiment of the invention.At first, this video flowing is handled, thereby generated a basic layer; Then, the motion vector of each two field picture of this basic layer is carried out rising sampling, thereby obtain the reference motion vector of respective frame image; According to this reference motion vector, obtain the motion vector of the respective frame image of this video flowing at last, thereby can utilize this motion vector and be somebody's turn to do basic layer this video flowing is handled, thereby generated an enhancement layer.
By following description and the claim that reference is carried out in conjunction with the accompanying drawings, other purpose of the present invention and achievement will be conspicuous, and the present invention is also had more comprehensively understanding.
Description of drawings
The present invention carries out detailed explanation by the mode of example with reference to accompanying drawing, wherein:
Fig. 1 is the block diagram of the video encoder of existing space compressed in layers;
Fig. 2 is the coded system schematic diagram that utilizes reference motion vector according to an embodiment of the invention;
Fig. 3 is the coding flow chart that utilizes reference motion vector according to an embodiment of the invention;
Fig. 4 is the coded system schematic diagram that utilizes reference motion vector according to another embodiment of the invention; And
Fig. 5 is the coded system schematic diagram that utilizes reference motion vector according to still another embodiment of the invention.
In all accompanying drawings, identical reference number is represented similar or identical feature and function.
Embodiment
Fig. 2 is the coded system schematic diagram that utilizes reference motion vector according to an embodiment of the invention.This coded system 200 is used to carry out compressed in layers, and wherein basic layer segment is used to provide the low resolution essential information of video flowing, and enhancement layer is used to transmit edge enhancement information, and two kinds of information can be at the receiving terminal quilt again in conjunction with forming high-resolution image information.
Coded system 200 comprise one obtain device 216, basic layer obtains device 212 and an enhancement layer obtains device 214.
Wherein, acquisition device 216 is used for this original video stream is handled, thereby obtains the reference motion vector of each two field picture of this video flowing.Obtain device 216 and comprise movement estimation apparatus 276 and frame memory 282.Frame memory 282 is used to store this original video sequence.Movement estimation apparatus 276 is used for obtaining reference frame (for example I frame or P frame) from frame memory 282, and according to reference frame present frame (for example P frame) is made estimation, thereby calculates the reference motion vector of present frame.
Basic layer obtains device 212 and utilizes this reference motion vector that this video flowing is handled, thereby generates a basic layer.This device 212 comprises down-sampled device 120,286.Down-sampled device 120 is used for carrying out down-sampled to this original video stream.Down-sampled 286 are used for carrying out down-sampled to this reference motion vector.Certainly, those skilled in the art should know, and also can finish down-sampled to original video stream and reference motion vector with a down-sampled device.
Should comprise also that motion vector obtained device 222 by basic layer acquisition device 212.Motion vector obtains device 222 and is used for according to this reference motion vector after down-sampled, obtains the motion vector of the respective frame image of the down-sampled video flowing of this process.The process how motion vector acquisition device 222 obtains motion vector sees for details following.
Substantially layer acquisition device 212 also comprise basic layer generation device 213, utilized this motion vector that the down-sampled video flowing of this process is handled, and was somebody's turn to do basic layer thereby produce.Except down-sampled device 120,286 and motion vector acquisition device 222, remaining parts is identical with the base layer coder 112 among Fig. 1 substantially in the basic layer acquisition device 212, all belongs to basic layer generation device 213, comprise motion compensation unit 124, DCT change-over circuit 130, quantizer 132, variable length coder 134, code rate control circuit 135, inverse quantizer 138, inverse conversion circuit 140, arithmetical unit 125,148, switch 128,144, rise sampling apparatus 150.It is basic identical that the motion vector that basic layer generation device 213 utilizes motion vector to obtain device 222 outputs produces the process and the prior art of basic layer, sees for details following.
Above-mentioned basic layer obtains in the device 212, and compared to Figure 1, identical reference number is represented similar or identical feature and function.Both differences only are that movement estimation apparatus 122 and motion vector obtain device 222, and the mode that they obtain motion vector is different.Movement estimation apparatus 122 among Fig. 1 is directly to use the reference frame in the frame memory (not shown) to search in a bigger search window, thereby obtain the motion vector of the respective frame image of video flowing, and the motion vector among Fig. 2 obtains device 222 based on above-mentioned reference motion vector, in a less search window, search for again, thereby obtain the motion vector of the respective frame image of this video flowing.
Enhancement layer obtains device 214 and utilizes this reference motion vector and should basic layer be handled by this video flowing, thereby generates an enhancement layer.This device 214 comprises that motion vector obtains device 254 and enhancement layer generation device 215.
Motion vector obtains device 254 and is used for according to this reference motion vector, obtains the motion vector of the respective frame image of this video flowing.
Enhancement layer generation device 215 utilizes this motion vector and should basic layer be handled by this video flowing, thereby generates this enhancement layer.Enhancement layer obtains the parts except that motion vector obtains device 254 in the device 214, remaining parts enhancement layer encoder 114 basic and among Fig. 1 is identical, all belongs to enhancement layer generation device 215, comprises motion compensation unit 155, DCT circuit 158, quantizer 160, variable length coder 162, code rate control circuit 164, inverse quantizer 166, inverse DCT circuit 168, switch 170,172.These functions of components are similar with the corresponding component that basic layer obtains device 212.It is basic identical that the motion vector that enhancement layer generation device 215 utilizes motion vector to obtain device 254 outputs produces the process and the prior art of enhancement layer, sees for details following.
Above-mentioned basic layer obtains in the device 214, and compared to Figure 1, identical reference number is represented similar or identical feature and function.Both differences only are that movement estimation apparatus 154 and motion vector obtain device 254, and the mode that they obtain motion vector is different.Movement estimation apparatus 154 among Fig. 1 is directly to use the reference frame in the frame memory (not shown) to search in a bigger search window, thereby obtain the motion vector of the respective frame image of video flowing, and the motion vector among Fig. 2 obtains device 254 based on above-mentioned reference motion vector, in a less search window, search for again, thereby obtain the motion vector of the respective frame image of this video flowing.
Describe basic layer in detail below in conjunction with Fig. 2 and obtain device 212 and enhancement layer and obtain device 214 and obtain separately motion vector respectively, thereby produce the process of basic layer and enhancement layer by the reference motion vector that obtains device 216.
An original video stream is transfused to acquisition device 216, and is sent to movement estimation apparatus 276 and frame memory 282 respectively.What deserves to be mentioned is that this video flowing had formed I frame, P frame, B frame through handling before access to plant 216, and according to parameter be provided with form as sequence I, B, P, B, P ..., B, P.This input video sequence of storage in the frame memory 282.Movement estimation apparatus 276 obtains reference frame (for example I frame) from frame memory 282, and according to reference frame present frame (for example P frame) is made estimation, thereby calculates the reference motion vector of the macro block of present frame.This macro block is the fritter of the 16*16 pixel in the current frame that is encoded, and is used for carrying out between current macro and reference frame piece coupling, calculating the reference motion vector of current macro, thereby obtains the reference motion vector of present frame.
In MPEG, four kinds of image prediction modes are arranged, comprise intraframe coding, forward predictive coded, back forecast coding and bi-directional predictive coding.The I frame is an in-frame encoding picture, and the P frame is intraframe coding or forward predictive coded or back forecast coded image, and the B frame is intraframe coding or forward predictive coded or bidirectionally predictive coded picture.
276 pairs of P frames of movement estimation apparatus carry out forward prediction, and calculate its reference motion vector.In addition, movement estimation apparatus carries out forward direction or bi-directional predicted and calculating reference motion vector to the B frame.For intraframe coding, do not need to carry out motion prediction.
The P frame is carried out forward prediction is example, and the process of calculating reference motion vector is such: movement estimation apparatus 276 is read the last reference frame in the frame memory 282, and the macro block of the piece pixel of present frame is mated in search most in the search window of last reference frame.Multiple match search algorithm is arranged in the existing technology, and coupling quality generally is the mean square error (MAD) between the relevant block pixel of measurement pixel of current input block and reference frame or the size of absolute error (MSE).Relevant block with reference frame of minimum MAD or minimum MSE is best matching blocks, and the relative position of the position of itself and current block is exactly a reference motion vector.
By above-mentioned processing, the movement estimation apparatus 276 in the acquisition device 216 can obtain the reference motion vector of a two field picture of video flowing.This reference motion vector through down-sampled device 286 down-sampled after, be sent to the movement estimation apparatus 222 that basic layer obtains device 212, so that movement estimation apparatus 222 carries out further estimation at basic layer to this same two field picture.And this reference motion vector also can be sent to the movement estimation apparatus 254 that enhancement layer obtains device 214, so that movement estimation apparatus 254 carries out further estimation at enhancement layer to this same two field picture.
When 216 pairs of input video streams of acquisition device carry out estimation, basic layer obtains device 212 and enhancement layer acquisition device 214 is also carrying out predictive coding to this input video stream, but delay is arranged in time slightly, because basic layer and enhancement layer must be made further estimation on the basis of above-mentioned reference motion vector.
It is as follows that basic layer is with above-mentioned reference motion vector that benchmark is made the process of further estimation:
The separated device of original input video stream separates, and is sent into basic layer acquisition device 212 and enhancement layer acquisition device 214 respectively.Obtain in the device 212 at basic layer, input video stream is admitted to down-sampled device 120.This down-sampled device can be low pass filter, is used to reduce the resolution of input video stream.Be admitted to motion vector through down-sampled video flowing then and obtain device 222.Motion vector obtains the last reference frame image that device 222 obtains the video sequence that is stored in the frame memory, and the down-sampled reference motion vector of process with this present frame of above-mentioned down-sampled device 286 outputs is a benchmark, the macro block of present frame is mated in search most in one of last reference frame less search window, thereby obtains the motion vector of the respective frame image of the video flowing after down-sampled.
After motion compensation unit 124 receives and obtains predictive mode, above-mentioned reference motion vector and the above-mentioned motion vector of device 222 from motion vector, can be according to predictive mode, reference motion vector and motion vector, read the view data that is stored in the last reference frame of encoded and local decoding in the frame memory (not shown), and this former frame image is carried out a displacement according to reference motion vector, and then carry out a displacement, thereby predict current frame image according to motion vector.Certainly, also this former frame image only can be carried out a displacement, this displacement is this reference motion vector and this motion vector sum, at this moment, and can be with this reference motion vector and motion vector sum motion vector as this two field picture.
Then, motion compensation unit 124 offers arithmetical unit 125 and switch 144 with predicted picture.Arithmetical unit 125 also receives input video stream, and calculates the image of input video stream and poor from the predicted picture of motion compensation unit 124.This difference is provided for DCT circuit 130.
If the predictive mode that receives from movement estimation apparatus 122 is an infra-frame prediction, motion compensation unit 124 not prediction of output images.In this case, arithmetical unit 125 does not carry out above-mentioned processing, but directly input video stream is delivered to DCT circuit 130.
The signal of 130 pairs of arithmetical unit outputs of DCT circuit carries out DCT to be handled, and to obtain the DCT coefficient, this coefficient is provided for quantizer 132.Quantizer 132 is provided with quantization amplitude (quantification gradation) according to the memory data output in the buffer, and with quantizing the DCT coefficient of amplitude quantizing from DCT circuit 130.DCT coefficient that quantizes and the quantization amplitude that sets are supplied to VLC unit 134 together.
The quantization amplitude that VLC unit 134 provides according to quantizer 132 quantization parameter of quantizer in the future is converted to variable length code, Huffman sign indicating number for example, thus generate a basic layer.
In addition, the quantization parameter through conversion is output to a buffer (not shown).Quantization parameter and quantization amplitude also are provided for inverse quantizer 138, and it is according to quantization amplitude re-quantization quantization parameter, to convert thereof into the DCT coefficient.The DCT coefficient is provided for inverse DCT unit 140, and it carries out the inverse DCT conversion to the DCT coefficient.The inverse DCT coefficient that obtains is provided for arithmetical unit 148.
Arithmetical unit 148 receives the inverse DCT coefficient from inverse DCT unit 140, and receives data according to the position of switch 144 from motion compensation unit 124.Arithmetical unit 148 calculates from the signal of inverse DCT unit 140 and from the predicted picture sum of motion compensation unit 124, with part decoding original image.But if predictive mode is shown as intraframe coding, frame memory can be directly given in the output of inverse DCT unit 140.The image through decoding that arithmetical unit 148 obtains is sent to and deposits and is in the frame memory, can be used as intraframe coding, forward coding, the back reference frame to coding or alternating binary coding image later on.
The output of arithmetical unit 140 also offers and rises sampling apparatus 150, produce one flow-reconstituted, it has identical with the high-resolution input video stream in fact resolution.But, because the compression and decompression filtration and the loss that bring, certain error has appearred in flow-reconstituted.This difference determines by the high-definition video stream that the high-definition video stream that changed with the former beginning and end deducts reconstruction, and is transfused to enhancement layer and encodes.Therefore, enhancement layer is that the frame that carries this difference information is carried out encoding compression.
The predictive coding process of enhancement layer and basic layer are closely similar.After acquisition device 216 obtained reference motion vector, this reference motion vector was sent to the movement estimation apparatus 254 that enhancement layer obtains device 214.Like this, movement estimation apparatus 254 is benchmark at enhancement layer with this reference motion vector, and this same two field picture is carried out further estimation, thereby obtains the motion vector of the respective frame image of this video flowing.Then, motion compensation unit 155 carries out corresponding displacement according to predictive mode, reference motion vector and above-mentioned motion vector with reference frame, thereby predicts present frame.Because this motion prediction process is similar to basic layer, therefore no longer describes in detail here.
Fig. 3 is the coding flow chart that utilizes reference motion vector according to an embodiment of the invention.This flow process is an operational process of device 200.
At first, receive a specific high-definition video stream (step S305), such as the resolution video flowing that is 1920*1080i.
Then, obtain the reference motion vector (step S310) of each two field picture of this video flowing.Suppose that present frame is the P frame, the macro block of present frame is mated in search most in the search window of reference frame I frame, such as, in the search window of ± 15 pixels, search for, should ± 15 pixels be recommendations of estimation.After searching best matching blocks, the displacement between current block and the match block corresponding points is exactly this reference motion vector.Because this reference motion vector is to be obtained by the reference frame in original video stream prediction, does not carry error, so its real motion of reflecting video better.
Following formula has illustrated the procurement process of this reference motion vector further.This reference motion vector be (Bx, By):
( Bx , By ) = arg min ( M , N ) ∈ S SAD ( m , n ) . . . . . . . . . ( 1 )
(1) in the formula, arg represents the SAD motion vector of hour current macro correspondence;
SAD ( m , n ) = Σ i M Σ j N | P c ( i , j ) - R p ( i + m , j + n ) | . . . . . . . . . . . . . . . . ( 2 )
(2) in the formula, sad value is the absolute difference of pixel separately, the similarity degree of two macro blocks of expression, and m and n are level and the vertical components that match block moves, P c(i, j) and R p(i j) represents the pixel of present frame and the pixel of last reference frame respectively.Subscript " c " and " p " expression " current " and " last " frame.
This reference motion vector can be respectively applied for the basic layer of video flowing and the estimation once more in the enhancement layer, make basic layer and enhancement layer only need to carry out again on this basis interior among a small circle estimation, thereby reduce the computational complexity of coded system, improve the efficient of compressed encoding.
Then, (Bx By) carries out down-sampledly, obtains (Bx ', By ') (step S312) to this reference motion vector.
This video flowing is carried out down-sampled (step S316), reduce its resolution, for example drop to 720*480i.
The reference motion vector down-sampled according to this process (Bx ', By '), obtain motion vector (step S322) through the corresponding two field picture of down-sampled video flowing.Present frame when it should be noted that corresponding two field picture described herein and obtaining reference motion vector is same frame.Just because of be to predict at same frame, thus can be on the benchmark of reference motion vector (Bx ', By '), and the macro block of current block is mated in search most in the less search window of reference frame again, thereby obtains a motion vector (Dx 1, Dy 1).By test, this search window can be ± the new search window of 2 pixels.Referring to formula (3) and (4), can more clearly understand this search procedure.
( D x 1 , D y 2 ) = arg min ( M , N ) ∈ S R S AD R . . . . . . . . . . . ( 3 )
SAD R = Σ i Σ j | P c ( i , j ) - R p ( i + B x ′ + m , j + B y ′ + n ) | . . . . . . . . . . ( 4 )
According to formula (4) as can be known, estimation wherein is at the basic enterprising line search of reference motion vector (Bx ', By ').Because most search work has been finished when calculating reference motion vector, therefore, in this step, can find best matching blocks as long as carry out the search of fraction.The volumes of searches of the search window of ± 2 pixels obviously volumes of searches than the search window of ± 15 pixels is little a lot.
Utilize this motion vector that the down-sampled video flowing of this process is handled, thereby generate a basic layer (step S326).According to above-mentioned reference motion vector and motion vector, reference frame is carried out displacement, just can obtain the predictive frame of present frame, and then handle with known method, just can generate a basic layer.
(Bx By), obtains the motion vector (step S332) of the respective frame image of this video flowing according to this reference motion vector.Present frame when it should be noted that corresponding two field picture described herein and obtaining reference motion vector is same frame.Just because of be to predict at same frame, (Bx, By), the macro block of current block is mated in search most in the less search window of reference frame, thereby obtains a motion vector (Dx so can utilize reference motion vector 2, Dy 2).Its method is similar to the method that basic layer obtains its motion vector, repeats no more here.
Then, utilize this motion vector and be somebody's turn to do basic layer this video flowing is handled, thereby generate an enhancement layer (step S336).
Therefore, the reference motion vector of originally executing in the example can be utilized by the motion prediction of basic layer and enhancement layer simultaneously, thereby has reduced the computational complexity of searching in two layers, has improved the efficient of compressed encoding.
Below we analyze and contrast the computational complexity of the compression scheme of the present invention and Fig. 1.
The resolution of supposing each high definition (HD) frame and single-definition (SD) frame is respectively 1920X1088i and 720X480i, and search window is ± 15 pixels.Computational complexity to the error measure SAD between two macro blocks of Y component is T SAD
Sum to the macro block of a HD frame and a SD frame (only considering the Y component) is 8160 and 1350.If the estimation to each macro block is that carry out (± 15) in search window, the max calculation amount that obtains the best motion vector of this macro block is (31*31*T SAD=961*T SAD).Amount of calculation to a HD frame (enhancement layer) is (8160*961*T SAD=7,841,760*T SAD), be (1350*961*T to the amount of calculation of a SD frame (basic layer) SAD=1,297,350*T SAD).
To the coded system among Fig. 1, be the amount of calculation of HD frame and the amount of calculation sum of SD frame to total max calculation amount of the motion vector of each frame, promptly (9,139,110*T SAD).
To the coded system among Fig. 2, the amount of calculation of reference motion vector be (7,841,760*T SAD).Carry out in a littler search window (± 2) when the estimation to each macro block, the max calculation amount that obtains a best motion vector is (5*5*T SAD=25*T SAD).To a SD frame (basic layer), its amount of calculation is (1350*25*T SAD=33,750*T SAD), for a HD frame (enhancement layer), its amount of calculation is (8160*25*T SAD=204,00*T SAD).
To the coded system of Fig. 2, then total max calculation amount of the motion vector of each frame is: the amount of calculation of reference motion vector, SD frame volumes of searches and the volumes of searches sum of HD frame in less search window in less search window, promptly (7,875,510*T SAD).
The ratio R that amount of calculation reduces between the coded system of Fig. 1 and Fig. 2:
R=|7,875,510-9,139,110|/9,139,110=14%
Fig. 4 is the coded system schematic diagram that utilizes reference motion vector according to another embodiment of the invention.Coded system 400 in the present embodiment is similar to the coded system among Fig. 2, and similarity no longer describes in detail here, only both differences is described in detail.Its difference is, obtains device 410 and comprises that a down-sampled device 120 and a reference motion vector obtain device 416.Original video stream is at first carried out down-sampled through down-sampled device 120.Be transfused to reference motion vector through down-sampled video flowing then and obtain device 416, promptly be sent to movement estimation apparatus 476 and frame memory 282 respectively, thereby obtain the reference motion vector of each two field picture of this video flowing.Then, this reference motion vector directly is sent to the movement estimation apparatus 422 that basic layer obtains device 412, device 422 is a benchmark with this reference motion vector, carries out estimation in a less search window once more, obtains the motion vector of the respective frame image of this video flowing after down-sampled; Then, basic layer generation device 413 utilizes this motion vector that the down-sampled video flowing of process is handled, thereby generates basic layer.
In addition, obtain in the device 414 at enhancement layer, above-mentioned reference motion vector at first need be through rising 486 liters of samplings of sampling apparatus, then, a motion vector deriving means, be motion vector estimating apparatus 454, carry out estimation once more, obtain the motion vector of the respective frame image of video flowing according to the reference motion vector that rises after the sampling.Then, enhancement layer generation device 415 utilizes this reference motion vector and should basic layer be handled by this video flowing, thereby generates this enhancement layer.
As from the foregoing, the estimation of basic layer and enhancement layer is associated gets up, the search work that makes both need originally to repeat when the same two field picture of prediction can disposablely be finished, and on the basis of this same reference motion vector, basic layer and enhancement layer carry out estimation once more in a less search window.Because omitted search work greatly, therefore, the amount of calculation of whole coded system has reduced.
Fig. 5 is the coded system schematic diagram that utilizes reference motion vector according to still another embodiment of the invention.Coded system 500 in the present embodiment is similar to the coded system among Fig. 2, and similarity no longer describes in detail here, only both differences is described in detail.Its difference is, basic layer obtains the movement estimation apparatus 522 of device 512 and exports the motion vector of each two field picture of layer substantially, this motion vector obtains device through a reference motion vector, promptly rise after 586 liters of samplings of sampling apparatus as the reference motion vector of respective frame image, this reference motion vector is sent to the movement estimation apparatus 554 that enhancement layer obtains device 514.With this reference motion vector is benchmark, carries out estimation again in a less search window, thereby obtains the motion vector of the respective frame image of this video flowing.Then, enhancement layer generation device 515 is again according to this reference motion vector, and the output of this motion vector and basic layer generates an enhancement layer in the mode similar with the embodiment of Fig. 2.
As from the foregoing, enhancement layer carries out interior among a small circle search work once more in the present embodiment on the basis of the motion vector that basic layer obtains, and makes enhancement layer omit a part and the identical search work of basic layer, therefore, the amount of calculation of whole coded system has reduced.
Though through the present invention is described in conjunction with specific embodiments, for the skilled personage in present technique field, will be conspicuous according to manyly substituting of making of narration above, modifications and variations.Therefore, when such substituting, in the spirit and scope that modifications and variations fall into attached claim the time, should being included among the present invention.

Claims (16)

1, a kind of method for spatial layered compression of video flowing comprises step:
A. this video flowing is handled, thereby is obtained the reference motion vector of each two field picture of this video flowing,
B. utilize this reference motion vector that this video flowing is handled, thereby generate a basic layer,
C. utilize this reference motion vector and be somebody's turn to do basic layer this video flowing is handled, thereby generate an enhancement layer.
2, the method for claim 1, wherein step a comprises step:
Described video flowing is carried out down-sampled,
Obtain reference motion vector through each two field picture of down-sampled video flowing.
3, method as claimed in claim 2, wherein step b comprises step:
According to described reference motion vector, obtain the motion vector of the respective frame image of this video flowing after down-sampled,
Utilize this motion vector to this video flowing after down-sampled handle, thereby generate described basic layer.
4, method as claimed in claim 2, wherein step c comprises step:
Described reference motion vector is carried out rising sampling,
According to through rising the reference motion vector of sampling, obtain the motion vector of the respective frame image of this video flowing,
Utilize this motion vector and described basic layer that this video flowing is handled, thereby generate described enhancement layer.
5, the method for claim 1, wherein step b comprises step:
Described reference motion vector is carried out down-sampled,
Described video flowing is carried out down-sampled,
Through down-sampled reference motion vector, obtain the motion vector of the respective frame image of the down-sampled video flowing of this process according to this,
Utilize this motion vector that the down-sampled video flowing of this process is handled, thereby generate described basic layer.
6, method as claimed in claim 5, wherein step c comprises step:
According to this reference motion vector, obtain the motion vector of the respective frame image of this video flowing,
Utilize this motion vector and described basic layer that this video flowing is handled, thereby generate described enhancement layer.
7, a kind of method for spatial layered compression of video flowing comprises step:
A. this video flowing is handled, thereby is generated a basic layer,
B. the motion vector to this each two field picture of basic layer carries out rising sampling, thereby obtains the reference motion vector of respective frame image,
C. utilize this reference motion vector and be somebody's turn to do basic layer this video flowing is handled, thereby generate an enhancement layer.
8, method as claimed in claim 7, wherein step c comprises step:
According to described reference motion vector, obtain the motion vector of the respective frame image of this video flowing,
Utilize this motion vector and described basic layer that this video flowing is handled, thereby generate described enhancement layer.
9, a kind of space delamination compression set of video flowing comprises:
An acquisition device is used for this video flowing is handled, thereby obtains the reference motion vector of each two field picture of this video flowing,
A basic layer obtains device, is used to utilize this reference motion vector that this video flowing is handled, thereby generates a basic layer,
An enhancement layer obtains device, is used to utilize this reference motion vector and is somebody's turn to do basic layer this video flowing is handled, thereby generate an enhancement layer.
10, device as claimed in claim 9, wherein said acquisition device comprises:
A down-sampled device, be used for described video flowing is carried out down-sampled, and
A reference motion vector obtains device, is used to obtain the reference motion vector through each two field picture of down-sampled video flowing.
11, device as claimed in claim 10, wherein said basic layer obtains device and comprises:
Motion vector obtains device, is used for according to described reference motion vector, obtains the motion vector of the respective frame image of this video flowing after down-sampled;
A basic layer generation device is used to utilize this motion vector that the down-sampled video flowing of this process is handled, thereby generates described basic layer.
12, device as claimed in claim 10, wherein said enhancement layer obtain device and comprise:
One rises sampling apparatus, be used for described reference motion vector is carried out rising sampling,
A motion vector deriving means is used for obtaining the motion vector of the respective frame image of this video flowing according to through rising the reference motion vector of sampling,
An enhancement layer generation device is used to utilize this motion vector and is somebody's turn to do basic layer this video flowing is handled, thereby generates described enhancement layer.
13, device as claimed in claim 9, wherein said basic layer obtains device and comprises:
A down-sampled device, be used for described reference motion vector and described video flowing are carried out down-sampled,
A motion vector deriving means is used for obtaining the motion vector of the respective frame image of the down-sampled video flowing of this process according to this reference motion vector after down-sampled,
A basic layer generation device is used to utilize this motion vector that the down-sampled video flowing of this process is handled, thereby produces described basic layer.
14, device as claimed in claim 13, wherein said enhancement layer obtain device and comprise:
A motion vector deriving means is used for according to this reference motion vector, obtains the motion vector of the respective frame image of this video flowing,
An enhancement layer generation device is used to utilize this motion vector and described basic layer that this video flowing is handled, thereby generates described enhancement layer.
15, a kind of space delamination compression set of video flowing comprises:
A basic layer obtains device, and be used for this video flowing is handled, thereby generate a basic layer,
A reference motion vector obtains device, and be used for the motion vector of this each two field picture of basic layer is carried out rising sampling, thereby obtain the reference motion vector of respective frame image,
An enhancement layer obtains device, is used to utilize this reference motion vector and is somebody's turn to do basic layer this video flowing is handled, thereby generate an enhancement layer.
16, device as claimed in claim 15, wherein said enhancement layer obtain device and comprise:
A motion vector deriving means according to described reference motion vector, obtains the motion vector of the respective frame image of this video flowing,
An enhancement layer generation device utilizes this motion vector and described basic layer that this video flowing is handled, thereby generates described enhancement layer.
CN 200580029013 2004-08-31 2005-08-23 Method and device for motion estimation Pending CN101010962A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 200580029013 CN101010962A (en) 2004-08-31 2005-08-23 Method and device for motion estimation

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN200410076990.3 2004-08-31
CN200410076990 2004-08-31
CN 200580029013 CN101010962A (en) 2004-08-31 2005-08-23 Method and device for motion estimation

Publications (1)

Publication Number Publication Date
CN101010962A true CN101010962A (en) 2007-08-01

Family

ID=38698170

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 200580029013 Pending CN101010962A (en) 2004-08-31 2005-08-23 Method and device for motion estimation

Country Status (1)

Country Link
CN (1) CN101010962A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101217654B (en) * 2008-01-04 2010-04-21 华南理工大学 Scalable organization method of video bit stream
CN101888546A (en) * 2010-06-10 2010-11-17 北京中星微电子有限公司 Motion estimation method and device
CN102238380A (en) * 2010-04-22 2011-11-09 奇景光电股份有限公司 Hierarchical motion estimation method and system

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101217654B (en) * 2008-01-04 2010-04-21 华南理工大学 Scalable organization method of video bit stream
CN102238380A (en) * 2010-04-22 2011-11-09 奇景光电股份有限公司 Hierarchical motion estimation method and system
CN102238380B (en) * 2010-04-22 2013-07-17 奇景光电股份有限公司 Hierarchical motion estimation method and system
CN101888546A (en) * 2010-06-10 2010-11-17 北京中星微电子有限公司 Motion estimation method and device
CN101888546B (en) * 2010-06-10 2016-03-30 无锡中感微电子股份有限公司 A kind of method of estimation and device

Similar Documents

Publication Publication Date Title
RU2341035C1 (en) Video signal coding and decoding procedure based on weighted prediction and related device for implementation thereof
KR100664929B1 (en) Method and apparatus for effectively compressing motion vectors in video coder based on multi-layer
CN101540926B (en) Stereo video coding-decoding method based on H.264
CN100471269C (en) Spatial scalable compression
KR100703740B1 (en) Method and apparatus for effectively encoding multi-layered motion vectors
CN102986220B (en) The codec of video signal and method and Video Codec equipment
US20130322524A1 (en) Rate control method for multi-layered video coding, and video encoding apparatus and video signal processing apparatus using the rate control method
EP0734165A2 (en) Image processing system using pixel-by-pixel motion estimation and frame decimation
EP1983759A1 (en) Estimation of separable adaptive interpolation filters for hybrid video coding
US20060120448A1 (en) Method and apparatus for encoding/decoding multi-layer video using DCT upsampling
JPH04177992A (en) Picture coder having hierarchical structure
NO342829B1 (en) COMPUTER-READY STORAGE MEDIUM AND APPARATUS FOR CODING A MULTIPLE VIDEO IMAGE USING A SEQUENCE VALUE
CN1751519A (en) Video coding
CN103098472A (en) Method and apparatus for hierarchical picture encoding and decoding
US20090185621A1 (en) Video encoding/decoding apparatus and method
JP2004096757A (en) Interpolation method and its apparatus for move compensation
JPH1155666A (en) Scalable encoder improving energy compensation and reverse compensation function and its method
CN100456836C (en) Coding device and method thereof
CN105474642A (en) Re-encoding image sets using frequency-domain differences
TWI468018B (en) Video coding using vector quantized deblocking filters
JP2004520744A (en) Method and apparatus for image encoding and decoding and related applications
CN100366093C (en) Wavelet domain half-pixel motion compensation
JP2007143176A (en) Compression method of motion vector
CN101010962A (en) Method and device for motion estimation
CN1848960B (en) Residual coding in compliance with a video standard using non-standardized vector quantization coder

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication