US20020154698A1 - Method and apparatus for motion estimation for high performance transcoding - Google Patents
Method and apparatus for motion estimation for high performance transcoding Download PDFInfo
- Publication number
- US20020154698A1 US20020154698A1 US09/276,826 US27682699A US2002154698A1 US 20020154698 A1 US20020154698 A1 US 20020154698A1 US 27682699 A US27682699 A US 27682699A US 2002154698 A1 US2002154698 A1 US 2002154698A1
- Authority
- US
- United States
- Prior art keywords
- motion vector
- delta
- generating
- base
- zero
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/40—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/503—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
- H04N19/51—Motion estimation or motion compensation
Definitions
- This invention relates generally to digital video compression and, in particular, to re-encoding a previously encoded digital video signal, also known as transcoding.
- Digital video compression reduces the number of bits by removing redundant information, without seriously affecting the quality of the video.
- Standard video compression techniques remove spatial redundancy within a video frame and remove temporal redundancy between video frames.
- encoders commonly use a Discrete Cosine Transform (DCT), which is widely known and understood.
- DCT Discrete Cosine Transform
- motion estimation which is also widely known and understood.
- motion estimation With regard to motion estimation, the images in a digital video usually do not change much within small time intervals, i.e., adjacent frames include a great deal of redundant information. Thus, motion-estimation takes advantage of this redundancy and encodes a video frame based on other video frames temporally close to it. For example, in a particular movie scene, the background trees (outdoor scene) or furniture (indoor scene) may not move. Therefore, video information related to the background may not necessarily have to be transmitted multiple times, reducing the number of bits to be transmitted or stored. On the other hand, if the camera is panning, the background may “move” on the video screen. In this case, it is possible to avoid transmitting background information. Instead of encoding and transmitting the background information multiple times, it is possible to encode it once and subsequently transmit information related to its movement. Techniques related to this process are called motion estimation.
- a device known as a transcoder may employ motion estimation.
- a transcoder reduces the bit-rate of an already compressed video bit-stream, allowing the bit stream to be transmitted through a narrower channel.
- Transcoders are often used by video services that operate over more than one type of network. In this situation, the different networks may have different bandwidths, thus each end-user may require a different Quality of Service (QoS). Therefore, “gateways” between the networks employ transcoders to adapt video bit-rates to different end-users on different networks.
- QoS Quality of Service
- transcoders There are two well known types of transcoders.
- the simplest type is an “open-loop transcoder.”
- the incoming bit-rate is reduced by the well-known mathematical technique of truncating or re-quantizing the DCT coefficients. In other words, the encoded bits that represent the higher quality aspects of the video are discarded. Because this transcoding is done in the “coded domain,” i.e., it is done without decoding the signal, these transcoders are simple and fast.
- Open-loop transcoding produces increased distortion caused by a “drift” due to the mismatched reconstructed pictures in the encoder and the decoder. This distortion may result in an unacceptable video quality in many applications.
- a second, more complicated, type of transcoder is a “drift-free” transcoder. It operates by decoding the incoming coded video and then re-encoding the video at a lower bit-rate. Using this method, it is possible to take advantage of useful information in the encoded video arriving at the transcoder, such as picture type, motion vectors, quantization step-size, bit-allocation statistics, etc. It is possible to construct transcoders with different complexity and performance with regard to coding efficiency and video quality.
- the second type of transcoding does not employ motion estimation for two reasons.
- the invention comprises a method and apparatus of re-encoding digital video from a previously encoded digital video having at least one input motion vector.
- the method comprises the steps of determining a base motion vector from the at least one input motion vector, generating a delta motion vector, generating a new motion vector that is the sum of the base motion vector and the delta motion vector, and re-encoding the previously encoded digital video using the new motion vector.
- FIG. 1A and 1B are block diagrams illustrating motion estimation that is prevalent in the prior art
- FIG. 2 is a diagram of major components of an environment, prevalent in the prior art, that uses a front-encoder and an end-decoder;
- FIG. 3 is a block diagram of the major components of the front-encoder shown in FIG. 2 that uses motion estimation;
- FIG. 4 is a block diagram of the major components of the end-decoder shown in FIG. 2 that uses motion compensation;
- FIG. 5 is a diagram of the major components of an environment which may require transcoders
- FIG. 6 is a diagram of an open-loop transcoder prevalent in the prior art
- FIG. 7 is a diagram of a drift-free transcoder prevalent in the prior art
- FIG. 8 is a diagram of a of a drift-free transcoder prevalent in the prior art
- FIG. 9 is a diagram of a drift-free transcoder consistent with this invention.
- FIG. 10 is a diagram illustrating frame skipping consistent with this invention.
- FIG. 11 is a flow chart, consistent with this invention, of a method for performing motion vector refinement with and without frame skipping;
- FIG. 12 is a block diagram of an apparatus, consistent with this invention, that performs motion vector refinement with or without frame skipping.
- FIG. 13 is a flowchart of a method, consistent with this invention, for adaptively applying motion vector refinement.
- a digital video signal is conceptually divided into four components for encoding: frames, pixels, blocks, and macroblocks.
- a frame is a single still image in a sequence of images. Displaying this sequence in quick succession creates the illusion of motion.
- a pixel (picture element) is a single point in a frame.
- Television and computer monitors display frames by dividing the display screen into thousands (or millions) of pixels, arranged in rows and columns. The pixels are so close together that they appear connected.
- Frames are divided into macroblocks, which are rectangular groups of pixels.
- Macroblocks are the units for motion-estimation encoding. Macroblocks in turn can be divided into blocks. There are two types of blocks: luminance (brightness) blocks and chrominance (color) blocks. Blocks are used for DCT encoding.
- FIGS. 1A and 1B are block diagrams illustrating motion estimation prevalent in the prior art.
- FIG. 1A shows an image in a current macroblock 130 in a current frame 131 .
- the system compares current macroblock 130 to a previous reference frame 135 over an area S 133 , as shown in FIG. 1B.
- the system searches the previous reference frame 135 to find a closest match 136 for current macroblock 130 .
- Closest match 136 in the previous reference frame 135 is in the position of current macroblock 130 in current frame 131 , but displaced by a motion vector 134 .
- Current macroblock 130 is M pixels by N pixels.
- P c is a pixel value in the current macroblock 130 and R p is a pixel value in the closest match 136 .
- FIG. 2 is a block diagram of major components of an environment, prevalent in the prior art, that uses an encoder and decoder.
- a video service 202 creates a digital video signal 204 that needs to be compressed for transmission purposes.
- a front-encoder 206 uses motion estimation to create an encoded video signal 208 along with a motion vector signal 216 .
- Encoded video signal 208 is transmitted over a channel, such as through a network (not shown), to an end-decoder 210 .
- End-decoder 210 decodes signal 208 and generates a decoded signal 212 for end-user 214 .
- End-decoder 210 uses the same motion vector signal 216 , which front-encoder 206 generated and transmitted to decoder 210 through an overhead channel (not shown).
- FIG. 3 is a block diagram of the major components of front-encoder 206 of FIG. 2 that uses motion estimation.
- a first frame signal from digital video input 204 with matching block 136 enters encoder 206 and is transformed at DCT 308 and quantized at 310 to generate a coded signal 312 .
- Coded signal 312 is variable-length coded (VLC-ed) at 316 to generate encoded digital video output 208 . Variable length coding and decoding is prevalent in the prior art.
- Coded signal 312 is also inverse quantized at 318 , inverse DCT-ed at 320 , and stored in a frame memory 326 .
- a second frame signal from digital video input 204 with current macroblock 130 then enters encoder 206 .
- the encoder performs a motion estimation (described below in detail) at 324 on the second frame signal and the frame signal stored at 326 , which generates a motion vector 134 that is part of motion vector signal 216 .
- Frame memory 326 outputs matching macroblock 136 to a summer 304 .
- Matching block 136 is then subtracted from current macroblock 130 at summer 304 to generate a difference signal 306 .
- difference signal 306 is DCT-ed at 308 , quantized at 310 , and VLC-ed at 316 , and outputted by encoder 206 .
- the output signal 208 is encoded video. This process continues for each macroblock in the second frame.
- Frame memory 326 is supplied with what will be the next reconstructed reference frame to be used for a third frame from digital video input 204 that enters encoder 206 .
- each subsequent frame is reconstructed by inverse quantizing at 318 , inverse DCT-ing at 320 , and adding corresponding matching macroblocks from frame memory 326 (the same matching macroblock that was subtracted at summer 304 ).
- Each motion vector is transmitted in motion vector signal 216 .
- most video coding standards including MPEG, H.261, and H.263, perform motion estimation on macroblocks based on the Sum of Absolute Difference (SAD) function.
- SAD Sum of Absolute Difference
- m and n are components of the motion vectors within predefined search area S.
- M and N are the dimensions of any macroblock and the largest possible values of i and j, respectively.
- P f c (i,j) and R f p (i+m,j+n) represent a pixel in the current frame and a pixel displaced by (m, n) in the previous reference frame, respectively.
- the superscript “c” or “p” represents the current or previous frame, respectively.
- the subscript “f” indicates that this is the first-stage or front-end encoder.
- FIG. 4 is a block diagram of the major components of end-decoder 210 shown in FIG. 2 using motion compensation.
- Encoded input 208 is variable length decoded at 403 , inverse quantized at 404 , and inverse discrete cosine transformed at 408 to form a signal 410 .
- a summer 412 adds to signal 410 the same macroblock signal 418 that had been subtracted from it in encoder 200 , thus providing a decoded digital video output signal 212 .
- the decoder does this by using the frame memory 414 and motion compensator 416 .
- Motion compensator 416 uses the motion vector signal 216 generated by front-encoder 206 . Motion vector signal 216 is transmitted to the end-decoder on an overhead channel.
- FIG. 5 is a diagram of the major components of an environment which may require transcoders.
- This environment 500 includes a network of different networks interconnected by gateways.
- a first network 512 is a public switched telephone network (PSTN) and a second network 522 is a wireless network.
- a first gateway 520 interconnects first network 512 and second network 522 .
- First network 512 may have different characteristics than second network 522 and may provide its end-users with a different QoS. For instance, assume that first network 512 has a higher QoS than second network 522 and video is transmitted through gateway 520 from first network 512 to second network 522 .
- Gateway 520 reduces the QoS, or bit-rate, of the video so that second network 522 can carry the video and deliver it to end-user 524 .
- Gateway 520 employs a transcoder to reduce the bit rate.
- a third network 504 is the Internet.
- a second gateway 506 connects first network 512 and third network 504 so that video can be delivered to end-user 502 .
- a fourth network 514 is an N-ISDN.
- a third gateway 508 connects third network 504 and fourth network 514 .
- a fourth gateway 510 connects third network 504 and second network 522 .
- Each gateway employs a transcoder of some type.
- FIG. 6 is a diagram of an open-loop transcoder 601 , prevalent in the prior art.
- Transcoder 601 is part of second gateway 506 between first network 512 and third network 504 .
- Video service 602 generates a digital video signal 604 , which is encoded by front-encoder 606 using motion estimation.
- Front-encoder 606 generates an encoded video signal 608 and a motion vector signal 626 .
- Encoded digital video 608 from first network 512 is supplied as input to transcoder 601 .
- Input signal 608 first goes through a variable length decoder (VLD) 610 , generating a second signal 612 .
- VLD variable length decoder
- second signal 612 is still in the “coded domain” because it has not yet gone through an inverse discrete cosine transform (IDCT), and a video image has not been reconstructed.
- Second signal 612 then goes through a process of high frequency cutting and requantization 616 , generating a third signal 618 .
- high frequency DCT coefficients are discarded and remaining DCT coefficients are requantized.
- Third signal 618 is variable length coded (VLC-ed) at 104 , generating an encoded digital video output 622 .
- Encoded output 622 is at a lower bit rate than the encoded digital video input 608 .
- the encoded digital video output 622 is prepared for third network 504 .
- the amount and type of requantization and high-frequency cutting performed by process 616 is determined by a bit allocation analyzer 614 , which considers the bit rate of second signal 612 and the needed rate constraint.
- the signal 622 is end-decoded 624 for end-user 502 .
- End-decoder 624 uses the same motion vector signal 626 that front-encoder 606 generated and transmitted in an overhead channel. Open-loop transcoding, however, produces increased distortion caused by the drift due to the mismatched reconstructed pictures in the encoder and the decoder. This distortion may result in an unacceptable video quality in many applications.
- FIG. 7 is a diagram of a drift-free transcoder 718 prevalent in the prior art.
- Transcoder 718 is part of third gateway 508 between third network 504 and fourth network 514 .
- Video service 702 generates a digital video signal 704 , which is encoded by front-encoder 706 using motion estimation.
- Front-encoder 706 generates an encoded video signal 712 and a motion vector signal 708 .
- the input to transcoder 718 is encoded digital video 712 from third network 504 .
- Transcoder 718 includes a cascaded decoder 714 and encoder 716 .
- Transcoder 718 decodes the encoded digital video signal 712 in decoder 714 using motion vector signal 708 generated by front-encoder 706 .
- the output of decoder 714 is an intermediate decoded video signal 720 .
- Encoder 716 takes intermediate digital video signal 720 and re-encodes it at a bit and frame rate suitable for fourth network 514 . Encoder 716 performs an entirely new motion estimation and generates a new motion vector 724 . Encoded video signal 722 leaving transcoder 718 is at a lower bit rate, and possibly a lower frame rate, than encoded digital video signal 712 entering transcoder 718 . Encoded video signal 722 is transmitted through fourth network 514 to end-decoder 726 . End-decoder 726 decodes second encoded video signal 722 into a decoded digital video signal 728 , which may be viewed by end-user 516 .
- the intermediate signal 720 is considered as the original video signal for encoder 716 in transcoder 718 .
- Encoder 716 is also known as a second-stage encoder 716 .
- pixels of the previously reconstructed frame and the current frame in the second-stage encoder are R s p (i,j) and P s c (i,j), respectively.
- the subscript “s” indicates the second-stage encoder.
- ⁇ f c (i,j) represents the quantization error of the current frame in the first-stage encoding process
- ⁇ s p (i,j) represents the quantization error of the previous frame in the second-stage encoding process. Therefore, the motion vector is defined by the motion vector at the first-stage encoder and the quantization errors from the first and the second-stage encoders.
- transcoder 718 The problem with transcoder 718 , however, is that it is computationally complex and expensive to implement. To reduce the complexity, transcoders commonly reuse motion vector signal 708 from the overhead channel of the incoming encoded video 712 .
- FIG. 8 is a diagram of a of a drift-free transcoder 818 that uses this method to overcome the computation problems of transcoder 718 .
- Transcoder 818 is part of fourth gateway 510 between third network 504 and second network 522 .
- video service 702 generates digital video signal 704 , which is encoded by front-encoder 706 using motion estimation.
- Front-encoder 706 generates encoded video signal 712 and motion vector signal 708 .
- Transcoder 818 The input to transcoder 818 is encoded digital video 712 from third network 504 .
- Transcoder 818 includes a cascaded decoder 814 and an encoder 816 .
- Transcoder 818 decodes the encoded video signal 712 in decoder 814 using motion vector signal 708 generated by front-encoder 706 .
- the output of decoder 814 is an intermediate decoded video signal 820 .
- Encoder 816 takes intermediate digital video signal 820 and re-encodes it with a bit and frame rate suitable for second network 522 . Encoder 816 does not perform a new motion estimation. Instead, as an approximation it reuses motion vector signal 708 generated by front-encoder 706 . Encoded video signal 822 leaving transcoder 818 is at a lower bit rate, and possibly a lower frame rate, than encoded video signal 712 entering transcoder 818 . Encoded video signal 822 is transmitted through second network 522 to end-decoder 826 . End-decoder 826 decodes encoded video signal 822 into a decoded video signal 828 , which may be viewed by end-user 524 .
- Transcoder 818 introduces significant quality degradation in many applications.
- (Ix, Iy) be a motion vector from motion vector signal 708 received during decoding 814 (i.e. received from the overhead channel).
- SAD s (Nx, Ny) is the optimized motion vector if a full-scale motion estimation were performed in the second-stage encoder 816 .
- SAD s (Nx, Ny) by definition is the minimal value among all possible SADs.
- SAD s ⁇ ( Nx , Ny ) ⁇ SAD s ⁇ ( Ix , Iy ) ⁇ SAD f ⁇ ( Ix , Iy ) + ⁇ i M ⁇ ⁇ j N ⁇ ⁇ ⁇ f c ⁇ ( i , j ) - ⁇ s p ⁇ ( i + Ix , j + Iy ) ⁇
- SDRE Differential Reconstruction Error
- FIG. 9 is a diagram of a drift-free transcoder 918 consistent with this invention.
- Transcoder 918 may be part of first gateway 520 between first network 512 and second network 522 .
- video service 602 generates a digital video signal 604 , which is encoded by front-encoder 606 using motion estimation.
- Front-encoder 606 generates an encoded video signal 608 and motion vector signal 626 .
- the input to transcoder 918 is encoded digital video 608 from first network 512 .
- Transcoder 918 includes a cascaded decoder 914 and encoder 916 .
- Transcoder 918 decodes the encoded digital video signal 608 in decoder 914 using motion vector signal 626 generated by front-encoder 606 .
- the output of decoder 914 is an intermediate decoded video signal 920 .
- Encoder 916 takes intermediate video signal 920 and re-encodes it at a bit and frame rate suitable for second network 522 .
- Encoder 916 performs a refined motion estimation with motion vector signal 626 from front-encoder 606 as an input and generates a new motion vector signal 928 .
- Encoded video signal 922 leaving transcoder 918 is at a lower bit rate, and possibly a lower frame rate, than encoded digital video signal 608 entering transcoder 918 .
- Encoded video signal 922 is transmitted through second network 522 to end-decoder 926 .
- End-decoder 926 decodes encoded video signal 922 into a decoded video signal 930 , which may be viewed by end-user 524 .
- Methods and systems consistent with this invention determine a base motion vector (Bx, By) from at least one incoming motion vector from motion vector signal 626 and generate a delta motion vector (Dx, Dy). Given the base and the delta motion vector, a new motion vector (Ox, Oy) that is part of motion vector signal 928 is the sum of the base motion vector and the delta motion vector, expressed by
- (Ix, Iy) be the current input motion vector from incoming motion vector signal 626 in the current frame.
- the base motion vector (Bx, By) is set equal to the input motion vector (Ix, Iy). This is represented by
- the delta motion vector (Dx, Dy) is obtained within a much smaller search area S′ than the search area S necessary for a full motion estimation technique used in transcoder 718 by encoder 716 .
- the new motion vector (Ox, Oy) that is part of motion vector signal 928 is the sum of the base motion vector (Bx, By) and the delta motion vector (Dx, Dy). Methods and systems consistent with this invention re-encode the previously encoded video using the new motion vector. Calculation of new motion vector (Ox, Oy) is less computationally intensive than the calculations of transcoder 718 because of the smaller search area. Transcoder 918 also results in better video quality than the method of transcoder 818 .
- a technique known as “frame skipping” is prevalent in the prior art.
- the frame rate is measured in frames per second.
- Each frame is a still image, and displaying frames in quick succession creates the illusion of motion.
- the more frames per second the smoother the motion appears.
- Frame skipping is a method of reducing the frame rate in order to allocate more bits to the remaining frames to maintain an acceptable image quality.
- Frame-skipping is also needed when an end-user only supports a lower frame-rate.
- FIG. 10 is a diagram, consistent with this invention, showing motion vectors with frame-skipping. The frames between frame n and frame (n+i+1) are skipped.
- the base motion vector (Bx, By) is determined by summing the incoming motion vectors from motion vector signal 626 . For example, with the sequence of incoming motion vectors (IV n+1 , IV n+2 , . . . , IV n+i ) as shown in FIG.
- This base motion vector is non-optimal because the motion vectors (IV n+1 , IV n+2 , . . . , IV n+i ) are non-optimal due to the quantization errors, as described above.
- the delta motion vector for the frame (n+i+1), (Dx, Dy) n+i+1 is determined by searching for a matching block in the n-th previous reference frame in the manner described above.
- FIG. 11 is a flow chart of a method, consistent with this invention, for performing motion vector refinement with or without frame skipping.
- the encoded video signal is decoded 608 (step 1102 ).
- the base motion vector (step 1104 ), which is dependent upon the input motion vector signal 626 , is determined. The base motion vector determination may be different depending on whether there is frame skipping or not.
- a delta motion vector is generated and a new motion vector signal 928 (step 1108 ) is generated by summing the base and delta motion vectors, as explained above.
- the video signal is re-encoded using the new motion vector to generate an encoded video output 922 .
- FIG. 12 is a block diagram of an apparatus, consistent with this invention, that performs motion vector refinement with or without frame skipping.
- Decoder 914 inputs encoded video signal 608 and outputs intermediate decoded video signal 920 .
- a base motion vector circuit 1204 inputs motion vector signal 626 and outputs base motion vector signal 1210 , which may be different depending whether there is frame skipping or not.
- a delta motion vector circuit 1202 inputs base motion vector signal 1210 and intermediate video signal 920 and generates a delta motion vector signal 1212 .
- a new motion vector circuit 1206 inputs delta motion vector signal 1212 and base motion vector signal 1210 and sums them to generate new motion vector signal 928 .
- An encoder 1208 inputs new motion vector signal 928 and intermediate video signal 920 and outputs encoded digital video signal 922 .
- delta motion vector circuit 1202 new motion vector circuit 1206 , base motion circuit 1204 , encoder 1208 , and decoder 914 are implemented in a computer as instructions in a computer-readable medium.
- Methods and systems consistent with this invention may further reduce the required computation of transcoder 919 by performing the motion vector refinement adaptively.
- the main cause of non-optimum motion vectors is the differential quantization errors, as shown above.
- the SDRE(Bx, By) is small compared to SAD(Bx, By)
- the incoming motion vectors of motion vector signal 626 are near the optimum and the transcoder may not need to perform motion vector refinement.
- the SDRE may be approximated in transcoder 918 .
- the SDRE is small.
- the SDRE may be approximated as SDRE ⁇ ( Bx , By ) ⁇ ⁇ i M ⁇ ⁇ j N ⁇ ⁇ ⁇ s p ⁇ ( i + Bx , j + By ) ⁇ ,
- the SDRE may be approximated by SDRE ⁇ ( Bx , By ) ⁇ ⁇ ( q 1 2 q 2 2 - 1 ) ⁇ ⁇ ⁇ i M ⁇ ⁇ j N ⁇ ⁇ ⁇ s p ⁇ ( i + Bx , j + By ) ⁇
- q 1 is the quantization step-size use in the current frame of front-encoder 606 and q 2 is the quantization step-size used in the previous frame of second encoder 916 .
- the complexity of this computation is similar to checking one search position in the motion estimation, so it does not require much new computation.
- the motion vector refinement is not performed, and the incoming motion vector signal 626 is reused as the outgoing motion vector signal 928 , i.e., the delta motion vector is set equal to zero.
- a predetermined higher second SDRE threshold can be used to prefer the reuse of the zero incoming motion vector signal 626 as the outgoing motion vector signal 928 .
- the motion vector refinement is not performed, i.e., the delta motion vector is set equal to zero and the incoming motion vector signal 626 is reused as the outgoing motion vector signal 928 .
- Reuse of a zero motion vector is preferable because a non-zero motion vector will need more bits to code.
- reuse of the incoming motion vector signal 626 can be accomplished by setting the delta motion vector to zero.
- methods and systems consistent with this invention may also apply adaptive motion vector refinement. Also, often a large number of macroblocks are non-coded. In methods or systems consistent with this invention, these macroblocks are not subject to motion vector refinement.
- FIG. 13 is a flow diagram, consistent with this invention, summarizing adaptive motion vector refinement.
- the appropriate base motion vector and SDRE are calculated (step 1302 ). If the base motion vector is zero (step 1304 ) and SDRE is greater than the second threshold (step 1306 ), then motion vector refinement is applied. If the base motion vector is zero (step 1304 ) and SDRE is not greater than the second threshold (step 1306 ), then motion vector refinement is not applied. If the base motion vector is not zero (step 1304 ) and SDRE is greater than the first threshold (step 1308 ), then motion vector refinement is applied. If the base motion vector is not zero (step 1304 ) and SDRE is not greater than the first threshold (step 1308 ), then motion vector refinement is not applied. In one embodiment, the first threshold and the second threshold are empirically set to 300 and 500 , respectively.
- this invention may be implemented in an environment other than a gateway connecting two networks.
- the invention could reside on a computer and be used to decrease the size of encoded video files, without transmitting the files.
- the invention would reside in a computer, not a gateway.
- the invention is not limited to operation in hardware such as an application specific integrated circuit.
- the invention could be implemented using software.
- the SAD function is used to find the optimal motion vector, other functions that are well known in the art may be used. Also, although adaptive motion vector refinement is used to decrease the bit rate of a digital video signal, nothing prohibits it from being used where the bit rate is not reduced.
Abstract
Description
- 1. Field of the Invention
- This invention relates generally to digital video compression and, in particular, to re-encoding a previously encoded digital video signal, also known as transcoding.
- 2. Description of the Related Art
- Consumer electronic equipment increasingly uses digital video technology. Because it improves picture quality, this digital technology is superior to the analog technology currently used in most commercial broadcasting and traditional VCRs. With digital video, motion picture image information is stored in the form of bits of digital data, i.e.0s and 1s. This information may be transmitted in the form of a stream of bits, also known as a “digital video signal.” Conventional digital video signals, however, require undesirably wide channels for transmission and undesirably large amounts of memory for storage. To avoid these problems, digital video signals are often “compressed” or “encoded.” Compression and encoding allow the same video information (or nearly the same information) to be represented using fewer bits. These encoding techniques have allowed for technical advances in the fields of digital broadcast television, digital satellite television, video teleconferencing, and video electronic mail.
- Digital video compression reduces the number of bits by removing redundant information, without seriously affecting the quality of the video. Standard video compression techniques remove spatial redundancy within a video frame and remove temporal redundancy between video frames. To remove spatial redundancy, encoders commonly use a Discrete Cosine Transform (DCT), which is widely known and understood. To remove temporal redundancy, encoders commonly use motion estimation, which is also widely known and understood.
- With regard to motion estimation, the images in a digital video usually do not change much within small time intervals, i.e., adjacent frames include a great deal of redundant information. Thus, motion-estimation takes advantage of this redundancy and encodes a video frame based on other video frames temporally close to it. For example, in a particular movie scene, the background trees (outdoor scene) or furniture (indoor scene) may not move. Therefore, video information related to the background may not necessarily have to be transmitted multiple times, reducing the number of bits to be transmitted or stored. On the other hand, if the camera is panning, the background may “move” on the video screen. In this case, it is possible to avoid transmitting background information. Instead of encoding and transmitting the background information multiple times, it is possible to encode it once and subsequently transmit information related to its movement. Techniques related to this process are called motion estimation.
- A device known as a transcoder may employ motion estimation. A transcoder reduces the bit-rate of an already compressed video bit-stream, allowing the bit stream to be transmitted through a narrower channel. Transcoders are often used by video services that operate over more than one type of network. In this situation, the different networks may have different bandwidths, thus each end-user may require a different Quality of Service (QoS). Therefore, “gateways” between the networks employ transcoders to adapt video bit-rates to different end-users on different networks.
- There are two well known types of transcoders. First, the simplest type is an “open-loop transcoder.” In this type of transcoder, the incoming bit-rate is reduced by the well-known mathematical technique of truncating or re-quantizing the DCT coefficients. In other words, the encoded bits that represent the higher quality aspects of the video are discarded. Because this transcoding is done in the “coded domain,” i.e., it is done without decoding the signal, these transcoders are simple and fast. Open-loop transcoding, however, produces increased distortion caused by a “drift” due to the mismatched reconstructed pictures in the encoder and the decoder. This distortion may result in an unacceptable video quality in many applications.
- A second, more complicated, type of transcoder is a “drift-free” transcoder. It operates by decoding the incoming coded video and then re-encoding the video at a lower bit-rate. Using this method, it is possible to take advantage of useful information in the encoded video arriving at the transcoder, such as picture type, motion vectors, quantization step-size, bit-allocation statistics, etc. It is possible to construct transcoders with different complexity and performance with regard to coding efficiency and video quality.
- Traditionally, the second type of transcoding does not employ motion estimation for two reasons. First, transcoders must operate very quickly and motion estimation is computationally complex and thus expensive to implement. Second, it is widely assumed that re-using motion vectors extracted from the incoming encoded video is as good as performing a new motion estimation, thus motion estimation is not performed in the transcoder. In some applications, however, this reuse scheme introduces significant quality degradation because reused motion vectors are not the optimal motion vectors.
- Thus, it is thus desirable to provide a method and apparatus of motion estimation when re-encoding a digital video signal that is not computationally complex, thereby providing an improved video quality.
- The advantages and purposes of the invention are set forth in part in the description which follows, and in part are obvious from the description, or may be learned by practice of the invention. The advantages and purposes of the invention are realized and attained by means of the elements and combinations particularly pointed out in the appended claims.
- To attain the advantages and in accordance with the purpose of the invention, as embodied and broadly described herein, the invention comprises a method and apparatus of re-encoding digital video from a previously encoded digital video having at least one input motion vector. The method comprises the steps of determining a base motion vector from the at least one input motion vector, generating a delta motion vector, generating a new motion vector that is the sum of the base motion vector and the delta motion vector, and re-encoding the previously encoded digital video using the new motion vector.
- The summary and the following detailed description should not restrict the scope of the claimed invention. Both provide examples and explanations to enable others to practice the invention. The accompanying drawings, which form part of the detailed description, show several embodiments of the invention and, together with the description, explain the principles of the invention.
- The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate several embodiments of the invention and together with the description, serve to explain the principles of the invention. In the drawings,
- FIG. 1A and 1B are block diagrams illustrating motion estimation that is prevalent in the prior art;
- FIG. 2 is a diagram of major components of an environment, prevalent in the prior art, that uses a front-encoder and an end-decoder;
- FIG. 3 is a block diagram of the major components of the front-encoder shown in FIG. 2 that uses motion estimation;
- FIG. 4 is a block diagram of the major components of the end-decoder shown in FIG. 2 that uses motion compensation;
- FIG. 5 is a diagram of the major components of an environment which may require transcoders;
- FIG. 6 is a diagram of an open-loop transcoder prevalent in the prior art;
- FIG. 7 is a diagram of a drift-free transcoder prevalent in the prior art;
- FIG. 8 is a diagram of a of a drift-free transcoder prevalent in the prior art;
- FIG. 9 is a diagram of a drift-free transcoder consistent with this invention;
- FIG. 10 is a diagram illustrating frame skipping consistent with this invention;
- FIG. 11 is a flow chart, consistent with this invention, of a method for performing motion vector refinement with and without frame skipping;
- FIG. 12 is a block diagram of an apparatus, consistent with this invention, that performs motion vector refinement with or without frame skipping; and
- FIG. 13 is a flowchart of a method, consistent with this invention, for adaptively applying motion vector refinement.
- The following description of the embodiments of this invention refers to the accompanying drawings. Where appropriate the same reference numbers in different drawings refer to the same or similar elements.
- A. Overview of Encoding/Decoding
- Before discussing transcoders and this invention, it is first necessary to describe digital video signals, and the process of encoding and decoding digital video signals. This is necessary because of the close relationship between encoding, decoding, and transcoding.
- As is well known in the art, a digital video signal is conceptually divided into four components for encoding: frames, pixels, blocks, and macroblocks. A frame is a single still image in a sequence of images. Displaying this sequence in quick succession creates the illusion of motion. A pixel (picture element) is a single point in a frame. Television and computer monitors display frames by dividing the display screen into thousands (or millions) of pixels, arranged in rows and columns. The pixels are so close together that they appear connected. Frames are divided into macroblocks, which are rectangular groups of pixels. Macroblocks are the units for motion-estimation encoding. Macroblocks in turn can be divided into blocks. There are two types of blocks: luminance (brightness) blocks and chrominance (color) blocks. Blocks are used for DCT encoding.
- Video encoding removes redundant information from the digital video signal. To remove temporal redundancy, encoders commonly use motion estimation. FIGS. 1A and 1B are block diagrams illustrating motion estimation prevalent in the prior art. FIG. 1A shows an image in a
current macroblock 130 in acurrent frame 131. During motion-estimation, the system comparescurrent macroblock 130 to aprevious reference frame 135 over anarea S 133, as shown in FIG. 1B. The system searches theprevious reference frame 135 to find aclosest match 136 forcurrent macroblock 130.Closest match 136 in theprevious reference frame 135 is in the position ofcurrent macroblock 130 incurrent frame 131, but displaced by amotion vector 134.Current macroblock 130 is M pixels by N pixels. Pc is a pixel value in thecurrent macroblock 130 and Rp is a pixel value in theclosest match 136. - FIG. 2 is a block diagram of major components of an environment, prevalent in the prior art, that uses an encoder and decoder. A
video service 202 creates adigital video signal 204 that needs to be compressed for transmission purposes. A front-encoder 206 uses motion estimation to create an encodedvideo signal 208 along with amotion vector signal 216. Encodedvideo signal 208 is transmitted over a channel, such as through a network (not shown), to an end-decoder 210. End-decoder 210 decodes signal 208 and generates a decodedsignal 212 for end-user 214. End-decoder 210 uses the samemotion vector signal 216, which front-encoder 206 generated and transmitted todecoder 210 through an overhead channel (not shown). - FIG. 3 is a block diagram of the major components of front-
encoder 206 of FIG. 2 that uses motion estimation. A first frame signal fromdigital video input 204 with matchingblock 136 entersencoder 206 and is transformed atDCT 308 and quantized at 310 to generate acoded signal 312.Coded signal 312 is variable-length coded (VLC-ed) at 316 to generate encodeddigital video output 208. Variable length coding and decoding is prevalent in the prior art.Coded signal 312 is also inverse quantized at 318, inverse DCT-ed at 320, and stored in aframe memory 326. - A second frame signal from
digital video input 204 withcurrent macroblock 130 then entersencoder 206. Before this macroblock is DCT-ed at 308, however, the encoder performs a motion estimation (described below in detail) at 324 on the second frame signal and the frame signal stored at 326, which generates amotion vector 134 that is part ofmotion vector signal 216.Frame memory 326outputs matching macroblock 136 to asummer 304.Matching block 136 is then subtracted fromcurrent macroblock 130 atsummer 304 to generate adifference signal 306. Then,difference signal 306 is DCT-ed at 308, quantized at 310, and VLC-ed at 316, and outputted byencoder 206. Theoutput signal 208 is encoded video. This process continues for each macroblock in the second frame.Frame memory 326 is supplied with what will be the next reconstructed reference frame to be used for a third frame fromdigital video input 204 that entersencoder 206. - After the first frame, each subsequent frame is reconstructed by inverse quantizing at318, inverse DCT-ing at 320, and adding corresponding matching macroblocks from frame memory 326 (the same matching macroblock that was subtracted at summer 304). Each motion vector is transmitted in
motion vector signal 216. - Referring back to FIGS. 1A and 1B, most video coding standards, including MPEG, H.261, and H.263, perform motion estimation on macroblocks based on the Sum of Absolute Difference (SAD) function. As is taught in the prior art, in order to obtain
motion vector 134 forcurrent macroblock 130, most encoders search for amatching block 136 that results in a minimum SAD within a predefinedsearch area S 133 inprevious reference frame 135. Thus, the motion vector,MV1 134, of a general motion estimation encoder is obtained by - where m and n are components of the motion vectors within predefined search area S. M and N are the dimensions of any macroblock and the largest possible values of i and j, respectively. Pf c(i,j) and Rf p(i+m,j+n) represent a pixel in the current frame and a pixel displaced by (m, n) in the previous reference frame, respectively. The superscript “c” or “p” represents the current or previous frame, respectively. The subscript “f” indicates that this is the first-stage or front-end encoder.
- FIG. 4 is a block diagram of the major components of end-
decoder 210 shown in FIG. 2 using motion compensation. Encodedinput 208 is variable length decoded at 403, inverse quantized at 404, and inverse discrete cosine transformed at 408 to form asignal 410. Asummer 412 adds to signal 410 thesame macroblock signal 418 that had been subtracted from it inencoder 200, thus providing a decoded digitalvideo output signal 212. The decoder does this by using theframe memory 414 andmotion compensator 416.Motion compensator 416 uses themotion vector signal 216 generated by front-encoder 206.Motion vector signal 216 is transmitted to the end-decoder on an overhead channel. - B. Transcoders
- Transcoders further compress, or reduce the bit rate, of encoded digital video signals. FIG. 5 is a diagram of the major components of an environment which may require transcoders. This
environment 500 includes a network of different networks interconnected by gateways. Afirst network 512 is a public switched telephone network (PSTN) and asecond network 522 is a wireless network. Afirst gateway 520 interconnectsfirst network 512 andsecond network 522.First network 512 may have different characteristics thansecond network 522 and may provide its end-users with a different QoS. For instance, assume thatfirst network 512 has a higher QoS thansecond network 522 and video is transmitted throughgateway 520 fromfirst network 512 tosecond network 522.Gateway 520 reduces the QoS, or bit-rate, of the video so thatsecond network 522 can carry the video and deliver it to end-user 524.Gateway 520 employs a transcoder to reduce the bit rate. - A
third network 504 is the Internet. Asecond gateway 506 connectsfirst network 512 andthird network 504 so that video can be delivered to end-user 502. Afourth network 514 is an N-ISDN. Athird gateway 508 connectsthird network 504 andfourth network 514. Afourth gateway 510 connectsthird network 504 andsecond network 522. Each gateway employs a transcoder of some type. - FIG. 6 is a diagram of an open-
loop transcoder 601, prevalent in the prior art.Transcoder 601 is part ofsecond gateway 506 betweenfirst network 512 andthird network 504.Video service 602 generates adigital video signal 604, which is encoded by front-encoder 606 using motion estimation. Front-encoder 606 generates an encodedvideo signal 608 and amotion vector signal 626. Encodeddigital video 608 fromfirst network 512 is supplied as input totranscoder 601.Input signal 608 first goes through a variable length decoder (VLD) 610, generating asecond signal 612. Although variable length decoding has taken place,second signal 612 is still in the “coded domain” because it has not yet gone through an inverse discrete cosine transform (IDCT), and a video image has not been reconstructed.Second signal 612 then goes through a process of high frequency cutting andrequantization 616, generating athird signal 618. In thisprocess 616, high frequency DCT coefficients are discarded and remaining DCT coefficients are requantized.Third signal 618 is variable length coded (VLC-ed) at 104, generating an encoded digital video output 622. - Encoded output622 is at a lower bit rate than the encoded
digital video input 608. The encoded digital video output 622 is prepared forthird network 504. The amount and type of requantization and high-frequency cutting performed byprocess 616 is determined by abit allocation analyzer 614, which considers the bit rate ofsecond signal 612 and the needed rate constraint. Inthird network 504, the signal 622 is end-decoded 624 for end-user 502. End-decoder 624 uses the samemotion vector signal 626 that front-encoder 606 generated and transmitted in an overhead channel. Open-loop transcoding, however, produces increased distortion caused by the drift due to the mismatched reconstructed pictures in the encoder and the decoder. This distortion may result in an unacceptable video quality in many applications. - FIG. 7 is a diagram of a drift-
free transcoder 718 prevalent in the prior art.Transcoder 718 is part ofthird gateway 508 betweenthird network 504 andfourth network 514.Video service 702 generates adigital video signal 704, which is encoded by front-encoder 706 using motion estimation. Front-encoder 706 generates an encodedvideo signal 712 and a motion vector signal 708. The input totranscoder 718 is encodeddigital video 712 fromthird network 504.Transcoder 718 includes a cascadeddecoder 714 andencoder 716.Transcoder 718 decodes the encodeddigital video signal 712 indecoder 714 using motion vector signal 708 generated by front-encoder 706. Thus, the output ofdecoder 714 is an intermediate decodedvideo signal 720. -
Encoder 716 takes intermediatedigital video signal 720 and re-encodes it at a bit and frame rate suitable forfourth network 514.Encoder 716 performs an entirely new motion estimation and generates anew motion vector 724. Encodedvideo signal 722 leavingtranscoder 718 is at a lower bit rate, and possibly a lower frame rate, than encodeddigital video signal 712 enteringtranscoder 718. Encodedvideo signal 722 is transmitted throughfourth network 514 to end-decoder 726. End-decoder 726 decodes second encodedvideo signal 722 into a decodeddigital video signal 728, which may be viewed by end-user 516. - In
transcoder 718, theintermediate signal 720 is considered as the original video signal forencoder 716 intranscoder 718.Encoder 716 is also known as a second-stage encoder 716. Thus, a motion vector MV2 frommotion vector signal 724 for a macroblock in the second-stage motion estimation process is given by - where the pixels of the previously reconstructed frame and the current frame in the second-stage encoder are Rs p(i,j) and Ps c(i,j), respectively. The subscript “s” indicates the second-stage encoder.
-
- Here, Δf c(i,j) represents the quantization error of the current frame in the first-stage encoding process, while Δs p(i,j) represents the quantization error of the previous frame in the second-stage encoding process. Therefore, the motion vector is defined by the motion vector at the first-stage encoder and the quantization errors from the first and the second-stage encoders.
- The problem with
transcoder 718, however, is that it is computationally complex and expensive to implement. To reduce the complexity, transcoders commonly reuse motion vector signal 708 from the overhead channel of the incoming encodedvideo 712. FIG. 8 is a diagram of a of a drift-free transcoder 818 that uses this method to overcome the computation problems oftranscoder 718.Transcoder 818 is part offourth gateway 510 betweenthird network 504 andsecond network 522. Again,video service 702 generatesdigital video signal 704, which is encoded by front-encoder 706 using motion estimation. Front-encoder 706 generates encodedvideo signal 712 and motion vector signal 708. The input totranscoder 818 is encodeddigital video 712 fromthird network 504.Transcoder 818 includes a cascadeddecoder 814 and anencoder 816.Transcoder 818 decodes the encodedvideo signal 712 indecoder 814 using motion vector signal 708 generated by front-encoder 706. Thus, the output ofdecoder 814 is an intermediate decodedvideo signal 820. -
Encoder 816 takes intermediatedigital video signal 820 and re-encodes it with a bit and frame rate suitable forsecond network 522.Encoder 816 does not perform a new motion estimation. Instead, as an approximation it reuses motion vector signal 708 generated by front-encoder 706. Encodedvideo signal 822 leavingtranscoder 818 is at a lower bit rate, and possibly a lower frame rate, than encodedvideo signal 712 enteringtranscoder 818. Encodedvideo signal 822 is transmitted throughsecond network 522 to end-decoder 826. End-decoder 826 decodes encodedvideo signal 822 into a decodedvideo signal 828, which may be viewed by end-user 524. -
Transcoder 818, however, introduces significant quality degradation in many applications. In analyzing this second prior art method, let (Ix, Iy) be a motion vector from motion vector signal 708 received during decoding 814 (i.e. received from the overhead channel). SADs(Ix, Iy) with this motion vector can be represented as -
-
- This relation reveals that the reuse of the incoming motion vector signal708 in
encoder 816 and end-decoder 826 results in non-optimal motion vectors due to the differential quantization errors. This causes quality degradation. When the bit-rate of thetranscoder output 822 is not significantly less than the bit-rate of thetranscoder input 712, then the reuse of the incoming motion vectors may not cause significant quality degradation because the differential quantization errors and the SDRE are relatively small. When the difference in the bit-rate is not small, however, the quality degradation may be significant and more accurate motion vectors are desirable. - From the above analysis of the known art, this invention may be understood. FIG. 9 is a diagram of a drift-
free transcoder 918 consistent with this invention.Transcoder 918 may be part offirst gateway 520 betweenfirst network 512 andsecond network 522. Again,video service 602 generates adigital video signal 604, which is encoded by front-encoder 606 using motion estimation. Front-encoder 606 generates an encodedvideo signal 608 andmotion vector signal 626. The input totranscoder 918 is encodeddigital video 608 fromfirst network 512.Transcoder 918 includes a cascadeddecoder 914 andencoder 916.Transcoder 918 decodes the encodeddigital video signal 608 indecoder 914 usingmotion vector signal 626 generated by front-encoder 606. Thus, the output ofdecoder 914 is an intermediate decodedvideo signal 920. -
Encoder 916 takesintermediate video signal 920 and re-encodes it at a bit and frame rate suitable forsecond network 522.Encoder 916 performs a refined motion estimation withmotion vector signal 626 from front-encoder 606 as an input and generates a newmotion vector signal 928. Encodedvideo signal 922 leavingtranscoder 918 is at a lower bit rate, and possibly a lower frame rate, than encodeddigital video signal 608 enteringtranscoder 918. Encodedvideo signal 922 is transmitted throughsecond network 522 to end-decoder 926. End-decoder 926 decodes encodedvideo signal 922 into a decodedvideo signal 930, which may be viewed by end-user 524. - Now, the motion estimation of
encoder 916 is explained. The differential quantization errors intranscoder 918 cause a perturbation in the position of the optimum motion vector for encodingintermediate signal 920. Therefore, instead of applying a full-scale motion vector estimation or re-using the incoming motion vectors, methods and systems consistent with this invention use “motion vector refinement.” Methods and systems consistent with this invention determine a base motion vector (Bx, By) from at least one incoming motion vector frommotion vector signal 626 and generate a delta motion vector (Dx, Dy). Given the base and the delta motion vector, a new motion vector (Ox, Oy) that is part ofmotion vector signal 928 is the sum of the base motion vector and the delta motion vector, expressed by - (Ox, Oy)=(Bx, By)+(Dx, Dy).
- Here, let (Ix, Iy) be the current input motion vector from incoming
motion vector signal 626 in the current frame. In the case where there is no frame skipping (described below), the base motion vector (Bx, By) is set equal to the input motion vector (Ix, Iy). This is represented by - (Ox, Oy)=(Bx, By)+(Dx, Dy)=(Ix, Iy)+(Dx, Dy).
- Given the base motion vector (Bx, By), the delta motion vector (Dx, Dy) is obtained within a much smaller search area S′ than the search area S necessary for a full motion estimation technique used in
transcoder 718 byencoder 716. Methods and systems consistent with this invention generate the delta motion vector by - The new motion vector (Ox, Oy) that is part of
motion vector signal 928 is the sum of the base motion vector (Bx, By) and the delta motion vector (Dx, Dy). Methods and systems consistent with this invention re-encode the previously encoded video using the new motion vector. Calculation of new motion vector (Ox, Oy) is less computationally intensive than the calculations oftranscoder 718 because of the smaller search area.Transcoder 918 also results in better video quality than the method oftranscoder 818. - C. Transcoding and Frame Skipping
- A technique known as “frame skipping” is prevalent in the prior art. The frame rate is measured in frames per second. Each frame is a still image, and displaying frames in quick succession creates the illusion of motion. The more frames per second, the smoother the motion appears. Frame skipping is a method of reducing the frame rate in order to allocate more bits to the remaining frames to maintain an acceptable image quality. Frame-skipping is also needed when an end-user only supports a lower frame-rate.
- Methods and systems consistent with this invention may also employ frame skipping, i.e., transcoder919 may perform a frame rate conversion by skipping frames. FIG. 10 is a diagram, consistent with this invention, showing motion vectors with frame-skipping. The frames between frame n and frame (n+i+1) are skipped. In the case of frame-skipping, the base motion vector (Bx, By) is determined by summing the incoming motion vectors from
motion vector signal 626. For example, with the sequence of incoming motion vectors (IVn+1, IVn+2, . . . , IVn+i) as shown in FIG. 10, the base motion vector for the current (n+i+1) frame is derived by adding the current motion vector to the sum of the previous motion vectors since a previous determination of a different base motion vector. This is described by -
- FIG. 11 is a flow chart of a method, consistent with this invention, for performing motion vector refinement with or without frame skipping. First, the encoded video signal is decoded608 (step 1102). Then the base motion vector (step 1104), which is dependent upon the input
motion vector signal 626, is determined. The base motion vector determination may be different depending on whether there is frame skipping or not. Instep 1106, a delta motion vector is generated and a new motion vector signal 928 (step 1108) is generated by summing the base and delta motion vectors, as explained above. Finally, atstep 1110, the video signal is re-encoded using the new motion vector to generate an encodedvideo output 922. - FIG. 12 is a block diagram of an apparatus, consistent with this invention, that performs motion vector refinement with or without frame skipping.
Decoder 914 inputs encodedvideo signal 608 and outputs intermediate decodedvideo signal 920. A basemotion vector circuit 1204 inputsmotion vector signal 626 and outputs basemotion vector signal 1210, which may be different depending whether there is frame skipping or not. A deltamotion vector circuit 1202 inputs basemotion vector signal 1210 andintermediate video signal 920 and generates a deltamotion vector signal 1212. A newmotion vector circuit 1206 inputs deltamotion vector signal 1212 and basemotion vector signal 1210 and sums them to generate newmotion vector signal 928. Anencoder 1208 inputs newmotion vector signal 928 andintermediate video signal 920 and outputs encodeddigital video signal 922. - Alternatively, delta
motion vector circuit 1202, newmotion vector circuit 1206,base motion circuit 1204,encoder 1208, anddecoder 914 are implemented in a computer as instructions in a computer-readable medium. - D. Adaptive Motion Vector Refinement
- Methods and systems consistent with this invention may further reduce the required computation of transcoder919 by performing the motion vector refinement adaptively. The main cause of non-optimum motion vectors is the differential quantization errors, as shown above. In methods and systems consistent with this invention, when the SDRE(Bx, By) is small compared to SAD(Bx, By), the incoming motion vectors of
motion vector signal 626 are near the optimum and the transcoder may not need to perform motion vector refinement. - The SDRE may be approximated in
transcoder 918. When the difference between the quantization step-size used in the current frame of front-encoder 606 is small compared to the quantization step-size used in the previous frame ofsecond encoder 916, then the SDRE is small. When the quantization step-size ofsecond encoder 916 is much larger than that of front-encoder 606, the SDRE may be approximated as -
- where q1 is the quantization step-size use in the current frame of front-
encoder 606 and q2 is the quantization step-size used in the previous frame ofsecond encoder 916. The complexity of this computation is similar to checking one search position in the motion estimation, so it does not require much new computation. In methods and systems consistent with this invention, if the estimated SDRE is smaller than a predetermined first threshold, then the motion vector refinement is not performed, and the incomingmotion vector signal 626 is reused as the outgoingmotion vector signal 928, i.e., the delta motion vector is set equal to zero. - When a motion vector from incoming
motion vector signal 626 has a zero value, a predetermined higher second SDRE threshold can be used to prefer the reuse of the zero incomingmotion vector signal 626 as the outgoingmotion vector signal 928. In methods and systems consistent with this invention, if the motion vector from incomingmotion vector signal 626 is zero and the estimated SDRE is smaller than a predetermined second threshold, then the motion vector refinement is not performed, i.e., the delta motion vector is set equal to zero and the incomingmotion vector signal 626 is reused as the outgoingmotion vector signal 928. Reuse of a zero motion vector is preferable because a non-zero motion vector will need more bits to code. In methods and systems consistent with this invention, reuse of the incomingmotion vector signal 626 can be accomplished by setting the delta motion vector to zero. - In the case of frame-skipping, methods and systems consistent with this invention may also apply adaptive motion vector refinement. Also, often a large number of macroblocks are non-coded. In methods or systems consistent with this invention, these macroblocks are not subject to motion vector refinement.
- FIG. 13 is a flow diagram, consistent with this invention, summarizing adaptive motion vector refinement. First, the appropriate base motion vector and SDRE are calculated (step1302). If the base motion vector is zero (step 1304) and SDRE is greater than the second threshold (step 1306), then motion vector refinement is applied. If the base motion vector is zero (step 1304) and SDRE is not greater than the second threshold (step 1306), then motion vector refinement is not applied. If the base motion vector is not zero (step 1304) and SDRE is greater than the first threshold (step 1308), then motion vector refinement is applied. If the base motion vector is not zero (step 1304) and SDRE is not greater than the first threshold (step 1308), then motion vector refinement is not applied. In one embodiment, the first threshold and the second threshold are empirically set to 300 and 500, respectively.
- Those skilled in the art recognize that various modifications and variations can be made in the preceding examples without departing from the scope or spirit of the invention. For example, this invention may be implemented in an environment other than a gateway connecting two networks. For instance, the invention could reside on a computer and be used to decrease the size of encoded video files, without transmitting the files. In this case, the invention would reside in a computer, not a gateway. Second, the invention is not limited to operation in hardware such as an application specific integrated circuit. The invention could be implemented using software.
- Further, although the SAD function is used to find the optimal motion vector, other functions that are well known in the art may be used. Also, although adaptive motion vector refinement is used to decrease the bit rate of a digital video signal, nothing prohibits it from being used where the bit rate is not reduced.
- The specification does not limit the invention. Instead, it provides examples and explanations to allow persons of ordinary skill to appreciate different ways to practice this invention. The following claims define the true scope and spirit of the invention.
Claims (40)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US09/276,826 US6466623B1 (en) | 1998-03-27 | 1999-03-26 | Method and apparatus for motion estimation for high performance transcoding |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US7975498P | 1998-03-27 | 1998-03-27 | |
US09/276,826 US6466623B1 (en) | 1998-03-27 | 1999-03-26 | Method and apparatus for motion estimation for high performance transcoding |
Publications (2)
Publication Number | Publication Date |
---|---|
US6466623B1 US6466623B1 (en) | 2002-10-15 |
US20020154698A1 true US20020154698A1 (en) | 2002-10-24 |
Family
ID=26762395
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/276,826 Expired - Lifetime US6466623B1 (en) | 1998-03-27 | 1999-03-26 | Method and apparatus for motion estimation for high performance transcoding |
Country Status (1)
Country | Link |
---|---|
US (1) | US6466623B1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040257434A1 (en) * | 2003-06-23 | 2004-12-23 | Robert Davis | Personal multimedia device video format conversion across multiple video formats |
US6968396B1 (en) * | 2001-07-26 | 2005-11-22 | Openwave Systems Inc. | Reloading of hypermedia pages by sending only changes |
US7058127B2 (en) * | 2000-12-27 | 2006-06-06 | International Business Machines Corporation | Method and system for video transcoding |
US20060244868A1 (en) * | 2005-04-27 | 2006-11-02 | Lsi Logic Corporation | Method for composite video artifacts reduction |
US20070106814A1 (en) * | 2000-01-28 | 2007-05-10 | Son Yong H | Method and Apparatus for Content Distribution Via Non-Homogeneous Access Networks |
US20070165718A1 (en) * | 2006-01-18 | 2007-07-19 | Sony Corporation | Encoding apparatus, encoding method and program |
US20110080944A1 (en) * | 2009-10-07 | 2011-04-07 | Vixs Systems, Inc. | Real-time video transcoder and methods for use therewith |
Families Citing this family (68)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7016337B1 (en) * | 1999-03-02 | 2006-03-21 | Cisco Technology, Inc. | System and method for multiple channel statistical re-multiplexing |
US7593433B1 (en) | 1999-03-02 | 2009-09-22 | Cisco Technology, Inc. | System and method for multiple channel statistical re-multiplexing |
US6263503B1 (en) * | 1999-05-26 | 2001-07-17 | Neal Margulis | Method for effectively implementing a wireless television system |
US8266657B2 (en) | 2001-03-15 | 2012-09-11 | Sling Media Inc. | Method for effectively implementing a multi-room television system |
DE19946267C2 (en) * | 1999-09-27 | 2002-09-26 | Harman Becker Automotive Sys | Digital transcoding system |
US6845130B1 (en) * | 2000-10-12 | 2005-01-18 | Lucent Technologies Inc. | Motion estimation and compensation for video compression |
US7088780B2 (en) * | 2001-05-11 | 2006-08-08 | Mitsubishi Electric Research Labs, Inc. | Video transcoder with drift compensation |
US6987866B2 (en) * | 2001-06-05 | 2006-01-17 | Micron Technology, Inc. | Multi-modal motion estimation for video sequences |
US20050053062A1 (en) * | 2001-10-29 | 2005-03-10 | Jan Kall | Adapting the data rate and/or the amount of data of content to be transmitted separately for at least two radio access networks e.g. umts, geran |
US7236529B2 (en) * | 2001-10-30 | 2007-06-26 | Industrial Technology Research Institute | Methods and systems for video transcoding in DCT domain with low complexity |
US7403564B2 (en) * | 2001-11-21 | 2008-07-22 | Vixs Systems, Inc. | System and method for multiple channel video transcoding |
US20030134082A1 (en) * | 2001-12-21 | 2003-07-17 | Morin Brian G. | Carpet comprising a low-shrink backing of polypropylene tape fibers |
US6700935B2 (en) * | 2002-02-08 | 2004-03-02 | Sony Electronics, Inc. | Stream based bitrate transcoder for MPEG coded video |
US7190723B2 (en) * | 2002-03-27 | 2007-03-13 | Scientific-Atlanta, Inc. | Digital stream transcoder with a hybrid-rate controller |
JP4346868B2 (en) * | 2002-06-17 | 2009-10-21 | 株式会社日立製作所 | Video encoding device, video recording / playback device, video playback device, video encoding method, and video recording / playback method |
US7529276B1 (en) | 2002-09-03 | 2009-05-05 | Cisco Technology, Inc. | Combined jitter and multiplexing systems and methods |
EP1445958A1 (en) * | 2003-02-05 | 2004-08-11 | STMicroelectronics S.r.l. | Quantization method and system, for instance for video MPEG applications, and computer program product therefor |
TWI262660B (en) * | 2003-11-19 | 2006-09-21 | Inst Information Industry | Video transcoder adaptively reducing frame rate |
KR20050049964A (en) * | 2003-11-24 | 2005-05-27 | 엘지전자 주식회사 | Apparatus for high speed resolution changing of compressed digital video |
TWI230547B (en) * | 2004-02-04 | 2005-04-01 | Ind Tech Res Inst | Low-complexity spatial downscaling video transcoder and method thereof |
US20050232497A1 (en) * | 2004-04-15 | 2005-10-20 | Microsoft Corporation | High-fidelity transcoding |
CA2569610C (en) * | 2004-06-07 | 2012-11-27 | Sling Media, Inc. | Personal media broadcasting system |
US7769756B2 (en) | 2004-06-07 | 2010-08-03 | Sling Media, Inc. | Selection and presentation of context-relevant supplemental content and advertising |
US8346605B2 (en) | 2004-06-07 | 2013-01-01 | Sling Media, Inc. | Management of shared media content |
US7917932B2 (en) | 2005-06-07 | 2011-03-29 | Sling Media, Inc. | Personal video recorder functionality for placeshifting systems |
US7975062B2 (en) | 2004-06-07 | 2011-07-05 | Sling Media, Inc. | Capturing and sharing media content |
US8099755B2 (en) * | 2004-06-07 | 2012-01-17 | Sling Media Pvt. Ltd. | Systems and methods for controlling the encoding of a media stream |
US9998802B2 (en) | 2004-06-07 | 2018-06-12 | Sling Media LLC | Systems and methods for creating variable length clips from a media stream |
JP2006086964A (en) * | 2004-09-17 | 2006-03-30 | Toshiba Corp | Bit rate conversion apparatus and bit rate conversion method |
EP1899814B1 (en) * | 2005-06-30 | 2017-05-03 | Sling Media, Inc. | Firmware update for consumer electronic device |
JP4534935B2 (en) * | 2005-10-04 | 2010-09-01 | 株式会社日立製作所 | Transcoder, recording apparatus, and transcoding method |
US20080256485A1 (en) * | 2007-04-12 | 2008-10-16 | Jason Gary Krikorian | User Interface for Controlling Video Programs on Mobile Computing Devices |
US8477793B2 (en) * | 2007-09-26 | 2013-07-02 | Sling Media, Inc. | Media streaming device with gateway functionality |
US8350971B2 (en) * | 2007-10-23 | 2013-01-08 | Sling Media, Inc. | Systems and methods for controlling media devices |
US8457958B2 (en) | 2007-11-09 | 2013-06-04 | Microsoft Corporation | Audio transcoder using encoder-generated side information to transcode to target bit-rate |
US8060609B2 (en) * | 2008-01-04 | 2011-11-15 | Sling Media Inc. | Systems and methods for determining attributes of media items accessed via a personal media broadcaster |
US8164862B2 (en) * | 2008-04-02 | 2012-04-24 | Headway Technologies, Inc. | Seed layer for TMR or CPP-GMR sensor |
US8667279B2 (en) | 2008-07-01 | 2014-03-04 | Sling Media, Inc. | Systems and methods for securely place shifting media content |
US8381310B2 (en) * | 2009-08-13 | 2013-02-19 | Sling Media Pvt. Ltd. | Systems, methods, and program applications for selectively restricting the placeshifting of copy protected digital media content |
US8667163B2 (en) | 2008-09-08 | 2014-03-04 | Sling Media Inc. | Systems and methods for projecting images from a computer system |
US9191610B2 (en) * | 2008-11-26 | 2015-11-17 | Sling Media Pvt Ltd. | Systems and methods for creating logical media streams for media storage and playback |
US8438602B2 (en) * | 2009-01-26 | 2013-05-07 | Sling Media Inc. | Systems and methods for linking media content |
US8396114B2 (en) | 2009-01-29 | 2013-03-12 | Microsoft Corporation | Multiple bit rate video encoding using variable bit rate and dynamic resolution for adaptive video streaming |
US8311115B2 (en) | 2009-01-29 | 2012-11-13 | Microsoft Corporation | Video encoding using previously calculated motion information |
US8171148B2 (en) | 2009-04-17 | 2012-05-01 | Sling Media, Inc. | Systems and methods for establishing connections between devices communicating over a network |
US8270473B2 (en) * | 2009-06-12 | 2012-09-18 | Microsoft Corporation | Motion based dynamic resolution multiple bit rate video encoding |
US8406431B2 (en) | 2009-07-23 | 2013-03-26 | Sling Media Pvt. Ltd. | Adaptive gain control for digital audio samples in a media stream |
US9479737B2 (en) | 2009-08-06 | 2016-10-25 | Echostar Technologies L.L.C. | Systems and methods for event programming via a remote media player |
US8966101B2 (en) | 2009-08-10 | 2015-02-24 | Sling Media Pvt Ltd | Systems and methods for updating firmware over a network |
US8532472B2 (en) | 2009-08-10 | 2013-09-10 | Sling Media Pvt Ltd | Methods and apparatus for fast seeking within a media stream buffer |
US9565479B2 (en) | 2009-08-10 | 2017-02-07 | Sling Media Pvt Ltd. | Methods and apparatus for seeking within a media stream using scene detection |
US8799408B2 (en) | 2009-08-10 | 2014-08-05 | Sling Media Pvt Ltd | Localization systems and methods |
US9525838B2 (en) | 2009-08-10 | 2016-12-20 | Sling Media Pvt. Ltd. | Systems and methods for virtual remote control of streamed media |
US9160974B2 (en) | 2009-08-26 | 2015-10-13 | Sling Media, Inc. | Systems and methods for transcoding and place shifting media content |
US8314893B2 (en) * | 2009-08-28 | 2012-11-20 | Sling Media Pvt. Ltd. | Remote control and method for automatically adjusting the volume output of an audio device |
US9015225B2 (en) | 2009-11-16 | 2015-04-21 | Echostar Technologies L.L.C. | Systems and methods for delivering messages over a network |
US8799485B2 (en) | 2009-12-18 | 2014-08-05 | Sling Media, Inc. | Methods and apparatus for establishing network connections using an inter-mediating device |
US8626879B2 (en) | 2009-12-22 | 2014-01-07 | Sling Media, Inc. | Systems and methods for establishing network connections using local mediation services |
US9178923B2 (en) | 2009-12-23 | 2015-11-03 | Echostar Technologies L.L.C. | Systems and methods for remotely controlling a media server via a network |
US9275054B2 (en) | 2009-12-28 | 2016-03-01 | Sling Media, Inc. | Systems and methods for searching media content |
US20110191456A1 (en) * | 2010-02-03 | 2011-08-04 | Sling Media Pvt Ltd | Systems and methods for coordinating data communication between two devices |
US8856349B2 (en) | 2010-02-05 | 2014-10-07 | Sling Media Inc. | Connection priority services for data communication between two devices |
US20110208506A1 (en) * | 2010-02-24 | 2011-08-25 | Sling Media Inc. | Systems and methods for emulating network-enabled media components |
US8705616B2 (en) | 2010-06-11 | 2014-04-22 | Microsoft Corporation | Parallel multiple bitrate video encoding to reduce latency and dependences between groups of pictures |
US9591318B2 (en) | 2011-09-16 | 2017-03-07 | Microsoft Technology Licensing, Llc | Multi-layer encoding and decoding |
US11089343B2 (en) | 2012-01-11 | 2021-08-10 | Microsoft Technology Licensing, Llc | Capability advertisement, configuration and control for video coding and decoding |
US9386267B1 (en) * | 2012-02-14 | 2016-07-05 | Arris Enterprises, Inc. | Cooperative transcoding to multiple streams |
CN108476318A (en) * | 2016-01-14 | 2018-08-31 | 三菱电机株式会社 | Coding efficiency evaluates auxiliary device, coding efficiency evaluation householder method and coding efficiency and evaluates auxiliary program |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU685444B2 (en) * | 1993-03-26 | 1998-01-22 | British Telecommunications Public Limited Company | A transcoder |
US5940130A (en) * | 1994-04-21 | 1999-08-17 | British Telecommunications Public Limited Company | Video transcoder with by-pass transfer of extracted motion compensation data |
DE4416967A1 (en) * | 1994-05-13 | 1995-11-16 | Thomson Brandt Gmbh | Method and device for transcoding bit streams with video data |
EP0690392B1 (en) * | 1994-06-30 | 2001-09-26 | Koninklijke Philips Electronics N.V. | Method and device for transcoding a sequence of coded digital signals |
US6144698A (en) * | 1996-10-31 | 2000-11-07 | Mitsubishi Electric Information Technology Center America, Inc. (Ita) | Digital video decoder and method of decoding a digital video signal |
DE69803195T2 (en) * | 1997-11-27 | 2002-08-29 | British Telecomm | CODE IMPLEMENTATION |
US6058143A (en) * | 1998-02-20 | 2000-05-02 | Thomson Licensing S.A. | Motion vector extrapolation for transcoding video sequences |
US6226338B1 (en) * | 1998-06-18 | 2001-05-01 | Lsi Logic Corporation | Multiple channel data communication buffer with single transmit and receive memories |
GB9929113D0 (en) | 1999-12-09 | 2000-02-02 | Jhall John J | Self-supporting poster screen for advertising off the internet |
-
1999
- 1999-03-26 US US09/276,826 patent/US6466623B1/en not_active Expired - Lifetime
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9172735B2 (en) * | 2000-01-28 | 2015-10-27 | Comcast Ip Holdings I, Llc | Method and apparatus for content distribution via non-homogeneous access networks |
US20070106814A1 (en) * | 2000-01-28 | 2007-05-10 | Son Yong H | Method and Apparatus for Content Distribution Via Non-Homogeneous Access Networks |
US10257246B2 (en) | 2000-01-28 | 2019-04-09 | Comcast Ip Holdings I, Llc | Content distribution via a distribution network and an access network |
US9596284B2 (en) | 2000-01-28 | 2017-03-14 | Comcast Ip Holdings I, Llc | Content distribution via a distribution network and an access network |
US7058127B2 (en) * | 2000-12-27 | 2006-06-06 | International Business Machines Corporation | Method and system for video transcoding |
US6968396B1 (en) * | 2001-07-26 | 2005-11-22 | Openwave Systems Inc. | Reloading of hypermedia pages by sending only changes |
US20040257434A1 (en) * | 2003-06-23 | 2004-12-23 | Robert Davis | Personal multimedia device video format conversion across multiple video formats |
US20070277221A1 (en) * | 2003-06-23 | 2007-11-29 | Robert Davis | Personal multimedia device video format conversion across multiple video formats |
US7688384B2 (en) | 2003-06-23 | 2010-03-30 | The Directv Group, Inc. | Personal multimedia device video format conversion across multiple video formats |
US20060244868A1 (en) * | 2005-04-27 | 2006-11-02 | Lsi Logic Corporation | Method for composite video artifacts reduction |
US8331458B2 (en) | 2005-04-27 | 2012-12-11 | Lsi Corporation | Method for composite video artifacts reduction |
US20100220235A1 (en) * | 2005-04-27 | 2010-09-02 | Yunwei Jia | Method for composite video artifacts reduction |
US7751484B2 (en) * | 2005-04-27 | 2010-07-06 | Lsi Corporation | Method for composite video artifacts reduction |
US20070165718A1 (en) * | 2006-01-18 | 2007-07-19 | Sony Corporation | Encoding apparatus, encoding method and program |
US20110080944A1 (en) * | 2009-10-07 | 2011-04-07 | Vixs Systems, Inc. | Real-time video transcoder and methods for use therewith |
Also Published As
Publication number | Publication date |
---|---|
US6466623B1 (en) | 2002-10-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6466623B1 (en) | Method and apparatus for motion estimation for high performance transcoding | |
Chen et al. | Efficient motion-estimation algorithm for reduced frame-rate video transcoder | |
US7170938B1 (en) | Rate control method for video transcoding | |
Bjork et al. | Transcoder architectures for video coding | |
EP0691054B1 (en) | Efficient transcoding device and method | |
US9350994B2 (en) | Motion estimation technique for digital video encoding applications | |
US6249318B1 (en) | Video coding/decoding arrangement and method therefor | |
US5870146A (en) | Device and method for digital video transcoding | |
US6091777A (en) | Continuously adaptive digital video compression system and method for a web streamer | |
US7058127B2 (en) | Method and system for video transcoding | |
EP0953254B1 (en) | Motion-compensated predictive image encoding and decoding | |
US6650707B2 (en) | Transcoding apparatus and method | |
AU684901B2 (en) | Method and circuit for estimating motion between pictures composed of interlaced fields, and device for coding digital signals comprising such a circuit | |
US20060018378A1 (en) | Method and system for delivery of coded information streams, related network and computer program product therefor | |
Youn et al. | Motion estimation for high performance transcoding | |
JPH08228156A (en) | Method and equipment for partial repressing digital signal | |
US20070237219A1 (en) | Digital Stream Transcoder | |
Lei et al. | H. 263 video transcoding for spatial resolution downscaling | |
US7236529B2 (en) | Methods and systems for video transcoding in DCT domain with low complexity | |
Assuncao et al. | Rate-reduction techniques for MPEG-2 video bit streams | |
US6040875A (en) | Method to compensate for a fade in a digital video input sequence | |
Chen et al. | Motion vector composition algorithm for spatial scalability in compressed video | |
Chang et al. | Error accumulation of repetitive image coding | |
US7542617B1 (en) | Methods and apparatus for minimizing requantization error | |
Youn et al. | Adaptive motion vector refinement for high performance transcoding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INDUSTRIAL TECHNOLOGY RESEARCH INSTITUTE, TAIWAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:YOUN, JEONGNAM;SUN, MING-TING;LIN, CHIA-WEN;REEL/FRAME:009990/0392 Effective date: 19990324 Owner name: WASHINGTON, UNIVERSITY OF, WASHINGTON Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:YOUN, JEONGNAM;SUN, MING-TING;LIN, CHIA-WEN;REEL/FRAME:009990/0392 Effective date: 19990324 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
FPAY | Fee payment |
Year of fee payment: 8 |
|
FPAY | Fee payment |
Year of fee payment: 12 |