US20090167775A1 - Motion estimation compatible with multiple standards - Google Patents

Motion estimation compatible with multiple standards Download PDF

Info

Publication number
US20090167775A1
US20090167775A1 US11/967,227 US96722707A US2009167775A1 US 20090167775 A1 US20090167775 A1 US 20090167775A1 US 96722707 A US96722707 A US 96722707A US 2009167775 A1 US2009167775 A1 US 2009167775A1
Authority
US
United States
Prior art keywords
macroblock
search
subblocks
motion estimation
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/967,227
Inventor
Ning Lu
Hong Jiang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US11/967,227 priority Critical patent/US20090167775A1/en
Publication of US20090167775A1 publication Critical patent/US20090167775A1/en
Assigned to INTEL CORPORATION reassignment INTEL CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: JIANG, HONG, LU, NING
Priority to US13/897,555 priority patent/US9699451B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/14Digital output to display device ; Cooperation and interconnection of the display device with other functional units
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/577Motion compensation with bidirectional frame interpolation, i.e. using B-pictures
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/109Selection of coding mode or of prediction mode among a plurality of temporal predictive coding modes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/112Selection of coding mode or of prediction mode according to a given display mode, e.g. for interlaced or progressive display mode
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/43Hardware specially adapted for motion estimation or compensation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/523Motion estimation or motion compensation with sub-pixel accuracy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/557Motion estimation characterised by stopping computation or iteration based on certain criteria, e.g. error magnitude being too large or early exit
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/61Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding in combination with predictive coding
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2320/00Control of display operating conditions
    • G09G2320/10Special adaptations of display systems for operation with variable images
    • G09G2320/106Determination of movement vectors or equivalent parameters within the image
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2340/00Aspects of display data processing
    • G09G2340/02Handling of images in compressed format, e.g. JPEG, MPEG

Definitions

  • This disclosure relates generally to signal processing, and more specifically but not exclusively, to digital video encoding technologies.
  • Image information (such as digital video information) is often transmitted from one electronic device to another. Such information is typically encoded and/or compressed to reduce the bandwidth required for transmission and/or to decrease the time necessary for transmission.
  • information about differences between a current picture and a previous picture might be transmitted and the device receiving the image information may then, for example, decode and/or decompress the information (e.g., by using the previous picture and the differences to generate the current picture) and provide the image to a viewing device.
  • a video sequence consists of a series of frames.
  • the motion estimation technique exploits the temporal redundancy between adjacent frames to achieve compression by selecting a frame as a reference and predicting subsequent frames from the reference.
  • the process of motion estimation based video compression is also known as inter-frame coding.
  • Motion estimation is used with an assumption that the objects in the scene have only translational motion. This assumption holds as long as there is no camera pan, zoom, changes in luminance, or rotational motion.
  • inter-frame coding does not work well, because the temporal correlation between frames from different scenes is low.
  • a second compression technique Intra-frame coding—is used.
  • the current frame in a sequence of frames is predicted from at least one reference frame.
  • the current frame is divided into N ⁇ N pixel macroblocks, typically 16 ⁇ 16 pixels in size.
  • Each macroblock is compared to a region in the reference frame of the same size using some error measure, and the best matching region is selected.
  • the search is conducted over a predetermined search area.
  • a motion vector denoting the displacement of the region in the reference frame with respect to the macroblock in the current frame is determined.
  • forward prediction If the reference frame is a future frame, then the prediction is referred to as backward prediction.
  • the prediction is referred to as bidirectional prediction.
  • a search window within the reference frame is often identified and the macroblock is compared to various positions within the search window.
  • the most effective yet computationally intensive way of comparing the macroblock to the search window is to compare the pixels of the macroblock to the pixels of the search window at every position that the macroblock may be moved to within the search window. This is referred to as a “full” or “exhaustive” search.
  • For each position of the block tested within the search window each pixel of the block is compared to a corresponding pixel in the search window. The comparison comprises computing a deviation between the values of compared pixels.
  • the mathematical sum of absolute differences (SAD), mean squared error (MSE), mean absolute error (MSE), or mean absolute difference (MAD) functions are utilized to quantitatively compare the pixels.
  • the deviations for each macroblock position are then accumulated, and the position within the search window that yields the smallest deviation is selected as the most likely position of the block in the previous frame.
  • the differences in the current and previous positions of the block are then utilized to derive the motion vector to estimate the movement associated with the block between the reference frame and the current frame.
  • the motion vector may then, for example, be transmitted as image information (e.g., instead of a full image frame) so that a decoder may render, recreate, or build the current frame by simply applying the motion vector information to the reference frame.
  • MPEG Moving Pictures Expert Group
  • ISO International Standards Organization
  • ISO International Electrotechnical Commission
  • AVC Advanced Video Coding
  • SMPTE 421 M Video Codec Society of Motion Picture and Television Engineers
  • FIG. 1 shows one example computing system which may include a video encoder to compress video information using a motion estimation engine
  • FIG. 2 shows a block diagram of a video encoder which uses a motion estimation engine to compress video information
  • FIG. 3 shows a block diagram of a motion estimation engine used by a video encoder to compress video information for multiple video encoding standards
  • FIG. 4 is a flowchart of one example process for a high performance motion estimation engine to perform motion estimation for multiple video encoding standards
  • FIGS. 5A and 5B illustrate ways to partition a macroblock during motion estimation
  • FIG. 6 illustrates an example approach to computing distortions for macroblock search during motion estimation
  • FIG. 7 illustrates another example approach to computing distortions for macroblock search during motion estimation
  • FIG. 8 is a flowchart of an example process for refining prediction directions during motion estimation.
  • FIGS. 9A and 9B illustrate fractional motion estimation interpolation schemes.
  • a motion estimation engine may be implemented to support multiple video encoding standards.
  • the motion estimation engine may be designed to support two macroblock partitioning modes: one for frame type video compression and the other for mixed frame-field type video compression. Additionally, the motion estimation engine provides the mixing unidirectional option (forward/backward) and the mixing bidirectional option. Furthermore, the motion estimation engine may use a unified 4 -tap interpolation filter for fractional macroblock search during motion estimation.
  • FIG. 1 shows one example computing system 100 which may include a video encoder to compress video information using a motion estimation engine.
  • Computing system 100 may comprise one or more processors 110 coupled to a system interconnect 115 .
  • Processor 110 may have multiple or many processing cores (for the convenience of description, terms “core” and “processor” may be used interchangeably hereinafter and also the term “multiple cores/processors” will be used to include both multiple processing cores and many processing cores hereinafter).
  • the computing system 100 may also include a chipset 120 coupled to the system interconnect 115 .
  • Chipset 120 may include one or more integrated circuit packages or chips.
  • chipset 120 may comprise a Memory Control Hub (MCH) 130 that is coupled to a main memory 150 through a memory bus 155 .
  • the main memory 150 may store data and sequences of instructions that are executed by multiple cores of the processor 110 or any other device included in the system.
  • the MCH 130 includes a memory controller 135 to access the main memory 150 in response to memory transactions associated with multiple cores of the processor 110 , and other devices in the computing system 100 .
  • memory controller 135 may be located in processor 110 or some other circuitries.
  • the main memory 150 may comprise various memory devices that provide addressable storage locations which the memory controller 125 may read data from and/or write data to.
  • the main memory 150 may comprise one or more different types of memory devices such as Dynamic Random Access Memory (DRAM) devices, Synchronous DRAM (SDRAM) devices, Double Data Rate (DDR) SDRAM devices, or other memory devices.
  • DRAM Dynamic Random Access Memory
  • SDRAM Synchronous DRAM
  • DDR Double Data Rate
  • Chipset 120 may comprise an Input/Output Control Hub (ICH) 140 to support data transfers to and/or from other components of the computing system 100 such as, for example, keyboards, mice, network interfaces, etc.
  • the ICH 140 may be coupled with other components through a bus such as bus 165 .
  • the bus may be a Peripheral Component Interconnect (PCI), accelerated graphics port (AGP), universal serial interconnect (USB), low pin count (LPC) interconnect, or any other kind of I/O interconnect.
  • the ICH 140 (or the MCH 130 ) may be coupled to a graphics device 160 , which generates and outputs images to a display.
  • the graphics device may also provide an interface between the ICH 140 (or MCH 130 ) and other graphics/video devices.
  • the graphics device includes a graphics processing unit (GPU) 170 , which is dedicated graphics rendering device to efficiently manipulate and display computer graphics.
  • the GPU 170 may implement a number of graphics primitive operations to process and render graphics.
  • the graphics device 160 may include a video codec 180 to enable video encoding/compression and/or decoding/decompression for digital video.
  • the video codec 180 may further include a motion estimation engine (not shown in the figure) to perform motion estimation for video compression/encoding under multiple video encoding standards such as MPEG-2, VC-1, and AVC standards.
  • the motion estimation engine may be a part of the video codec 180 , or may be a part of the GPU 170 , or may be a separate engine in the graphics device 160 , or may be in a different device.
  • graphics device 160 may be located in MCH 130 . In another embodiment, graphics device 160 may be located in processor 110 or some other circuitries.
  • FIG. 2 shows a block diagram of a video encoder which uses a motion estimation engine to compress digital video under multiple video encoding standards, according to an embodiment of the subject matter disclosed in the present application.
  • An image data frame source 210 may be coupled to provide image data frames to an encoder 220 .
  • the encoder 220 may, according to some configurations, apply an encoding and/or compression algorithm in accordance with a video encoding standard such as the MPEG-2, VC-1, or AVC standards.
  • the encoder 220 may include a motion estimation engine 225 to obtain motion vectors so that during the decoding/decompression process an image frame may be reconstructed based on a motion vector and a reference frame used to obtain the motion vector.
  • the motion estimation engine 225 may be designed in such a way that it may perform motion estimation for multiple video encoding standards.
  • the frame type is used for coding progressive contents and all macroblocks and reference pictures are considered as continuous pieces of a video picture.
  • the field type is used mostly for coding interlaced contents and an encoder basically treats a picture as two disjoint field pictures. All macroblocks and reference pictures used in motion estimation are identical under either the field type or the frame type. Thus, assuming each picture in a digital video is stored in a memory buffer in frame format, special design for reading the picture (e.g., boundary padding) is needed for the field type.
  • the field-frame type is used when interlaced contents are coded in frames.
  • One is a frame type macroblock that allows field type partitioning as under the MPEG-2 and VC-1 standards.
  • the other is a 16 ⁇ 32 tall macroblock pair that can be divided into either two frame macroblocks or two field macroblocks as under the AVC standard.
  • the motion estimation engine 225 may apply a corresponding distortion calculation mode. For example, for frame type digital video signals (e.g., video signals using the AVC standard), distortion measures for macroblock search may be calculated for a total of 41 macroblock partitions on the basis of 4 ⁇ 4 sub-block units; and for field type and field-frame type digital video signals (e.g., video signals using the VC-1 or MPEG-2 standard), distortion measures for macroblock search may be calculated for a total of 15 macroblock partitions. Additionally, the motion estimation engine 225 may provide the mixing unidirectional option (forward/backward) and the mixing bidirectional option for video signals obtained using different video standards. Furthermore, the motion estimation engine 225 may use a unified 4-tap interpolation filter for fractional macroblock search during motion estimation.
  • the motion estimation engine 225 may generate motion vectors for corresponding reference frames.
  • the encoder 220 may use the motion vectors and their corresponding reference frames to further compress video signals.
  • the encoded/compressed video signals from the encoder 220 may be stored in a storage device and/or may be transmitted through a channel 230 to a receiving device 250 .
  • the channel 230 may be a wired or wireless channel.
  • the receiving device 250 may decode/decompress the received encoded/compressed video signals and reconstruct original video signals based on reference frames and motion vectors.
  • FIG. 3 shows a block diagram of a motion estimation engine 300 used by a video encoder to encode/compress video signals under multiple video encoding standards, according to an embodiment of the subject matter disclosed in the present application.
  • Motion estimation engine 300 comprises a skip checking module 310 , an integer search module 320 , a macroblock (MB) partitioning module 330 , a fractional search module 340 , a bidirectional checking module 350 , and an intra search module 360 .
  • the motion estimation engine 300 may work with a controller 390 which controls/coordinates operations of different modules in the motion estimation engine 300 . In one embodiment, there might not be a separate controller, but a GPU (e.g., GPU 170 in FIG.
  • GPU e.g., GPU 170 in FIG.
  • controller 390 may be integrated with and become a part of the motion estimation engine 300 .
  • the motion estimation engine 300 may have a buffer (not shown in the figure) associated with it to temporarily store macroblocks read from a storage device that stores frame(s) of digital video to be encoded/compressed. Controller 390 may feed to the motion estimation engine one macroblock at a time.
  • the integer search module 320 may search the reference frame for the received macroblock at pixels with integer indices.
  • the integer search module 320 may calculate search distortion based on 4 ⁇ 4 subblock units. For example, if the received macroblock is 16 ⁇ 16 pixels, the integer search module 320 may partition the macroblock into four 4 ⁇ 4 subblocks and calculate a distortion for each 4 ⁇ 4 subblock. The distortion for the 16 ⁇ 16 macroblock may be obtained by adding together distortions of the four 4 ⁇ 4 subblocks.
  • Using a 4 ⁇ 4 subblock as a unit for distortion calculations provides flexibility for multiple ways of partitioning a macroblock because many macroblock partitions may be obtained by a combination of one or more 4 ⁇ 4 subblocks. It also provides flexibility for macroblocks with different dimensions (e.g., 8 ⁇ 8, 8 ⁇ 16, and 16 ⁇ 8 macroblocks may all be partitioned into multiple 4 ⁇ 4 subblocks).
  • the integer search module 320 may store distortions for all 4 ⁇ 4 subblocks as well as for the entire macroblock in a buffer associated with the motion estimation engine 300 .
  • the MB partitioning module 330 may support multiple ways of macroblock partitioning for either the frame type, or the field type, or the field-frame type digital video signal.
  • a macroblock partitioning may be referred to as a major partition if no sub-block is smaller than 8 ⁇ 8, and may be referred to as a minor partition if no sub-block is larger than 8 ⁇ 8.
  • few standards currently support macroblock partitioning that mixes blocks larger than 8 ⁇ 8 (such as 16 ⁇ 8 or 8 ⁇ 16) with blocks smaller than 8 ⁇ 8 (such as 8 ⁇ 4, 4 ⁇ 8, and 4 ⁇ 4). As shown in FIG.
  • FIG. 5A there are 8 possible major macroblock frame type partitions (partition 510 , 515 , 520 , 525 , 530 , 535 , 540 , and 545 ) for a 16 ⁇ 16 macroblock.
  • the total number of possible minor partitions can be very large (e.g., several thousands for a 16 ⁇ 16 macroblock).
  • Embodiments of the subject matter disclosed in the present application may support some or all of such major partitions and a portion of minor partitions.
  • the AVC standard allows minor partitions but limited to ones whose each 8 ⁇ 8 sub-region is divided into the same shapes (i.e. one 8 ⁇ 8, two 8 ⁇ 4s, two 4 ⁇ 8s, or four 4 ⁇ 4s.).
  • Partition 560 includes two 16 ⁇ 8 subblocks with one subblock having all of the even rows and the other having all of the odd rows.
  • Partition 570 includes four 8 ⁇ 8 blocks by partitioning the 16 ⁇ 16 block into two 8 ⁇ 16 subblocks 572 and 574 and by further partitioning each of subblocks 572 and 574 into two 8 ⁇ 8 subblocks with one 8 ⁇ 8 subblock having all of the even rows and the other having all of the odd rows.
  • 41 subblocks may be supported in a first mode for frame type video signals or field type video signals when both fields are coded as separated pictures or macroblocks (e.g., video signals under the AVC standard does allow sub-block partitioning as fine as to 4 ⁇ 4 block size); and 15 subblocks (i.e., 1 16 ⁇ 16, 2 16 ⁇ 8's, 2 8 ⁇ 16's, 4 8 ⁇ 8's, 2 field 16 ⁇ 8's, and 4 field 8 ⁇ 8's) may be supported in a second mode for mixed field-frame type video signals (e.g., video signals of a macroblock is allowed to be coded in either frame type or field type individually under the VC-1 and/or MPEG-2 standard).
  • first mode for frame type video signals or field type video signals when both fields are coded as separated pictures or macroblocks (e.g., video signals under the AVC standard does allow sub-block partitioning as fine as to 4 ⁇ 4 block size); and 15 subblocks (i.e., 1 16 ⁇ 16, 2 16 ⁇ 8's
  • Dist16 ⁇ 16[0] Dist16 ⁇ 8[0]+Dist16 ⁇ 8[1].
  • basic subblocks may be 4 ⁇ 4 field sub-blocks and distortions for all of such basic subblocks may be calculated, and distortions for all the other sub-blocks may be derived from distortions of those basic subblocks, as illustrated in FIG. 7 and by the following pseudo-codes:
  • Dist8 ⁇ 8[0] Dist4 ⁇ 4F[0]+Dist4 ⁇ 4F[1]+Dist4 ⁇ 4F[8]+Dist4 ⁇ 4F[9];
  • Dist8 ⁇ 8[1] Dist4 ⁇ 4F[4]+Dist4 ⁇ 4F[5]+Dist4 ⁇ 4F[12]+Dist4 ⁇ 4F[13];
  • Dist8 ⁇ 8[2] Dist4 ⁇ 4F[2]+Dist4 ⁇ 4F[3]+Dist4 ⁇ 4F[10]+Dist4 ⁇ 4F[11];
  • Dist8 ⁇ 8[3] Dist4 ⁇ 4F[6]+Dist4 ⁇ 4F[7]+Dist4 ⁇ 4F[14]+Dist4 ⁇ 4F[15];
  • Dist8 ⁇ 8F[0] Dist4 ⁇ 4F[0]+Dist4 ⁇ 4F[1]+Dist4 ⁇ 4F[2]+Dist4 ⁇ 4F[3];
  • Dist8 ⁇ 8F[1] Dist4 ⁇ 4F[4]+Dist4 ⁇ 4F[5]+Dist4 ⁇ 4F[6]+Dist4 ⁇ 4F[7];
  • Dist8 ⁇ 8F[2] Dist4 ⁇ 4F[8]+Dist4 ⁇ 4F[9]+Dist4 ⁇ 4F[10]+Dist4 ⁇ 4F[11];
  • Dist8 ⁇ 8F[3] Dist4 ⁇ 4F[12]+Dist4 ⁇ 4F[13]+Dist4 ⁇ 4F[14]+Dist4 ⁇ 4F[15];
  • Dist16 ⁇ 8F[0] Dist8 ⁇ 8F[0]+Dist8 ⁇ 8F[1];
  • Dist16 ⁇ 8F[1] Dist8 ⁇ 8F[2]+Dist8 ⁇ 8F[3];
  • Dist16 ⁇ 16[0] Dist16 ⁇ 8[0]+Dist16 ⁇ 8[1].
  • the motion estimation engine 300 may be switched among all of the supported partitions according to different standard specifications. Normally, if any field shape is on, the motion estimation engine is set to the second mode; otherwise, the motion estimation engine is in the first mode.
  • Distortions for all specified subblocks in either the first or the second mode may be calculated during integer search by the integer search module 320 . Comparisons are made and the best (with the least distortion) is recorded for each sub-block.
  • the MB partitioning module 330 further obtains distortions for each possible macroblock partition allowed by the video standard under which the video signal to be encoded/compressed is obtained, based on the best-recorded subblock distortions so obtained as described above. The MB partitioning module 330 may then compare distortions of all the available macroblock partitions and select the one that has the least overall distortion for macroblock partition.
  • the fractional search module 340 may search a macroblock at fractional pixel locations. Typically half-pel and quarter-pel pixel values are used for fractional macroblock search. half-pel or quarter-pel pixel values are obtained using interpolation formulas. For instance, the AVC standard uses 6-tap formulas below:
  • the VC-1 standard uses 4-tap filters as defined below:
  • the fractional search module may use different formulas according to different video standards.
  • a unified 4-tap interpolation may be used for all of the different video standards, as shown below:
  • Equations (7) and (8) are good approximations of the formulas used by various video standards.
  • the bidirectional motion estimation refinement module 350 may refine prediction directions on the sub-block level or the macroblock level. Given a macroblock partition pattern, some standards allow different sub-blocks to have their own decision on whether the prediction should be unidirectional or bidirectional, and whether the prediction should be forward or backward; while others make such decisions in the macroblock level as whole. Typically, there are three scenarios:
  • Each sub-block has its own decision on whether the prediction should be unidirectional or bidirectional, and whether the prediction should be forward or backward (e.g., the AVC standard). In this case, directions are chosen for each subblock and the macroblock partitioning may be updated after direction refinement for each subblock.
  • All of the sub-blocks share the same directions, whether unidirectional or bidirectional and whether forward or backward (e.g., the MPEG-2 standard). In this case, the partitioning is done for each individual direction first, and the better one is chosen later.
  • All of the sub-blocks must be all unidirectional or bidirectional. If it is unidirectional, each sub-block can decide whether is forward or backward (e.g., the VC-1 standard). In this case, forward and backward decision for each sub-block is done first, then partitioning for unidirectional is performed. Bidirectional is checked for the unidirectional best, and it is chosen only if it is better.
  • forward or backward e.g., the VC-1 standard
  • the bidirectional motion estimation refinement module 350 may be designed to accommodate all of the above cases.
  • FIG. 8 is a flowchart of an example process 800 which the bidirectional motion estimation refinement module may use to refine motion estimation.
  • Process 800 starts with the best subblock motion vectors at block 805 , i.e., the bidirectional motion estimation refinement module 350 works on the basis of the best macroblock partitioning determined before the bidirectional refinement starts.
  • the bidirectional motion estimation refinement module 350 determines whether the video standard under which the video signal is obtained allows each subblock to have its own decision on whether the prediction should be forward or backward.
  • the best macroblock partition may be found for each direction; and at block 820 between the two macroblock partitions (one for each direction), the one with smaller distortion may be chosen. If the answer is “yes,” at block 825 , forward and backward motion vectors for all subblocks may be merged at block 825 ; and at block 830 , the best overall macroblock partition may be obtained based on the merged motion vectors.
  • fractional refinement fractional search
  • bidirectional refinement may be performed on the chosen partition.
  • the bidirectional motion estimation refinement module 350 may determine whether the video standard allows each subblock to have its own decision on whether the prediction should be unidirectional or bidirectional. If the answer is “no,” the final direction pattern may be chosen by comparing bidirectional overall results of the partition chosen at block 840 with all subblocks being unidirectional and being bidirectional at block 850 . If the answer is “yes,” the final direction pattern may be chosen by comparing bidirectional results for each subblock of the partition chosen at block 840 . At block 860 , the process 800 may continue for another macroblock.
  • the intra search module 360 may perform intra-frame search for the received macroblock.
  • the intra-frame search works better when there are scene changes from frame to frame or when there is no motion vector set found with small enough distortion.
  • the intra search module 360 may perform intra-frame search in parallel with the inter-frame search.
  • the motion estimation engine 300 may support early exits. In any stage of the encoding, the motion estimation engine may terminate the estimation process if some “good enough” pattern of the macroblock is found in the reference frame.
  • a “good enough” pattern in the reference frame is a block whose total adjusted distortion error from the macroblock is below a threshold constant provided by the user as part of a motion estimation command.
  • the motion estimation engine 300 may support refinement skip. If the best result after integer search is “still bad”, the motion estimation engine 300 may skip fractional and bidirectional searches and jump to intra-frame directly to save time and computation.
  • a “still bad” result is a block in the reference frame whose total adjusted distortion error from the macroblock is above a threshold constant provided by the user as part of a motion estimation command.
  • the motion estimation engine 300 may support a working mode under which the engine may output all distortion values in addition to the best results of all major partitions to allow user to be able to use the motion estimation for non-encoding applications and further refinement of encoding solutions.
  • FIG. 4 is a flowchart of one example process 400 for a motion estimation engine to perform motion estimation for multiple video encoding standards, according to an embodiment of the subject matter disclosed in the present application.
  • Process 400 starts at block 405 .
  • the motion estimation engine e.g., 300 in FIG. 3
  • the motion estimation engine may receive a macroblock from the current frame and check whether the “P-skip” is enabled for the macroblock. If it is, the motion estimation engine may perform skip checking at block 415 . If it is determined that the macroblock may indeed be skipped from motion estimation at block 420 , process 800 may directly jump to block 480 .
  • the motion estimation engine may check whether inter-frame search is needed for the macroblock at block 425 .
  • a bit in an information bit vector for the macroblock and/or for the current frame may be set to indicate that inter-frame search is needed. If the inter-frame search is needed, the motion estimation engine may perform integer search (described above for integer search module 320 ) at block 430 . If the inter-frame search is not needed, it may be further determined whether intra-frame search is needed for the macroblock at block 435 . If the answer is “yes,” process 800 will proceed to block 475 directly.
  • the motion estimation engine will determine whether a search result is good enough so that the search may be exited to proceed to block 440 .
  • a search result is good enough if the distortion between the search result and the macroblock is below a predetermined threshold.
  • the threshold may be predetermined by a user and provided to the motion estimation engine via a command.
  • distortions for all of the 4 ⁇ 4 basic blocks of the macroblock may be computed and retained during the inter-frame search. Distortions for other shaped subblocks may be obtained from the distortions of various 4 ⁇ 4 basic blocks. Based on all of the distortions so derived, an optimal macroblock partition may be determined at block 440 .
  • the overall distortion for the determined optimal macroblock partition is above a “too bad” threshold or is below a “too good” threshold.
  • the “too bad” threshold is used to indicate that the integer search result is not acceptable and an intra-search may thus be necessary.
  • the “too good” threshold is used to indicate that the integer search result is good enough and it is not necessary to perform further refinement to the search result. Both thresholds are predetermined by a user and provided to the motion estimation engine via a command. If the result from the integer search is too bad, process 800 may proceed directly to block 475 for intra-frame search. If the result is too good, on the other hand, process 800 may exit early and proceed directly to block 480 . If the search result is between the “too bad” threshold and the “too good” threshold, process 800 may proceed to block 450 to determine whether fractional search refinement is needed.
  • the bit in an information bit vector that indicates whether the fractional search is needed for the macroblock may be checked. If the fractional search bit is set, fractional search (as described above for the fractional search module 340 in FIG. 3 ) may be performed at block 455 . If after the fractional search, it is determined at block 460 the overall distortion between the macroblock and its corresponding search results in the reference frame is below an “early exit” threshold, process 800 may proceed directly to block 480 without any further refinement to the search result.
  • the “early exit” threshold may be set the same as the “too good” threshold, or it may be predetermined differently.
  • the “early exit” threshold may be provided to the motion estimation engine via a command.
  • process 800 may proceed to block 465 for further processing of the search results from block 455 .
  • Various embodiments of the disclosed subject matter may be implemented in hardware, firmware, software, or combination thereof, and may be described by reference to or in conjunction with program code, such as instructions, functions, procedures, data structures, logic, application programs, design representations or formats for simulation, emulation, and fabrication of a design, which when accessed by a machine results in the machine performing tasks, defining abstract data types or low-level hardware contexts, or producing a result.
  • program code such as instructions, functions, procedures, data structures, logic, application programs, design representations or formats for simulation, emulation, and fabrication of a design, which when accessed by a machine results in the machine performing tasks, defining abstract data types or low-level hardware contexts, or producing a result.
  • program code may represent hardware using a hardware description language or another functional description language which essentially provides a model of how designed hardware is expected to perform.
  • Program code may be assembly or machine language, or data that may be compiled and/or interpreted.
  • Program code may be stored in, for example, volatile and/or non-volatile memory, such as storage devices and/or an associated machine readable or machine accessible medium including solid-state memory, hard-drives, floppy-disks, optical storage, tapes, flash memory, memory sticks, digital video disks, digital versatile discs (DVDs), etc., as well as more exotic mediums such as machine-accessible biological state preserving storage.
  • a machine readable medium may include any mechanism for storing, transmitting, or receiving information in a form readable by a machine, and the medium may include a tangible medium through which electrical, optical, acoustical or other form of propagated signals or carrier wave encoding the program code may pass, such as antennas, optical fibers, communications interfaces, etc.
  • Program code may be transmitted in the form of packets, serial data, parallel data, propagated signals, etc., and may be used in a compressed or encrypted format.
  • Program code may be implemented in programs executing on programmable machines such as mobile or stationary computers, personal digital assistants, set top boxes, cellular telephones and pagers, and other electronic devices, each including a processor, volatile and/or non-volatile memory readable by the processor, at least one input device and/or one or more output devices.
  • Program code may be applied to the data entered using the input device to perform the described embodiments and to generate output information.
  • the output information may be applied to one or more output devices.
  • programmable machines such as mobile or stationary computers, personal digital assistants, set top boxes, cellular telephones and pagers, and other electronic devices, each including a processor, volatile and/or non-volatile memory readable by the processor, at least one input device and/or one or more output devices.
  • Program code may be applied to the data entered using the input device to perform the described embodiments and to generate output information.
  • the output information may be applied to one or more output devices.
  • One of ordinary skill in the art may appreciate that embodiments of the disclosed subject

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Discrete Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A motion estimation engine may be implemented to support multiple video encoding standards. The motion estimation engine may be designed to support two macroblock partitioning modes: one for frame type video signals and the other for mixed frame-field type video signals. Additionally, the motion estimation engine provides the mixing unidirectional option (forward/backward) and the mixing bidirectional option. Furthermore, the motion estimation engine may use a unified 4-tap interpolation filter for fractional macroblock search during motion estimation.

Description

    BACKGROUND
  • 1. Field
  • This disclosure relates generally to signal processing, and more specifically but not exclusively, to digital video encoding technologies.
  • 2. Description
  • Image information (such as digital video information) is often transmitted from one electronic device to another. Such information is typically encoded and/or compressed to reduce the bandwidth required for transmission and/or to decrease the time necessary for transmission. In some configurations, information about differences between a current picture and a previous picture might be transmitted and the device receiving the image information may then, for example, decode and/or decompress the information (e.g., by using the previous picture and the differences to generate the current picture) and provide the image to a viewing device.
  • One of the key elements of many video encoding/compression schemes is motion estimation. A video sequence consists of a series of frames. The motion estimation technique exploits the temporal redundancy between adjacent frames to achieve compression by selecting a frame as a reference and predicting subsequent frames from the reference. The process of motion estimation based video compression is also known as inter-frame coding. Motion estimation is used with an assumption that the objects in the scene have only translational motion. This assumption holds as long as there is no camera pan, zoom, changes in luminance, or rotational motion. However, for scene changes, inter-frame coding does not work well, because the temporal correlation between frames from different scenes is low. In this case, a second compression technique—intra-frame coding—is used.
  • Using the motion estimation technique, the current frame in a sequence of frames is predicted from at least one reference frame. The current frame is divided into N×N pixel macroblocks, typically 16×16 pixels in size. Each macroblock is compared to a region in the reference frame of the same size using some error measure, and the best matching region is selected. The search is conducted over a predetermined search area. A motion vector denoting the displacement of the region in the reference frame with respect to the macroblock in the current frame is determined. When a previous frame is used as a reference, the prediction is referred to as forward prediction. If the reference frame is a future frame, then the prediction is referred to as backward prediction. When both backward and forward predictions are used, the prediction is referred to as bidirectional prediction.
  • To reduce computational overhead of macroblock search, a search window within the reference frame is often identified and the macroblock is compared to various positions within the search window. The most effective yet computationally intensive way of comparing the macroblock to the search window is to compare the pixels of the macroblock to the pixels of the search window at every position that the macroblock may be moved to within the search window. This is referred to as a “full” or “exhaustive” search. For each position of the block tested within the search window, each pixel of the block is compared to a corresponding pixel in the search window. The comparison comprises computing a deviation between the values of compared pixels.
  • Often the mathematical sum of absolute differences (SAD), mean squared error (MSE), mean absolute error (MSE), or mean absolute difference (MAD) functions are utilized to quantitatively compare the pixels. The deviations for each macroblock position are then accumulated, and the position within the search window that yields the smallest deviation is selected as the most likely position of the block in the previous frame. The differences in the current and previous positions of the block are then utilized to derive the motion vector to estimate the movement associated with the block between the reference frame and the current frame. The motion vector may then, for example, be transmitted as image information (e.g., instead of a full image frame) so that a decoder may render, recreate, or build the current frame by simply applying the motion vector information to the reference frame.
  • There are various video encoding standards. The most common ones are the Moving Pictures Expert Group (MPEG) Release Two (MPEG-2) standard published by the International Standards Organization (ISO) and the International Electrotechnical Commission (IEC), the MPEG-4 part 10 standard (also known as Advanced Video Coding (AVC) standard) published by ISO/IEC, and the Society of Motion Picture and Television Engineers or SMPTE 421 M Video Codec standard (also known as VC-1 standard). Although different standards share similar algorithmic ideas and require similar motion estimation mechanisms, the actual details are often very distinctive. Motion estimation in general requires intensive computation and is desirably performed by hardware. Since motion estimation used by different video encoding standards has its own distinctive features, each hardware implementation of motion estimation needs to be standard specific resulting in inefficient use of the hardware. Therefore, it is desirable to have a unified motion estimation hardware device which covers special constraints of various video standards.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The features and advantages of the disclosed subject matter will become apparent from the following detailed description of the subject matter in which:
  • FIG. 1 shows one example computing system which may include a video encoder to compress video information using a motion estimation engine;
  • FIG. 2 shows a block diagram of a video encoder which uses a motion estimation engine to compress video information;
  • FIG. 3 shows a block diagram of a motion estimation engine used by a video encoder to compress video information for multiple video encoding standards;
  • FIG. 4 is a flowchart of one example process for a high performance motion estimation engine to perform motion estimation for multiple video encoding standards;
  • FIGS. 5A and 5B illustrate ways to partition a macroblock during motion estimation;
  • FIG. 6 illustrates an example approach to computing distortions for macroblock search during motion estimation;
  • FIG. 7 illustrates another example approach to computing distortions for macroblock search during motion estimation;
  • FIG. 8 is a flowchart of an example process for refining prediction directions during motion estimation; and
  • FIGS. 9A and 9B illustrate fractional motion estimation interpolation schemes.
  • DETAILED DESCRIPTION
  • According to embodiments of the subject matter disclosed in this application, a motion estimation engine may be implemented to support multiple video encoding standards. The motion estimation engine may be designed to support two macroblock partitioning modes: one for frame type video compression and the other for mixed frame-field type video compression. Additionally, the motion estimation engine provides the mixing unidirectional option (forward/backward) and the mixing bidirectional option. Furthermore, the motion estimation engine may use a unified 4-tap interpolation filter for fractional macroblock search during motion estimation.
  • Reference in the specification to “one embodiment” or “an embodiment” of the disclosed subject matter means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosed subject matter. Thus, the appearances of the phrase “in one embodiment” appearing in various places throughout the specification are not necessarily all referring to the same embodiment.
  • FIG. 1 shows one example computing system 100 which may include a video encoder to compress video information using a motion estimation engine. Computing system 100 may comprise one or more processors 110 coupled to a system interconnect 115. Processor 110 may have multiple or many processing cores (for the convenience of description, terms “core” and “processor” may be used interchangeably hereinafter and also the term “multiple cores/processors” will be used to include both multiple processing cores and many processing cores hereinafter). The computing system 100 may also include a chipset 120 coupled to the system interconnect 115. Chipset 120 may include one or more integrated circuit packages or chips.
  • Additionally, chipset 120 may comprise a Memory Control Hub (MCH) 130 that is coupled to a main memory 150 through a memory bus 155. The main memory 150 may store data and sequences of instructions that are executed by multiple cores of the processor 110 or any other device included in the system. The MCH 130 includes a memory controller 135 to access the main memory 150 in response to memory transactions associated with multiple cores of the processor 110, and other devices in the computing system 100. In one embodiment, memory controller 135 may be located in processor 110 or some other circuitries. The main memory 150 may comprise various memory devices that provide addressable storage locations which the memory controller 125 may read data from and/or write data to. The main memory 150 may comprise one or more different types of memory devices such as Dynamic Random Access Memory (DRAM) devices, Synchronous DRAM (SDRAM) devices, Double Data Rate (DDR) SDRAM devices, or other memory devices.
  • Chipset 120 may comprise an Input/Output Control Hub (ICH) 140 to support data transfers to and/or from other components of the computing system 100 such as, for example, keyboards, mice, network interfaces, etc. The ICH 140 may be coupled with other components through a bus such as bus 165. The bus may be a Peripheral Component Interconnect (PCI), accelerated graphics port (AGP), universal serial interconnect (USB), low pin count (LPC) interconnect, or any other kind of I/O interconnect. The ICH 140 (or the MCH 130) may be coupled to a graphics device 160, which generates and outputs images to a display. The graphics device may also provide an interface between the ICH 140 (or MCH 130) and other graphics/video devices. The graphics device includes a graphics processing unit (GPU) 170, which is dedicated graphics rendering device to efficiently manipulate and display computer graphics. The GPU 170 may implement a number of graphics primitive operations to process and render graphics.
  • Additionally, the graphics device 160 may include a video codec 180 to enable video encoding/compression and/or decoding/decompression for digital video. The video codec 180 may further include a motion estimation engine (not shown in the figure) to perform motion estimation for video compression/encoding under multiple video encoding standards such as MPEG-2, VC-1, and AVC standards. The motion estimation engine may be a part of the video codec 180, or may be a part of the GPU 170, or may be a separate engine in the graphics device 160, or may be in a different device. In one embodiment, graphics device 160 may be located in MCH 130. In another embodiment, graphics device 160 may be located in processor 110 or some other circuitries.
  • FIG. 2 shows a block diagram of a video encoder which uses a motion estimation engine to compress digital video under multiple video encoding standards, according to an embodiment of the subject matter disclosed in the present application. An image data frame source 210 may be coupled to provide image data frames to an encoder 220. The encoder 220 may, according to some configurations, apply an encoding and/or compression algorithm in accordance with a video encoding standard such as the MPEG-2, VC-1, or AVC standards. The encoder 220 may include a motion estimation engine 225 to obtain motion vectors so that during the decoding/decompression process an image frame may be reconstructed based on a motion vector and a reference frame used to obtain the motion vector.
  • The motion estimation engine 225 may be designed in such a way that it may perform motion estimation for multiple video encoding standards. There are typically three picture types commonly allowed in most video standards for video encoding/compression: the frame type, the field type, and the field-frame mixed type. The frame type is used for coding progressive contents and all macroblocks and reference pictures are considered as continuous pieces of a video picture. The field type is used mostly for coding interlaced contents and an encoder basically treats a picture as two disjoint field pictures. All macroblocks and reference pictures used in motion estimation are identical under either the field type or the frame type. Thus, assuming each picture in a digital video is stored in a memory buffer in frame format, special design for reading the picture (e.g., boundary padding) is needed for the field type. Once a picture is loaded in the motion estimation engine, it works in the same way for either the field type or the frame type. The field-frame type is used when interlaced contents are coded in frames. Depending on a video standard, there may be two subtypes for the field-frame type. One is a frame type macroblock that allows field type partitioning as under the MPEG-2 and VC-1 standards. The other is a 16×32 tall macroblock pair that can be divided into either two frame macroblocks or two field macroblocks as under the AVC standard.
  • Depending on the picture type of a digital video to be encoded/compressed under different video standards, the motion estimation engine 225 may apply a corresponding distortion calculation mode. For example, for frame type digital video signals (e.g., video signals using the AVC standard), distortion measures for macroblock search may be calculated for a total of 41 macroblock partitions on the basis of 4×4 sub-block units; and for field type and field-frame type digital video signals (e.g., video signals using the VC-1 or MPEG-2 standard), distortion measures for macroblock search may be calculated for a total of 15 macroblock partitions. Additionally, the motion estimation engine 225 may provide the mixing unidirectional option (forward/backward) and the mixing bidirectional option for video signals obtained using different video standards. Furthermore, the motion estimation engine 225 may use a unified 4-tap interpolation filter for fractional macroblock search during motion estimation.
  • The motion estimation engine 225 may generate motion vectors for corresponding reference frames. The encoder 220 may use the motion vectors and their corresponding reference frames to further compress video signals. The encoded/compressed video signals from the encoder 220 may be stored in a storage device and/or may be transmitted through a channel 230 to a receiving device 250. The channel 230 may be a wired or wireless channel. The receiving device 250 may decode/decompress the received encoded/compressed video signals and reconstruct original video signals based on reference frames and motion vectors.
  • FIG. 3 shows a block diagram of a motion estimation engine 300 used by a video encoder to encode/compress video signals under multiple video encoding standards, according to an embodiment of the subject matter disclosed in the present application. Motion estimation engine 300 comprises a skip checking module 310, an integer search module 320, a macroblock (MB) partitioning module 330, a fractional search module 340, a bidirectional checking module 350, and an intra search module 360. The motion estimation engine 300 may work with a controller 390 which controls/coordinates operations of different modules in the motion estimation engine 300. In one embodiment, there might not be a separate controller, but a GPU (e.g., GPU 170 in FIG. 1) in a graphics device may be used to control operations of each module in the motion estimation engine 300 and/or coordinate operations among the modules in the motion estimation engine 300. In another embodiment, controller 390 may be integrated with and become a part of the motion estimation engine 300. In one embodiment, the motion estimation engine 300 may have a buffer (not shown in the figure) associated with it to temporarily store macroblocks read from a storage device that stores frame(s) of digital video to be encoded/compressed. Controller 390 may feed to the motion estimation engine one macroblock at a time.
  • For most video codec standard, all macroblocks have default predicted codes: some motion vectors derived from their already-coded-neighborhood macroblock codes—e.g. left, top, top-left, top-right, and same location from previously coded frames. Following the same principle, the detailed description varies from standard to standard. As long as it is agreed by both encoder and decoder, no actual codes will be carried in the data stream except a coded flag bit to indicate the choice. Such coded macroblocks are called of skip type. Since the skip type is the most efficient type to code a macroblock, the distortion of using skip type is checked by the skip checking module 310 often as the first type to be checked. If the distortion is low, this skip code may be selected without going through the rest of the motion estimation to try other code types.
  • If the received macroblock needs to go through the process of motion estimation, it may be passed to the integer search module 320. The integer search module 320 may search the reference frame for the received macroblock at pixels with integer indices. The integer search module 320 may calculate search distortion based on 4×4 subblock units. For example, if the received macroblock is 16×16 pixels, the integer search module 320 may partition the macroblock into four 4×4 subblocks and calculate a distortion for each 4×4 subblock. The distortion for the 16×16 macroblock may be obtained by adding together distortions of the four 4×4 subblocks. Using a 4×4 subblock as a unit for distortion calculations provides flexibility for multiple ways of partitioning a macroblock because many macroblock partitions may be obtained by a combination of one or more 4×4 subblocks. It also provides flexibility for macroblocks with different dimensions (e.g., 8×8, 8×16, and 16×8 macroblocks may all be partitioned into multiple 4×4 subblocks). The integer search module 320 may store distortions for all 4×4 subblocks as well as for the entire macroblock in a buffer associated with the motion estimation engine 300.
  • The MB partitioning module 330 may support multiple ways of macroblock partitioning for either the frame type, or the field type, or the field-frame type digital video signal. A macroblock partitioning may be referred to as a major partition if no sub-block is smaller than 8×8, and may be referred to as a minor partition if no sub-block is larger than 8×8. In reality, few standards currently support macroblock partitioning that mixes blocks larger than 8×8 (such as 16×8 or 8×16) with blocks smaller than 8×8 (such as 8×4, 4×8, and 4×4). As shown in FIG. 5A, there are 8 possible major macroblock frame type partitions ( partition 510, 515, 520, 525, 530, 535, 540, and 545) for a 16×16 macroblock. The total number of possible minor partitions can be very large (e.g., several thousands for a 16×16 macroblock). Embodiments of the subject matter disclosed in the present application may support some or all of such major partitions and a portion of minor partitions. For example, the AVC standard allows minor partitions but limited to ones whose each 8×8 sub-region is divided into the same shapes (i.e. one 8×8, two 8×4s, two 4×8s, or four 4×4s.). In one embodiment, all of such minor partitions allowed by the AVC standard and/or some other minor partitions may be supported. Additionally, two field partitions (i.e., partition 560 and partition 570 as shown in FIG. 5B) may be supported. Partition 560 includes two 16×8 subblocks with one subblock having all of the even rows and the other having all of the odd rows. Partition 570 includes four 8×8 blocks by partitioning the 16×16 block into two 8×16 subblocks 572 and 574 and by further partitioning each of subblocks 572 and 574 into two 8×8 subblocks with one 8×8 subblock having all of the even rows and the other having all of the odd rows.
  • In one embodiment, 41 subblocks (i.e., 1 16×16, 2 16×8's, 2 8×16's, 4 8×8's, 8 8×4's, 8 4×8's, and 16 4×4's) may be supported in a first mode for frame type video signals or field type video signals when both fields are coded as separated pictures or macroblocks (e.g., video signals under the AVC standard does allow sub-block partitioning as fine as to 4×4 block size); and 15 subblocks (i.e., 1 16×16, 2 16×8's, 2 8×16's, 4 8×8's, 2 field 16×8's, and 4 field 8×8's) may be supported in a second mode for mixed field-frame type video signals (e.g., video signals of a macroblock is allowed to be coded in either frame type or field type individually under the VC-1 and/or MPEG-2 standard). In the first mode, distortions for all basic 4×4 subblocks are calculated and distortions for all the other subblocks are obtained from distortions of basic 4×4 subblocks as illustrated in are not calculated for each sub-block but to the common 4×4 sub-blocks, and all the other ones are added afterwards as illustrated in FIG. 6 and by the following pseudo-codes:
      • b4[16]={{0,0},{4,0},{0, 4},{4, 4}, {8,0},{12,0},{8, 4},{12, 4}, {0,8},{4,8},{0,12},{4,12}, {8,8},{1 2,8},{8,12},{12,12}};
      • Dist4×4[i]=distortion for block start at p=&Mb[b4[i].y][b4[i].x] of pixels: p[0][0],p[0][1],p[0][2],p[0][3],p[1][0], . . . ,p[3][3]; for i=0,1, . . . ,15.
  • Dist8×4[j]=Dist4×4[j*2]+Dist4×4[j*2+1]; for j =0,1,2,3,4,5,6,7.
  • Dist4×8[j]=Dist4×4[j*2]+Dist4×4[j*2+2]; for j=0,2,4,6.
  • Dist4×8[j]=Dist4×4[j*2−1]+Dist4×4[j*2+1]; for j=1,3,5,7.
  • Dist8×8[k]=Dist8×4[k*2]+Dist8×4[k*2+1]; for k=0,1,2,3.
  • Dist8×16[0]=Dist8×8[0]+Dist8×8[2]; Dist8×16[1]=Dist8×8[1]+Dist8×8[3];
  • Dist16×8[0]=Dist8×8[0]+Dist8×8[1]; Dist[6×8[1]=Dist8×8[2]+Dist8×8[3];
  • Dist16×16[0]=Dist16×8[0]+Dist16×8[1].
  • In the second mode, a similar strategy for obtaining distortions for all of the 15 subblocks may be taken. In this mode, basic subblocks may be 4×4 field sub-blocks and distortions for all of such basic subblocks may be calculated, and distortions for all the other sub-blocks may be derived from distortions of those basic subblocks, as illustrated in FIG. 7 and by the following pseudo-codes:
      • b4[16]={{0,0},{4,0},{0,8},{4,8}, {8,0},{12,0},{8,8},{12,8}, {0,1},{4,1},{0,9},{4,9}, {8,1},{12,1},{8,9},{12,9}};
  • Dist4×4F[i]=distortion for block start at p=&Mb[b4[i].y][b4[i].x] of pixels:
  • p[0][0],p[0][1],p[0][2],p[0][3],p[2][0], . . . ,p[6][3]; for i=0,1, . . . ,15.
  • Dist8×8[0]=Dist4×4F[0]+Dist4×4F[1]+Dist4×4F[8]+Dist4×4F[9];
  • Dist8×8[1]=Dist4×4F[4]+Dist4×4F[5]+Dist4×4F[12]+Dist4×4F[13];
  • Dist8×8[2]=Dist4×4F[2]+Dist4×4F[3]+Dist4×4F[10]+Dist4×4F[11];
  • Dist8×8[3]=Dist4×4F[6]+Dist4×4F[7]+Dist4×4F[14]+Dist4×4F[15];
  • Dist8×8F[0]=Dist4×4F[0]+Dist4×4F[1]+Dist4×4F[2]+Dist4×4F[3];
  • Dist8×8F[1]=Dist4×4F[4]+Dist4×4F[5]+Dist4×4F[6]+Dist4×4F[7];
  • Dist8×8F[2]=Dist4×4F[8]+Dist4×4F[9]+Dist4×4F[10]+Dist4×4F[11];
  • Dist8×8F[3]=Dist4×4F[12]+Dist4×4F[13]+Dist4×4F[14]+Dist4×4F[15];
  • Dist8×16[0]=Dist8×8[0]+Dist8×8[2]; Dist8×16[1]=Dist8×8[1]+Dist8×8[3];
  • Dist16×8[0]=Dist8×8[0]+Dist8×8[1]; Dist16×8[1]=Dist8×8[2]+Dist8×8[3];
  • Dist16×8F[0]=Dist8×8F[0]+Dist8×8F[1];
  • Dist16×8F[1]=Dist8×8F[2]+Dist8×8F[3];
  • Dist16×16[0]=Dist16×8[0]+Dist16×8[1].
  • With 41 subblocks supported in the first mode, many macroblock partitions may be supported. Similarly with 15 subblocks supported in the second mode, a number of macroblock partitions may be supported. The motion estimation engine 300 may be switched among all of the supported partitions according to different standard specifications. Normally, if any field shape is on, the motion estimation engine is set to the second mode; otherwise, the motion estimation engine is in the first mode.
  • Distortions for all specified subblocks in either the first or the second mode may be calculated during integer search by the integer search module 320. Comparisons are made and the best (with the least distortion) is recorded for each sub-block.
  • The MB partitioning module 330 further obtains distortions for each possible macroblock partition allowed by the video standard under which the video signal to be encoded/compressed is obtained, based on the best-recorded subblock distortions so obtained as described above. The MB partitioning module 330 may then compare distortions of all the available macroblock partitions and select the one that has the least overall distortion for macroblock partition.
  • The fractional search module 340 may search a macroblock at fractional pixel locations. Typically half-pel and quarter-pel pixel values are used for fractional macroblock search. half-pel or quarter-pel pixel values are obtained using interpolation formulas. For instance, the AVC standard uses 6-tap formulas below:

  • For ½-pel: (1,−5,20,20,−5,1)/32;   (1)

  • For ¼-pel: (1,−5,52,20,−5,1)/64.   (2)
  • The VC-1 standard uses 4-tap filters as defined below:

  • For ½-pel: (−1,9,9,−1)/16;   (3)

  • For ¼-pel: (−4,53,18,−3)/64.   (4)
  • In general, bilinear interpolation is also accepted too (as used by the MPEG2 standard):

  • For ½-pel: (0,1,1,0)/2;   (5)

  • For ¼-pel: (0,3,1,0)/4.   (6)
  • In one embodiment, the fractional search module may use different formulas according to different video standards. In another embodiment, a unified 4-tap interpolation may be used for all of the different video standards, as shown below:

  • For ½-pel: (−1,5,5,−1)/8;   (7)

  • For ¼-pel: (−1,13,5,−1)/16.   (8)
  • As shown in FIG. 9A (for half-pel) and FIG. 9B (for quarter-pel), the unified 4-tap interpolation formulas shown in Equations (7) and (8) are good approximations of the formulas used by various video standards.
  • The bidirectional motion estimation refinement module 350 may refine prediction directions on the sub-block level or the macroblock level. Given a macroblock partition pattern, some standards allow different sub-blocks to have their own decision on whether the prediction should be unidirectional or bidirectional, and whether the prediction should be forward or backward; while others make such decisions in the macroblock level as whole. Typically, there are three scenarios:
  • 1). Each sub-block has its own decision on whether the prediction should be unidirectional or bidirectional, and whether the prediction should be forward or backward (e.g., the AVC standard). In this case, directions are chosen for each subblock and the macroblock partitioning may be updated after direction refinement for each subblock.
  • 2). All of the sub-blocks share the same directions, whether unidirectional or bidirectional and whether forward or backward (e.g., the MPEG-2 standard). In this case, the partitioning is done for each individual direction first, and the better one is chosen later.
  • 3). All of the sub-blocks must be all unidirectional or bidirectional. If it is unidirectional, each sub-block can decide whether is forward or backward (e.g., the VC-1 standard). In this case, forward and backward decision for each sub-block is done first, then partitioning for unidirectional is performed. Bidirectional is checked for the unidirectional best, and it is chosen only if it is better.
  • In one embodiment, the bidirectional motion estimation refinement module 350 may be designed to accommodate all of the above cases. FIG. 8 is a flowchart of an example process 800 which the bidirectional motion estimation refinement module may use to refine motion estimation. Process 800 starts with the best subblock motion vectors at block 805, i.e., the bidirectional motion estimation refinement module 350 works on the basis of the best macroblock partitioning determined before the bidirectional refinement starts. At block 810, the bidirectional motion estimation refinement module 350 determines whether the video standard under which the video signal is obtained allows each subblock to have its own decision on whether the prediction should be forward or backward. If the answer is “no,” at block 815 the best macroblock partition may be found for each direction; and at block 820 between the two macroblock partitions (one for each direction), the one with smaller distortion may be chosen. If the answer is “yes,” at block 825, forward and backward motion vectors for all subblocks may be merged at block 825; and at block 830, the best overall macroblock partition may be obtained based on the merged motion vectors. At block 835, fractional refinement (fractional search) may be performed on the chosen partition. At block 840, bidirectional refinement may be performed on the chosen partition. At block 845, the bidirectional motion estimation refinement module 350 may determine whether the video standard allows each subblock to have its own decision on whether the prediction should be unidirectional or bidirectional. If the answer is “no,” the final direction pattern may be chosen by comparing bidirectional overall results of the partition chosen at block 840 with all subblocks being unidirectional and being bidirectional at block 850. If the answer is “yes,” the final direction pattern may be chosen by comparing bidirectional results for each subblock of the partition chosen at block 840. At block 860, the process 800 may continue for another macroblock.
  • The intra search module 360 may perform intra-frame search for the received macroblock. The intra-frame search works better when there are scene changes from frame to frame or when there is no motion vector set found with small enough distortion. In one embodiment, the intra search module 360 may perform intra-frame search in parallel with the inter-frame search.
  • Additionally, the motion estimation engine 300 may support early exits. In any stage of the encoding, the motion estimation engine may terminate the estimation process if some “good enough” pattern of the macroblock is found in the reference frame. A “good enough” pattern in the reference frame is a block whose total adjusted distortion error from the macroblock is below a threshold constant provided by the user as part of a motion estimation command.
  • Moreover, the motion estimation engine 300 may support refinement skip. If the best result after integer search is “still bad”, the motion estimation engine 300 may skip fractional and bidirectional searches and jump to intra-frame directly to save time and computation. A “still bad” result is a block in the reference frame whose total adjusted distortion error from the macroblock is above a threshold constant provided by the user as part of a motion estimation command.
  • Furthermore, the motion estimation engine 300 may support a working mode under which the engine may output all distortion values in addition to the best results of all major partitions to allow user to be able to use the motion estimation for non-encoding applications and further refinement of encoding solutions.
  • FIG. 4 is a flowchart of one example process 400 for a motion estimation engine to perform motion estimation for multiple video encoding standards, according to an embodiment of the subject matter disclosed in the present application. Process 400 starts at block 405. At block 410, the motion estimation engine (e.g., 300 in FIG. 3) may receive a macroblock from the current frame and check whether the “P-skip” is enabled for the macroblock. If it is, the motion estimation engine may perform skip checking at block 415. If it is determined that the macroblock may indeed be skipped from motion estimation at block 420, process 800 may directly jump to block 480. If the “P-skip” is found at block 410 not enabled or if it is determined that the macroblock may not be skipped from motion estimation at block 420, the motion estimation engine may check whether inter-frame search is needed for the macroblock at block 425. A bit in an information bit vector for the macroblock and/or for the current frame may be set to indicate that inter-frame search is needed. If the inter-frame search is needed, the motion estimation engine may perform integer search (described above for integer search module 320) at block 430. If the inter-frame search is not needed, it may be further determined whether intra-frame search is needed for the macroblock at block 435. If the answer is “yes,” process 800 will proceed to block 475 directly.
  • During the integer search at block 430, the motion estimation engine will determine whether a search result is good enough so that the search may be exited to proceed to block 440. A search result is good enough if the distortion between the search result and the macroblock is below a predetermined threshold. The threshold may be predetermined by a user and provided to the motion estimation engine via a command. As described above for FIG. 3, distortions for all of the 4×4 basic blocks of the macroblock may be computed and retained during the inter-frame search. Distortions for other shaped subblocks may be obtained from the distortions of various 4×4 basic blocks. Based on all of the distortions so derived, an optimal macroblock partition may be determined at block 440.
  • At block 445, it may be determined whether the overall distortion for the determined optimal macroblock partition is above a “too bad” threshold or is below a “too good” threshold. The “too bad” threshold is used to indicate that the integer search result is not acceptable and an intra-search may thus be necessary. The “too good” threshold is used to indicate that the integer search result is good enough and it is not necessary to perform further refinement to the search result. Both thresholds are predetermined by a user and provided to the motion estimation engine via a command. If the result from the integer search is too bad, process 800 may proceed directly to block 475 for intra-frame search. If the result is too good, on the other hand, process 800 may exit early and proceed directly to block 480. If the search result is between the “too bad” threshold and the “too good” threshold, process 800 may proceed to block 450 to determine whether fractional search refinement is needed.
  • At block 450, the bit in an information bit vector that indicates whether the fractional search is needed for the macroblock may be checked. If the fractional search bit is set, fractional search (as described above for the fractional search module 340 in FIG. 3) may be performed at block 455. If after the fractional search, it is determined at block 460 the overall distortion between the macroblock and its corresponding search results in the reference frame is below an “early exit” threshold, process 800 may proceed directly to block 480 without any further refinement to the search result. The “early exit” threshold may be set the same as the “too good” threshold, or it may be predetermined differently. The “early exit” threshold may be provided to the motion estimation engine via a command.
  • If it is determined at block 460 that an early exit is not warranted, process 800 may proceed to block 465 for further processing of the search results from block 455. At block 465, it may be determined whether a bidirectional refinement is needed by checking the bidirectional bit in the information bit vector. If the bidirectional bit is set, bidirectional refinement (as described above for bidirectional motion estimation refinement module 350 in FIG. 3) may be performed at block 470; otherwise, process 800 may proceed to block 480. Intra-frame search may be performed at block 475. Results from different blocks may be received by block 480, which finalize the results and output the results in a format desired by the next stage of the video encoding process.
  • Although an example embodiment of the disclosed subject matter is described with reference to block and flow diagrams in FIGS. 1-9, persons of ordinary skill in the art will readily appreciate that many other methods of implementing the disclosed subject matter may alternatively be used. For example, the order of execution of the blocks in flow diagrams may be changed, and/or some of the blocks in block/flow diagrams described may be changed, eliminated, or combined.
  • In the preceding description, various aspects of the disclosed subject matter have been described. For purposes of explanation, specific numbers, systems and configurations were set forth in order to provide a thorough understanding of the subject matter. However, it is apparent to one skilled in the art having the benefit of this disclosure that the subject matter may be practiced without the specific details. In other instances, well-known features, components, or modules were omitted, simplified, combined, or split in order not to obscure the disclosed subject matter.
  • Various embodiments of the disclosed subject matter may be implemented in hardware, firmware, software, or combination thereof, and may be described by reference to or in conjunction with program code, such as instructions, functions, procedures, data structures, logic, application programs, design representations or formats for simulation, emulation, and fabrication of a design, which when accessed by a machine results in the machine performing tasks, defining abstract data types or low-level hardware contexts, or producing a result.
  • For simulations, program code may represent hardware using a hardware description language or another functional description language which essentially provides a model of how designed hardware is expected to perform. Program code may be assembly or machine language, or data that may be compiled and/or interpreted. Furthermore, it is common in the art to speak of software, in one form or another as taking an action or causing a result. Such expressions are merely a shorthand way of stating execution of program code by a processing system which causes a processor to perform an action or produce a result.
  • Program code may be stored in, for example, volatile and/or non-volatile memory, such as storage devices and/or an associated machine readable or machine accessible medium including solid-state memory, hard-drives, floppy-disks, optical storage, tapes, flash memory, memory sticks, digital video disks, digital versatile discs (DVDs), etc., as well as more exotic mediums such as machine-accessible biological state preserving storage. A machine readable medium may include any mechanism for storing, transmitting, or receiving information in a form readable by a machine, and the medium may include a tangible medium through which electrical, optical, acoustical or other form of propagated signals or carrier wave encoding the program code may pass, such as antennas, optical fibers, communications interfaces, etc. Program code may be transmitted in the form of packets, serial data, parallel data, propagated signals, etc., and may be used in a compressed or encrypted format.
  • Program code may be implemented in programs executing on programmable machines such as mobile or stationary computers, personal digital assistants, set top boxes, cellular telephones and pagers, and other electronic devices, each including a processor, volatile and/or non-volatile memory readable by the processor, at least one input device and/or one or more output devices. Program code may be applied to the data entered using the input device to perform the described embodiments and to generate output information. The output information may be applied to one or more output devices. One of ordinary skill in the art may appreciate that embodiments of the disclosed subject matter can be practiced with various computer system configurations, including multiprocessor or multiple-core processor systems, minicomputers, mainframe computers, as well as pervasive or miniature computers or processors that may be embedded into virtually any device. Embodiments of the disclosed subject matter can also be practiced in distributed computing environments where tasks may be performed by remote processing devices that are linked through a communications network.
  • Although operations may be described as a sequential process, some of the operations may in fact be performed in parallel, concurrently, and/or in a distributed environment, and with program code stored locally and/or remotely for access by single or multi-processor machines. In addition, in some embodiments the order of operations may be rearranged without departing from the spirit of the disclosed subject matter. Program code may be used by or in conjunction with embedded controllers.
  • While the disclosed subject matter has been described with reference to illustrative embodiments, this description is not intended to be construed in a limiting sense. Various modifications of the illustrative embodiments, as well as other embodiments of the subject matter, which are apparent to persons skilled in the art to which the disclosed subject matter pertains are deemed to lie within the scope of the disclosed subject matter.

Claims (26)

1. An apparatus for performing motion estimation for multiple video encoding standards, comprising:
an integer search module to search a reference frame for the macroblock at integer pixel locations (“integer search”), the integer search module calculating distortion between each 4×4 pixel subblock unit in the macroblock and its corresponding 4×4 subblock in the reference frame; and
a macroblock partitioning module to determine a macroblock partition based on the distortions of 4×4 subblock units from the integer search module, the macroblock partitioning module deriving distortions for other dimension subblocks in the macroblock from the 4×4 subblock distortions, and the determined macroblock partition having the least overall distortion among all potential macroblock partitions.
2. The apparatus of claim 1, further comprising a skip checking module to determine whether a macroblock in the current frame may use a default predicted motion vector specified by an individual video encoding standard.
3. The apparatus of claim 1, further comprising a fractional search module to fine-tune the results form the integer search module by searching the macroblock in the reference frame at fractional pixel positions (“fractional search”), the fractional search being performed by using a unified interpolation formula for multiple video encoding standards.
4. The apparatus of claim 1, further comprising a bidirectional motion estimation refinement module to refine prediction directions for the macroblock based on the macroblock partition, the bidirectional motion estimation allowing subblocks in the macroblock to have unified or mixing unidirectional predictions and to have unified or mixing bidirectional predictions, as specified by an individual video encoding standard.
5. The apparatus of claim 1, further comprising an intra search module to conduct intra-frame search for the macroblock.
6. The apparatus of claim 1, further comprising a controller to control operations of modules in the apparatus, to coordinate among the modules in the apparatus, and to turn on or off options of each module in the apparatus for different video encoding standards, and to output motion estimation results to a video encoder.
7. The apparatus of claim 1, wherein a 4×4 pixel subblock unit is formed from consecutive pixel rows for frame type video signals or from alternated pixel rows for field type video signals.
8. A computing system, comprising:
a memory to store information associated with a reference frame; and
a motion estimation engine to perform motion estimation for a current frame and to produce a motion vector for the current frame based on the reference frame for various video encoding standards, the motion estimation engine including:
an integer search module to search a reference frame for the macroblock at integer pixel locations (“integer search”), the integer search module calculating distortion between each 4×4 pixel subblock unit in the macroblock and its corresponding 4×4 subblock in the reference frame; and
a macroblock partitioning module to determine a macroblock partition based on the distortions of 4×4 subblock units from the integer search module, the macroblock partitioning module deriving distortions for other dimension subblocks in the macroblock from the 4×4 subblock distortions, and the determined macroblock partition having the least overall distortion among all potential macroblock partitions.
9. The computing system of claim 8, wherein the motion estimation engine further comprises a skip checking module to determine whether a macroblock in the current frame may use a default predicted motion vector specified by an individual video encoding standard.
10. The computing system of claim 8, the motion estimation engine further comprises a fractional search module to fine-tune the results form the integer search module by searching the macroblock in the reference frame at fractional pixel positions (“fractional search”), the fractional search being performed by using a unified interpolation formula for multiple video encoding standards.
11. The computing system of claim 8, the motion estimation engine further comprises a bidirectional motion estimation refinement module to refine prediction directions for the macroblock based on the macroblock partition, the bidirectional motion estimation allowing subblocks in the macroblock to have unified or mixing unidirectional predictions and to have unified or mixing bidirectional predictions, as specified by an individual video encoding standard.
12. The computing system of claim 8, the motion estimation engine further comprises an intra search module to conduct intra-frame search for the macroblock.
13. The computing system of claim 8, the motion estimation engine further comprises a controller to control operations of modules in the apparatus, to coordinate among the modules in the apparatus, and to turn on or off options of each module in the apparatus for different video encoding standards, and to output motion estimation results to a video encoder.
14. The computing system of claim 12, wherein functions of the controller are performed by a graphics processing unit (“GPU”).
15. The computing system of claim 8, wherein a 4×4 pixel subblock unit is formed from consecutive pixel rows for frame type video signals or from alternated pixel rows for field type video signals.
16. The computing system of claim 8, wherein the motion estimation engine supports frame type, field type, and field-frame type based video encoding standards.
17. A method for performing motion estimation for multiple video encoding standards, comprising:
receiving a macroblock from a current frame;
checking whether an integer search is set to be performed for the macroblock;
if the integer search is set to be performed, performing search for a macroblock in a reference frame at integer pixel locations (“integer search”);
calculating distortion between each 4×4 pixel subblock unit in the macroblock and its corresponding 4×4 subblock in the reference frame;
deriving distortions for other dimension subblocks in the macroblock from the 4×4 subblock distortions; and determining a macroblock partition based on the 4×4 subblock distortions and the distortions of other dimension subblocks, the determined macroblock partition having the least overall distortion among all potential macroblock partitions.
18. The method of claim 17, further comprising:
if the integer search is not to be performed, checking whether an intra search is set to be performed for the macroblock; and
if the intra search is set to be performed, performing intra-frame search for the macroblock.
19. The method of claim 17, wherein subblocks in the macroblock include 16×8 subblocks, 2 8×16 subblocks, 4 8×8 subblocks, 8 8×4 subblocks, 8 4×8 subblocks, and 16 4×4 subblocks for frame or field type video signals; and 2 16×8 subblocks, 2 8×16 subblocks, 4 8×8 subblocks, 2 field 16×8 subblocks, and 4 field 8×8 subblocks for field-frame type video signals.
20. The method of claim 17, further comprising fine-tuning results from the integer search by searching the macroblock in the reference frame at fractional pixel positions (“fractional search”) if a fractional search is set to be performed, the fractional search being performed by using a unified interpolation formula for multiple video encoding standards.
21. The method of claim 17, further comprising refining prediction directions for the macroblock based on the macroblock partition if a bidirectional refinement is set to be performed, subblocks in the macroblock being allowed to have unified or mixing unidirectional predictions and to have unified or mixing bidirectional predictions to support various video encoding standards.
22. An article comprising a machine-readable medium that contains instructions, which when executed by a processing platform, cause said processing platform to perform operations including:
receiving a macroblock from a current frame;
checking whether an integer search is set to be performed for the macroblock;
if the integer search is set to be performed, performing search for a macroblock in a reference frame at integer pixel locations (“integer search”);
calculating distortion between each 4×4 pixel subblock unit in the macroblock and its corresponding 4×4 subblock in the reference frame;
deriving distortions for other dimension subblocks in the macroblock from the 4×4 subblock distortions; and
determining a macroblock partition based on the 4×4 subblock distortions and the distortions of other dimension subblocks, the determined macroblock partition having the least overall distortion among all potential macroblock partitions.
23. The article of claim 22, wherein the operations further comprise:
if the integer search is not to be performed, checking whether an intra search is set to be performed for the macroblock; and
if the intra search is set to be performed, performing intra-frame search for the macroblock.
24. The article of claim 22, wherein subblocks in the macroblock include 2 16×8 subblocks, 2 8×16 subblocks, 4 8×8 subblocks, 8 8×4 subblocks, 8 4×8 subblocks, and 16 4×4 subblocks for frame or field type video signals; and 2 16×8 subblocks, 2 8×16 subblocks, 4 8×8 subblocks, 2 field 16×8 subblocks, and 4 field 8×8 subblocks for field-frame type video signals.
25. The article of claim 22, wherein the operations further comprise fine-tuning results from the integer search by searching the macroblock in the reference frame at fractional pixel positions (“fractional search”) if a fractional search is set to be performed, the fractional search being performed by using a unified interpolation formula for multiple video encoding standards.
26. The article of claim 22, wherein the operations further comprise refining prediction directions for the macroblock based on the macroblock partition if a bidirectional refinement is set to be performed, subblocks in the macroblock being allowed to have unified or mixing unidirectional predictions and to have unified or mixing bidirectional predictions to support various video encoding standards.
US11/967,227 2007-12-30 2007-12-30 Motion estimation compatible with multiple standards Abandoned US20090167775A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US11/967,227 US20090167775A1 (en) 2007-12-30 2007-12-30 Motion estimation compatible with multiple standards
US13/897,555 US9699451B2 (en) 2007-12-30 2013-05-20 Motion estimation compatible with multiple standards

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/967,227 US20090167775A1 (en) 2007-12-30 2007-12-30 Motion estimation compatible with multiple standards

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/897,555 Continuation US9699451B2 (en) 2007-12-30 2013-05-20 Motion estimation compatible with multiple standards

Publications (1)

Publication Number Publication Date
US20090167775A1 true US20090167775A1 (en) 2009-07-02

Family

ID=40797678

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/967,227 Abandoned US20090167775A1 (en) 2007-12-30 2007-12-30 Motion estimation compatible with multiple standards
US13/897,555 Active 2028-10-08 US9699451B2 (en) 2007-12-30 2013-05-20 Motion estimation compatible with multiple standards

Family Applications After (1)

Application Number Title Priority Date Filing Date
US13/897,555 Active 2028-10-08 US9699451B2 (en) 2007-12-30 2013-05-20 Motion estimation compatible with multiple standards

Country Status (1)

Country Link
US (2) US20090167775A1 (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2403250A1 (en) * 2010-06-30 2012-01-04 ViXS Systems Inc. Method and apparatus for multi-standard video coding
US20120051430A1 (en) * 2009-01-28 2012-03-01 France Telecom Methods for encoding and decoding sequence implementing a motion compensation, corresponding encoding and decoding devices, signal and computer programs
WO2014099399A1 (en) * 2012-12-17 2014-06-26 Qualcomm Mems Technologies, Inc. Motion compensated video halftoning
US20140192071A1 (en) * 2012-06-21 2014-07-10 Qi Liu Techniques for improved graphics encoding
WO2014120369A1 (en) * 2013-01-30 2014-08-07 Intel Corporation Content adaptive partitioning for prediction and coding for next generation video
US20150228106A1 (en) * 2014-02-13 2015-08-13 Vixs Systems Inc. Low latency video texture mapping via tight integration of codec engine with 3d graphics engine
US20170111652A1 (en) * 2015-10-15 2017-04-20 Cisco Technology, Inc. Low-complexity method for generating synthetic reference frames in video coding
US9819965B2 (en) 2012-11-13 2017-11-14 Intel Corporation Content adaptive transform coding for next generation video
CN111886866A (en) * 2018-01-26 2020-11-03 联发科技股份有限公司 Hardware-friendly constrained motion vector refinement correction
US11405653B2 (en) * 2016-11-04 2022-08-02 Google Llc Restoration in video coding using filtering and subspace projection

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3747194A1 (en) * 2018-01-29 2020-12-09 VID SCALE, Inc. Frame-rate up conversion with low complexity

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020114391A1 (en) * 1997-04-01 2002-08-22 Sony Corporation Image encoder, image encoding method, image decoder, image decoding method, and distribution media
US20050094729A1 (en) * 2003-08-08 2005-05-05 Visionflow, Inc. Software and hardware partitioning for multi-standard video compression and decompression
US20050276331A1 (en) * 2004-06-11 2005-12-15 Samsung Electronics Co., Ltd. Method and apparatus for estimating motion
US6980596B2 (en) * 2001-11-27 2005-12-27 General Instrument Corporation Macroblock level adaptive frame/field coding for digital video content
US20060222075A1 (en) * 2005-04-01 2006-10-05 Bo Zhang Method and system for motion estimation in a video encoder
US20060245497A1 (en) * 2005-04-14 2006-11-02 Tourapis Alexis M Device and method for fast block-matching motion estimation in video encoders
US7342964B2 (en) * 2003-07-15 2008-03-11 Lsi Logic Corporation Multi-standard variable block size motion estimation processor
US7782952B2 (en) * 2005-10-25 2010-08-24 Novatek Microelectronics Corp. Apparatus and method for motion estimation supporting multiple video compression standards

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3050736B2 (en) * 1993-12-13 2000-06-12 シャープ株式会社 Video encoding device
US8179963B2 (en) * 2003-07-24 2012-05-15 Panasonic Corporation Coding mode determining apparatus, image coding apparatus, coding mode determining method and coding mode determining program
US8902989B2 (en) * 2005-04-27 2014-12-02 Broadcom Corporation Decoder system for decoding multi-standard encoded video
US9332264B2 (en) 2007-12-30 2016-05-03 Intel Corporation Configurable performance motion estimation for video encoding

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020114391A1 (en) * 1997-04-01 2002-08-22 Sony Corporation Image encoder, image encoding method, image decoder, image decoding method, and distribution media
US6980596B2 (en) * 2001-11-27 2005-12-27 General Instrument Corporation Macroblock level adaptive frame/field coding for digital video content
US7342964B2 (en) * 2003-07-15 2008-03-11 Lsi Logic Corporation Multi-standard variable block size motion estimation processor
US20050094729A1 (en) * 2003-08-08 2005-05-05 Visionflow, Inc. Software and hardware partitioning for multi-standard video compression and decompression
US20050276331A1 (en) * 2004-06-11 2005-12-15 Samsung Electronics Co., Ltd. Method and apparatus for estimating motion
US20060222075A1 (en) * 2005-04-01 2006-10-05 Bo Zhang Method and system for motion estimation in a video encoder
US20060245497A1 (en) * 2005-04-14 2006-11-02 Tourapis Alexis M Device and method for fast block-matching motion estimation in video encoders
US7782952B2 (en) * 2005-10-25 2010-08-24 Novatek Microelectronics Corp. Apparatus and method for motion estimation supporting multiple video compression standards

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120051430A1 (en) * 2009-01-28 2012-03-01 France Telecom Methods for encoding and decoding sequence implementing a motion compensation, corresponding encoding and decoding devices, signal and computer programs
US9066107B2 (en) * 2009-01-28 2015-06-23 France Telecom Methods for encoding and decoding sequence implementing a motion compensation, corresponding encoding and decoding devices, signal and computer programs
EP2403250A1 (en) * 2010-06-30 2012-01-04 ViXS Systems Inc. Method and apparatus for multi-standard video coding
US20140192071A1 (en) * 2012-06-21 2014-07-10 Qi Liu Techniques for improved graphics encoding
US10158851B2 (en) * 2012-06-21 2018-12-18 Intel Corporation Techniques for improved graphics encoding
US9819965B2 (en) 2012-11-13 2017-11-14 Intel Corporation Content adaptive transform coding for next generation video
WO2014099399A1 (en) * 2012-12-17 2014-06-26 Qualcomm Mems Technologies, Inc. Motion compensated video halftoning
CN104838438A (en) * 2012-12-17 2015-08-12 高通Mems科技公司 Motion compensated video halftoning
US9794569B2 (en) 2013-01-30 2017-10-17 Intel Corporation Content adaptive partitioning for prediction and coding for next generation video
US9787990B2 (en) 2013-01-30 2017-10-10 Intel Corporation Content adaptive parametric transforms for coding for next generation video
WO2014120369A1 (en) * 2013-01-30 2014-08-07 Intel Corporation Content adaptive partitioning for prediction and coding for next generation video
US20150228106A1 (en) * 2014-02-13 2015-08-13 Vixs Systems Inc. Low latency video texture mapping via tight integration of codec engine with 3d graphics engine
US20170111652A1 (en) * 2015-10-15 2017-04-20 Cisco Technology, Inc. Low-complexity method for generating synthetic reference frames in video coding
US10805627B2 (en) * 2015-10-15 2020-10-13 Cisco Technology, Inc. Low-complexity method for generating synthetic reference frames in video coding
US11070834B2 (en) 2015-10-15 2021-07-20 Cisco Technology, Inc. Low-complexity method for generating synthetic reference frames in video coding
US11405653B2 (en) * 2016-11-04 2022-08-02 Google Llc Restoration in video coding using filtering and subspace projection
US20220353545A1 (en) * 2016-11-04 2022-11-03 Google Llc Restoration in video coding using filtering and subspace projection
US11924476B2 (en) * 2016-11-04 2024-03-05 Google Llc Restoration in video coding using filtering and subspace projection
CN111886866A (en) * 2018-01-26 2020-11-03 联发科技股份有限公司 Hardware-friendly constrained motion vector refinement correction
EP3744098A4 (en) * 2018-01-26 2021-10-20 MediaTek Inc. Hardware friendly constrained motion vector refinement

Also Published As

Publication number Publication date
US20130251040A1 (en) 2013-09-26
US9699451B2 (en) 2017-07-04

Similar Documents

Publication Publication Date Title
US9332264B2 (en) Configurable performance motion estimation for video encoding
US9699451B2 (en) Motion estimation compatible with multiple standards
US10812821B2 (en) Video encoding and decoding
Chen et al. Analysis and design of macroblock pipelining for H. 264/AVC VLSI architecture
JP5422124B2 (en) Reference picture selection method, image encoding method, program, image encoding device, and semiconductor device
CN107318026B (en) Video encoder and video encoding method
US9307241B2 (en) Video encoding method and a video encoding apparatus using the same
US10291925B2 (en) Techniques for hardware video encoding
US9674547B2 (en) Method of stabilizing video, post-processing circuit and video decoder including the same
US20070030899A1 (en) Motion estimation apparatus
JP2009544225A (en) Parallel processing unit for video compression
US8737476B2 (en) Image decoding device, image decoding method, integrated circuit, and program for performing parallel decoding of coded image data
US20120076207A1 (en) Multiple-candidate motion estimation with advanced spatial filtering of differential motion vectors
KR20060069010A (en) Deblocking filter of h.264/avc video decoder
US20070171979A1 (en) Method of video decoding
GB2492778A (en) Motion compensated image coding by combining motion information predictors
JP2007259149A (en) Encoding method
US20080031335A1 (en) Motion Detection Device
Wu et al. Algorithm and architecture design of image inpainting engine for video error concealment applications
JP2011250400A (en) Moving picture encoding apparatus and moving picture encoding method
US6353683B1 (en) Method and apparatus of image processing, and data storage media
JP4625096B2 (en) Decoding circuit, decoding device, and decoding system
US20070153909A1 (en) Apparatus for image encoding and method thereof
JP2001045493A (en) Moving image encoding device, moving image output device and storage medium
JP2001086447A (en) Image processor

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTEL CORPORATION, CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LU, NING;JIANG, HONG;REEL/FRAME:022937/0734

Effective date: 20071228

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION