WO2010036995A1 - Deriving new motion vectors from existing motion vectors - Google Patents

Deriving new motion vectors from existing motion vectors Download PDF

Info

Publication number
WO2010036995A1
WO2010036995A1 PCT/US2009/058546 US2009058546W WO2010036995A1 WO 2010036995 A1 WO2010036995 A1 WO 2010036995A1 US 2009058546 W US2009058546 W US 2009058546W WO 2010036995 A1 WO2010036995 A1 WO 2010036995A1
Authority
WO
WIPO (PCT)
Prior art keywords
motion vectors
video information
portions
frames
existing motion
Prior art date
Application number
PCT/US2009/058546
Other languages
French (fr)
Inventor
Richard Webb
Original Assignee
Dolby Laboratories Licensing Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dolby Laboratories Licensing Corporation filed Critical Dolby Laboratories Licensing Corporation
Publication of WO2010036995A1 publication Critical patent/WO2010036995A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/40Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using video transcoding, i.e. partial or full decoding of a coded input stream followed by re-encoding of the decoded output stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/533Motion estimation using multistep search, e.g. 2D-log search or one-at-a-time search [OTS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/56Motion estimation with initialisation of the vector search, e.g. estimating a good candidate to initiate a search
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/503Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving temporal prediction
    • H04N19/51Motion estimation or motion compensation
    • H04N19/57Motion estimation characterised by a search window with variable size or shape

Definitions

  • the present invention pertains generally to video signal processing and pertains more specifically to signal processing that derives information about apparent motion in images represented by a sequence of pictures or frames of video data in a video signal.
  • a variety of video signal processing applications rely on the ability to detect apparent motion in images that are represented by a sequence of pictures or frames. Three examples of these applications are data compression, noise reduction and frame -rate conversion.
  • Some forms of data compression rely on motion detection between two pictures or frames so that one frame of video data can be represented more efficiently by inter- frame encoded data, or data that represents at least a portion of one frame of data in relative terms to a respective portion of data in another frame.
  • video data compression that used motion detection is described in international standard ISO/IEC 13818-2 entitled “Generic Coding of Moving Pictures and Associated Audio Information: Video” and in Advanced Television Standards
  • the MPEG-2 technique compresses some frames of video data by spatial coding techniques without reference to any other frame of video data to generate respective I- frames of independent or intra-frame encoded video data. Other frames are compressed by temporal coding techniques that use motion detection and prediction. Forward prediction is used to generate respective P- frames or predicted frames of inter-frame encoded video data, and forward and backward prediction are used to generate respective B-frames or bidirectional frames of inter-frame encoded video data. MPEG-2 compliant applications may select frames for intra-frame encoding according to a fixed schedule, such as every fifteenth frame, or they may select frames according to an adaptive schedule.
  • An adaptive schedule may be based on criteria related to the detection of motion or differences in content between adjacent frames, if desired.
  • Some noise-reduction techniques rely on the ability to identify blocks or portions of images in which motion occurs or, alternatively, in which no motion occurs.
  • One system for noise reduction uses motion detection to control application of a temporal low-pass filter to corresponding picture elements or "pixels" in respective frames in a sequence of frames. This form of noise reduction avoids blurring the appearance of moving objects by applying its low-pass filter to only those areas of the image in which motion is not detected.
  • One implementation of the low-pass filter calculates a moving average value for corresponding pixels in a sequence of frames and substitutes the average value for the respective pixel in the current frame.
  • Some frame-rate conversion techniques use motion vectors to interpolate the values of individual pixels between two adjacent frames. Pixel values are interpolated for specified times to generate new frames at the desired frame rate.
  • Motion vectors that are suitable for data compression may not be suitable for other applications.
  • MPEG-2 data compression uses a motion vector for inter-frame encoding to represent motion between two frames of video data.
  • the MPEG-2 motion vector expresses the horizontal and vertical displacement of a corresponding portion of an image in source and destination areas of two different pictures or frames.
  • the MPEG motion vectors work fairly well for data compression applications, they are generally too coarse for noise reduction and frame-rate conversion.
  • the performance of noise reduction and frame-rate conversion applications generally improves when the motion vectors represent very small blocks or portions of images but the computational resources needed to derive large numbers of motion vectors for small blocks or portions of images can be prohibitively expensive for reasons discussed below.
  • Several methods derive motion vectors by detecting differences between frames.
  • One well known method uses a technique called block matching, which compares the video data in a "current" frame of video data to the video data in a "reference" frame of data.
  • the data in a current frame is divided into an array of blocks such as blocks of 16 x 16 pixels or 8 x 8 pixels, for example, and the content of a respective block in the current frame is compared to arrays of pixels within a search area in the reference frame. If a match is found between a block in the current frame and a region of the reference frame, motion for the portion of the image represented by that block can be deemed to have occurred.
  • the search area is often a rectangular region of the reference frame having a specified height and width and having a location that is centered on the corresponding location of the respective block.
  • the height and width of the search area may be fixed or adaptive.
  • a larger search area allows larger magnitude displacements to be detected, which correspond to higher velocities of movement.
  • a larger search area increases the computational resources that are needed to perform block matching.
  • the search region is centered on the location of the respective block to be matched and is 48 pixels high and 64 pixels wide.
  • each pixel in a block is compared to its respective pixel in all 8 x 8 blocks of the search region.
  • a correspondingly higher number of comparisons are needed if block matching is to be done for smaller block sizes.
  • the additional resources should be expended only if the additional benefits are worth the cost of those additional resources.
  • block refers to a portion of an image or picture and the term “motion vector” refers to any data construct that can represent a block in one frame of data in relative terms to a respective block in another frame, which typically expresses motion between two frames of video data.
  • a block comprises a square or rectangular array of pixels but it may comprise pixels arranged in other simple shapes such as a circle or an ellipse or complex shapes with irregular outlines and holes.
  • a motion vector is not limited to any particular data construct but the construct set forth in the MPEG-2 standard is a suitable example. In this standard, the "motion vector" data construct is set forth in part 10 of the ISO/IEC 14496 standard, also known as
  • the motion vector defined in the MPEG-2 standard specifies a source area of one picture, a destination area in a second picture, and the horizontal and vertical displacements from the source area to the destination area. Additional information may be included in or associated with a motion vector.
  • Fig. 1 is a schematic block diagram of an exemplary system that incorporates various aspects of the present invention.
  • Fig. 2 is a schematic illustration of a sequence of pictures or frames of video data in an MPEG-2 compliant encoded video data stream.
  • Fig. 3 is a schematic illustration of two video frames with a motion vector.
  • Fig. 4 is a schematic block diagram of a process for deriving new motion vectors.
  • Fig. 5 is a schematic illustration of two video frames and a new motion vector.
  • Fig 6 is a schematic block diagram of a device that may be used to derive new motion vectors according to teachings of the present invention.
  • Fig. 1 is a schematic block diagram of an exemplary system 10 incorporating aspects of the present invention that may be used to derive new motion vectors representing very small image areas from motion vectors representing much larger areas that already exist in a video data stream.
  • the Motion Vector Processor (MVP) 2 receives video information conveyed in the video data stream from the signal path 1, analyzes the existing motion vectors to derive the new motion vectors that are not present in the stream, passes the new motion vectors along the path 3 and can, if desired, also pass the existing motion vectors along the path 3.
  • MVP Motion Vector Processor
  • the Video Signal Processor (VSP) 4 receives the video data stream from the path 1, receives the new motion vectors from the path 3, receives the existing motion vectors from either the path 1 or the path 3, and applies signal processing to at least some of the video information conveyed in the video data stream to generate a processed signal that is passed along the signal path 5.
  • the VSP 4 adapts its signal processing in response to the new motion vectors and, preferably, also in response to the existing motion vectors.
  • the VSP4 may apply essentially any type of signal processing that may be desired such as, for example, noise reduction, image resolution enhancement and frame-rate conversion.
  • the present invention can process motion vectors in video data streams that conform to a wide variety of formats and standards. No format or standard is essential to the present invention but the following disclosure uses terminology that is consistent with the MPEG-2 standard.
  • I-frames independently encoded frames, or intra-frame encoded data
  • P-frames Dependently encoded frames, or inter-frame encoded data
  • Fig. 2 is a schematic illustration of a sequence of pictures or frames of video data in an MPEG-2 compliant encoded video data stream. This particular sequence includes two I-frames 33, 39 and five intervening P-frames 34 to 38.
  • the encoded data in each P-frame may include one or more motion vectors for blocks of pixels in that frame, which is based on or predicted from corresponding blocks of pixels in a prior frame.
  • the P-frame 34 may contain one or more motion vectors representing one or more blocks in motion between the I-frame 33 and the P-frame 34.
  • An MPEG-2 compliant video data stream may also include B-frames with bidirectional motion vectors.
  • Fig. 3 is a schematic diagram of two frames of video data within a sequence of frames. In this example, the upper frame is a source frame and the lower frame is a destination frame.
  • the destination frame is either a P-frame or a B -frame because it has a motion vector that represents motion from the source area 41 in the source frame to the destination area 42 in the destination frame.
  • the magnitude and direction of motion is illustrated by the vector 51.
  • the magnitude and direction of motion are expressed by the amounts of horizontal and vertical displacement between the source and destination areas. No particular format or data construct is essential to the present invention.
  • the destination frame may have more than one motion vector representing motion occurring in multiple areas from the source frame to the destination frame. If the two frames are in an MPEG-2 compliant video data stream, the destination frame shown in the figure may be a B -frame. If the destination frame is a B-frame, it may have motion vectors from another source frame.
  • the present invention is applied to all existing motion vectors. For the sake of simplicity, however, only one existing motion vector is discussed.
  • the present invention derives new motion vectors from one or more existing motion vectors.
  • the new motion vectors can represent motion for very small blocks or portions of images. Implementations of the present invention are efficient and tend to be self-optimizing because more processing resources are applied to video frames that have more existing motion vectors.
  • a larger number of existing motion vectors generally means a video frame represents an image with more identified and useful motion and greater benefits are more likely to be achieved for those video frames that represent images with such motion.
  • Step 101 obtains one or more existing motion vectors that represent areas of motion between two frames of video information. For example, if the sequence of frames is within an encoded video data stream that conforms to the MPEG-2 standard, the existing motion vectors can be obtained directly from the encoded video data stream.
  • Step 102 obtains a set of search vectors (SV) and selects the size of the pixel blocks (BLKSIZE) that will be processed in this iteration.
  • the search vectors are obtained from the existing motion vectors and the block size is set to a maximum value.
  • Step 103 selects a search vector from the set of search vectors and defines a candidate region (CR) in the destination frame that includes and is larger than the destination area of the selected search vector.
  • the size of the candidate region is calculated from a function of block size.
  • the candidate region for a particular search vector is centered on the destination area of the search vector and has a width and height equal to three times the block size.
  • the top and left boundary locations and the height and width of the candidate region CR and the search vector destination area SVDA may be expressed in pixels as follows:
  • CR_top SVDA_top - BLKSIZE
  • CRJeft SVDAJeft - BLKSIZE
  • CR_width SVDA_width + 2 * BLKSIZE
  • the candidate region 61 includes the destination area 42 of the selected search vector.
  • Step 104 selects a candidate block (BLK) of pixels within the candidate region.
  • the candidate block 46 is selected from a uniformly-spaced grid of blocks having the selected block size.
  • Step 105 defines a search region (SR) in the source frame for block matching. This may be done by traversing the selected search vector in the reverse direction from the candidate block to the source frame to identify an initial search block within the search region.
  • the size of the search region is calculated from a function of block size.
  • the size of the search region is typically very small.
  • the width and height of the search region are set equal to 2, 1, 1 and 1 pixels for block sizes equal to 8, 4, 2 and 1 pixels, respectively. The use of small search regions arises from a desire to stay close to the displacement values of the existing motion vectors.
  • each new motion vector will differ from the existing motion vector from which it was derived by no more than five pixels.
  • the selected search vector is traversed in the reverse direction along the vector 51 from the candidate block 46 to the block 55 in the source frame.
  • the search region 62 is defined to include the initial search block 55.
  • the search region is defined to be symmetrical about the initial search block unless this is not possible because the search block is at the edge of an image.
  • Step 106 performs a block matching operation to find the block in the source frame that is the best match with the candidate block in the destination frame.
  • the search may start with any block in the search region but the initial search block is a convenient choice.
  • the block matching operation generates a measure of block difference between the candidate block and each block in the search region and selects the block that has the smallest measure of block difference.
  • the measure of block difference may be generated in a variety of ways. For example, the measure may be generated by calculating the average of the absolute differences between the values of corresponding pixels in the two blocks or by calculating the average of the square of differences between the values of corresponding pixels in the two blocks.
  • the differences between corresponding pixels may be weighted by an array of coefficients that emphasizes the pixels nearer the center of the blocks.
  • the measure of block difference is weighted by a bias factor so that matching blocks tend to be found closer to the center of the search region.
  • the value of the bias factor should be chosen so that it is just large enough to dominate variations in the difference measure that are caused by noise in the pixel values.
  • the bias factor prevents a block far from the search region center from being selected as the matching block only because it has a lower measure of block difference due to noise.
  • the measure of block difference for a particular block in the search frame may be weighted by a bias factor that is derived from the average total noise in a block and scaled proportionally to the distance in pixels between the center of the search region and the center of the particular block in the search frame. This may be expressed as:
  • Dist(s) distance in pixels from block s center to center of search region.
  • the average noise per block may be determined in a variety of ways. One way derives this noise from a video data stream by calculating the average measure of block difference for the ten percent of all motion vectors in the data stream having the smallest measures of block difference.
  • the block matching operation may consider the surrounding context of each block by also calculating the differences between corresponding pixels that surround the blocks. This feature may improve the accuracy of block matching for small block sizes. Preferably, these differences are weighted non-uniformly by an array of coefficients that emphasizes the pixels nearer the center of the blocks. Three examples of arrays are shown for block sizes of four, two and one pixels. For blocks of 4 x 4 pixels, a suitable array of coefficients is:
  • the 4 x 4 set of coefficients in the central portion of this array are only 25% of the total coefficients but their contribution to the measure of block difference is 57%.
  • a suitable array of coefficients is:
  • the 2 x 2 set of coefficients in the central portion of this array provide 52% of the contribution to the measure of block difference.
  • the set of coefficients with the value of 6 that are immediately adjacent to the central core provide 29% of the contribution.
  • the set of coefficients with the value 1 that are farthest from the central core provide 19% of the contribution.
  • a suitable array of coefficients is:
  • the coefficient in the central portion of this array provides 55% of the contribution to the measure of block difference.
  • the set of coefficients with the value of 8 that are immediately adjacent to the central core provide 28% of the contribution.
  • the set of coefficients with the value 1 that are farthest from the central core provide 17% of the contribution.
  • a new motion vector is calculated for the candidate block in the destination frame and the block in the source frame that is the best match.
  • the block 45 is the best match to the candidate block 46.
  • a motion vector is generated to represent motion from the block 45 to the block 46. This new motion vector is added to a set of new vectors generated for this iteration.
  • the search is confined to consider only those blocks that are contained completely within the search region but is not limited to blocks that are aligned with the grid mentioned above.
  • the example shown in Fig. 5 is one possible result that can be achieved with this implementation.
  • the search can be implemented to consider blocks that are only partially contained within the search region, or to consider only blocks that are aligned with the grid. No particular implementation is essential.
  • Step 107 determines whether a match has been found for all blocks in the candidate region. If not, the process continues with step 104 that selects another candidate block in the candidate region and the block matching operation reiterates. If a match has been found for all blocks in the candidate region, the process continues with step 108. Step 108 determines whether all search vectors in the set of search vectors has been processed.
  • Step 103 selects the next search vector from the set and defines the candidate region for this search vector. If all search vectors have been process, the process continues with step 109. Step 109 determines whether the block size used in the last iteration is small enough. If it is, the set new vectors generated for the last iteration are the new motion vectors for the source and destination frames. If the block size is not small enough, the process continues with step 102 that reduces the block size and obtains the set of search vectors for the next iteration.
  • the initial block size is a power of two and each iteration reduces the block size by one half until the desired block size is processed. For example, the initial block size may be 16 pixels and the minimum block size may be one pixel.
  • Step 102 initially obtains the search vectors from the set of existing motion vectors. For subsequent iterations, the search vectors are obtained from the new vectors generated in the previous iteration.
  • the program identifies the existing motion vectors EMV for a destination frame FD and a source frame FS and saves each of the existing motion vectors EMV(k) as a respective search vector SV(k).
  • Lines (4) through (26) of the program implement a loop that derives new motion vectors for progressively smaller block sizes BIkSz. The initial derivation is performed for a block size equal to some maximum value Smax and the loop iterates until the block size reaches a minimum value Smin.
  • Lines (5) to (8) of the program initialize two variables LowScore(d), NewVector(d) for each destination block Block(d) in the destination frame FD.
  • Lines (9) through (21) of the program implement a loop that derives one or more new motion vectors for each SV(k).
  • the program identifies in the destination frame FD a candidate region CR including the search vector's destination, which is denoted as SV(k).Dest.
  • the size of the candidate region CR is determined as a function of the block size BIkSz.
  • Lines (11) through (20) of the program implement a loop that derives a new motion vector for each block of pixels FD.Block(d) in the destination frame FD that is within the candidate region CR.
  • line (12) defines a search region SR in the source frame FS.
  • the search region SR is defined to include an initial search block in the source frame that is displaced from the position of the block FD.Block(d) by an amount equal but opposite to the displacement of the search vector SV(k).
  • the size of the search region is determined as a function of the block size BIkSz.
  • Lines (13) to (19) of the program consider each block FS.Block(s) in the source frame FS that is within the search region SR and calculates a measure of block difference Diff between pixels in the block FD.Block(d) and the block FS.Block(s).
  • the difference is calculated from a sum of absolute differences between values of corresponding pixels in and around the two blocks that are weighted by arrays of coefficients as discussed above.
  • Line (15) determines whether the measure of block difference Diff is less than the lowest score LowScore(d).
  • the block FS.Block(s) is a better match with the block FD.Block(d) than any other block considered thus far in this iteration and lines (16) and (17) save the new low score and calculate a vector NewVector(d) between these two blocks.
  • the loop in lines (13) to (19) continues until all blocks FS.Block(s) in the source frame FS that are within the search region SR have been considered.
  • the variable NewVector(d) contains a new motion vector for the block FD.Block(d) that was derived from the search vector SV(k).
  • the loop in lines (11) to (20) terminates after all blocks FD.Block(d) that are within the candidate region CR have been considered.
  • the set of variables NewVector( ) contain all of the new motion vectors that were derived from the search vector SV(k).
  • the loop in lines (9) to (21) terminates after all search vectors SV(k) have been considered.
  • the set of variables NewVector( ) contain all of the new motion vectors that were derived from the set of search vectors SV( ) that existed when the loop began in line (9).
  • Lines (22) through (25) replace all of the search vectors SV( ) with the set of new motion vectors New Vector ( ).
  • the loop in lines (4) to (26) iterates until the block size BIkSz reaches its smallest value Smin.
  • NewVector( ) represents the set of new motion vectors that were derived for the initial set of existing motion vectors.
  • the new motion vectors represent motion between blocks of pixels that have a uniform size equal to Smin.
  • FIG. 6 is a schematic block diagram of a device 70 that may be used to implement aspects of the present invention.
  • the processor 72 provides computing resources.
  • RAM 73 is system random access memory (RAM) used by the processor 72 for processing.
  • ROM 74 represents some form of persistent storage such as read only memory (ROM) for storing programs needed to operate the device 70 and possibly for carrying out various aspects of the present invention.
  • I/O control 75 represents interface circuitry to receive and transmit signals by way of the communication channels 76, 77.
  • bus 71 which may represent more than one physical or logical bus; however, a bus architecture is not required to implement the present invention.
  • additional components may be included for interfacing to devices such as a keyboard or mouse and a display, and for controlling a storage device 78 having a storage medium such as magnetic tape or disk, or an optical medium.
  • the storage medium may be used to record programs of instructions for operating systems, utilities and applications, and may include programs that implement various aspects of the present invention.
  • Software implementations of the present invention may be conveyed by a variety of machine readable media such as baseband or modulated communication paths throughout the spectrum including from supersonic to ultraviolet frequencies, or storage media such as tape, card, disk or electronic circuitry that records information using essentially any recording technology including magnetic, electrostatic and optical technologies.
  • machine readable media such as baseband or modulated communication paths throughout the spectrum including from supersonic to ultraviolet frequencies, or storage media such as tape, card, disk or electronic circuitry that records information using essentially any recording technology including magnetic, electrostatic and optical technologies.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The motion vectors that exist in an encoded video data stream are used to derive new motion vectors for smaller blocks of video information. An iterative process uses the existing motion vectors and the newly derived motion vectors to control block matching operations for progressively smaller block sizes. New motion vectors can be derived for blocks of the video images that are outside those blocks represented by the existing motion vectors.

Description

Fine Spatial Granularity Motion Determination
TECHNICAL FIELD The present invention pertains generally to video signal processing and pertains more specifically to signal processing that derives information about apparent motion in images represented by a sequence of pictures or frames of video data in a video signal.
BACKGROUND ART A variety of video signal processing applications rely on the ability to detect apparent motion in images that are represented by a sequence of pictures or frames. Three examples of these applications are data compression, noise reduction and frame -rate conversion.
Some forms of data compression rely on motion detection between two pictures or frames so that one frame of video data can be represented more efficiently by inter- frame encoded data, or data that represents at least a portion of one frame of data in relative terms to a respective portion of data in another frame. One example of video data compression that used motion detection is described in international standard ISO/IEC 13818-2 entitled "Generic Coding of Moving Pictures and Associated Audio Information: Video" and in Advanced Television Standards
Committee (ATSC) document A/54 entitled "Guide to the Use of the ATSC Digital Television Standard." The MPEG-2 technique compresses some frames of video data by spatial coding techniques without reference to any other frame of video data to generate respective I- frames of independent or intra-frame encoded video data. Other frames are compressed by temporal coding techniques that use motion detection and prediction. Forward prediction is used to generate respective P- frames or predicted frames of inter-frame encoded video data, and forward and backward prediction are used to generate respective B-frames or bidirectional frames of inter-frame encoded video data. MPEG-2 compliant applications may select frames for intra-frame encoding according to a fixed schedule, such as every fifteenth frame, or they may select frames according to an adaptive schedule. An adaptive schedule may be based on criteria related to the detection of motion or differences in content between adjacent frames, if desired. Some noise-reduction techniques rely on the ability to identify blocks or portions of images in which motion occurs or, alternatively, in which no motion occurs. One system for noise reduction uses motion detection to control application of a temporal low-pass filter to corresponding picture elements or "pixels" in respective frames in a sequence of frames. This form of noise reduction avoids blurring the appearance of moving objects by applying its low-pass filter to only those areas of the image in which motion is not detected. One implementation of the low-pass filter calculates a moving average value for corresponding pixels in a sequence of frames and substitutes the average value for the respective pixel in the current frame. Some frame-rate conversion techniques use motion vectors to interpolate the values of individual pixels between two adjacent frames. Pixel values are interpolated for specified times to generate new frames at the desired frame rate.
Motion vectors that are suitable for data compression may not be suitable for other applications. MPEG-2 data compression uses a motion vector for inter-frame encoding to represent motion between two frames of video data. The MPEG-2 motion vector expresses the horizontal and vertical displacement of a corresponding portion of an image in source and destination areas of two different pictures or frames. Although the MPEG motion vectors work fairly well for data compression applications, they are generally too coarse for noise reduction and frame-rate conversion. The performance of noise reduction and frame-rate conversion applications generally improves when the motion vectors represent very small blocks or portions of images but the computational resources needed to derive large numbers of motion vectors for small blocks or portions of images can be prohibitively expensive for reasons discussed below. Several methods derive motion vectors by detecting differences between frames. One well known method uses a technique called block matching, which compares the video data in a "current" frame of video data to the video data in a "reference" frame of data. The data in a current frame is divided into an array of blocks such as blocks of 16 x 16 pixels or 8 x 8 pixels, for example, and the content of a respective block in the current frame is compared to arrays of pixels within a search area in the reference frame. If a match is found between a block in the current frame and a region of the reference frame, motion for the portion of the image represented by that block can be deemed to have occurred. The search area is often a rectangular region of the reference frame having a specified height and width and having a location that is centered on the corresponding location of the respective block. The height and width of the search area may be fixed or adaptive. On one hand, a larger search area allows larger magnitude displacements to be detected, which correspond to higher velocities of movement. On the other hand, a larger search area increases the computational resources that are needed to perform block matching.
An example may help illustrate the magnitude of the computational resources that can be required for block matching. In this example, each frame of video data is represented by an array of 1920 x 1080 pixels, and each frame is divided into blocks of 8 x 8 pixels. As a result, each frame is divided into an array of 32,400 = 240 x 135 blocks. The search region is centered on the location of the respective block to be matched and is 48 pixels high and 64 pixels wide. In one implementation, each pixel in a block is compared to its respective pixel in all 8 x 8 blocks of the search region. In this example, a search region away from the edge of the picture has 2240 = 56 x 40 blocks; therefore, more than 143,000 pixel comparisons are needed to check for motion of a single block. Fewer comparisons are needed for search regions at or near the edge of the picture because the search region is bounded by the edge of the picture. Nevertheless, nearly 4.5 x 109 pixel comparisons are needed for each frame. If the frame is part of a video data stream that presents its data at a rate of sixty frames per second, then more than 267 x 109 pixel comparisons must be performed each second just to compare pixels in adjacent frames.
A correspondingly higher number of comparisons are needed if block matching is to be done for smaller block sizes. Preferably, the additional resources should be expended only if the additional benefits are worth the cost of those additional resources.
The implementation of some systems incorporates processing hardware with pipelined architectures to obtain higher processing capabilities for lower cost but even these lower costs are too high for many applications. Optimization techniques have been proposed to reduce the computational requirements of block matching but these techniques have not been as effective as desired because they require conditional logic that disrupts the processing flow in processors that have a pipelined architecture. In addition, these techniques have not provided a way to optimize the use of processing resources with respect to the value of the benefits gained by the processing.
DISCLOSURE OF INVENTION
It is an object of the present invention to provide more efficient ways to derive motion vector representations for very small areas of an image.
It is another object of the present invention to provide for the control of motion vector derivation so that greater computational resources are used when greater benefits can be obtained from that derivation.
These objects are achieved by the methods for deriving motion vectors set forth below in the section of this disclosure labeled as claims.
In this context and throughout the remainder of this disclosure, the term "block" refers to a portion of an image or picture and the term "motion vector" refers to any data construct that can represent a block in one frame of data in relative terms to a respective block in another frame, which typically expresses motion between two frames of video data. In many applications, a block comprises a square or rectangular array of pixels but it may comprise pixels arranged in other simple shapes such as a circle or an ellipse or complex shapes with irregular outlines and holes. A motion vector is not limited to any particular data construct but the construct set forth in the MPEG-2 standard is a suitable example. In this standard, the "motion vector" data construct is set forth in part 10 of the ISO/IEC 14496 standard, also known as
MPEG-4 Advanced Video Coding (AVC) or the ITU-T H.264 standard. The motion vector defined in the MPEG-2 standard specifies a source area of one picture, a destination area in a second picture, and the horizontal and vertical displacements from the source area to the destination area. Additional information may be included in or associated with a motion vector.
The various features of the present invention and its preferred embodiments may be better understood by referring to the following discussion and the accompanying drawings. The contents of the following discussion and the drawings are set forth as examples only and should not be understood to represent limitations upon the scope of the present invention.
BRIEF DESCRIPTION OF DRAWINGS
Fig. 1 is a schematic block diagram of an exemplary system that incorporates various aspects of the present invention. Fig. 2 is a schematic illustration of a sequence of pictures or frames of video data in an MPEG-2 compliant encoded video data stream.
Fig. 3 is a schematic illustration of two video frames with a motion vector. Fig. 4 is a schematic block diagram of a process for deriving new motion vectors.
Fig. 5 is a schematic illustration of two video frames and a new motion vector. Fig 6 is a schematic block diagram of a device that may be used to derive new motion vectors according to teachings of the present invention.
MODES FOR CARRYING OUT THE INVENTION A. Introduction
Fig. 1 is a schematic block diagram of an exemplary system 10 incorporating aspects of the present invention that may be used to derive new motion vectors representing very small image areas from motion vectors representing much larger areas that already exist in a video data stream. The Motion Vector Processor (MVP) 2 receives video information conveyed in the video data stream from the signal path 1, analyzes the existing motion vectors to derive the new motion vectors that are not present in the stream, passes the new motion vectors along the path 3 and can, if desired, also pass the existing motion vectors along the path 3. The Video Signal Processor (VSP) 4 receives the video data stream from the path 1, receives the new motion vectors from the path 3, receives the existing motion vectors from either the path 1 or the path 3, and applies signal processing to at least some of the video information conveyed in the video data stream to generate a processed signal that is passed along the signal path 5. The VSP 4 adapts its signal processing in response to the new motion vectors and, preferably, also in response to the existing motion vectors. The VSP4 may apply essentially any type of signal processing that may be desired such as, for example, noise reduction, image resolution enhancement and frame-rate conversion.
The present invention can process motion vectors in video data streams that conform to a wide variety of formats and standards. No format or standard is essential to the present invention but the following disclosure uses terminology that is consistent with the MPEG-2 standard. For example, independently encoded frames, or intra-frame encoded data, is referred to as I-frames. Dependently encoded frames, or inter-frame encoded data, is referred to as P-frames. Fig. 2 is a schematic illustration of a sequence of pictures or frames of video data in an MPEG-2 compliant encoded video data stream. This particular sequence includes two I-frames 33, 39 and five intervening P-frames 34 to 38. The encoded data in each P-frame may include one or more motion vectors for blocks of pixels in that frame, which is based on or predicted from corresponding blocks of pixels in a prior frame. The P-frame 34, for example, may contain one or more motion vectors representing one or more blocks in motion between the I-frame 33 and the P-frame 34. An MPEG-2 compliant video data stream may also include B-frames with bidirectional motion vectors. Fig. 3 is a schematic diagram of two frames of video data within a sequence of frames. In this example, the upper frame is a source frame and the lower frame is a destination frame. If the frames are part of an MPEG-2 compliant video data stream, then the destination frame is either a P-frame or a B -frame because it has a motion vector that represents motion from the source area 41 in the source frame to the destination area 42 in the destination frame. The magnitude and direction of motion is illustrated by the vector 51. In a preferred implementation, the magnitude and direction of motion are expressed by the amounts of horizontal and vertical displacement between the source and destination areas. No particular format or data construct is essential to the present invention. The destination frame may have more than one motion vector representing motion occurring in multiple areas from the source frame to the destination frame. If the two frames are in an MPEG-2 compliant video data stream, the destination frame shown in the figure may be a B -frame. If the destination frame is a B-frame, it may have motion vectors from another source frame. In preferred implementations, the present invention is applied to all existing motion vectors. For the sake of simplicity, however, only one existing motion vector is discussed.
B. Derivation of New Motion Vectors
The present invention derives new motion vectors from one or more existing motion vectors. The new motion vectors can represent motion for very small blocks or portions of images. Implementations of the present invention are efficient and tend to be self-optimizing because more processing resources are applied to video frames that have more existing motion vectors. A larger number of existing motion vectors generally means a video frame represents an image with more identified and useful motion and greater benefits are more likely to be achieved for those video frames that represent images with such motion.
The diagram in Fig. 4 illustrates one iterative process that may be used to derive new motion vectors for square blocks. Step 101 obtains one or more existing motion vectors that represent areas of motion between two frames of video information. For example, if the sequence of frames is within an encoded video data stream that conforms to the MPEG-2 standard, the existing motion vectors can be obtained directly from the encoded video data stream.
Step 102 obtains a set of search vectors (SV) and selects the size of the pixel blocks (BLKSIZE) that will be processed in this iteration. For the initial iteration, the search vectors are obtained from the existing motion vectors and the block size is set to a maximum value.
Step 103 selects a search vector from the set of search vectors and defines a candidate region (CR) in the destination frame that includes and is larger than the destination area of the selected search vector. In preferred implementations, the size of the candidate region is calculated from a function of block size. In one exemplary implementation that uses square candidate regions, square blocks, and the destination area corresponds to one block, the candidate region for a particular search vector is centered on the destination area of the search vector and has a width and height equal to three times the block size. The top and left boundary locations and the height and width of the candidate region CR and the search vector destination area SVDA may be expressed in pixels as follows:
CR_top = SVDA_top - BLKSIZE
CRJeft = SVDAJeft - BLKSIZE CR_width = SVDA_width + 2 * BLKSIZE
CR_height = SVDA_height + 2 * BLKSIZE where the origin (0,0) of the pixel coordinate system is at the top left corner of the image in the destination frame, the horizontal coordinate increases to the right and the vertical coordinate increases down the image. Referring to Fig. 5 as a hypothetical example, the candidate region 61 includes the destination area 42 of the selected search vector. Step 104 selects a candidate block (BLK) of pixels within the candidate region. In the example shown in Fig. 5, the candidate block 46 is selected from a uniformly-spaced grid of blocks having the selected block size.
Step 105 defines a search region (SR) in the source frame for block matching. This may be done by traversing the selected search vector in the reverse direction from the candidate block to the source frame to identify an initial search block within the search region. In preferred implementations, the size of the search region is calculated from a function of block size. The size of the search region is typically very small. In one exemplary implementation that uses square search regions and square blocks, the width and height of the search region are set equal to 2, 1, 1 and 1 pixels for block sizes equal to 8, 4, 2 and 1 pixels, respectively. The use of small search regions arises from a desire to stay close to the displacement values of the existing motion vectors. In this example with the maximum search region equal to two pixels, each new motion vector will differ from the existing motion vector from which it was derived by no more than five pixels. Referring to the example shown in Fig. 5, the selected search vector is traversed in the reverse direction along the vector 51 from the candidate block 46 to the block 55 in the source frame. The search region 62 is defined to include the initial search block 55. Preferably, the search region is defined to be symmetrical about the initial search block unless this is not possible because the search block is at the edge of an image.
Step 106 performs a block matching operation to find the block in the source frame that is the best match with the candidate block in the destination frame. The search may start with any block in the search region but the initial search block is a convenient choice. The block matching operation generates a measure of block difference between the candidate block and each block in the search region and selects the block that has the smallest measure of block difference. The measure of block difference may be generated in a variety of ways. For example, the measure may be generated by calculating the average of the absolute differences between the values of corresponding pixels in the two blocks or by calculating the average of the square of differences between the values of corresponding pixels in the two blocks. The differences between corresponding pixels may be weighted by an array of coefficients that emphasizes the pixels nearer the center of the blocks. Preferably, the measure of block difference is weighted by a bias factor so that matching blocks tend to be found closer to the center of the search region. The value of the bias factor should be chosen so that it is just large enough to dominate variations in the difference measure that are caused by noise in the pixel values. When implemented properly, the bias factor prevents a block far from the search region center from being selected as the matching block only because it has a lower measure of block difference due to noise.
For example, the measure of block difference for a particular block in the search frame may be weighted by a bias factor that is derived from the average total noise in a block and scaled proportionally to the distance in pixels between the center of the search region and the center of the particular block in the search frame. This may be expressed as:
F (s) = N - Dist (s) where F(s) = bias factor for block s in the search region; N = average noise per block; and
Dist(s) = distance in pixels from block s center to center of search region. The average noise per block may be determined in a variety of ways. One way derives this noise from a video data stream by calculating the average measure of block difference for the ten percent of all motion vectors in the data stream having the smallest measures of block difference.
The block matching operation may consider the surrounding context of each block by also calculating the differences between corresponding pixels that surround the blocks. This feature may improve the accuracy of block matching for small block sizes. Preferably, these differences are weighted non-uniformly by an array of coefficients that emphasizes the pixels nearer the center of the blocks. Three examples of arrays are shown for block sizes of four, two and one pixels. For blocks of 4 x 4 pixels, a suitable array of coefficients is:
Figure imgf000011_0001
The 4 x 4 set of coefficients in the central portion of this array are only 25% of the total coefficients but their contribution to the measure of block difference is 57%. For blocks of 2 x 2 pixels, a suitable array of coefficients is:
Figure imgf000011_0002
The 2 x 2 set of coefficients in the central portion of this array provide 52% of the contribution to the measure of block difference. The set of coefficients with the value of 6 that are immediately adjacent to the central core provide 29% of the contribution. The set of coefficients with the value 1 that are farthest from the central core provide 19% of the contribution. For blocks of 1 pixel, a suitable array of coefficients is:
Figure imgf000012_0001
The coefficient in the central portion of this array provides 55% of the contribution to the measure of block difference. The set of coefficients with the value of 8 that are immediately adjacent to the central core provide 28% of the contribution. The set of coefficients with the value 1 that are farthest from the central core provide 17% of the contribution.
When the best match has been found, a new motion vector is calculated for the candidate block in the destination frame and the block in the source frame that is the best match. Referring to example shown in Fig. 5, the block 45 is the best match to the candidate block 46. A motion vector is generated to represent motion from the block 45 to the block 46. This new motion vector is added to a set of new vectors generated for this iteration.
In one implementation, the search is confined to consider only those blocks that are contained completely within the search region but is not limited to blocks that are aligned with the grid mentioned above. The example shown in Fig. 5 is one possible result that can be achieved with this implementation. If desired, the search can be implemented to consider blocks that are only partially contained within the search region, or to consider only blocks that are aligned with the grid. No particular implementation is essential. Step 107 determines whether a match has been found for all blocks in the candidate region. If not, the process continues with step 104 that selects another candidate block in the candidate region and the block matching operation reiterates. If a match has been found for all blocks in the candidate region, the process continues with step 108. Step 108 determines whether all search vectors in the set of search vectors has been processed. If not, the process continues with step 103 that selects the next search vector from the set and defines the candidate region for this search vector. If all search vectors have been process, the process continues with step 109. Step 109 determines whether the block size used in the last iteration is small enough. If it is, the set new vectors generated for the last iteration are the new motion vectors for the source and destination frames. If the block size is not small enough, the process continues with step 102 that reduces the block size and obtains the set of search vectors for the next iteration. In one implementation, the initial block size is a power of two and each iteration reduces the block size by one half until the desired block size is processed. For example, the initial block size may be 16 pixels and the minimum block size may be one pixel.
Step 102 initially obtains the search vectors from the set of existing motion vectors. For subsequent iterations, the search vectors are obtained from the new vectors generated in the previous iteration.
One exemplary method that may be used to derive new motion vectors from existing motion vectors is discussed below in conjunction with a fragment of a computer program written in pseudo code. The program fragment is presented to help illustrate a few concepts for the exemplary method and is not intended to represent a complete implementation for practical application.
(1 ) for each EMV(k)
(2) . SV(k) = EMV(k)
(3) next EMV(k)
(4) for BIkSz from Smax to Smin (5) . for each FD.BIock(d)
(6) . . LowScore(d) = MaxScore
(7) . . NewVector(d) = Null
(8) . next FD.BIock(d)
(9) . for each SV(k) (10) . . CR = CandidateRegion{SV(k).Dest, BIkSz }
(1 1 ) . . for each FD.BIock(d) in CR
(12) . . . SR = SearchRegion{SV(k), FD.BIock(d) }
(13) . . . for each FS.BIock(s) in SR
(14) . . . . Diff=DifferenceScore{FD.BIock(d), FS.BIock(s)} (15) . . . . if Diff < LowScore(d) then
(16) . . . . . LowScore(d) = Diff
(17) . . . . . NewVector(d) = VeCtOr(FS-BIoCk(S), FD.BIock(d)} (18) . . . . end if
(19) . . . next FS.BIock(s) (20) . . next FD.BIock(d)
(21 ) . next SV(k)
(22) . clear all SV(k)
(23) . for each NewVector(k)
(24) . . SV(k) = NewVector(k) (25) . next NewVector(k)
(26) next BIkSz
In lines (1) through (3), the program identifies the existing motion vectors EMV for a destination frame FD and a source frame FS and saves each of the existing motion vectors EMV(k) as a respective search vector SV(k). Lines (4) through (26) of the program implement a loop that derives new motion vectors for progressively smaller block sizes BIkSz. The initial derivation is performed for a block size equal to some maximum value Smax and the loop iterates until the block size reaches a minimum value Smin.
Lines (5) to (8) of the program initialize two variables LowScore(d), NewVector(d) for each destination block Block(d) in the destination frame FD.
Lines (9) through (21) of the program implement a loop that derives one or more new motion vectors for each SV(k). In line (10), the program identifies in the destination frame FD a candidate region CR including the search vector's destination, which is denoted as SV(k).Dest. Preferably, the size of the candidate region CR is determined as a function of the block size BIkSz.
Lines (11) through (20) of the program implement a loop that derives a new motion vector for each block of pixels FD.Block(d) in the destination frame FD that is within the candidate region CR. Within this loop, line (12) defines a search region SR in the source frame FS. The search region SR is defined to include an initial search block in the source frame that is displaced from the position of the block FD.Block(d) by an amount equal but opposite to the displacement of the search vector SV(k). The size of the search region is determined as a function of the block size BIkSz. Lines (13) to (19) of the program consider each block FS.Block(s) in the source frame FS that is within the search region SR and calculates a measure of block difference Diff between pixels in the block FD.Block(d) and the block FS.Block(s). In one implementation, the difference is calculated from a sum of absolute differences between values of corresponding pixels in and around the two blocks that are weighted by arrays of coefficients as discussed above. Line (15) determines whether the measure of block difference Diff is less than the lowest score LowScore(d). If it is, then the block FS.Block(s) is a better match with the block FD.Block(d) than any other block considered thus far in this iteration and lines (16) and (17) save the new low score and calculate a vector NewVector(d) between these two blocks. The loop in lines (13) to (19) continues until all blocks FS.Block(s) in the source frame FS that are within the search region SR have been considered. At the termination of this loop, the variable NewVector(d) contains a new motion vector for the block FD.Block(d) that was derived from the search vector SV(k).
The loop in lines (11) to (20) terminates after all blocks FD.Block(d) that are within the candidate region CR have been considered. At the conclusion of this loop, the set of variables NewVector( ) contain all of the new motion vectors that were derived from the search vector SV(k).
The loop in lines (9) to (21) terminates after all search vectors SV(k) have been considered. At the conclusion of this loop, the set of variables NewVector( ) contain all of the new motion vectors that were derived from the set of search vectors SV( ) that existed when the loop began in line (9).
Lines (22) through (25) replace all of the search vectors SV( ) with the set of new motion vectors New Vector ( ). The loop in lines (4) to (26) iterates until the block size BIkSz reaches its smallest value Smin. At the termination of this loop, NewVector( ) represents the set of new motion vectors that were derived for the initial set of existing motion vectors. In this particular implementation, the new motion vectors represent motion between blocks of pixels that have a uniform size equal to Smin.
The implementation discussed above may be modified in a number of ways. For example, the order of iteration can be changed so that the search vectors SV(k) loop is nested inside the destination blocks FD.Block(d) loop. The portions of the frames referred to above as blocks need not be square or rectangular. C. Implementation
Devices that incorporate various aspects of the present invention may be implemented in a variety of ways including software for execution by a computer or some other device that includes more specialized components such as digital signal processor (DSP) circuitry coupled to components similar to those found in a general- purpose computer. Fig. 6 is a schematic block diagram of a device 70 that may be used to implement aspects of the present invention. The processor 72 provides computing resources. RAM 73 is system random access memory (RAM) used by the processor 72 for processing. ROM 74 represents some form of persistent storage such as read only memory (ROM) for storing programs needed to operate the device 70 and possibly for carrying out various aspects of the present invention. I/O control 75 represents interface circuitry to receive and transmit signals by way of the communication channels 76, 77. In the embodiment shown, all major system components connect to the bus 71, which may represent more than one physical or logical bus; however, a bus architecture is not required to implement the present invention. In embodiments implemented by a general purpose computer system, additional components may be included for interfacing to devices such as a keyboard or mouse and a display, and for controlling a storage device 78 having a storage medium such as magnetic tape or disk, or an optical medium. The storage medium may be used to record programs of instructions for operating systems, utilities and applications, and may include programs that implement various aspects of the present invention.
The functions required to practice various aspects of the present invention can be performed by components that are implemented in a wide variety of ways including discrete logic components, integrated circuits, one or more ASICs and/or program- controlled processors. The manner in which these components are implemented is not important to the present invention.
Software implementations of the present invention may be conveyed by a variety of machine readable media such as baseband or modulated communication paths throughout the spectrum including from supersonic to ultraviolet frequencies, or storage media such as tape, card, disk or electronic circuitry that records information using essentially any recording technology including magnetic, electrostatic and optical technologies.

Claims

1. A method for deriving new motion vectors from existing motion vectors, wherein the new motion vectors and the existing motion vectors represent movement in images represented by frames of video information in a sequence of frames of video information, where the method comprises: obtaining one or more existing motion vectors that represent movement from a source image represented by a source frame in the sequence of frames of video information to a destination image represented by a destination frame in the sequence of frames of video information; identifying magnitude and direction of displacement of movement from a respective first area in the destination image to a respective second area in the source image for each existing motion vector in the set of existing motion vectors; defining a candidate region in the source image that includes the respective first area; identifying first portions of video information within the candidate region that are smaller than the respective first area; for a respective first portion within the candidate region, defining a search region in the source image that has a center that is displaced from a center of the respective first portion by the magnitude and direction of displacement of movement from the respective first area to the respective second area; calculating measures of difference between the respective first portion of video information and second portions of video information within the search region and using the measures of difference to identify the second portion of video information that is a best match for the respective first portion of video information; generating one or more new motion vectors not included in the set of one or more existing motion vectors that represent magnitude and direction of displacement of movement between the respective first portion of video information and the second portion of video information that is the best match for the respective first portion of video information; and applying signal processing to at least some of the video information in the source frame or the destination frame to generate a processed signal representing a modified form of at least a portion of the sequence of images, wherein the signal processing adapts its operation in response to the one or more new motion vectors.
2. The method of claim 1 that iteratively generates new motion vectors, wherein for a given iteration the method comprises: generating a plurality of new motion vectors from a respective existing motion vector in the set of existing motion vectors using first and second portions that have a size; replacing the respective existing motion vector in the set with the plurality of new motion vectors generated therefrom; and reducing the size of the first and second portions for a subsequent iteration.
3. The method of claim 1 or 2 that calculates the measures of difference by computing differences between values of corresponding pixels in the respective first portion and the second portions of video information and weighting the differences more heavily for pixels closer to centers of the respective first portion and the second portions.
4. The method of any one of claims 1 through 3 that calculates the measures of difference by computing differences between values of pixels that are inside and outside the respective first portion and the second portions.
5. The method of any one of claims 1 through 4 that weights the measures of difference by bias factors for the second portions of video information that increase for larger distances between centers of the second portions of video information and the center of the search region.
6. An apparatus for deriving new motion vectors from existing motion vectors, wherein the new motion vectors and the existing motion vectors represent movement in images represented by frames of video information in a sequence of frames of video information, where the apparatus comprises: means for obtaining one or more existing motion vectors that represent movement from a source image represented by a source frame in the sequence of frames of video information to a destination image represented by a destination frame in the sequence of frames of video information; means for identifying magnitude and direction of displacement of movement from a respective first area in the destination image to a respective second area in the source image for each existing motion vector in the set of existing motion vectors; means for defining a candidate region in the source image that includes the respective first area; means for identifying first portions of video information within the candidate region that are smaller than the respective first area; means for defining a search region in the source image for a respective first portion within the candidate region that has a center that is displaced from a center of the respective first portion by the magnitude and direction of displacement of movement from the respective first area to the respective second area; means for calculating measures of difference between the respective first portion of video information and second portions of video information within the search region and using the measures of difference to identify the second portion of video information that is a best match for the respective first portion of video information; means for generating one or more new motion vectors not included in the set of one or more existing motion vectors that represent magnitude and direction of displacement of movement between the respective first portion of video information and the second portion of video information that is the best match for the respective first portion of video information; and means for applying signal processing to at least some of the video information in the source frame or the destination frame to generate a processed signal representing a modified form of at least a portion of the sequence of images, wherein the signal processing adapts its operation in response to the one or more new motion vectors.
7. The apparatus of claim 6 that iteratively generates new motion vectors, wherein for a given iteration the apparatus comprises: means for generating a plurality of new motion vectors from a respective existing motion vector in the set of existing motion vectors using first and second portions that have a size; means for replacing the respective existing motion vector in the set with the plurality of new motion vectors generated therefrom; and means for reducing the size of the first and second portions for a subsequent iteration.
8. The apparatus of claim 6 or 7 that comprises means for calculating the measures of difference by computing differences between values of corresponding pixels in the respective first portion and the second portions of video information and weighting the differences more heavily for pixels closer to centers of the respective first portion and the second portions.
9. The apparatus of any one of claims 6 through 8 that comprises means for calculating the measures of difference by computing differences between values of pixels that are inside and outside the respective first portion and the second portions.
10. The apparatus of any one of claims 6 through 9 that comprises means for weighting the measures of difference by bias factors for the second portions of video information that increase for larger distances between centers of the second portions of video information and the center of the search region.
11. A storage medium recording a program of instructions that is executable by a device to perform a method for deriving new motion vectors from existing motion vectors, wherein the new motion vectors and the existing motion vectors represent movement in images represented by frames of video information in a sequence of frames of video information, where the method comprises: obtaining one or more existing motion vectors that represent movement from a source image represented by a source frame in the sequence of frames of video information to a destination image represented by a destination frame in the sequence of frames of video information; identifying magnitude and direction of displacement of movement from a respective first area in the destination image to a respective second area in the source image for each existing motion vector in the set of existing motion vectors; defining a candidate region in the source image that includes the respective first area; identifying first portions of video information within the candidate region that are smaller than the respective first area; for a respective first portion within the candidate region, defining a search region in the source image that has a center that is displaced from a center of the respective first portion by the magnitude and direction of displacement of movement from the respective first area to the respective second area; calculating measures of difference between the respective first portion of video information and second portions of video information within the search region and using the measures of difference to identify the second portion of video information that is a best match for the respective first portion of video information; generating one or more new motion vectors not included in the set of one or more existing motion vectors that represent magnitude and direction of displacement of movement between the respective first portion of video information and the second portion of video information that is the best match for the respective first portion of video information; and applying signal processing to at least some of the video information in the source frame or the destination frame to generate a processed signal representing a modified form of at least a portion of the sequence of images, wherein the signal processing adapts its operation in response to the one or more new motion vectors.
12. The medium of claim 11, wherein the method iteratively generates new motion vectors and for a given iteration the method comprises: generating a plurality of new motion vectors from a respective existing motion vector in the set of existing motion vectors using first and second portions that have a size; replacing the respective existing motion vector in the set with the plurality of new motion vectors generated therefrom; and reducing the size of the first and second portions for a subsequent iteration.
13. The medium of claim 11 or 12, wherein the method calculates the measures of difference by computing differences between values of corresponding pixels in the respective first portion and the second portions of video information and weighting the differences more heavily for pixels closer to centers of the respective first portion and the second portions.
14. The medium of any one of claims 11 through 13, wherein the method calculates the measures of difference by computing differences between values of pixels that are inside and outside the respective first portion and the second portions.
15. The medium of any one of claims 11 through 14, wherein the method weights the measures of difference by bias factors for the second portions of video information that increase for larger distances between centers of the second portions of video information and the center of the search region.
PCT/US2009/058546 2008-09-29 2009-09-28 Deriving new motion vectors from existing motion vectors WO2010036995A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US19471808P 2008-09-29 2008-09-29
US61/194,718 2008-09-29

Publications (1)

Publication Number Publication Date
WO2010036995A1 true WO2010036995A1 (en) 2010-04-01

Family

ID=41350657

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2009/058546 WO2010036995A1 (en) 2008-09-29 2009-09-28 Deriving new motion vectors from existing motion vectors

Country Status (1)

Country Link
WO (1) WO2010036995A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1587327A2 (en) * 2004-04-15 2005-10-19 Microsoft Corporation Video transcoding
EP1610561A2 (en) * 2004-06-27 2005-12-28 Apple Computer, Inc. Encoding and decoding images
US20060280248A1 (en) * 2005-06-14 2006-12-14 Kim Byung G Fast motion estimation apparatus and method using block matching algorithm
US20080019448A1 (en) * 2006-07-24 2008-01-24 Samsung Electronics Co., Ltd. Motion estimation apparatus and method and image encoding apparatus and method employing the same
WO2008112072A2 (en) * 2007-03-09 2008-09-18 Dolby Laboratories Licensing Corporation Multi-frame motion extrapolation from a compressed video source

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1587327A2 (en) * 2004-04-15 2005-10-19 Microsoft Corporation Video transcoding
EP1610561A2 (en) * 2004-06-27 2005-12-28 Apple Computer, Inc. Encoding and decoding images
US20060280248A1 (en) * 2005-06-14 2006-12-14 Kim Byung G Fast motion estimation apparatus and method using block matching algorithm
US20080019448A1 (en) * 2006-07-24 2008-01-24 Samsung Electronics Co., Ltd. Motion estimation apparatus and method and image encoding apparatus and method employing the same
WO2008112072A2 (en) * 2007-03-09 2008-09-18 Dolby Laboratories Licensing Corporation Multi-frame motion extrapolation from a compressed video source

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHEN D ET AL: "Extraction of high-resolution video stills from MPEG image sequences", IMAGE PROCESSING, 1998. ICIP 98. PROCEEDINGS. 1998 INTERNATIONAL CONFE RENCE ON CHICAGO, IL, USA 4-7 OCT. 1998, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, vol. 2, 4 October 1998 (1998-10-04), pages 465 - 469, XP010308640, ISBN: 978-0-8186-8821-8 *
JENQ-NENG HWANG ET AL: "Motion vector re-estimation and dynamic frame-skipping for video transcoding", SIGNALS, SYSTEMS & COMPUTERS, 1998. CONFERENCE RECORD OF THE THIRTY-SE COND ASILOMAR CONFERENCE ON PACIFIC GROVE, CA, USA 1-4 NOV. 1998, PISCATAWAY, NJ, USA,IEEE, US, vol. 2, 1 November 1998 (1998-11-01), pages 1606 - 1610, XP010324470, ISBN: 978-0-7803-5148-6 *
JEONGNAM YOUN ET AL: "Motion Vector Refinement for High-Performance Transcoding", IEEE TRANSACTIONS ON MULTIMEDIA, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 1, no. 1, 1 March 1999 (1999-03-01), XP011036284, ISSN: 1520-9210 *

Similar Documents

Publication Publication Date Title
JP5714042B2 (en) Video information processing
US8761254B2 (en) Image prediction encoding device, image prediction decoding device, image prediction encoding method, image prediction decoding method, image prediction encoding program, and image prediction decoding program
KR102121558B1 (en) Method of stabilizing video image, post-processing device and video encoder including the same
EP2773123B1 (en) Padding of frame boundaries for video coding
US20110261886A1 (en) Image prediction encoding device, image prediction encoding method, image prediction encoding program, image prediction decoding device, image prediction decoding method, and image prediction decoding program
US20110026596A1 (en) Method and System for Block-Based Motion Estimation for Motion-Compensated Frame Rate Conversion
KR100657261B1 (en) Method and apparatus for interpolating with adaptive motion compensation
US11336915B2 (en) Global motion vector video encoding systems and methods
EP1829383A1 (en) Temporal estimation of a motion vector for video communications
Muhit et al. Video coding using fast geometry-adaptive partitioning and an elastic motion model
JP2013532926A (en) Method and system for encoding video frames using multiple processors
KR20090014371A (en) Motion detection device
TWI423167B (en) Multi-frame motion extrapolation from a compressed video source
US20040105589A1 (en) Moving picture compression/coding apparatus and motion vector detection method
US8411749B1 (en) Optimized motion compensation and motion estimation for video coding
US10567790B2 (en) Non-transitory computer-readable storage medium for storing image compression program, image compression device, and image compression method
WO2010036995A1 (en) Deriving new motion vectors from existing motion vectors
US8805101B2 (en) Converting the frame rate of video streams
KR20040027047A (en) Encoding/decoding apparatus and method for image using predictive scanning
Chen et al. Center of mass-based adaptive fast block motion estimation
KR100892471B1 (en) Motion detection device
JP2019149721A (en) Moving image coding apparatus, control method of the same, and program
KR101021538B1 (en) Fast Intra Mode Decision Method in H.264 Encoding
Lin et al. Adaptive block-matching algorithm for video compression
JP2005260883A (en) Moving image encoding apparatus and method, and program, and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09793044

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09793044

Country of ref document: EP

Kind code of ref document: A1