WO2001074082A1 - Temporal interpolation of interlaced or progressive video images - Google Patents

Temporal interpolation of interlaced or progressive video images Download PDF

Info

Publication number
WO2001074082A1
WO2001074082A1 PCT/US2001/008789 US0108789W WO0174082A1 WO 2001074082 A1 WO2001074082 A1 WO 2001074082A1 US 0108789 W US0108789 W US 0108789W WO 0174082 A1 WO0174082 A1 WO 0174082A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
motion
information
motion vector
correlation
Prior art date
Application number
PCT/US2001/008789
Other languages
French (fr)
Inventor
Barry Kahn
Original Assignee
Teranex, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Teranex, Inc. filed Critical Teranex, Inc.
Priority to AU2001247574A priority Critical patent/AU2001247574A1/en
Publication of WO2001074082A1 publication Critical patent/WO2001074082A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • H04N7/0135Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving interpolation processes
    • H04N7/014Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving interpolation processes involving the use of motion vectors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/01Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level
    • H04N7/0117Conversion of standards, e.g. involving analogue television standards or digital television standards processed at pixel level involving conversion of the spatial resolution of the incoming video signal
    • H04N7/012Conversion between an interlaced and a progressive signal
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2310/00Command of the display device
    • G09G2310/02Addressing, scanning or driving the display screen or processing steps related thereto
    • G09G2310/0229De-interlacing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N11/00Colour television systems
    • H04N11/06Transmission systems characterised by the manner in which the individual colour picture signal components are combined
    • H04N11/20Conversion of the manner in which the individual colour picture signal components are combined, e.g. conversion of colour television standards
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/14Picture signal circuitry for video frequency region
    • H04N5/144Movement detection
    • H04N5/145Movement estimation

Definitions

  • the present invention is generally related to image processing, and more particularly, to interpolating video images from interlaced or progressive video images. Background Information:
  • One aspect of image processing involves interpolating synthetic images to allow for conversion from one frame rate to another.
  • Such techniques are applied to both progressive and interlaced image sequences (e.g., progressive or interlaced television images).
  • a typical video sequence e.g. NTSC and PAL
  • an image is portrayed by alternately displaying the odd and even lines (also referred to as "fields").
  • the time between the display of these fields is sufficiently short (1/60 second for NTSC, 1/50 second for PAL) that the fields appear as a complete interlaced image. Since there is a time between when the fields are displayed, there is both a temporal and spatial nonalignment of the odd and even lines.
  • progressive images contain both odd and even fields that were captured at the same instant in time.
  • Motion pictures are composed of image or frame sequences that represent samples taken at regular intervals in time. If the frames are sampled at a sufficiently high rate, the appearance of smooth motion is achieved. Common sampling rates include 24 frames per second for film, 60 fields per second for NTSC standard video in the United States and Canada, and 50 fields per second for PAL standard video in Europe and elsewhere. To convert a motion picture sequence to a different sampling rate, new frames must be created which appear to be intermediate in time between frames sampled at the source frame rate; this process is called temporal interpolation.
  • the process of temporal interpolation is one of predicting the contents of an image frame that is temporally between available image frames. Where objects are in motion within the sequence of image frames, the interpolated frame must position those objects spatially between the object positions in the surrounding available image frames. In order to do this, a process of motion estimation is performed to determine the motion between available frames.
  • the estimated motion is represented by a spatial offset known as a motion vector.
  • motion vectors may be computed with respect to an entire image frame (one vector per frame), with respect to pixel blocks of varying sizes (e.g. 16x16, 8x8, etc.), or with respect to individual pixels (one vector per pixel).
  • the vector(s) are scaled in accordance with the proportion of the temporal offset of the interpolated frame with respect to the surrounding available frames.
  • the vector (s) are applied to one or both of the surrounding frames in a process known as motion compensation to generate the interpolated image frame.
  • a motion vector generated by linear interpolation may not be strictly accurate.
  • the sequence of sets of motion vectors can be considered a piecewise linear approximation of nonlinear motion over the sequence.
  • a set of motion vectors represents a linear function mapping a frame from time t ⁇ to ti
  • the function can be scaled linearly to create a frame at any time between tg to ti .
  • the process of temporal interpolation by motion compensation is more complex than a simple mapping.
  • the generation of progressive frames from interlaced fields involves the temporal interpolation of fields which are opposite in even/odd polarity to the available fields.
  • Progressive frames are formed by joining the interpolated fields with the available fields.
  • a conversion of interlaced fields to progressive frames at a different frame rate may also be achieved by generating and joining both odd and even fields by temporal interpolation.
  • the present invention is directed to improving the accuracy with which video images are interpolated to convert interlaced video images into noninterlaced progressive video images (i.e. , de-interlacing), and to alter the frame rate of either progressive non-interlaced video images or interlaced video fields (i.e., frame rate conversion).
  • Exemplary embodiments are directed to a method and apparatus for synthesizing an image using at least two images in a sequence of video images, such as two non-interlaced progressive video images or two interlaced images.
  • the method comprises the steps of comparing first information obtained from pixels used to represent a first image with second information obtained from pixels used to represent a second image; extracting a motion vector from image motion among the at least two images; producing a measure of confidence of accuracy with which the motion vector is generated; synthesizing a first synthesized image using the motion vector; synthesizing a second synthesized image using at least one of the first information and the second information; and interpolating an image between the first image and the second image by combining the first synthesized image with the second synthesized image using the measure of confidence as a weighting factor.
  • Figure 1 is a block diagram of an exemplary apparatus which uses motion estimation and motion compensation for frame interpolation in a progressive frame sequence according to the present invention
  • Figure 2 is a block diagram of an exemplary temporal interpolation process according to the present invention.
  • Figures 3 A and 3B show two different exemplary correlation surfaces
  • Figure 4 illustrates an example of a single motion vector
  • Figure 5 illustrates an exemplary use of motion vectors to create a motion compensation frame
  • Figure 6 illustrates an exemplary process of temporal interpolation from 60 frames per second to 50 frames per second, in accordance with the present invention
  • Figure 7 illustrates motion estimation and motion compensation for interpolation in an interlaced field sequence
  • Figure 8 illustrates use of motion vectors with interlaced images to create a motion compensated frame.
  • FIGS 1 and 2 show portions of an exemplary frame rate conversion apparatus 100, configured as a functional block diagram, for frame interpolation in a progressive frame sequence.
  • P denotes progressive frames
  • F denotes interlaced fields
  • synthesized frames or fields are identified by the symbol P or F , respectively.
  • Motion estimation is the first part of the temporal interpolation process.
  • at least two images such as two sequential frames of non-interlaced image data in a video signal, or any two frames, are processed to detect image motion.
  • the two frames are sampled at a sample rate and are labeled P t and P t+1
  • P t and P t+1 For purposes of calculating a motion vector, one of the two frames (e.g., P t ) is identified as the frame of interest, and the other frame (e.g. , P t+I ) is the next sequential image frame that is used to identify motion with respect to the image of interest.
  • the motion estimation unit can be configured in accordance with a motion estimator as described in commonly assigned, co-pending U.S. Application Serial No. (Attorney
  • the motion estimation unit 102 is described as producing a correlation surface for each pixel in the frame of interest, from which correlation data (Cl x ⁇ , C2 XY , C ⁇ ) and motion vector (V x ⁇ ) are extracted for each pixel.
  • the correlation data is used to produce a confidence metric M(x,y) for each pixel.
  • the frame correlation uses frames P t and P t+1 that are temporally separated by one frame, although any desired temporal separation can be used.
  • Each motion vector is determined by spatially correlating a pixel in P t with a pixel in P, +1 .
  • Correlation is performed by comparing a region of pixels that surround and include a target pixel in the frame P, with spatially shifted regions in the other frame P r +; .
  • the confidence metric is a measure of correlation confidence (i.e. , measure of correlation accuracy quantified, for example, as a value from 0 to 1) of the motion vector.
  • the confidence is variable because some portions of the source frame may not be visible in the adjacent frame or the correlation is ambiguous.
  • frame correlation is performed by defining a search area within the frame P (+; .
  • a search area ⁇ Sx, ⁇ Sy is defined with respect to the frame V t+1 for a given pixel in the reference frame P r
  • a block is defined which extends over image pixels from -Bx to +Bx and from - By to +By.
  • the block center is located within the search area and used to calculate a correlation point C x ⁇ on a correlation surface. This process is performed by repeatedly moving the block center to a new location within the search area to generate a set of correlation points (i.e., one correlation point is calculated for each location of the block). This process is repeated for each possible location of the block center within the search area which extends from -Sx to +Sx and from -Sy to +Sy.
  • the set of correlation points is mapped into a correlation surface for the target pixel.
  • the correlation surface will correspond in size to the search area
  • the progressive frame correlation process by which each correlation point of the correlation surface for a given target pixel is calculated is defined by:
  • the progressive frame correlation process is defined by :
  • i and “j” are integers incremented in accordance with the two summations shown.
  • the values "x” and “y” account for spatial and temporal offsets of the pixels in the second image P t+1 with respect to the first image P t .
  • a correlation surface is produced which comprises a SAD (sum of the absolute difference) for each location of the block center within the search area.
  • Each SAD represents a correlation point C XJ on the correlation surface, the SAD being recomputed each time the block is moved within the search region.
  • the mapping of all SADs for a given search area constitutes the correlation surface for a given pixel.
  • the correlation surface is a two-dimensional array wherein each point is • mapped to a pixel location in the search area of the frame P f+7 .
  • the correlation surface uses the correlation surface, the pixel location to which image data of a given pixel in frame P, has moved in frame P, w can be determined. The lower the SAD associated with a given point on the correlation surface, the better the correlation.
  • any block size suitable for a particular application and computation capability of the system can be used to generate a correlation surface for a given pixel, from which a motion vector for the given pixel can be derived.
  • the correlation block size (+ Bx pixels horizontally by + By pixels vertically) can be set to a size large enough such that the SAD is statistically valid, yet small enough to be responsive to movement of small structures.
  • the motion estimation unit can be implemented using a parallel processor as described in commonly assigned, copending U.S. Application Serial No. 09/057,482 entitled MESH CONNECTED COMPUTER, the disclosure of which is hereby incorporated by reference in its entirety.
  • FIGS 3 A and 3B Two examples of correlation surfaces are shown in Figures 3 A and 3B.
  • Figure 3A there is a predominant "well” 302 representing a strong correlation surrounded by poor correlations.
  • This correlation surface labeled 304, provides a good indication that image data associated with the target pixel of interest in frame P f has moved to the pixel location in frame V t+1 which corresponds to the location of the "well” .
  • Figure 3B there is a "valley” 306 in the correlation surface 308 which is indicative of ambiguous correlations along a linear structure.
  • the correlation surface determined for a given pixel can be analyzed to extract the best (Cl x ⁇ ) and second-best (C2 XY ) correlation points in frame P r+7 of the pixel of interest in frame P r . That is, these points represent the best match of image data to that of the given pixel in frame P ? for which the correlation surface was produced.
  • the motion vector Vl x ⁇ associated with Cl ⁇ y i.e. , for a given (x,y) pixel coordinate
  • the best correlation value Cl xy is the minimum value within the correlation surface for the given pixel and is used to extract a motion vector which represents the motion of the pixel's image data between frames P r and V t+1 .
  • the value Cl ⁇ y is defined by:
  • FIG. 4 illustrates a motion vector 402 associated with two images 404 and 406 (represented as the two frames P, and P, +; ).
  • the motion vector corresponds to the distance and direction which image data associated with a given pixel 408 has moved in transitioning from the pixel location in frame P r to the pixel location in frame P t+1 .
  • the position of Cl xy on the correlation surface associated with pixel 408 implies the motion that the image data associated with the pixel of interest has undergone.
  • the motion vector associated with that correlation is:
  • the second-best correlation value C2 XY the average correction value C ⁇ for a given pixel are provided to enable the computation of the correlation confidence metric.
  • the second-best correlation value is C2 ⁇ y , the next ranked minimum located beyond a predetermined distance (e.g., a specified radius ( ⁇ )) from the best value Cl xy for the given pixel.
  • a predetermined distance e.g., a specified radius ( ⁇ )
  • the average correlation value (C ⁇ ) for the surface is computed as follows: s x s r
  • the motion vectors can be optionally filtered in unit 102.
  • the filter of the motion estimation unit 102 can be configured in accordance with any known motion vector filtering algorithm including, but not limited to, those described in U.S. Patent No. 5,016,102. Alternately the filtering can be performed according to that of the aforementioned U.S. Application. That is, the motion vectors can processed to identify and replace anomalous vectors. Only those vectors deemed “bad” are replaced with "filtered” vectors, the remaining vectors being left unchanged.
  • a vector can be flagged as bad if there are not at least two adjacent pixels in the frame P r with the same motion vector components (xl ⁇ , yl ⁇ ) as the center position. If the vector is flagged as bad it can be replaced with a filtered vector.
  • a filtered motion vector output of unit 102 is labeled "V(x,y) ⁇ .
  • exemplary embodiments In addition to generating filtered motion vectors, exemplary embodiments generate a confidence metric for each motion vector as a measure of the accuracy with which the motion vector for the pixel has been generated. Where the motion estimation unit 102 is configured in accordance with the aforementioned copending U.S. application, a confidence metric computation uses the best correlation point
  • the absolute confidence metric M x ⁇ computes a ratio of the best correlation value with respect to the average correlation value of the surface. This confidence metric quantifies the correlation "strength" and is defined as:
  • the relative confidence metric (M xr ) computes a ratio of the difference
  • This confidence metric which is a function of the difference between the correlation values C2 and Cl, quantifies the correlation "ambiguity" and is defined as:
  • a motion compensated frame can be interpolated at a time t,. between two frames P t and P t+1 by first determining an interpolated frame ⁇ M F using the tx forward motion compensation function:
  • MMF forward motion compensation
  • Y x , Y y are the x and y components of the motion vector.
  • a backward motion compensation function can be computed in the Figure 1 motion compensation unit 106:
  • the motion vector confidence metric can be used as a weighting value in the generation of motion compensated pixels. That is, the confidence metric can be used as a weighting factor in combining the motion compensated frame P ⁇ with a frame synthesizing an alternate technique.
  • Figure 2 uses the inputs and outputs of the Figure 1 portion of a frame rate conversion apparatus 100 to produce an alternate synthesized frame that is combined with the motion compensated frame P t ⁇ .
  • a second synthesized image can be generated as a frame tt .
  • This frame can be generated using an alternate technique, such as simple temporal interpolation (TI), implemented using an interpolated frame unit 202 of Figure 2, where:
  • a final temporally interpolated frame P t ⁇ can then be generated in a quality metric blending unit 204 from the motion compensated frame and the simply interpolated frame, using the confidence metric M(x,y) as a weighting function:
  • Figure 6 shows an exemplary 60 fps sequence which has been converted to an interpolated 50 fps sequence using the frame rate converter of Figures 1 and 2.
  • the first and second synthesized images are not limited to being synthesized in the manner described above.
  • the first synthesized image P t ⁇ can be generated with any combination of forward or backward motion estimation, and can be generated using only forward motion compensation or using only backward motion compensation.
  • the second image can be generated using any alternate synthesizing technique.
  • Frame rate conversion with interlaced frames can alternately, or additionally be performed by the frame rate conversion apparatus of Figures 1 and 2 by synthesizing an even field at a time "tx" and by synthesizing an odd field temporally aligned with the synthesized even field.
  • Motion vectors between even fields and motion vectors between odd fields can be produced in the same manner that motion vectors are produced for non-interlaced images. That is, assuming that the sequence of images includes interlaced even and odd fields, two consecutive even fields can be analyzed in exactly the same manner described with respect to two consecutive non-interlaced frames to calculate motion vectors and confidence metrics for each pixel. Similarly, two consecutive odd fields can be analyzed to determine motion vectors and confidence metrics.
  • motion compensation Having defined motion vectors between two consecutive even fields, and/or motion vectors between two consecutive odd fields, motion compensation can be used to produce a synthesized image at any location between the even and odd fields using the motion vectors between the even fields and/or the odd fields.
  • a synthesized even field can be produced by first generating a forward motion compensated field with the function:
  • F CF (x, y) F t _ x (x + Vx - At, y + Vy At)
  • F l CB (x,y) F M (x - Vx - (l - At), y - Vy (l - At)
  • This process can then be repeated to synthesize an odd field temporally aligned with the synthesized even field.
  • the two fields collectively
  • the progressive frame can then be further processed using standard image processing techniques such as spatial filtering and resampling for conversion between NTSC and PAL video standards for example.
  • an interlaced sequence of images can be converted to a noninterlaced sequence of any desired frame rate.
  • any desired temporal and spatial shift of the image information included in the pixels of the reference fields or frame can be synthesized.
  • Figure 8 shows an apparatus 800 for de-interlacing interlaced video images, such as sequential interlaced fields of a television video signal.
  • a process as described above can be used to synthesize a missing even or odd field.
  • the Figure 8 apparatus can also be used when it is desired to de-interlace a sequence of images without changing the frame rate of those images. In other words, ⁇ t will always be 1/2.
  • the Figure 8 motion estimation unit 802 processes three source fields labeled F t _, , F t and F t+1 representing three consecutive source fields (e.g., two fields of even numbered scan lines of a video frame and one field of odd numbered scan lines of a video frame, or vice verse).
  • the interframe correlation involves two spatially aligned fields which are temporally one frame apart, and therefore spatially aligned (e.g., two successive even fields, or two successive odd fields).
  • the intraframe correlation involves two fields which are temporally one field apart, and therefore spatially nonaligned (e.g., two successive, spatially nonaligned fields, such as an even field, and a successive odd field).
  • a vertical interpolation is performed in one of the two fields using a vertical interpolation unit 801.
  • the vertical interpolation unit spatially aligns the scan lines of the two fields, thereby permitting a correlation between the two fields in the intraframe correlation unit.
  • the vertical interpolation of, for example, the reference field F r can be performed by simply averaging the pixel values immediately above and below a pixel of interest.
  • any desired interpolation technique, vertical or otherwise can be used to fill in pixels of the scan lines needed to establish spatial alignment with the other field or fields to be used in the correlation process.
  • the vertically interpolated (VI) field is designated F t .
  • motion estimation can be performed in motion estimation unit 802 using a method and apparatus as described in the copending application.
  • the motion estimation unit performs the intraframe correlation and the interframe correlation. For each pixel of the frame F the interframe correlation and the intraframe correlation produce a correlation surface C 7 and C, respectively.
  • the correlation surfaces of the interframe correlation and the intraframe correlation are combined for each pixel into a composite correlation surface.
  • an unfiltered motion vector VI and associated confidence metric are extracted using correlation data Cl, C2, C for that pixel of the reference field F f . This process is repeated until an unfiltered motion vector and associated confidence metric have been produced for all pixels.
  • the unfiltered motion vectors for all pixels of the reference frame are supplied to a filter to produce filtered motion vectors in a manner already described with respect to progressive images.
  • Correlation data extracted from the correlation surfaces are used by the motion estimation unit 802 to produce a confidence metric M for each pixel in a manner already described with respect to progressive images.
  • the Figure 8 motion estimation unit 802 outputs a filtered motion vector V(x,y) and a motion vector confidence metric M xy for every pixel X,Y in the reference field F f . a.
  • the intraframe correlation utilizes fields F t ⁇ and F t temporally spaced by one field to produce a first correlation surface for each pixel of reference frame F r
  • the intraframe correlation process is implemented using pixel correspondence based on a search area ⁇ Sx, +Sy and a block ⁇ Bx, +By movable within the search area as was described with respect to the pixel correspondence approach to correlating progressive frames.
  • the intraframe correlation results in a correlation point for each location of the block center within the search area. Each correlation point is determined as follows:
  • the mapping of all correlation points for each location of the block center in the search area constitutes a correlation surface C for a given pixel in the reference field F t .
  • Each correlation point on the correlation surface C represents a pixel location in the search area of the field F ,. ; . This process is repeated for all pixels in the field of interest.
  • the interframe correlation utilizes spatially aligned fields F t ⁇ and F t+l that are temporally separated by two fields (i.e., one frame with a temporal separation of 2 ⁇ t).
  • the interframe correlation process is implemented using pixel correspondence based on a search area ⁇ Sx, ⁇ Sy and a block ⁇ Bx, ⁇ By movable within the search area, as was described with respect to the pixel correspondence approach to correlating progressive frames.
  • the intraframe correlation results in a correlation point for each location of the block center within the search area. Each correlation point is determined as follows:
  • the mapping of all correlation points for each location of the block in the search area constitutes a correlation surface C 1 for a given pixel in the reference field F t .
  • Each correlation point on the correlation surface C r represents a pixel location in the search area of the field F r _ 7 .
  • the correlation surface of the intraframe correlation implies the motion vector over a one ⁇ t time increment.
  • a two-pixel shift between fields with a 2 ⁇ t separation has the same rate (assuming constant velocity) as a one-pixel shift between fields with a ⁇ t separation.
  • the image motion implied by the correlation surface C 1 has been normalized to the same rate (pixels per ⁇ t) as the correlation surface C such that these surfaces can be composited.
  • the composite surface can be used to extract the correlation data Cl, C2, C , and derive motion vector information for each pixel of the interlaced reference field F r c. Correlation Compositing
  • the outputs of the interframe and intraframe correlation are their respective correlation surfaces.
  • / is a function such as simple summation, weighted summation, or multiplication, or combination thereof.
  • Intraframe correlation uses two inputs that are temporally one field apart, thus minimizing the effect of acceleration of objects within the image.
  • Interframe correlation uses two unmodified (non-interpolated) inputs, which provides a more accurate correlation.
  • results of the correlation compositing for each pixel of the reference field F t can be used to extract correlation data that, in turn, is used to derive a motion vector and confidence metric for each pixel of the reference field in a manner as described with respect to processing of progressive images.
  • filtering of the motion vectors can be performed in a manner as described with respect to that of the Figure 1 motion estimator. d. Temporal Interpolation For Motion Compensated De-Interlace
  • the Figure 8 system includes a vector scaling unit 804 and a motion compensation unit 806 to perform temporal interpolation for the de-interlace process.
  • a progressive (non-interlaced) motion picture sequence can be created from an interlaced sequence by synthesizing fields synchronized in time but opposite in phase for each existing field. The process is similar to progressive temporal interpolation: two motion compensated fields are generated.
  • the previous field F t is motion compensated in motion compensation unit 806 using a temporal offset ⁇ t of 1/2 from vector scaling unit 804 to interpolate the forward motion compensated field:
  • the following field F t+1 is motion compensated in motion compensation unit 806 using a temporal offset ⁇ t of 1/2 to interpolate the backward motion compensated field:
  • the motion compensated field can be blended with the vertically interpolated field in a quality metric blending unit, such as the blending unit 204 of Figure 2 using the confidence metric M and the function:
  • F, c (x,y) F t MC ⁇ x,y)-M(x,y)+ F t VI (x,yy (l -M(x,y)) where the quality blending metric unit has been configured to receive F t , and
  • a single 2-dimensional vertical-temporal filter can be employed using at least two of the fields, (including F t ).

Abstract

The present invention is directed to improving the accuracy of interpolated images to convert interlaced images into non-interlaced progressive images, and to convert the frame rate of either progressive non-interlaced images or interlaced fields (Fig. 7). The method comprises the steps of comparing first information obtained from pixels used to represent a first image with second information obtained from pixels used to represent a second image; extracting a motion vector from image motion among the at least two images; producing a measure of confidence of accuracy with which the motion vector is generated; synthesizing a first synthesized image using the motion vector; synthesizing a second synthesized image using at least one of the first information and the second information; and interpolating an image between the first image and the second image by combining the first synthesized image with the second synthesized image using the measure of confidence as a weighting factor.

Description

TEMPORAL INTERPOLATION OF INTERLACED OR PROGRESSIVE
VIDEO IMAGES
BACKGROUND OF THE INVENTION Field of The Invention: The present invention is generally related to image processing, and more particularly, to interpolating video images from interlaced or progressive video images. Background Information:
One aspect of image processing involves interpolating synthetic images to allow for conversion from one frame rate to another. Such techniques are applied to both progressive and interlaced image sequences (e.g., progressive or interlaced television images). In a typical video sequence (e.g. NTSC and PAL) an image is portrayed by alternately displaying the odd and even lines (also referred to as "fields"). The time between the display of these fields is sufficiently short (1/60 second for NTSC, 1/50 second for PAL) that the fields appear as a complete interlaced image. Since there is a time between when the fields are displayed, there is both a temporal and spatial nonalignment of the odd and even lines. In contrast progressive images contain both odd and even fields that were captured at the same instant in time. Motion pictures (e.g. film or video) are composed of image or frame sequences that represent samples taken at regular intervals in time. If the frames are sampled at a sufficiently high rate, the appearance of smooth motion is achieved. Common sampling rates include 24 frames per second for film, 60 fields per second for NTSC standard video in the United States and Canada, and 50 fields per second for PAL standard video in Europe and elsewhere. To convert a motion picture sequence to a different sampling rate, new frames must be created which appear to be intermediate in time between frames sampled at the source frame rate; this process is called temporal interpolation.
The process of temporal interpolation is one of predicting the contents of an image frame that is temporally between available image frames. Where objects are in motion within the sequence of image frames, the interpolated frame must position those objects spatially between the object positions in the surrounding available image frames. In order to do this, a process of motion estimation is performed to determine the motion between available frames. The estimated motion is represented by a spatial offset known as a motion vector. Depending on the motion estimation technique employed, motion vectors may be computed with respect to an entire image frame (one vector per frame), with respect to pixel blocks of varying sizes (e.g. 16x16, 8x8, etc.), or with respect to individual pixels (one vector per pixel). The vector(s) are scaled in accordance with the proportion of the temporal offset of the interpolated frame with respect to the surrounding available frames. The vector (s) are applied to one or both of the surrounding frames in a process known as motion compensation to generate the interpolated image frame.
The motion of an image object is not always at a constant rate, therefore a motion vector generated by linear interpolation may not be strictly accurate. However, if a set of motion vectors is determined for each frame in sequence, the sequence of sets of motion vectors can be considered a piecewise linear approximation of nonlinear motion over the sequence. Given that a set of motion vectors represents a linear function mapping a frame from time tø to ti , then the function can be scaled linearly to create a frame at any time between tg to ti . In practice, the process of temporal interpolation by motion compensation is more complex than a simple mapping. Several factors contribute to this complexity, including ambiguities in the motion estimation process and the covering and uncovering from one frame to the next due to objects in motion. With regard to interlaced images, an additional problem exists due to the spatial and temporal offset which exists from one image to the next. For example, in a single video frame constituted by two interlaced fields of information separated in space (e.g., by one line) and separated in time (e.g., by one half of a frame time), one field includes the odd numbered scan lines of an image, while the other includes the spatially offset even numbered scan lines. This complicates the task of motion estimation, since the spatial offset between successive fields introduces uncertainty into the process of comparing objects to determine their relative positions. The generation of progressive frames from interlaced fields involves the temporal interpolation of fields which are opposite in even/odd polarity to the available fields. Progressive frames are formed by joining the interpolated fields with the available fields. A conversion of interlaced fields to progressive frames at a different frame rate may also be achieved by generating and joining both odd and even fields by temporal interpolation.
Accordingly, it would be desirable to provide a method of temporal interpolation which avoids the inaccuracies found in existing techniques of motion compensation so that more reliable image conversion can be achieved.
SUMMARY OF THE INVENTION The present invention is directed to improving the accuracy with which video images are interpolated to convert interlaced video images into noninterlaced progressive video images (i.e. , de-interlacing), and to alter the frame rate of either progressive non-interlaced video images or interlaced video fields (i.e., frame rate conversion). Exemplary embodiments are directed to a method and apparatus for synthesizing an image using at least two images in a sequence of video images, such as two non-interlaced progressive video images or two interlaced images. In accordance with exemplary embodiments, the method comprises the steps of comparing first information obtained from pixels used to represent a first image with second information obtained from pixels used to represent a second image; extracting a motion vector from image motion among the at least two images; producing a measure of confidence of accuracy with which the motion vector is generated; synthesizing a first synthesized image using the motion vector; synthesizing a second synthesized image using at least one of the first information and the second information; and interpolating an image between the first image and the second image by combining the first synthesized image with the second synthesized image using the measure of confidence as a weighting factor.
BRIEF DESCRIPTION OF THE DRAWINGS
Other objects and advantages of the present invention will become more apparent to those skilled in the art upon reading the detailed description of the preferred embodiments, wherein like elements have been designated by like numerals, and wherein:
Figure 1 is a block diagram of an exemplary apparatus which uses motion estimation and motion compensation for frame interpolation in a progressive frame sequence according to the present invention;
Figure 2 is a block diagram of an exemplary temporal interpolation process according to the present invention;
Figures 3 A and 3B show two different exemplary correlation surfaces;
Figure 4 illustrates an example of a single motion vector;
Figure 5 illustrates an exemplary use of motion vectors to create a motion compensation frame; Figure 6 illustrates an exemplary process of temporal interpolation from 60 frames per second to 50 frames per second, in accordance with the present invention; Figure 7 illustrates motion estimation and motion compensation for interpolation in an interlaced field sequence;
Figure 8 illustrates use of motion vectors with interlaced images to create a motion compensated frame.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
1. Temporal Interpolation For Frame Rate Conversion a. Frame Rate Conversion Of Non-interlaced Images
( 1. ) Motion Estimation
Figures 1 and 2 show portions of an exemplary frame rate conversion apparatus 100, configured as a functional block diagram, for frame interpolation in a progressive frame sequence. In the following discussion, "P" denotes progressive frames, "F" denotes interlaced fields, and synthesized frames or fields are identified by the symbol P or F , respectively.
Motion estimation is the first part of the temporal interpolation process. In Figure 1, at least two images, such as two sequential frames of non-interlaced image data in a video signal, or any two frames, are processed to detect image motion. The two frames are sampled at a sample rate and are labeled Pt and Pt+1 For purposes of calculating a motion vector, one of the two frames (e.g., Pt ) is identified as the frame of interest, and the other frame (e.g. , Pt+I ) is the next sequential image frame that is used to identify motion with respect to the image of interest. Those skilled in the art will appreciate that in this case (i.e., identifying motion from Pt to Pt+1, when Pt and Pt+1 are images in a temporally ordered sequence) motion will be determined in a forward direction from Pt to Pt+1 However, motion can alternately be assessed in a backward direction from Pt+1 to Pt in which case Vt+1 is the image of interest. In the following discussion, Pt will be considered the frame of interest for simplicity. The two video frames Pt and Pt+1 are supplied to a motion estimation unit 102 to produce a motion vector and associated confidence metric for each pixel of either Pt or Pt+1 . The motion estimation can be performed according to that described in U.S. Patent No. 5,016,102, the contents of which are hereby incorporated by reference in its entirety. Alternately, the motion estimation unit can be configured in accordance with a motion estimator as described in commonly assigned, co-pending U.S. Application Serial No. (Attorney
Docket No. 032797-062), entitled "PROCESSING SEQUENTIAL VIDEO IMAGES TO DETECT IMAGE MOTION AMONG INTERLACED OR PROGRESSIVE VIDEO IMAGES" , filed on even date herewith, the contents of which are incorporated herein by reference in their entirety.
In the copending application, the motion estimation unit 102 is described as producing a correlation surface for each pixel in the frame of interest, from which correlation data (Cl, C2XY, Cχγ ) and motion vector (V) are extracted for each pixel. The correlation data is used to produce a confidence metric M(x,y) for each pixel. The frame correlation uses frames Pt and Pt+1 that are temporally separated by one frame, although any desired temporal separation can be used. Each motion vector is determined by spatially correlating a pixel in Pt with a pixel in P, +1. Correlation is performed by comparing a region of pixels that surround and include a target pixel in the frame P, with spatially shifted regions in the other frame Pr +;. The confidence metric is a measure of correlation confidence (i.e. , measure of correlation accuracy quantified, for example, as a value from 0 to 1) of the motion vector. The confidence is variable because some portions of the source frame may not be visible in the adjacent frame or the correlation is ambiguous. In the motion estimation unit, frame correlation is performed by defining a search area within the frame P(+;. For example, a search area ±Sx, ±Sy is defined with respect to the frame Vt+1 for a given pixel in the reference frame Pr A block is defined which extends over image pixels from -Bx to +Bx and from - By to +By. The block center is located within the search area and used to calculate a correlation point C on a correlation surface. This process is performed by repeatedly moving the block center to a new location within the search area to generate a set of correlation points (i.e., one correlation point is calculated for each location of the block). This process is repeated for each possible location of the block center within the search area which extends from -Sx to +Sx and from -Sy to +Sy.
The set of correlation points is mapped into a correlation surface for the target pixel. The correlation surface will correspond in size to the search area
±Sx, ±Sy. The progressive frame correlation process by which each correlation point of the correlation surface for a given target pixel is calculated, is defined by:
Figure imgf000008_0001
Alternately, where the motion estimation is performed in the direction from Pt+1 to Pt (with Pt+1 being the frame of interest), the progressive frame correlation process is defined by :
Figure imgf000008_0002
In these equations, "i" and "j" are integers incremented in accordance with the two summations shown. The values "x" and "y" account for spatial and temporal offsets of the pixels in the second image Pt+1 with respect to the first image Pt. Thus, for each pixel in the frame Pr , a correlation surface is produced which comprises a SAD (sum of the absolute difference) for each location of the block center within the search area. Each SAD represents a correlation point CXJ on the correlation surface, the SAD being recomputed each time the block is moved within the search region. The mapping of all SADs for a given search area constitutes the correlation surface for a given pixel.
Because each SAD provides one correlation point C on the correlation surface, the correlation surface is a two-dimensional array wherein each point is mapped to a pixel location in the search area of the frame Pf+7. Using the correlation surface, the pixel location to which image data of a given pixel in frame P, has moved in frame P,w can be determined. The lower the SAD associated with a given point on the correlation surface, the better the correlation.
Those skilled in the art will appreciate that any block size suitable for a particular application and computation capability of the system can be used to generate a correlation surface for a given pixel, from which a motion vector for the given pixel can be derived. The correlation block size (+ Bx pixels horizontally by + By pixels vertically) can be set to a size large enough such that the SAD is statistically valid, yet small enough to be responsive to movement of small structures. In accordance with exemplary embodiments of the present invention, the motion estimation unit can be implemented using a parallel processor as described in commonly assigned, copending U.S. Application Serial No. 09/057,482 entitled MESH CONNECTED COMPUTER, the disclosure of which is hereby incorporated by reference in its entirety.
Two examples of correlation surfaces are shown in Figures 3 A and 3B. In Figure 3A, there is a predominant "well" 302 representing a strong correlation surrounded by poor correlations. This correlation surface, labeled 304, provides a good indication that image data associated with the target pixel of interest in frame Pf has moved to the pixel location in frame Vt+1 which corresponds to the location of the "well" . In Figure 3B, there is a "valley" 306 in the correlation surface 308 which is indicative of ambiguous correlations along a linear structure.
The correlation surface determined for a given pixel can be analyzed to extract the best (Cl) and second-best (C2XY) correlation points in frame Pr+7 of the pixel of interest in frame Pr . That is, these points represent the best match of image data to that of the given pixel in frame P? for which the correlation surface was produced. The motion vector Vl associated with Clχy (i.e. , for a given (x,y) pixel coordinate) is selected as the most likely candidate for specifying the direction and the magnitude of motion the image data of the given pixel in frame P, underwent between frames P, and P(+ . That is, the best correlation value Clxy is the minimum value within the correlation surface for the given pixel and is used to extract a motion vector which represents the motion of the pixel's image data between frames Pr and Vt+1. The value Clχy is defined by:
Cl= m \CXY(x,y)l The geometry for computing a motion vector V of a small correlation surface using the correlation data Cl is illustrated in Figure 4. Figure 4 illustrates a motion vector 402 associated with two images 404 and 406 (represented as the two frames P, and P,+;). The motion vector corresponds to the distance and direction which image data associated with a given pixel 408 has moved in transitioning from the pixel location in frame Pr to the pixel location in frame Pt+1. The position of Clxy on the correlation surface associated with pixel 408 implies the motion that the image data associated with the pixel of interest has undergone. The motion vector associated with that correlation is:
Figure imgf000010_0001
Only the motion vector associated with the best correlation (for each pixel in the image) is retained for subsequent filtering.
The second-best correlation value C2XY the average correction value Cχγ for a given pixel are provided to enable the computation of the correlation confidence metric. The second-best correlation value is C2γy, the next ranked minimum located beyond a predetermined distance (e.g., a specified radius (β)) from the best value Clxy for the given pixel. The use of a minimum radius increases the likelihood that the second-best correlation is not a false second best correlation point associated with the best correlation point The average correlation value (Cχγ) for the surface is computed as follows: sx sr
Σ Σ cχγ(χ>y) x -Sx y =-Sγ
C =
(2S +1)(2S„+1)
The foregoing process of determining correlation data Clχy, C2xy and Cχγ is repeated for each pixel in frame Pp so that a motion vector and associated correlation data can be determined for every pixel.
After generating motion vectors and confidence metrics for each pixel of frame Pf ,the motion vectors can be optionally filtered in unit 102. The filter of the motion estimation unit 102, can be configured in accordance with any known motion vector filtering algorithm including, but not limited to, those described in U.S. Patent No. 5,016,102. Alternately the filtering can be performed according to that of the aforementioned U.S. Application. That is, the motion vectors can processed to identify and replace anomalous vectors. Only those vectors deemed "bad" are replaced with "filtered" vectors, the remaining vectors being left unchanged. Although any known filtering technique can be used, in one exemplary embodiment, a vector can be flagged as bad if there are not at least two adjacent pixels in the frame Pr with the same motion vector components (xlχγ, yl^) as the center position. If the vector is flagged as bad it can be replaced with a filtered vector. A filtered motion vector output of unit 102 is labeled "V(x,y)π.
In addition to generating filtered motion vectors, exemplary embodiments generate a confidence metric for each motion vector as a measure of the accuracy with which the motion vector for the pixel has been generated. Where the motion estimation unit 102 is configured in accordance with the aforementioned copending U.S. application, a confidence metric computation uses the best correlation point
Cl, the second best correlation point C2, and the average correlation value C for computing the confidence metric of a given pixel.
Two confidence metrics are defined which are indicative of the accuracy of the best motion vector Vχγ. The absolute confidence metric M ) computes a ratio of the best correlation value with respect to the average correlation value of the surface. This confidence metric quantifies the correlation "strength" and is defined as:
%,ABS C^XY
Mχγ = 1 -^—
^XY
The relative confidence metric (Mxr ) computes a ratio of the difference
between the correlation values of the best and second-best correlation points with respect to (1-Clxy). This confidence metric which is a function of the difference between the correlation values C2 and Cl, quantifies the correlation "ambiguity" and is defined as:
Figure imgf000012_0001
These can be further combined into a single confidence metric by, for example, a simple multiplication:
Mχγ=Mχγ S * M , REL XY
where Mxy can, for example, be within a range of 0 and 1. (2.) Motion Compensation For Temporal Interpolation Using the information generated in the Figure 1 motion estimation unit
102, a high quality synthetic frame can be generated for a temporal offset Δt= tx of the synthetic frame with respect to the frames Pr and Pt+i using a vector scaling unit 104 and a motion compensation unit 106.
A motion compensated frame can be interpolated at a time t,. between two frames Pt and Pt+1 by first determining an interpolated frame β M F using the tx forward motion compensation function:
Figure imgf000013_0001
where "MCF" denotes forward motion compensation; t < x < t+1 ~ x ~ ; and Yx, Yy are the x and y components of the motion vector.
An exemplary use of motion vectors to create a motion compensated frame is shown in Figure 5 with respect to pixel image information which transitions from a location of pixel 502 to pixel 504 over a frame pitch (temporal offset between successive frames) Tf of Δt= 1. Figure 5 also illustrates interpolated frames 506, 508 and 510 formed with temporal offsets of 50% (Δt=0.5), 20% (Δt=0.2) and 80% (Δt=0.8), respectively. Optionally, a backward motion compensation function can be computed in the Figure 1 motion compensation unit 106:
Figure imgf000014_0001
where "MCB" denotes backward motion compensation. The forward and backward motion compensated frames can then be combined in motion compensation unit 106 to create a first synthesized image as an interpolated, motion compensated frame tt (x,y) using a blending function:
* ' (1 - Δt) Δt
The foregoing is one of many blending possibilities.
In practice the motion vectors are not equally valid and the above function will generate a number of anomalous pixels. To prevent the synthesis of erroneous pixels, the motion vector confidence metric can be used as a weighting value in the generation of motion compensated pixels. That is, the confidence metric can be used as a weighting factor in combining the motion compensated frame P^ with a frame synthesizing an alternate technique.
Figure 2 uses the inputs and outputs of the Figure 1 portion of a frame rate conversion apparatus 100 to produce an alternate synthesized frame that is combined with the motion compensated frame P . For example, in addition to the motion compensated frame P , a second synthesized image can be generated as a frame tt . This frame can be generated using an alternate technique, such as simple temporal interpolation (TI), implemented using an interpolated frame unit 202 of Figure 2, where:
^ ) =Pt(x,y)-(l "At) +Pt ,j(χ,y)At
A final temporally interpolated frame P can then be generated in a quality metric blending unit 204 from the motion compensated frame and the simply interpolated frame, using the confidence metric M(x,y) as a weighting function:
Pu (x,y) = P^ (x,y)-M(x,y)+ P^ (x, y (l ~M(x,y)) Figure 6 shows an exemplary 60 fps sequence which has been converted to an interpolated 50 fps sequence using the frame rate converter of Figures 1 and 2. The first and second synthesized images are not limited to being synthesized in the manner described above. For example, the first synthesized image P can be generated with any combination of forward or backward motion estimation, and can be generated using only forward motion compensation or using only backward motion compensation. The second image can be generated using any alternate synthesizing technique. b. Frame Rate Conversion Of Interlaced Images ( 1. ) Motion Estimation
Frame rate conversion with interlaced frames can alternately, or additionally be performed by the frame rate conversion apparatus of Figures 1 and 2 by synthesizing an even field at a time "tx" and by synthesizing an odd field temporally aligned with the synthesized even field. Motion vectors between even fields and motion vectors between odd fields can be produced in the same manner that motion vectors are produced for non-interlaced images. That is, assuming that the sequence of images includes interlaced even and odd fields, two consecutive even fields can be analyzed in exactly the same manner described with respect to two consecutive non-interlaced frames to calculate motion vectors and confidence metrics for each pixel. Similarly, two consecutive odd fields can be analyzed to determine motion vectors and confidence metrics. (2.) Motion Compensation Having defined motion vectors between two consecutive even fields, and/or motion vectors between two consecutive odd fields, motion compensation can be used to produce a synthesized image at any location between the even and odd fields using the motion vectors between the even fields and/or the odd fields.
For example, a synthesized even field can be produced by first generating a forward motion compensated field with the function:
F CF (x, y) = Ft_x (x + Vx - At, y + Vy At)
and then generating a backward motion compensated field with the function:
Fl CB (x,y) = FM(x - Vx - (l - At), y - Vy (l - At)
where Δt is any fraction. The two fields can then be blended:
The motion compensated field can then blended with an alternatively generated
field F lxTI using the motion vector confidence metric M to create the synthesized
(i.e., temporally interpolated) even field:
Ftx{x,y) =
Figure imgf000016_0001
M{x,y)+ F ! {x,y)- {l -M{x,y))
This process can then be repeated to synthesize an odd field temporally aligned with the synthesized even field.
Referring to Figure 7, the preceding process generates a synthesized even t -t-1 field 706 at the time tx where t < tx < t+1, with Δt= at the same phase
2 (e.g., even field) as the fields used for motion estimation: i.e., Ft and Ft+1, represented as fields 702 and 704 in Figure 7. By repeating the process with respect to two sequential odd fields, an opposite phase field (e.g., odd field 708) can be generated at the same time tx by performing the motion estimation on fields Ft and Ft+2 (labeled 710 and 712) with At=^— . The two fields collectively
2 constitute a progressive frame 714 at time tx as illustrated in Fig 7. The progressive frame can then be further processed using standard image processing techniques such as spatial filtering and resampling for conversion between NTSC and PAL video standards for example.
Thus, an interlaced sequence of images can be converted to a noninterlaced sequence of any desired frame rate. By multiplying the motion vectors by any integer or fraction, any desired temporal and spatial shift of the image information included in the pixels of the reference fields or frame, can be synthesized.
2. Temporal Interpolation For De-interlacing
Figure 8 shows an apparatus 800 for de-interlacing interlaced video images, such as sequential interlaced fields of a television video signal. Of course, a process as described above can be used to synthesize a missing even or odd field. The Figure 8 apparatus can also be used when it is desired to de-interlace a sequence of images without changing the frame rate of those images. In other words, Δt will always be 1/2. The Figure 8 motion estimation unit 802 processes three source fields labeled Ft_, , Ft and Ft+1 representing three consecutive source fields (e.g., two fields of even numbered scan lines of a video frame and one field of odd numbered scan lines of a video frame, or vice verse). As with the frame rate conversion described above, image motion among interlaced video images can be detected in any known fashion. Where consecutive fields provide spatially interlaced pixels, the aforementioned copending U.S. application describes using two different correlation techniques to produce two different correlation surfaces for each pixel in a field of interest: an interframe correlation implemented using an interframe correlation unit, and an intraframe correlation implemented using an intraframe correlation unit. The resultant sets of correlation surfaces are then combined to extract a motion vector and confidence metric for each pixel of the reference field.
The interframe correlation involves two spatially aligned fields which are temporally one frame apart, and therefore spatially aligned (e.g., two successive even fields, or two successive odd fields). The intraframe correlation involves two fields which are temporally one field apart, and therefore spatially nonaligned (e.g., two successive, spatially nonaligned fields, such as an even field, and a successive odd field).
Because the intraframe correlation is performed using two spatially nonaligned fields (e.g. , one which includes the even numbered scan lines of the video frame, and another which includes the odd numbered scan lines of the video frame), a vertical interpolation is performed in one of the two fields using a vertical interpolation unit 801. The vertical interpolation unit spatially aligns the scan lines of the two fields, thereby permitting a correlation between the two fields in the intraframe correlation unit.
The vertical interpolation of, for example, the reference field Fr can be performed by simply averaging the pixel values immediately above and below a pixel of interest. Of course, any desired interpolation technique, vertical or otherwise, can be used to fill in pixels of the scan lines needed to establish spatial alignment with the other field or fields to be used in the correlation process. The vertically interpolated (VI) field is designated Ft .
After calculating the interpolated field, motion estimation can be performed in motion estimation unit 802 using a method and apparatus as described in the copending application. The motion estimation unit performs the intraframe correlation and the interframe correlation. For each pixel of the frame F the interframe correlation and the intraframe correlation produce a correlation surface C7 and C, respectively. The correlation surfaces of the interframe correlation and the intraframe correlation are combined for each pixel into a composite correlation surface. Using the composite correlation surface for a given pixel, an unfiltered motion vector VI and associated confidence metric are extracted using correlation data Cl, C2, C for that pixel of the reference field Ff. This process is repeated until an unfiltered motion vector and associated confidence metric have been produced for all pixels. The unfiltered motion vectors for all pixels of the reference frame are supplied to a filter to produce filtered motion vectors in a manner already described with respect to progressive images.
Correlation data extracted from the correlation surfaces are used by the motion estimation unit 802 to produce a confidence metric M for each pixel in a manner already described with respect to progressive images. Thus, when processing interlaced images, the Figure 8 motion estimation unit 802 outputs a filtered motion vector V(x,y) and a motion vector confidence metric Mxy for every pixel X,Y in the reference field Ff . a. Intraframe Correlation The intraframe correlation utilizes fields F and Ft temporally spaced by one field to produce a first correlation surface for each pixel of reference frame Fr In an exemplary embodiment, the intraframe correlation process is implemented using pixel correspondence based on a search area ±Sx, +Sy and a block ±Bx, +By movable within the search area as was described with respect to the pixel correspondence approach to correlating progressive frames. The intraframe correlation results in a correlation point for each location of the block center within the search area. Each correlation point is determined as follows:
Figure imgf000019_0001
The mapping of all correlation points for each location of the block center in the search area constitutes a correlation surface C for a given pixel in the reference field Ft . Each correlation point on the correlation surface C represents a pixel location in the search area of the field F ,.;. This process is repeated for all pixels in the field of interest. b. Interframe Correlation The interframe correlation utilizes spatially aligned fields F and Ft+l that are temporally separated by two fields (i.e., one frame with a temporal separation of 2Δt). As with the intraframe correlation, the interframe correlation process is implemented using pixel correspondence based on a search area ±Sx, ±Sy and a block ±Bx, ±By movable within the search area, as was described with respect to the pixel correspondence approach to correlating progressive frames. The intraframe correlation results in a correlation point for each location of the block center within the search area. Each correlation point is determined as follows:
Figure imgf000020_0001
The mapping of all correlation points for each location of the block in the search area constitutes a correlation surface C1 for a given pixel in the reference field Ft. Each correlation point on the correlation surface Cr represents a pixel location in the search area of the field Fr_7. Thus, this process is identical to intraframe correlation with the exception that F is shifted by twice the vector magnitude (i.e. 2x and 2y) due to the two field temporal separation.
The correlation surface of the intraframe correlation implies the motion vector over a one Δt time increment. A two-pixel shift between fields with a 2 Δt separation has the same rate (assuming constant velocity) as a one-pixel shift between fields with a Δt separation. Thus, the image motion implied by the correlation surface C1 has been normalized to the same rate (pixels per Δt) as the correlation surface C such that these surfaces can be composited. The composite surface can be used to extract the correlation data Cl, C2, C , and derive motion vector information for each pixel of the interlaced reference field Fr c. Correlation Compositing The outputs of the interframe and intraframe correlation are their respective correlation surfaces. The two surfaces C and Cχγ for every pixel in the
reference image can be composited as follows:
Figure imgf000021_0001
where /is a function such as simple summation, weighted summation, or multiplication, or combination thereof.
Combining the intraframe and interframe correlation takes advantage of each of their strengths. Intraframe correlation uses two inputs that are temporally one field apart, thus minimizing the effect of acceleration of objects within the image. Interframe correlation uses two unmodified (non-interpolated) inputs, which provides a more accurate correlation.
The results of the correlation compositing for each pixel of the reference field Ft can be used to extract correlation data that, in turn, is used to derive a motion vector and confidence metric for each pixel of the reference field in a manner as described with respect to processing of progressive images. In addition, filtering of the motion vectors can be performed in a manner as described with respect to that of the Figure 1 motion estimator. d. Temporal Interpolation For Motion Compensated De-Interlace
The Figure 8 system includes a vector scaling unit 804 and a motion compensation unit 806 to perform temporal interpolation for the de-interlace process. A progressive (non-interlaced) motion picture sequence can be created from an interlaced sequence by synthesizing fields synchronized in time but opposite in phase for each existing field. The process is similar to progressive temporal interpolation: two motion compensated fields are generated.
Using the system of Fig 8 to convert an interlaced sequence of images into a non-interlaced sequence, the previous field Ft is motion compensated in motion compensation unit 806 using a temporal offset Δt of 1/2 from vector scaling unit 804 to interpolate the forward motion compensated field:
Figure imgf000022_0001
The following field Ft+1 is motion compensated in motion compensation unit 806 using a temporal offset Δt of 1/2 to interpolate the backward motion compensated field:
Figure imgf000022_0002
These two interpolated fields are blended equally in motion compensation unit 806 to generate the motion compensated field:
Figure imgf000022_0003
Because the motion compensated pixels are not all of equal confidence, the motion compensated field can be blended with the vertically interpolated field in a quality metric blending unit, such as the blending unit 204 of Figure 2 using the confidence metric M and the function:
F, c (x,y) = Ft MC{x,y)-M(x,y)+ Ft VI (x,yy (l -M(x,y)) where the quality blending metric unit has been configured to receive Ft , and
Ft from Figure 2, rather than the frame inputs from Figure 1. Note that the
above function is basically identical to that used to produce temporally interpolated frame P (x,y), with the exception that a vertically interpolated image Ft is used
as the second image, rather than the temporally interpolated image P^ used previously. However, those skilled in the art will appreciate that a temporally interpolated image, generated in a manner as described with respect to P , but
using the image fields F (e.g. , fields Ft and Ft+1) as opposed to image frames P, could be used in place of Ft . In addition, instead of using a 1-dimensional
vertical interpolation filter followed by a 1-dimensional temporal interpolation filter, a single 2-dimensional vertical-temporal filter can be employed using at least two of the fields, (including Ft).
It will be appreciated by those skilled in the art that the present invention can be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The presently disclosed embodiments are therefore considered in all respects to be illustrative and not restricted. The scope of the invention is indicated by the appended claims rather than the foregoing description and all changes that come within the meaning and range and equivalence thereof are intended to be embraced therein.

Claims

What Is Claimed Is:
1. Method for synthesizing a video image from at least two images in a sequence of video images, the method comprising the steps of: comparing first information obtained from pixels used to represent a first image with second information obtained from pixels used to represent a second image; extracting a motion vector from image motion among the at least two images; producing a measure of confidence of accuracy with which the motion vector is generated; synthesizing a first synthesized image using the motion vector; synthesizing a second synthesized image using at least one of the first information and the second information; and interpolating an image between the first image and the second image by combining the first synthesized image with the second synthesized image using the measure of confidence as a weighting factor.
2. A method according to claim 1, wherein the step of synthesizing a first synthesized image includes a step of: spatially offsetting the first information by the motion vector multiplied by a desired temporal offset.
3. A method according to claim 2, wherein said step of comparing includes a step of: producing a correlation surface representative of image motion between the first image and second image.
4. A method according to claim 3, wherein the step of extracting includes a step of: deriving the motion vector from correlation data included in the correlation surface as a measure of the motion among the at least two images.
5. A method according to claim 1, wherein the motion vector is extracted by comparing the second information with the first information, and the step of synthesizing the first synthesized image includes a step of: spatially offsetting the first information by the motion vector multiplied by a desired temporal offset.
6. A method according to claim 1, wherein the motion vector is extracted by comparing the second information from the first information, and the step of synthesizing the first synthesized image includes a step of: offsetting the second information by the motion vector multiplied by a desired temporal offset.
7. A method according to claim 1, wherein the motion vector is extracted by comparing the first information with the second information, and the step of synthesizing the first synthesized image includes a step of: spatially offsetting the first information by the motion vector multiplied by a desired temporal offset.
8. A method according to claim 1, wherein the motion vector is extracted by comparing the first information from the second information, and the step of synthesizing the first synthesized image includes a step of: offsetting the second information by the motion vector multiplied by a desired temporal offset.
9. Method according to claim 1, wherein the motion vector is extracted by comparing the second information with the first information, and said step of synthesizing a first synthesized image includes a step of: compensating image motion using the first image and the motion vector to produce a forward motion compensated image; compensating image motion using the second image and the motion vector to produce a backward motion compensated image; and blending the forward motion compensated image and the backward motion compensation image to produce a motion compensated image as said first synthesized image (blending of forward and backward motion estimation).
10. Method according to claim 1, wherein the step of synthesizing a second image includes a step of: combining the first information and the second information to produce a temporally interpolated image.
11. A method according to claim 10, wherein the first synthesized image is a synthesized motion compensated image, the method comprising the step of: combining the synthesized motion compensated image and the interpolated image to produce a synthesized video image (Ptx).
12. A method according to claim 11, wherein the first and second images are non-interlaced images.
13. Method according to claim 11, wherein the first and second images are interlaced images.
14. Method according to claim 1, wherein the images are interlaced images, the method comprising the steps of: producing a vertically interpolated image from the second image; comparing the vertically interpolated image with the first image to produce a first correlation surface; comparing the first image with a third image to produce a second correlation surface; and combining the first and second correlation surfaces into a composite correlation surface; and extracting the motion vector from the composite correlation surface to produce de-interlaced image.
15. Method according to claim 14, comprising the step of: producing said measure of confidence using the composite correlation surface.
16. Method according to claim 14, wherein the motion vector is extracted by comparing the second information from the first information, and said step of synthesizing a first synthesized image includes a step of: compensating image motion using the first image and the motion vector to produce a forward motion compensated image; compensating image motion using the second image and the motion vector to produce a backward motion compensated image; and blending the forward motion compensated image and the backward motion compensation image to produce a motion compensated image as said first synthesized image.
17. Method according to claim 16, wherein the step of synthesizing a second image includes a step of: combining the first information and the second information to produce a temporally interpolated image.
18. A method according to claim 17, wherein the first synthesized image is a synthesized motion compensated image, the method comprising the step of: combining the synthesized motion compensated image and the interpolated image to produce a synthesized video image.
PCT/US2001/008789 2000-03-27 2001-03-20 Temporal interpolation of interlaced or progressive video images WO2001074082A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001247574A AU2001247574A1 (en) 2000-03-27 2001-03-20 Temporal interpolation of interlaced or progressive video images

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US53553400A 2000-03-27 2000-03-27
US09/535,534 2000-03-27

Publications (1)

Publication Number Publication Date
WO2001074082A1 true WO2001074082A1 (en) 2001-10-04

Family

ID=24134655

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/008789 WO2001074082A1 (en) 2000-03-27 2001-03-20 Temporal interpolation of interlaced or progressive video images

Country Status (2)

Country Link
AU (1) AU2001247574A1 (en)
WO (1) WO2001074082A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2411784A (en) * 2004-03-02 2005-09-07 Imagination Tech Ltd A motion compensated de-interlacer
US7075988B2 (en) 2001-10-25 2006-07-11 Samsung Electronics Co., Ltd. Apparatus and method of converting frame and/or field rate using adaptive motion compensation
GB2448336A (en) * 2007-04-11 2008-10-15 Snell & Wilcox Ltd De-interlacing video using motion vectors
US8588513B2 (en) * 2005-07-18 2013-11-19 Broadcom Corporation Method and system for motion compensation
GB2505872A (en) * 2012-07-24 2014-03-19 Snell Ltd Interpolation of images
TWI471010B (en) * 2010-12-30 2015-01-21 Mstar Semiconductor Inc A motion compensation deinterlacing image processing apparatus and method thereof
CN110809114A (en) * 2018-08-03 2020-02-18 半导体元件工业有限责任公司 Transform processor for stepwise switching between image transforms

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5742289A (en) * 1994-04-01 1998-04-21 Lucent Technologies Inc. System and method of generating compressed video graphics images
US5963257A (en) * 1995-07-14 1999-10-05 Sharp Kabushiki Kaisha Video coding device and video decoding device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5742289A (en) * 1994-04-01 1998-04-21 Lucent Technologies Inc. System and method of generating compressed video graphics images
US5963257A (en) * 1995-07-14 1999-10-05 Sharp Kabushiki Kaisha Video coding device and video decoding device

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7075988B2 (en) 2001-10-25 2006-07-11 Samsung Electronics Co., Ltd. Apparatus and method of converting frame and/or field rate using adaptive motion compensation
US7505080B2 (en) 2004-03-02 2009-03-17 Imagination Technologies Limited Motion compensation deinterlacer protection
WO2005086480A1 (en) * 2004-03-02 2005-09-15 Imagination Technologies Limited Motion compensation deinterlacer protection
GB2411784B (en) * 2004-03-02 2006-05-10 Imagination Tech Ltd Motion compensation deinterlacer protection
JP2007526705A (en) * 2004-03-02 2007-09-13 イマジネイション テクノロジーズ リミテッド Protection with corrected deinterlacing device
GB2411784A (en) * 2004-03-02 2005-09-07 Imagination Tech Ltd A motion compensated de-interlacer
US8588513B2 (en) * 2005-07-18 2013-11-19 Broadcom Corporation Method and system for motion compensation
US8421918B2 (en) 2007-04-11 2013-04-16 Snell Limited De-interlacing video
GB2448336A (en) * 2007-04-11 2008-10-15 Snell & Wilcox Ltd De-interlacing video using motion vectors
TWI471010B (en) * 2010-12-30 2015-01-21 Mstar Semiconductor Inc A motion compensation deinterlacing image processing apparatus and method thereof
US9277167B2 (en) 2010-12-30 2016-03-01 Mstar Semiconductor, Inc. Compensation de-interlacing image processing apparatus and associated method
GB2505872A (en) * 2012-07-24 2014-03-19 Snell Ltd Interpolation of images
US8860880B2 (en) 2012-07-24 2014-10-14 Snell Limited Offset interpolation of a sequence of images to obtain a new image
GB2505872B (en) * 2012-07-24 2019-07-24 Snell Advanced Media Ltd Interpolation of images
CN110809114A (en) * 2018-08-03 2020-02-18 半导体元件工业有限责任公司 Transform processor for stepwise switching between image transforms
US10757324B2 (en) 2018-08-03 2020-08-25 Semiconductor Components Industries, Llc Transform processors for gradually switching between image transforms

Also Published As

Publication number Publication date
AU2001247574A1 (en) 2001-10-08

Similar Documents

Publication Publication Date Title
KR100360893B1 (en) Apparatus and method for compensating video motions
US8144778B2 (en) Motion compensated frame rate conversion system and method
KR100739744B1 (en) Method and System of Noise-adaptive Motion Detection in an Interlaced Video Sequence
EP2564588B1 (en) Method and device for motion compensated video interpoltation
US7944503B1 (en) Interlaced-to-progressive video processing
JP2001054075A (en) Motion compensation scanning conversion circuit for image signal
WO2008152951A1 (en) Method of and apparatus for frame rate conversion
EP1723786B1 (en) Motion compensation deinterlacer protection
EP0535066B1 (en) Video signal processing
US7023920B2 (en) Facilitating motion estimation
KR20080008952A (en) Methods and systems of deinterlacing using super resolution technology
WO2001074082A1 (en) Temporal interpolation of interlaced or progressive video images
CN1957614B (en) Motion estimation in interlaced scan video images
Biswas et al. A novel motion estimation algorithm using phase plane correlation for frame rate conversion
JP2005507618A (en) Edge-oriented interpolation of video data
US20060176394A1 (en) De-interlacing of video data
Biswas et al. A novel de-interlacing technique based on phase plane correlation motion estimation
EP0675643A1 (en) Method and apparatus for reducing conversion artefacts
JPH08186802A (en) Interpolation picture element generating method for interlace scanning image
CN100450155C (en) Robust de-interlacing of video signals
Robert et al. Advanced high-definition 50 to 60-hz standards conversion
JPH11261972A (en) Video signal identifying method and processing method of video signal using the same
JP3721941B2 (en) Scanning line interpolation device
US8421918B2 (en) De-interlacing video
Shahinfard et al. Deinterlacing/interpolation of TV signals

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP