WO2004013810A1 - System and method for segmenting - Google Patents

System and method for segmenting Download PDF

Info

Publication number
WO2004013810A1
WO2004013810A1 PCT/IB2003/003185 IB0303185W WO2004013810A1 WO 2004013810 A1 WO2004013810 A1 WO 2004013810A1 IB 0303185 W IB0303185 W IB 0303185W WO 2004013810 A1 WO2004013810 A1 WO 2004013810A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixels
blocks
image
group
motion
Prior art date
Application number
PCT/IB2003/003185
Other languages
French (fr)
Inventor
Christiaan Varekamp
Matthijs C. Piek
Ralph A. C. Braspenning
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Priority to US10/522,469 priority Critical patent/US7382899B2/en
Priority to AU2003247051A priority patent/AU2003247051A1/en
Priority to JP2004525661A priority patent/JP2005535028A/en
Priority to EP03766538A priority patent/EP1527416A1/en
Publication of WO2004013810A1 publication Critical patent/WO2004013810A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20021Dividing image into blocks, subimages or windows

Definitions

  • the invention relates to a method of segmenting a first image feature in a first video image from an adjacent second image feature in the first video image.
  • the invention further relates to a segmentation system for segmenting a first image feature in a first video image from an adjacent second image feature.
  • the invention further relates to a motion estimator at pixel resolution for estimating a motion vector field, comprising such a segmentation system.
  • the invention further relates to an image processing apparatus comprising:
  • - receiving means for receiving a signal representing a series of video images
  • a motion compensated image processing unit for determining processed images on basis of the video images and the motion vector field.
  • An embodiment of the method of the kind described in the opening paragraph is known from US patent 6,075,875.
  • This method includes obtaining a motion representation, e.g. motion vectors, of corresponding pixels in a selected video image and a preceding video image to form motion-segmented video image features.
  • Video image features are also segmented according to their spatial image characteristics to form spatially-segmented video image features.
  • the video image features are jointly segmented as a weighted combination of the motion-segmented video image features and the spatially-segmented video image features.
  • a disadvantage of the method according to the prior art is the relatively complex method of combining motion-segmented video image features and the spatially- segmented video image features. Notice that two different quantities, i.e. a motion based quantity and a spatial image characteristic, are weighted. Weighting different quantities is unusual. It is an object of the invention to provide a method of the kind described in the opening paragraph which is pixel accurate and relatively easy.
  • This object of the invention is achieved in that the method of segmenting a first image feature in a first video image from an adjacent second image feature in the first video image, the first image feature having plural pixels with respective values of an image property substantially being in a first range of values and having motion relative to the second image feature between the first video image and a second video image, and the second image feature having plural pixels with respective values of the image property substantially being in a second range of values being different from the first range of values, comprises;
  • An advantage of the method according to the invention is that the two segmentation operations are applied on the appropriate locations.
  • the result of the block- based motion segmentation is used as input for the pixel-based segmentation on basis of the image property.
  • the pixel-based segmentation is applied to recover object boundaries, i.e. image feature boundaries on a small subset of the video image, i.e. in the blocks at the border between the first and second group of connected blocks of pixels.
  • This is an efficient strategy resulting in efficient computing resource usage, since the detailed pixel-based segmentation process is only applied to the blocks at the boundaries and not to all blocks.
  • the block-based motion segmentation and the pixel-based segmentation are applied sequentially: No complex weighting factors are required to tune between the results of the segmentation approaches as in the prior art.
  • segmenting the first video image into the first group of connected blocks of pixels and the second group of connected blocks of pixels is based on a motion model.
  • the advantage of this embodiment is that it is a robust method. Relatively much data, i.e. most or all motion vectors of test groups of connected blocks for the first group of connected blocks of pixels are applied to estimate the appropriate configuration of blocks which belong to the first image feature. In other words it is a type of region fitting. Many segmentation methods only take blocks at the border of a segment into account to decide whether the blocks correspond to the particular segment or not.
  • the motion model might be based on rotation and/or translation. However preferably the segmenting is based on an affine motion model.
  • segmenting the first video image into the first group of connected blocks of pixels and the second group of connected blocks of pixels comprises:
  • An advantage of this embodiment according to the invention is that it allows for a scanning approach. That means that block by block, e.g. starting at the left top towards the right bottom each block is tested iterative. Testing means that for each block it is evaluated to with which other blocks it should be merged to eventually form the group of blocks corresponding to one of the image features.
  • the advantage of the scanning approach is that the method can be implemented relatively easy.
  • the pixel-based segmentation is based on a spatial color model.
  • the image property on which the pixel-based segmentation is based is color. Color is a relatively good cue for segmentation.
  • the two segmentation approaches are applied on the appropriate scale resolution: color is applied for high frequencies and motion segmentation for low frequencies. Since color differences are caused by texture and motion differences caused by different velocities of objects in the scene being captured.
  • the pixel-based segmentation is based on a spatial luminance model.
  • the image property on which the pixel-based segmentation is based is luminance.
  • Luminance is a relatively good cue for segmentation.
  • a step in values of the image property is detected in a first block of the portion of the blocks of pixels of the first group of connected blocks of pixels, which have been determined to be positioned at the border between the first and second group of connected blocks of pixels.
  • a step in values corresponds to the edge of the first image feature.
  • the step is detected by means of:
  • the advantage of this embodiment is that it is a robust method. Relatively much data, i.e. most or all pixels of the first and second block of pixels are applied to estimate the appropriate configuration of pixels which belong to the first image feature. This is a so-called region fitting approach.
  • the segmentation system for segmenting a first image feature in a first video image from an adjacent second image feature in the first video image, the first image feature having plural pixels with respective values of an image property substantially being in a first range of values and having motion relative to the second image feature between the first video image and a second video image, and the second image feature having plural pixels with respective values of the image property substantially being in a second range of values being different from the first range of values comprises; - dividing means for dividing the first video image into blocks of pixels;
  • a motion segmentation unit for segmenting the first video image into a first group of connected blocks of pixels and a second group of connected blocks of pixels by classifying the blocks of pixels on basis of the motion vectors of the respective blocks of pixels;
  • a pixel-based segmentation unit for segmenting the first image feature from the second image feature by means of a pixel-based segmentation of a portion of the blocks of pixels of the first and second group of connected blocks of pixels, which have been determined to be positioned at a border between the first and second group of connected blocks of pixels, on basis of the respective values of the image property.
  • the pixel accurate motion estimator comprises the segmentation system as claimed in claim 9.
  • the pixel accurate motion estimator is provided with results being calculated by the segmentation system. It is advantageous to apply an embodiment of the pixel accurate motion estimator according to the invention in an image processing apparatus as described in the opening paragraph.
  • the image processing apparatus may comprise additional components, e.g. a display device for displaying the processed images or storage means for storage of the processed images.
  • the motion compensated image processing unit might support one or more of the following types of image processing:
  • Interlacing is the common video broadcast procedure for transmitting the odd or even numbered image lines alternately. De-interlacing attempts to restore the full vertical resolution, i.e. make odd and even lines available simultaneously for each image;
  • Up-conversion From a series of original input images a larger series of output images is calculated. Output images are temporally located between two original input images;
  • Temporal noise reduction This can also involve spatial processing, resulting in spatial-temporal noise reduction; and - Video compression, i.e. encoding or decoding, e.g. according to the MPEG standard or H26L standard.
  • Modifications of the method and variations thereof may correspond to modifications and variations thereof of the segmentation system, the pixel accurate motion estimator and of the image processing apparatus described.
  • Fig. 1 schematically shows an embodiment of the segmentation system according to the invention
  • Fig. 2 A schematically shows an initial configuration of blocks for the block- based motion segmentation
  • Fig. 2B schematically shows an updated configuration of blocks of Fig. 2 A after one modification
  • Fig. 2C schematically shows a result configuration of blocks of the block- based motion segmentation
  • Fig. 3 A schematically shows input blocks for the pixel-based segmentation
  • Fig. 3B schematically shows an initial edge for a horizontal block pair for the pixel-based segmentation
  • Fig. 3C schematically shows the detected edge for the horizontal block pair of
  • Fig. 3D schematically shows an initial edge for a vertical block pair for the pixel-based segmentation
  • Fig. 3E schematically shows the finally detected edge
  • Fig. 4A schematically shows the inputs and output of a motion estimator for estimating a pixel accurate motion vector field
  • Fig. 4B schematically shows the inputs and output of an alternative motion estimator for estimating a pixel accurate motion vector field
  • Fig. 5 schematically shows elements of an image processing apparatus according to the invention.
  • Fig. 1 schematically shows an embodiment of the segmentation system 100 according to the invention.
  • the segmentation system 100 is arranged to segment a first image feature 214 in a first video image from an adjacent second image feature 216 in the first video image.
  • the first image feature 214 has plural pixels with respective color values substantially being in a first range of values and the second image feature 216 has plural pixels with respective color values substantially being in a second range of values being different from the first range of values.
  • the first image feature 214 has motion relative to the second image feature 216 between the first video image and a second video image.
  • the second video image might be succeeding or preceding the first video image.
  • the segmentation system 100 receives a signal representing video images at the input connector 108 and provides a pixel accurate segmentation result S p at the output connector 110.
  • the segmentation system 100 comprises;
  • a block-based motion estimator 102 for estimating motion vectors.
  • This motion estimator 102 is arranged to estimate motion vectors 218-230 for the respective blocks in which the first video image has been divided by the splitting unit 103.
  • This dividing means that the pixels of the first video image are clustered into blocks of pixels.
  • a block of pixels comprises 8*8 pixels.
  • the motion estimator 102 is as described in the article "True-Motion Estimation with 3-D Recursive Search Block Matching" by G. de Haan et. al. in IEEE Transactions on circuits and systems for video technology, vol.3, no.5, October 1993, pages 368-379.
  • the motion estimator 102 provides a block-based motion vector field M B to;
  • a motion segmentation unit 104 which is arranged to segment the first video image into a first group of connected blocks of pixels 204C (see Fig. 2C) and a second group of connected blocks of pixels 206C (see Fig. 2C) by classifying the blocks of pixels on basis of the motion vectors 218-230 of the respective blocks of pixels.
  • the segmentation unit 104 provides a block-based segmentation result S s to;
  • a pixel-based segmentation unit 106 which is arranged to segment the first image feature 214 from the second image feature 216 by means of a pixel-based segmentation of a portion of the blocks 302-306 of pixels of the first and second group of connected blocks of pixels 204, 206, which have been determined to be positioned at a border between the first and second group of connected blocks of pixels 204, 206.
  • the pixel-based segmentation unit 106 is designed to perform the segmentation on basis of color values of pixels.
  • the behavior of the motion segmentation unit 104 will be described in more detail in connection with Figs. 2A-2C and the behavior of the pixel-based segmentation unit 106 will be described in more detail in connection with Figs. 3A-3E.
  • the block-based motion estimator 102, the motion segmentation unit 104 and the pixel-based segmentation unit 106 of the segmentation system 100 maybe implemented using one processor. Normally, these functions are performed under control of a software program product. During execution, normally the software program product is loaded into a memory, like a RAM, and executed from there. The program may be loaded from a background memory, like a ROM, hard disk, or magnetically and/or optical storage, or may be loaded via a network like Internet. Alternatively an application specific integrated circuit provides the disclosed functionality.
  • Fig. 2A a motion vector field 200 is depicted.
  • the motion vector field 200 comprises a number of motion vectors 218-230 for the respective blocks of pixels.
  • Fig. 2 A schematically shows an initial configuration of blocks for the block-based motion segmentation.
  • the number of initial groups of connected blocks corresponds to the maximum number of image features, i.e. objects which one want to segment in the video image, hi this exemplary cases each initial group of connected blocks of pixels 202A-212A comprises 5*5 blocks. All blocks of pixels of a particular initial group of connected blocks have the same classification label. By means of a scanning approach all blocks will be evaluated. That means that appropriate clusters or groups of connected blocks of pixels 202C-206C (See Fig. 2C) are determined.
  • Such a group of connected blocks of pixels 202C-206C corresponds to one of the image features 214, 216 in the first video image. Finding the appropriate groups is achieved in a number of iterations wherein for the blocks is tested to which group of connected blocks of pixels 202-206C it belongs to. That might be in a fixed number of iterations. Alternatively the process of evaluating the various intermediate groups of connected blocks 202B-212B is stopped if modifications of intermediate groups of connected blocks 202B-212B do not have any positive effect. The process of evaluatmg is bounded by a topological constraint. Only blocks at the border of a connected groups of blocks 202A-212A, 202B-212B are tested.
  • the evaluation of a particular block of pixels 232 comprises a number of steps. This is described for the particular block of pixels 232 during the first scan (See Fig. 2A): - creating a first initial group of connected blocks of pixels 204A comprising the particular block of pixels 232.
  • the match error is based on the Euclidean distance between the motion vectors; - calculating a second motion model for a test group of connected blocks of pixels 204B, based on the first initial group of connected blocks of pixels 204A, but excluding the particular block of pixels 232. (See 204B in Fig. 2B);
  • the first initial group of blocks 204A is adapted into an intermediate group of blocks 204B excluding the particular block 232 (See Fig. 2B).
  • additional match errors are calculated and taken into account to decide to which of the initial groups of blocks 202A-212A the particular block of pixels 232 belongs to.
  • a third match error is calculated for a second initial group of connected blocks of pixels 202 A as depicted in Fig. 2 A and a fourth match error for a second test group of connected blocks of pixels 202B as depicted in Fig. 2B.
  • the test whether the particular block of pixels 232 matches better with the first initial group of connected blocks of pixels 204 A or with the neighboring group of connected block of pixels 202 A can also be performed by means of evaluating with the first initial group of connected blocks of pixels 204A and with the neighboring group of connected block of pixels 202 A. That means without adapting these groups by moving the particular group of pixels 232 from the first initial group of connected blocks of pixels 204A to the neighboring group of connected block of pixels 202 A. hi that case motion models are calculated for both groups of connected blocks.
  • Fig. 2B schematically shows an updated configuration of blocks for the block- based motion segmentation after one modification. This modification can be executed directly or after all relevant blocks of the motion vector field have been evaluated.
  • the blocks are positioned at the border of a group of connected blocks of pixels. It is shown that a particular block of pixels 232 is moved from a first initial group of connected blocks of pixels 204A to a horizontally positioned neighboring group of blocks 202A. However a movement in vertical direction is also possible, e.g. a movement of another block of pixels from the first initial group of blocks 204A to the group of blocks of pixels 210A beneath.
  • a number of final groups of blocks of pixels 202C-206C is found. These final groups of blocks of pixels 202C- 206C are the output of the block-based motion segmentation. Fig. 2C schematically shows this output, i.e. the result configuration of the block-based motion segmentation. This output is provided to the pixel- based segmentation unit 106 for further processing: detecting the actual borders of the first image feature 214 and the second image feature 216.
  • the block-based motion segmentation result for one image pair e.g. the first image and the second image is applied for initialization of the block-based motion segmentation of a succeeding image pair.
  • Fig. 3A schematically shows input blocks 302-306 for the pixel-based segmentation.
  • the motion segmentation result indicates that one of the blocks belongs to a first image video feature 214 and is labeled as such 1 and that two of the blocks 304 and 306 belong to a second image video feature 216 and are labeled as such 2. Pairs of neighboring blocks are evaluated to find the actual edge of the first image feature 214.
  • Fig. 3B schematically shows an initial edge 308 for a horizontal pair of neighboring blocks 302 and 304.
  • the evaluation is done by means of an iterative process. That might be in a fixed number of iterations. Alternatively the process of evaluating the various intermediate groups of connected pixels is stopped if modifications of intermediate groups of connected pixels do not have any positive effect.
  • the evaluation for a particular pixel comprises the following steps:
  • the eventual first group of pixels comprises pixels which correspond to the first image feature and the eventual second group of pixels comprises pixels which correspond to the second image feature.
  • Fig. 3C schematically shows the detected edge 308 for the horizontal block pair of Fig 3B.
  • Fig. 3D schematically shows an initial edge 310 for this vertically positioned pair of neighboring blocks 302, 306.
  • Fig. 3E schematically shows the finally detected edge 310. Notice that the evaluation of the horizontally positioned pair of neighboring blocks 302, 304 and the vertically positioned pair of neighboring blocks 302, 306 might be performed simultaneously. Although only color is discussed, it-should be noted that other image properties can be applied for the pixel-based segmentation, e.g. luminance, or a combination of color and luminance or derived properties such as differences in color/luminance values between neighboring pixels.
  • Fig. 3D schematically shows an initial edge 310 for this vertically positioned pair of neighboring blocks 302, 306.
  • Fig. 3E schematically shows the finally detected edge 310. Notice that the evaluation of the horizontally positioned pair of neighboring blocks 302, 304 and the vertically positioned pair of neighboring blocks 302, 306 might be performed simultaneously. Although only color is discussed, it-should be noted that other image properties can be applied for the pixel
  • FIG. 4A schematically shows the input and outputs of a motion estimator 400 for estimating a pixel accurate motion vector field.
  • a motion vector field M B as being calculated by the block-based motion estimator 102 as described in connection with Fig. 1 is provided at input connector 404.
  • the motion models as being calculated by the motion segmentation unit 104 are also provided. Because of the block resolution of the motion vector field M B the motion vectors 218-230 are not correct for all pixels of the first video image.
  • a pixel accurate segmentation result S p is provided by the segmentation system 100 as described in connection with Fig. 1.
  • motion vectors are assigned to these respective pixels which correspond to motion vectors being estimated for neighboring blocks.
  • the selection of the appropriate neighboring block and thus motion vector is determined by the segmentation result S ⁇ .
  • a motion vector is assigned which is equal to a motion vector of a block corresponding to a first image feature 214 and another to another portion of pixels another motion vector is assigned which is equal to a motion vector of a block corresponding to a second image feature 216 being adjacent to the first image feature 214.
  • Fig. 4B schematically shows the inputs and output of an alternative motion estimator 401 for estimating a pixel accurate motion vector field.
  • Video images are provided at the input connector 402 and at another input connector 406 of the motion estimator 401 also a pixel accurate segmentation result S p is provided by the segmentation system 100 as described in connection with Fig. 1.
  • the pixel accurate motion estimator 401 is arranged to calculate motion vectors for the groups of pixels corresponding to the respective image features 214, 216 as segmented by the segmentation system 100 by means of comparing the pixels of these groups of pixels with corresponding pixels of preceding or succeeding video images. Preferably the comparing is based on match errors corresponding to the sum of absolute pixel value differences.
  • Fig. 5 schematically shows elements of an image processing apparatus 500 according to the invention.
  • the image processing apparatus 500 comprises:
  • the signal may be a broadcast signal received via an antenna or cable but may also be a signal from a storage device like a VCR (Video Cassette Recorder) or Digital Versatile Disk (DVD).
  • VCR Video Cassette Recorder
  • DVD Digital Versatile Disk
  • the signal is provided at the input connector 510.
  • processing unit 504 comprising a segmentation system 100 and a motion estimator 401 as described in connection with Fig. 1 and Fig. 4A, respectively; - a motion compensated image processing unit 506; and
  • This display device 508 is optional.
  • the motion compensated image processing unit 506 requires images and motion vectors as its input.
  • the motion compensated image processing unit 506 might support one or more of the following types of image processing: de-interlacing; up- conversion; temporal noise reduction; and video compression.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Analysis (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

A segmentation system (100) for segmenting a first image feature (214) in a first video image from an adjacent second image feature (216) in the first video image on basis of motion and on an image property like color or luminance. The segmentation system (100) comprises; a block-based motion estimator (102) for estimating motion vectors (218-230) for blocks of pixels; a motion segmentation unit (104) for segmenting the first video image into a first group of connected blocks of pixels (204) and a second group of connected blocks of pixels (206) on basis of the motion vectors (218-230) of the respective blocks of pixels; and a pixel-based segmentation unit (106) for segmenting the first image feature (214) from the second image feature (214) by means of a pixel-based segmentation of a portion of the blocks of pixels of the first and second group of connected blocks of pixels (214 and 216), on basis of the respective values of the image property.

Description

SYSTEM AND METHOD FOR SEGMENTING
The invention relates to a method of segmenting a first image feature in a first video image from an adjacent second image feature in the first video image.
The invention further relates to a segmentation system for segmenting a first image feature in a first video image from an adjacent second image feature. The invention further relates to a motion estimator at pixel resolution for estimating a motion vector field, comprising such a segmentation system.
The invention further relates to an image processing apparatus comprising:
- receiving means for receiving a signal representing a series of video images;
- such a motion estimator at pixel resolution for estimating a motion vector field from the video images; and
- a motion compensated image processing unit for determining processed images on basis of the video images and the motion vector field.
An embodiment of the method of the kind described in the opening paragraph is known from US patent 6,075,875. This method includes obtaining a motion representation, e.g. motion vectors, of corresponding pixels in a selected video image and a preceding video image to form motion-segmented video image features. Video image features are also segmented according to their spatial image characteristics to form spatially-segmented video image features. Finally the video image features are jointly segmented as a weighted combination of the motion-segmented video image features and the spatially-segmented video image features. A disadvantage of the method according to the prior art is the relatively complex method of combining motion-segmented video image features and the spatially- segmented video image features. Notice that two different quantities, i.e. a motion based quantity and a spatial image characteristic, are weighted. Weighting different quantities is unusual. It is an object of the invention to provide a method of the kind described in the opening paragraph which is pixel accurate and relatively easy.
This object of the invention is achieved in that the method of segmenting a first image feature in a first video image from an adjacent second image feature in the first video image, the first image feature having plural pixels with respective values of an image property substantially being in a first range of values and having motion relative to the second image feature between the first video image and a second video image, and the second image feature having plural pixels with respective values of the image property substantially being in a second range of values being different from the first range of values, comprises;
- dividing the first video image into blocks of pixels;
- estimating motion vectors for the respective blocks of pixels;
- segmenting the first video image into a first group of connected blocks of pixels and a second group of connected blocks of pixels by classifying the blocks of pixels on basis of the motion vectors of the respective blocks of pixels; and succeeded by
- segmenting the first image feature from the second image feature by means of a pixel-based segmentation of a portion of the blocks of pixels of the first and second group of connected blocks of pixels, which have been determined to be positioned at a border between the first and second group of connected blocks of pixels, on basis of the respective values of the image property.
An advantage of the method according to the invention is that the two segmentation operations are applied on the appropriate locations. The result of the block- based motion segmentation is used as input for the pixel-based segmentation on basis of the image property. The pixel-based segmentation is applied to recover object boundaries, i.e. image feature boundaries on a small subset of the video image, i.e. in the blocks at the border between the first and second group of connected blocks of pixels. This is an efficient strategy resulting in efficient computing resource usage, since the detailed pixel-based segmentation process is only applied to the blocks at the boundaries and not to all blocks. The block-based motion segmentation and the pixel-based segmentation are applied sequentially: No complex weighting factors are required to tune between the results of the segmentation approaches as in the prior art.
In an embodiment of the method according to the invention, segmenting the first video image into the first group of connected blocks of pixels and the second group of connected blocks of pixels is based on a motion model. The advantage of this embodiment is that it is a robust method. Relatively much data, i.e. most or all motion vectors of test groups of connected blocks for the first group of connected blocks of pixels are applied to estimate the appropriate configuration of blocks which belong to the first image feature. In other words it is a type of region fitting. Many segmentation methods only take blocks at the border of a segment into account to decide whether the blocks correspond to the particular segment or not. The motion model might be based on rotation and/or translation. However preferably the segmenting is based on an affine motion model.
In an embodiment of the method according to the invention, segmenting the first video image into the first group of connected blocks of pixels and the second group of connected blocks of pixels comprises:
- creating a first initial group of connected blocks of pixels for the first group of connected blocks of pixels, the first initial group of connected blocks of pixels comprising a particular block of pixels;
- determining a first motion model for the first initial group of connected block of pixels;
- calculating a first match error between the motion vector corresponding to the particular block of pixels being estimated during estimating motion vectors for the respective blocks of pixels of the first video image and the motion vector corresponding to the particular block of pixels on basis of the first motion model; - calculating a second motion model for a test group of connected blocks of pixels, based on the first initial group of connected blocks of pixels, but excluding the particular block of pixels;
- calculating a second match error between the motion vector corresponding to the particular block of pixels being estimated during estimating motion vectors for the respective blocks of pixels of the first video image and the motion vector corresponding to the particular block of pixels on basis of the second motion model;
- deciding whether the particular block of pixels corresponds to the first group of connected blocks of pixels or not on basis of the first and second match error.
An advantage of this embodiment according to the invention is that it allows for a scanning approach. That means that block by block, e.g. starting at the left top towards the right bottom each block is tested iterative. Testing means that for each block it is evaluated to with which other blocks it should be merged to eventually form the group of blocks corresponding to one of the image features. The advantage of the scanning approach is that the method can be implemented relatively easy. In an embodiment of the method according to the invention the pixel-based segmentation is based on a spatial color model. In other words the image property on which the pixel-based segmentation is based, is color. Color is a relatively good cue for segmentation. In this embodiment the two segmentation approaches are applied on the appropriate scale resolution: color is applied for high frequencies and motion segmentation for low frequencies. Since color differences are caused by texture and motion differences caused by different velocities of objects in the scene being captured.
In an embodiment of the method according to the invention the pixel-based segmentation is based on a spatial luminance model. In other words the image property on which the pixel-based segmentation is based, is luminance. Luminance is a relatively good cue for segmentation.
In an embodiment of the method according to the invention a step in values of the image property is detected in a first block of the portion of the blocks of pixels of the first group of connected blocks of pixels, which have been determined to be positioned at the border between the first and second group of connected blocks of pixels. A step in values corresponds to the edge of the first image feature. Preferably the step is detected by means of:
- calculating for the pixels of the first block a first mean value of the image property;
- calculating a first difference measure on basis of the first mean value and the respective values of the pixels of the first block;
- calculating for the pixels of a second block of pixels a second mean value of the image property, the second block of pixels corresponding to the second group of pixels and being connected to the first block;
- calculating a second difference measure on basis of the second mean value and the respective values of the pixels of the second block;
- creating a first test group of pixels on basis of the first block but excluding a particular pixel and creating a second test group of pixels on basis of the second block and comprising the particular pixel;
- calculating for the pixels of the first test group a third mean value of the image property;
- calculating a third difference measure on basis of the third mean value and the respective values of the pixels of the first test group;
- calculating for the pixels of the second test group a fourth mean value of the image property; - calculating a fourth difference measure on basis of the fourth mean value and the respective values of the pixels of the second test group;
- deciding whether the particular pixel belongs to the first image feature or the second image feature on basis of the first, second, third and fourth difference measure. The advantage of this embodiment is that it is a robust method. Relatively much data, i.e. most or all pixels of the first and second block of pixels are applied to estimate the appropriate configuration of pixels which belong to the first image feature. This is a so-called region fitting approach.
It is another object of the invention to provide a segmentation system of the kind described in the opening paragraph which is arranged to provide pixel accurate segmentations and which is relatively easy.
This object of the invention is achieved in that the segmentation system for segmenting a first image feature in a first video image from an adjacent second image feature in the first video image, the first image feature having plural pixels with respective values of an image property substantially being in a first range of values and having motion relative to the second image feature between the first video image and a second video image, and the second image feature having plural pixels with respective values of the image property substantially being in a second range of values being different from the first range of values, comprises; - dividing means for dividing the first video image into blocks of pixels;
- a block-based motion estimator for estimating motion vectors for the respective blocks of pixels;
- a motion segmentation unit for segmenting the first video image into a first group of connected blocks of pixels and a second group of connected blocks of pixels by classifying the blocks of pixels on basis of the motion vectors of the respective blocks of pixels; and
- a pixel-based segmentation unit for segmenting the first image feature from the second image feature by means of a pixel-based segmentation of a portion of the blocks of pixels of the first and second group of connected blocks of pixels, which have been determined to be positioned at a border between the first and second group of connected blocks of pixels, on basis of the respective values of the image property.
It is another object of the invention to provide a motion estimator at pixel resolution for estimating a pixel accurate motion vector field, which is relatively easy. This object of the invention is achieved in that the pixel accurate motion estimator comprises the segmentation system as claimed in claim 9. Alternatively, the pixel accurate motion estimator is provided with results being calculated by the segmentation system. It is advantageous to apply an embodiment of the pixel accurate motion estimator according to the invention in an image processing apparatus as described in the opening paragraph. The image processing apparatus may comprise additional components, e.g. a display device for displaying the processed images or storage means for storage of the processed images. The motion compensated image processing unit might support one or more of the following types of image processing:
- De-interlacing: Interlacing is the common video broadcast procedure for transmitting the odd or even numbered image lines alternately. De-interlacing attempts to restore the full vertical resolution, i.e. make odd and even lines available simultaneously for each image; - Up-conversion: From a series of original input images a larger series of output images is calculated. Output images are temporally located between two original input images;
- Temporal noise reduction. This can also involve spatial processing, resulting in spatial-temporal noise reduction; and - Video compression, i.e. encoding or decoding, e.g. according to the MPEG standard or H26L standard.
Modifications of the method and variations thereof may correspond to modifications and variations thereof of the segmentation system, the pixel accurate motion estimator and of the image processing apparatus described.
These and other aspects of the method, the segmentation system, the pixel accurate motion estimator and of the image processing apparatus according to the invention will become apparent from and will be elucidated with respect to the implementations and embodiments described hereinafter and with reference to the accompanying drawings, wherein:
Fig. 1 schematically shows an embodiment of the segmentation system according to the invention; Fig. 2 A schematically shows an initial configuration of blocks for the block- based motion segmentation;
Fig. 2B schematically shows an updated configuration of blocks of Fig. 2 A after one modification; Fig. 2C schematically shows a result configuration of blocks of the block- based motion segmentation;
Fig. 3 A schematically shows input blocks for the pixel-based segmentation; Fig. 3B schematically shows an initial edge for a horizontal block pair for the pixel-based segmentation; Fig. 3C schematically shows the detected edge for the horizontal block pair of
Fig 3B;
Fig. 3D schematically shows an initial edge for a vertical block pair for the pixel-based segmentation;
Fig. 3E schematically shows the finally detected edge; Fig. 4A schematically shows the inputs and output of a motion estimator for estimating a pixel accurate motion vector field;
Fig. 4B schematically shows the inputs and output of an alternative motion estimator for estimating a pixel accurate motion vector field; and
Fig. 5 schematically shows elements of an image processing apparatus according to the invention.
Same reference numerals are used to denote similar parts throughout the figures.
Fig. 1 schematically shows an embodiment of the segmentation system 100 according to the invention. The segmentation system 100 is arranged to segment a first image feature 214 in a first video image from an adjacent second image feature 216 in the first video image. The first image feature 214 has plural pixels with respective color values substantially being in a first range of values and the second image feature 216 has plural pixels with respective color values substantially being in a second range of values being different from the first range of values. The first image feature 214 has motion relative to the second image feature 216 between the first video image and a second video image. The second video image might be succeeding or preceding the first video image. The segmentation system 100 receives a signal representing video images at the input connector 108 and provides a pixel accurate segmentation result Sp at the output connector 110. The segmentation system 100 comprises;
- a block-based motion estimator 102 for estimating motion vectors. This motion estimator 102 is arranged to estimate motion vectors 218-230 for the respective blocks in which the first video image has been divided by the splitting unit 103. This dividing means that the pixels of the first video image are clustered into blocks of pixels. Typically a block of pixels comprises 8*8 pixels. Preferably the motion estimator 102 is as described in the article "True-Motion Estimation with 3-D Recursive Search Block Matching" by G. de Haan et. al. in IEEE Transactions on circuits and systems for video technology, vol.3, no.5, October 1993, pages 368-379. The motion estimator 102 provides a block-based motion vector field MB to;
- a motion segmentation unit 104 which is arranged to segment the first video image into a first group of connected blocks of pixels 204C (see Fig. 2C) and a second group of connected blocks of pixels 206C (see Fig. 2C) by classifying the blocks of pixels on basis of the motion vectors 218-230 of the respective blocks of pixels. The segmentation unit 104 provides a block-based segmentation result Ss to;
- a pixel-based segmentation unit 106 which is arranged to segment the first image feature 214 from the second image feature 216 by means of a pixel-based segmentation of a portion of the blocks 302-306 of pixels of the first and second group of connected blocks of pixels 204, 206, which have been determined to be positioned at a border between the first and second group of connected blocks of pixels 204, 206. The pixel-based segmentation unit 106 is designed to perform the segmentation on basis of color values of pixels.
The behavior of the motion segmentation unit 104 will be described in more detail in connection with Figs. 2A-2C and the behavior of the pixel-based segmentation unit 106 will be described in more detail in connection with Figs. 3A-3E.
The block-based motion estimator 102, the motion segmentation unit 104 and the pixel-based segmentation unit 106 of the segmentation system 100 maybe implemented using one processor. Normally, these functions are performed under control of a software program product. During execution, normally the software program product is loaded into a memory, like a RAM, and executed from there. The program may be loaded from a background memory, like a ROM, hard disk, or magnetically and/or optical storage, or may be loaded via a network like Internet. Alternatively an application specific integrated circuit provides the disclosed functionality. In Fig. 2A a motion vector field 200 is depicted. The motion vector field 200 comprises a number of motion vectors 218-230 for the respective blocks of pixels. These blocks of pixels have been clustered into 6 initial groups of connected blocks of pixels 202 A- 212A. Or in other words, Fig. 2 A schematically shows an initial configuration of blocks for the block-based motion segmentation. The number of initial groups of connected blocks corresponds to the maximum number of image features, i.e. objects which one want to segment in the video image, hi this exemplary cases each initial group of connected blocks of pixels 202A-212A comprises 5*5 blocks. All blocks of pixels of a particular initial group of connected blocks have the same classification label. By means of a scanning approach all blocks will be evaluated. That means that appropriate clusters or groups of connected blocks of pixels 202C-206C (See Fig. 2C) are determined. Such a group of connected blocks of pixels 202C-206C corresponds to one of the image features 214, 216 in the first video image. Finding the appropriate groups is achieved in a number of iterations wherein for the blocks is tested to which group of connected blocks of pixels 202-206C it belongs to. That might be in a fixed number of iterations. Alternatively the process of evaluating the various intermediate groups of connected blocks 202B-212B is stopped if modifications of intermediate groups of connected blocks 202B-212B do not have any positive effect. The process of evaluatmg is bounded by a topological constraint. Only blocks at the border of a connected groups of blocks 202A-212A, 202B-212B are tested. Notice that a block might become located at the border of an intermediate group of blocks 202B-212B after a number of iterations. Only applying blocks at the border is different from the well-known k-means approach which does not have this constraint. The evaluation of a particular block of pixels 232 comprises a number of steps. This is described for the particular block of pixels 232 during the first scan (See Fig. 2A): - creating a first initial group of connected blocks of pixels 204A comprising the particular block of pixels 232.
- determining a first motion model for the first initial group of connected block of pixels 204A;
- calculating a first match error between the motion vector corresponding to the particular block of pixels 232 being estimated during estimating motion vectors for the respective blocks of pixels of the first video image and the motion vector corresponding to the particular block of pixels 232 on basis of the first motion model. Preferably the match error is based on the Euclidean distance between the motion vectors; - calculating a second motion model for a test group of connected blocks of pixels 204B, based on the first initial group of connected blocks of pixels 204A, but excluding the particular block of pixels 232. (See 204B in Fig. 2B);
- calculating a second match error between the motion vector corresponding to the particular block of pixels 232 being estimated during estimating motion vectors for the respective blocks of pixels of the first video image and the motion vector corresponding to the particular block of pixels 232 on basis of the second motion model;
- deciding whether the particular block of pixels 232 corresponds to the first group of connected blocks of pixels 204C or not on basis of the first and second match error. This means that it is tested whether the particular block of pixels 232 matches better with the first initial group of connected blocks of pixels 204A or with the neighboring group of connected block of pixels 202 A. In this case it appeared that the particular block of pixels 232 better matches with the latter group of blocks 202A and hence the neighboring group of connected blocks of pixels 202A is extended with the particular block 232 resulting in the intermediate group of blocks 202B. The first initial group of blocks 204A is adapted into an intermediate group of blocks 204B excluding the particular block 232 (See Fig. 2B).
Alternatively, additional match errors are calculated and taken into account to decide to which of the initial groups of blocks 202A-212A the particular block of pixels 232 belongs to. E.g. a third match error is calculated for a second initial group of connected blocks of pixels 202 A as depicted in Fig. 2 A and a fourth match error for a second test group of connected blocks of pixels 202B as depicted in Fig. 2B.
The test whether the particular block of pixels 232 matches better with the first initial group of connected blocks of pixels 204 A or with the neighboring group of connected block of pixels 202 A can also be performed by means of evaluating with the first initial group of connected blocks of pixels 204A and with the neighboring group of connected block of pixels 202 A. That means without adapting these groups by moving the particular group of pixels 232 from the first initial group of connected blocks of pixels 204A to the neighboring group of connected block of pixels 202 A. hi that case motion models are calculated for both groups of connected blocks. Fig. 2B schematically shows an updated configuration of blocks for the block- based motion segmentation after one modification. This modification can be executed directly or after all relevant blocks of the motion vector field have been evaluated. With relevant is meant that the blocks are positioned at the border of a group of connected blocks of pixels. It is shown that a particular block of pixels 232 is moved from a first initial group of connected blocks of pixels 204A to a horizontally positioned neighboring group of blocks 202A. However a movement in vertical direction is also possible, e.g. a movement of another block of pixels from the first initial group of blocks 204A to the group of blocks of pixels 210A beneath.
After a number of iterations a number of final groups of blocks of pixels 202C-206C is found. These final groups of blocks of pixels 202C- 206C are the output of the block-based motion segmentation. Fig. 2C schematically shows this output, i.e. the result configuration of the block-based motion segmentation. This output is provided to the pixel- based segmentation unit 106 for further processing: detecting the actual borders of the first image feature 214 and the second image feature 216.
Preferably the block-based motion segmentation result for one image pair, e.g. the first image and the second image is applied for initialization of the block-based motion segmentation of a succeeding image pair. Fig. 3A schematically shows input blocks 302-306 for the pixel-based segmentation. The motion segmentation result indicates that one of the blocks belongs to a first image video feature 214 and is labeled as such 1 and that two of the blocks 304 and 306 belong to a second image video feature 216 and are labeled as such 2. Pairs of neighboring blocks are evaluated to find the actual edge of the first image feature 214. Fig. 3B schematically shows an initial edge 308 for a horizontal pair of neighboring blocks 302 and 304. h a similar way as described above for the block-based motion segmentation the pixel-based segmentation is performed: with similar is meant that for pixels is evaluated to which group of connected pixels they correspond. This is done on basis of a spatial color model which is based on the following assumptions: Comprised by the pair of neighboring blocks 302 and 304 there is a first group of connected pixels with color values substantially being in a first range of values and a second group of pixels with color values substantially being in a second range of values. Besides that there is a transition or step in color values between the first group of connected pixels and the second group of connected pixels. That means that for pixels of the pair of neighboring blocks 302 and 304 is evaluated to which of the two groups of connected pixels the respective pixels correspond. The evaluation is done by means of an iterative process. That might be in a fixed number of iterations. Alternatively the process of evaluating the various intermediate groups of connected pixels is stopped if modifications of intermediate groups of connected pixels do not have any positive effect. The evaluation for a particular pixel comprises the following steps:
- calculating for the pixels of the first block of pixels 302 of the pair of neighboring blocks 302, 304 a first mean color value; - calculating a first difference measure on basis of the first mean color value and the respective values of the pixels of the first block 302;
- calculating for the pixels of the second block of pixels 304 of the pair of neighboring blocks 302, 304 a second mean color value;
- calculating a second difference measure on basis of the second mean color value and the respective values of the pixels of the second block of pixels 304;
- creating a first test group of pixels on basis of the first block 302 but excluding the particular pixel and creating a second test group of pixels on basis of the second block 304 and comprising the particular pixel;
- calculating for the pixels of the first test group a third mean color value; - calculating a third difference measure on basis of the third mean color value and the respective values of the pixels of the first test group;
- calculating for the pixels of the second test group a fourth mean color value;
- calculating a fourth difference measure on basis of the fourth mean color value and the respective values of the pixels of the second test group; - deciding whether the particular pixel belongs to the first group of connected pixels or the second group of connected pixels, on basis of the first, second, third and fourth difference measure.
After a number of iterations the eventual first group of pixels comprises pixels which correspond to the first image feature and the eventual second group of pixels comprises pixels which correspond to the second image feature. Fig. 3C schematically shows the detected edge 308 for the horizontal block pair of Fig 3B.
After the evaluation of the horizontally positioned pair of neighboring blocks 302, 304 has been completed a similar evaluation is started for a vertically positioned pair of neighboring blocks 302, 306. Fig. 3D schematically shows an initial edge 310 for this vertically positioned pair of neighboring blocks 302, 306. Fig. 3E schematically shows the finally detected edge 310. Notice that the evaluation of the horizontally positioned pair of neighboring blocks 302, 304 and the vertically positioned pair of neighboring blocks 302, 306 might be performed simultaneously. Although only color is discussed, it-should be noted that other image properties can be applied for the pixel-based segmentation, e.g. luminance, or a combination of color and luminance or derived properties such as differences in color/luminance values between neighboring pixels. Fig. 4A schematically shows the input and outputs of a motion estimator 400 for estimating a pixel accurate motion vector field. A motion vector field MB as being calculated by the block-based motion estimator 102 as described in connection with Fig. 1 is provided at input connector 404. Optionally the motion models as being calculated by the motion segmentation unit 104 are also provided. Because of the block resolution of the motion vector field MB the motion vectors 218-230 are not correct for all pixels of the first video image. However at another input connector 406 of the motion estimator 400 also a pixel accurate segmentation result Sp is provided by the segmentation system 100 as described in connection with Fig. 1. By means of combining the information which is provided a pixel accurate motion vector field can be determined. Especially pixels of blocks being located at the borders of the image features 214, 216 might be erroneous. Preferably motion vectors are assigned to these respective pixels which correspond to motion vectors being estimated for neighboring blocks. The selection of the appropriate neighboring block and thus motion vector is determined by the segmentation result S^ . In general, to a portion of the pixels of a block being located at the border of an image feature 214, 216 a motion vector is assigned which is equal to a motion vector of a block corresponding to a first image feature 214 and another to another portion of pixels another motion vector is assigned which is equal to a motion vector of a block corresponding to a second image feature 216 being adjacent to the first image feature 214.
Fig. 4B schematically shows the inputs and output of an alternative motion estimator 401 for estimating a pixel accurate motion vector field. Video images are provided at the input connector 402 and at another input connector 406 of the motion estimator 401 also a pixel accurate segmentation result Sp is provided by the segmentation system 100 as described in connection with Fig. 1. The pixel accurate motion estimator 401 is arranged to calculate motion vectors for the groups of pixels corresponding to the respective image features 214, 216 as segmented by the segmentation system 100 by means of comparing the pixels of these groups of pixels with corresponding pixels of preceding or succeeding video images. Preferably the comparing is based on match errors corresponding to the sum of absolute pixel value differences. Fig. 5 schematically shows elements of an image processing apparatus 500 according to the invention. The image processing apparatus 500 comprises:
- a receiving unit 502 for receiving a signal representing video images to be displayed after some processing has been performed. The signal may be a broadcast signal received via an antenna or cable but may also be a signal from a storage device like a VCR (Video Cassette Recorder) or Digital Versatile Disk (DVD). The signal is provided at the input connector 510.
- a processing unit 504 comprising a segmentation system 100 and a motion estimator 401 as described in connection with Fig. 1 and Fig. 4A, respectively; - a motion compensated image processing unit 506; and
- a display device 508 for displaying the processed images. This display device 508 is optional.
The motion compensated image processing unit 506 requires images and motion vectors as its input. The motion compensated image processing unit 506 might support one or more of the following types of image processing: de-interlacing; up- conversion; temporal noise reduction; and video compression.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be constructed as limiting the claim. The word 'comprising' does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention can be implemented by means of hardware comprising several distinct elements and by means of a suitable programmed computer. In the unit claims enumerating several means, several of these means can be embodied by one and the same item of hardware.

Claims

CLAIMS:
1. A method of segmenting a first image feature (214) in a first video image from an adjacent second image feature (216) in the first video image, the first image feature (214) having plural pixels with respective values of an image property substantially being in a first range of values and having motion relative to the second image feature (216) between the first video image and a second video image, and the second image feature (216) having plural pixels with respective values of the image property substantially being in a second range of values being different from the first range of values, the method comprising;
- dividing the first video image into blocks of pixels;
- estimating motion vectors (218-230) for the respective blocks of pixels; - segmenting the first video image into a first group of connected blocks of pixels (204C) and a second group of connected blocks of pixels (206C) by classifying the blocks of pixels on basis of the motion vectors (218-230) of the respective blocks of pixels; and succeeded by segmenting the first image feature (214C) from the second image feature (216C) by means of a pixel-based segmentation of a portion of the blocks of pixels of the first and second group of connected blocks of pixels (204C, 206C), which have been determined to be positioned at a border between the first and second group of connected blocks of pixels (204C, 206C), on basis of the respective values of the image property.
2. A method as claimed in claim 1 , whereby segmenting the first video image into the first group of connected blocks of pixels (204C) and the second group of connected blocks of pixels (206C) is based on a motion model.
3. A method as claimed in claim 2, whereby segmenting the first video image into the first group of connected blocks of pixels (204C) and the second group of connected blocks of pixels (206C) is based on an affine motion model.
4. A method as claimed in claim 2 or 3, whereby segmenting the first video image into the first group of connected blocks of pixels (204C) and the second group of connected blocks of pixels (206C) comprises: - creating a first initial group of connected blocks of pixels (204A) for the first group of connected blocks of pixels (204C), the first initial group of connected blocks of pixels (204A) comprising a particular block of pixels (232);
- determining a first motion model for the first initial group of connected block of pixels (204A);
- calculating a first match error between the motion vector corresponding to the particular block of pixels (232) being estimated during estimating motion vectors (218- 230) for the respective blocks of pixels of the first video image and the motion vector corresponding to the particular block of pixels (232) on basis of the first motion model; - calculating a second motion model for a test group of connected blocks of pixels (204B), based on the first initial group of connected blocks of pixels (204A), but excluding the particular block of pixels (232);
- calculating a second match error between the motion vector corresponding to the particular block of pixels (232) being estimated during estimating motion vectors (218- 230) for the respective blocks of pixels of the first video image and the motion vector corresponding to the particular block of pixels (232) on basis of the second motion model;
- deciding whether the particular block of pixels (232) corresponds to the first group of connected blocks of pixels (204C) or not on basis of the first and second match error.
5. A method as claimed in claim 1, whereby the pixel-based segmentation is based on a spatial color model.
6. A method as claimed in claim 1, whereby the pixel-based segmentation is based on a spatial luminance model.
7. A method as claimed in claim 5 or 6, whereby a step in values of the image property is detected in a first block (302) of the portion of the blocks of pixels of the first group of connected blocks of pixels (204C), which have been determined to be positioned at the border between the first and second group of connected blocks of pixels (204C, 206C).
8. A method as claimed in claim 7, whereby the step is detected by means of:
- calculating for the pixels of the first block (302) a first mean value of the image property; - calculating a first difference measure on basis of the first mean value and the respective values of the pixels of the first block (302);
- calculating for the pixels of a second block of pixels (304) a second mean value of the image property, the second block (304) of pixels corresponding to the second group of pixels and being connected to the first block (302);
- calculating a second difference measure on basis of the second mean value and the respective values of the pixels of the second block (304);
- creating a first test group of pixels on basis of the first block (302) but excluding a particular pixel and creating a second test group of pixels on basis of the second block (304) and comprising the particular pixel;
- calculating for the pixels of the first test group a third mean value of the image property;
- calculating a third difference measure on basis of the third mean value and the respective values of the pixels of the first test group; - calculating for the pixels of the second test group a fourth mean value of the image property;
- calculating a fourth difference measure on basis of the fourth mean value and the respective values of the pixels of the second test group;
- deciding whether the particular pixel belongs to the first image feature (214) or the second image feature (216) on basis of the first, second, third and fourth difference measure.
9. A segmentation system (100) for segmenting a first image feature (214) in a first video image from an adjacent second image feature (216) in the first video image, the first image feature (214) having plural pixels with respective values of an image property substantially being in a first range of values and having motion relative to the second image feature (216) between the first video image and a second video image, and the second image feature (216) having plural pixels with respective values of the image property substantially being in a second range of values being different from the first range of values, the segmentation system (100) comprising;
- dividing means (103) for dividing the first video image into blocks of pixels;
- a block-based motion estimator (102) for estimating motion vectors (218- 230) for the respective blocks of pixels; - a motion segmentation unit (104) for segmenting the first video image into a first group of connected blocks of pixels (204C) and a second group of connected blocks of pixels (206) by classifying the blocks of pixels on basis of the motion vectors (218-230) of the respective blocks of pixels; and - a pixel-based segmentation unit (106) for segmenting the first image feature
(214) from the second image feature (216) by means of a pixel-based segmentation of a portion of the blocks of pixels of the first and second group of connected blocks of pixels (204C, 206C), which have been determined to be positioned at a border between the first and second group of connected blocks of pixels (204C, 206C), on basis of the respective values of the image property.
10. A motion estimator at pixel resolution (400, 401) for estimating a motion vector field, comprising the segmentation system (100) as claimed in claim 9.
11. A motion estimator at pixel resolution (400) as claimed in claim 10 being arranged to assign the motion vectors (218-230) estimated for the respective blocks to the respective pixels of the first video image on basis of the segmenting the first image feature (214) from the second image feature (216).
12. A motion estimator at pixel resolution (401)as claimed in claim 10 being arranged to estimate a new motion vector for the first image feature (214) by means of comparing pixels of the first image feature (214) with corresponding pixels of the second video image.
13. An image processing apparatus (500) comprising:
- receiving means (502) for receiving a signal representing a series of video images;
- a motion estimator at pixel resolution (504) as claimed in claim 10 for estimating a motion vector field from the video images; and - a motion compensated image processing unit (506) for determining processed images on basis of the video images and the motion vector field.
14. An image processing apparatus (500) as claimed in claim 13, whereby the motion compensated image processing unit (506) is designed to perform video compression.
15. An image processing apparatus (500) as claimed in claim 13 , whereby the motion compensated image processing unit (506) is designed to reduce noise in the series of images.
16. An image processing apparatus (500) as claimed in claim 13, whereby the motion compensated image processing unit (506) is designed to de-interlace the series of images.
17. An image processing apparatus (500) as claimed in claim 13, characterized in that the motion compensated image processing unit (506) is designed to perform an up- conversion.
PCT/IB2003/003185 2002-07-31 2003-07-11 System and method for segmenting WO2004013810A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
US10/522,469 US7382899B2 (en) 2002-07-31 2003-07-11 System and method for segmenting
AU2003247051A AU2003247051A1 (en) 2002-07-31 2003-07-11 System and method for segmenting
JP2004525661A JP2005535028A (en) 2002-07-31 2003-07-11 System and segmentation method for segmentation
EP03766538A EP1527416A1 (en) 2002-07-31 2003-07-11 System and method for segmenting

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP02078116.7 2002-07-31
EP02078116 2002-07-31

Publications (1)

Publication Number Publication Date
WO2004013810A1 true WO2004013810A1 (en) 2004-02-12

Family

ID=31197904

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2003/003185 WO2004013810A1 (en) 2002-07-31 2003-07-11 System and method for segmenting

Country Status (6)

Country Link
US (1) US7382899B2 (en)
EP (1) EP1527416A1 (en)
JP (1) JP2005535028A (en)
CN (1) CN1311409C (en)
AU (1) AU2003247051A1 (en)
WO (1) WO2004013810A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1511322A1 (en) * 2003-08-26 2005-03-02 LG Electronics Inc. Method for segmenting moving object of compressed moving image
EP1988505A1 (en) 2007-05-03 2008-11-05 Sony Deutschland Gmbh Method and system for initializing templates of moving objects
WO2009112742A1 (en) * 2008-02-21 2009-09-17 France Telecom Encoding and decoding of an image or image sequence divided into pixel blocks
EP2306399A3 (en) * 2009-09-07 2012-08-01 Sony Computer Entertainment Europe Limited Image processing method, apparatus and system
CN102007770B (en) * 2008-04-15 2013-07-31 法国电信公司 Coding and decoding of an image or of a sequence of images sliced into partitions of pixels of linear form
EP2698764A1 (en) * 2012-08-14 2014-02-19 Thomson Licensing Method of sampling colors of images of a video sequence, and application to color clustering

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060251337A1 (en) * 2003-08-07 2006-11-09 Redert Peter A Image object processing
US8243805B2 (en) * 2006-11-14 2012-08-14 Microsoft Corporation Video completion by motion field transfer
WO2009032255A2 (en) * 2007-09-04 2009-03-12 The Regents Of The University Of California Hierarchical motion vector processing method, software and devices
JP4952627B2 (en) * 2008-03-21 2012-06-13 富士通株式会社 Image processing apparatus, image processing method, and image processing program
US8532390B2 (en) * 2010-07-28 2013-09-10 International Business Machines Corporation Semantic parsing of objects in video
US8995755B2 (en) 2011-09-30 2015-03-31 Cyberlink Corp. Two-dimensional to stereoscopic conversion systems and methods
WO2019234606A1 (en) 2018-06-05 2019-12-12 Beijing Bytedance Network Technology Co., Ltd. Interaction between ibc and atmvp
US10796416B2 (en) * 2018-06-06 2020-10-06 Adobe Inc. Recolored collage generation based on color hue distances
WO2019244118A1 (en) 2018-06-21 2019-12-26 Beijing Bytedance Network Technology Co., Ltd. Component-dependent sub-block dividing
CN110636298B (en) 2018-06-21 2022-09-13 北京字节跳动网络技术有限公司 Unified constraints for Merge affine mode and non-Merge affine mode
CN117768651A (en) 2018-09-24 2024-03-26 北京字节跳动网络技术有限公司 Method, apparatus, medium, and bit stream storage method for processing video data
WO2020094150A1 (en) 2018-11-10 2020-05-14 Beijing Bytedance Network Technology Co., Ltd. Rounding in current picture referencing

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6075875A (en) * 1996-09-30 2000-06-13 Microsoft Corporation Segmentation of image features using hierarchical analysis of multi-valued image data and weighted averaging of segmentation results
US6173077B1 (en) * 1996-11-13 2001-01-09 U.S. Philips Corporation Image segmentation
US6233008B1 (en) * 1997-06-11 2001-05-15 Samsung Thomson-Csf Co., Ltd. Target tracking method and device therefor
US6337917B1 (en) * 1997-01-29 2002-01-08 Levent Onural Rule-based moving object segmentation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09185720A (en) * 1995-12-28 1997-07-15 Canon Inc Picture extraction device
WO1998044479A1 (en) * 1997-03-31 1998-10-08 Matsushita Electric Industrial Co., Ltd. Dynamic image display method and device therefor

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6075875A (en) * 1996-09-30 2000-06-13 Microsoft Corporation Segmentation of image features using hierarchical analysis of multi-valued image data and weighted averaging of segmentation results
US6173077B1 (en) * 1996-11-13 2001-01-09 U.S. Philips Corporation Image segmentation
US6337917B1 (en) * 1997-01-29 2002-01-08 Levent Onural Rule-based moving object segmentation
US6233008B1 (en) * 1997-06-11 2001-05-15 Samsung Thomson-Csf Co., Ltd. Target tracking method and device therefor

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
EREN P E ET AL: "Region-based affine motion segmentation using color information", ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 1997. ICASSP-97., 1997 IEEE INTERNATIONAL CONFERENCE ON MUNICH, GERMANY 21-24 APRIL 1997, LOS ALAMITOS, CA, USA,IEEE COMPUT. SOC, US, 21 April 1997 (1997-04-21), pages 3005 - 3008, XP010225789, ISBN: 0-8186-7919-0 *
KOMPATSIARIS L ET AL: "Spatiotemporal segmentation and tracking of objects in color image sequences", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, IEEE INC. GENEVA, SWITZERLAND, vol. 5, 28 May 2000 (2000-05-28), pages 29 - 32, XP010504125 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1511322A1 (en) * 2003-08-26 2005-03-02 LG Electronics Inc. Method for segmenting moving object of compressed moving image
EP1988505A1 (en) 2007-05-03 2008-11-05 Sony Deutschland Gmbh Method and system for initializing templates of moving objects
EP3683768A1 (en) * 2007-05-03 2020-07-22 Sony Deutschland GmbH Method and system for initializing templates of moving objects
US8045812B2 (en) 2007-05-03 2011-10-25 Sony Deutschland Gmbh Method and system for initializing templates of moving objects
US8917945B2 (en) 2008-02-21 2014-12-23 Orange Encoding and decoding an image or image sequence divided into pixel blocks
US8787685B2 (en) 2008-02-21 2014-07-22 France Telecom Encoding and decoding an image or image sequence divided into pixel blocks
US8971648B2 (en) 2008-02-21 2015-03-03 Orange Encoding and decoding an image or image sequence divided into pixel blocks
WO2009112742A1 (en) * 2008-02-21 2009-09-17 France Telecom Encoding and decoding of an image or image sequence divided into pixel blocks
CN102007770B (en) * 2008-04-15 2013-07-31 法国电信公司 Coding and decoding of an image or of a sequence of images sliced into partitions of pixels of linear form
US8311384B2 (en) 2009-09-07 2012-11-13 Sony Computer Entertainment Europe Limited Image processing method, apparatus and system
EP2306399A3 (en) * 2009-09-07 2012-08-01 Sony Computer Entertainment Europe Limited Image processing method, apparatus and system
EP2698764A1 (en) * 2012-08-14 2014-02-19 Thomson Licensing Method of sampling colors of images of a video sequence, and application to color clustering
CN104584556A (en) * 2012-08-14 2015-04-29 汤姆逊许可公司 Method of sampling colors of images of a video sequence, and application to color clustering
US9911195B2 (en) 2012-08-14 2018-03-06 Thomson Licensing Method of sampling colors of images of a video sequence, and application to color clustering

Also Published As

Publication number Publication date
JP2005535028A (en) 2005-11-17
US20060078156A1 (en) 2006-04-13
AU2003247051A1 (en) 2004-02-23
US7382899B2 (en) 2008-06-03
CN1311409C (en) 2007-04-18
CN1672174A (en) 2005-09-21
EP1527416A1 (en) 2005-05-04

Similar Documents

Publication Publication Date Title
US7382899B2 (en) System and method for segmenting
KR100973429B1 (en) Background motion vector detection
KR101135454B1 (en) Temporal interpolation of a pixel on basis of occlusion detection
US20060098737A1 (en) Segment-based motion estimation
US20030007667A1 (en) Methods of and units for motion or depth estimation and image processing apparatus provided with such motion estimation unit
US20050180506A1 (en) Unit for and method of estimating a current motion vector
JP4213035B2 (en) Occlusion detector and method for detecting an occlusion region
KR20060083978A (en) Motion vector field re-timing
US20050226462A1 (en) Unit for and method of estimating a motion vector
JP2005505841A (en) Apparatus and method for motion estimation
US8102915B2 (en) Motion vector fields refinement to track small fast moving objects
US20060268181A1 (en) Shot-cut detection
KR20060063937A (en) Graphics overlay detection
KR20050023122A (en) System and method for segmenting

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NI NO NZ OM PH PL PT RO RU SC SD SE SG SK SL TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2003766538

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2004525661

Country of ref document: JP

ENP Entry into the national phase

Ref document number: 2006078156

Country of ref document: US

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 10522469

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 1020057001820

Country of ref document: KR

Ref document number: 20038184311

Country of ref document: CN

WWP Wipo information: published in national office

Ref document number: 1020057001820

Country of ref document: KR

WWP Wipo information: published in national office

Ref document number: 2003766538

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 10522469

Country of ref document: US

WWW Wipo information: withdrawn in national office

Ref document number: 2003766538

Country of ref document: EP