AU763919B2

AU763919B2 - Tracking objects from video sequences

Info

Publication number: AU763919B2
Application number: AU28028/01A
Authority: AU
Inventors: Mathieu Hitter; Anne Lauriou; Delphine Anh Dao Le
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2000-03-16
Filing date: 2001-03-15
Publication date: 2003-08-07
Anticipated expiration: 2021-03-15
Also published as: AU2802801A

Description

S&FRef: 536254

AUSTRALIA

PATENTS ACT 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT

ORIGINAL

Name and Address of Applicant: Actual Inventor(s): Address for Service: Canon Kabushiki Kaisha 30-2, Shimomaruko 3-chome, Ohta-ku Tokyo 146 Japan Delphine Anh Dao Le, Mathieu Hitter, Anne Lauriou Spruson Ferguson St Martins Tower,Level 31 Market Street Sydney NSW 2000 Tracking Objects From Video Sequences Invention Title: ASSOCIATED PROVISIONAL APPLICATION DETAILS [33] Country [31] Applic. No(s) AU PQ6281 [32] Application Date 16 Mar 2000 The following statement is a full description of this invention, including the best method of performing it known to me/us:- IP Australia Documents received on: MAR 2001 Batch No: 5815c TRACKING OBJECTS FROM VIDEO SEQUENCES Copyright Notice This patent specification contains material that is subject to copyright protection.

The copyright owner has no objection to the reproduction of this patent specification or related materials from associated patent office files for the purposes of review, but otherwise reserves all copyright whatsoever.

Technical Field of the Invention The present invention relates to a method and apparatus for tracking objects from video sequences. The invention also relates to a computer readable medium comprising a computer program for tracking objects from video sequences.

Background Image motion plays an important role in computer vision and scene understanding.

Image motion analysis has been applied to many fields over the last few decades, including object tracking, autonomous navigation, surveillance and virtual reality. More eeeo recently, motion information has played an important role in video indexing, contributing to video segmentation and shot classification.

Although the human visual system can easily distinguish moving objects, partitioning an image sequence into moving objects and tracking their evolution along time remains a challenging problem. Many applications related to video compression and object recognition rely on moving object segmentation. Content-based functionalities, such as those defined recently in the MPEG-4 and MPEG-7 context, require a description of image sequences in terms of moving objects.

The publication entitled "Unsupervised Video Segmentation Based on Watersheds •and Temporal Tracking" by Demin Wang, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 8, No. 5, September 1998 discloses a method for video segmentation and object tracking. The method by Wang consists of an initial segmentation process and a temporal tracking process. The initial segmentation process consists of three steps: spatially segmentation for partitioning the first frame of a video sequence into homogeneous regions based on intensity, motion estimation to estimate motion parameters for each region, and motion-based region merging to group regions into moving objects. The temporal tracking is performed after the first frame of a video sequence has been segmented into moving objects. The temporal tracking segments the subsequent frames of the video sequence and establishes a correspondence of moving objects between frames. The temporal tracking is realized in four steps: motion 536254.doc 11. JUN. 2003 16:50 SPRUSON FERGUSON NO. 2592 P. 19 -2projection, marker extraction, a modified segmentation, and a region merging, In the first step, moving objects in one frame are projected into the next frame based on motion, which establishes the correspondence of moving objects between frames. In the next step, relaible parts of each projected object are extracted as markers of the corresponding moving object. Starting from these markers, a modified segmentation defines the complete segmentation of the next frame. Finally, motion based region mergin is applied to new regions for eventual grouping into moving objects. This method suffers from the disadvantage in that the watershed transformation segmentation is sensitive to noise.

The publication entitled "Seeded Region Growing" by Adams et al., IEEE Trans.

Pattern Anal. Machine Intell., vol. 16 pp- 64l-647, 1994 (hereinafter called Adams et al) discloses another method for segmentation of images. This method is based on a region growing principle of selecting a pixel adjacent to a region of pixels, which is most similar to the region of pixels. The method does not rely on the arbitrary selection of homogeneity thresholds, but is controlled by choosing a small number of pixels, called seeds. This seed selection may be either automatic or manual. Once the seeds have been selected, the segmented regions are grown in an iterative fashion. Each step of the process involves the addition of one of the neighboring pixels to one of the regions grown from the seeds. A measure 8(x) is defined how different each of the neighboring pixels is from that region. The neighboring pixel having the minimum measure 6(x) is added to *:'the rethe n. Adams et al make use of a sorted list in determining the relevant neighboring S pixel to be added. In Adams at al, once a pixel has been added to the list, the 8(x) measure is never updated. Moreover, the Adams at al method suffers from the disadvantage of slow image segmentation.

%oaf: 25 The reference to the aforementioned publications is not to be taken as an adm..nwirssion that thyconstitutecomngerlkwed.

Disclosure of the Invention It is an object of the present invention to substantially overcome, or at least ameliorate, one or more disadvantages of existing arrangements.

According to a first aspect of the invention, there is provided a method of tracking moving objects from a video sequence, said video sequence comprising a plurality of frames each comprising a plurality of pixels, the method performing the following steps for each two adjacent frames of the video sequence; segmenting a current frame of the video sequence into a number of homogeneous regions of pixels; estimating 536254-amendments-_Olioc COMS ID No: SMBI-00289460 Received by IP Australia: Time 16:56 Date 2003-06-11 11. JUN. 2003 16:51 SPRUSON FERGUSON NO. 2592 P. the motion of said segmented regions of the current frame; projecting the segmented regions of the current frame into said next frame using said estimated motion; segmenting a next frame of the video sequence into a number of homogeneous regions; and establishing a correspondence between the segmented regions of the current and next frame using the projected regions, wherein said segmented regions of the current frame and corresponding said segmented regions of the next frame constitute said objects and the estimated motion of said segmented regious their motion; wherein either one or both of said steps of segmenting said current and next fr-ames comprises the following substeps: distributing seeds in said frame as a function of a first property of said pixels within the said frame, wherein fewer seeds are allocated to those regions having pixels homogeneous in said first property; and growing a number of regions from said seeds, ~0wherein a number of pixels that border said growing regions are considered and that pixel of said number that is most similar in a second property to a region it borders is appended to that region -and the said second property of the appended region is updated and said growing step is repeated until no pixels bordering the growing regions are available.

:According to a second aspect of the invention, there is provided a method of tracking, moving objects from a video sequence, said video sequence comprising a plurality of frames each comprising a plurality of pixels, the method performing the following steps for each two adjacent frames of the video sequence: segmenting a current fr-ame of the video sequence into a number of homogeneous regions of pixels; estimating the motion of said segmented regions of the current frame; selecting any one or more of those said segmented regions of the current frame having motion; projecting the selected segmented regions of the ow-rent frame into said next frame using said estimated motion; *ee ditiuigsesi9adpoetdrgos fsi etfaea ucino is dsrbtngsesi ai rjce rgoso ai etfae safntono is 25 property of said pixels within those projected regions, wherein fewer seeds are allocated 0:to those regions having pixels homogeneous in said first property; growing a number of regions from said seeds, wherein a number of pixels having motion that border said growing regions are considered and that pixel of said number that is most similar in a second property to a region it borders is appended to that region and the said second property of the appended region is updated and said growing step is repeated until there are nio pixels bordering the growing regions having motion; and establishing a correspondence between the selected segmented regions of the current frame and the segmented regions of the next frame using the projected regions, wherein said selected segmented regions of the current frame and corresponding said segmented regions of the 536254_)flefldtLI dmv COMS ID No: SMBI-00289460 Received by IP Australia: Time 16:56 Date 2003-06-11 11. JUN. 2003 16:51 SPRUSON FERGUSON NO. 2592 P. 21 -4.

next frame constitute said objects and the estimated motion of said selected segmented regions their motion.

According to a third aspect of the invention, there is provided apparatus for tracking moving objects from a. video sequence, said video sequence comprising a plurality of frames each comprising a plurality of pixels, the apparatus comprising: means fbr segmenting each frame of the video sequence into a -number of homogeneous regions of pixels, wherein said segmenting means comprises: means for distributing, for each said frame, seeds as a function of a first property of said pixels within the fr-ame, wherein fewer seeds are allocated to those regions having pixels homogeneous in said first property; and means for growing, for each said frame, a number of regions from said seeds, wherein a number of pixels that border said growing regions are considered and that pixel of said number that is most similar in a second property to a region it borders is :appended to that region and the said second property of the appended region is updated and the growing operation is repeated until no pixels bordering the growing regions are available; means for estimating the motion of said segmnented regions of a current frame; means for projecting the segmented regions of the current fr-ame into a next adjacent frame using said estimated motion; and means for establishing a correspondence between the segmented regions of the current and the next adjacent frames using the projected regions, wherein said segmented regions of the current frame and corresponding said segmented regions of the next adjacent frame constitute said objects and the estimated motion of said segmented regions their motion- According to a fourth aspect of the invention, there is provided Apparatus for *~.tracking moving objects from a video sequence, said video sequence comprising a of frames each comprising a. plurality of pixels, the apparatus comprising: means for segmenting a current frame of the video sequence into a number of homogeneous regions of pixels; moans for estimating the motion of said segmented regions of the current frame; means for selecting any one or more of those said segmented regions of the current frame having motion; means for projecting the selected segmented regions of the current frame into said next frame using said estimated motion; means for distributing seeds in said projected regions of said next frame as a function of a first property of said pixels within those projected regions, wherein fewer seeds are allocated to those regions having pixels homogeneous in said first property; means for growing a number of regions from said seeds, wherein a number of pixels having motion that border said growing regions are considered and that pixel of said number that is most similar in a second 536254_amendiflCnt0l.doc COMS ID No: SMBI-00289460 Received by IP Australia: Time 16:56 Date 2003-06-11 11. JUN. 2003 16:51 SPRUSON FERGUSON NO. 2592 P. 22 property to a region it borders is appended to that region and the said second property of the appended region is updated and the growing operation is repeated until there are no pixels bordering the growing regions having motion; and means for establishing a correspondence between the selected segmented regions of the current frame and the segmented regions of the next frame using the projected regions, wherein said selected segmented regions of the current frame and cor responding said segmented regions of the next frame constitute said objects and the estimated motion of said selected segmented regions their motion.

According to a fifth aspect of the invention, there is provided a computer readable medium comprising a computer program for tracking moving objects from a video sequence, said video sequence comprising a plurality of frames each comprising a plurality of pixels, the computer program comprising: means for segmenting each frame of the video sequence into a number of homogeneous regions of pixels, wherein said segmenting means comprises: means for distributing, for each said frame, seeds as a function of a first property of said pixels within the frame, wherein fewer seeds are allocated to those regions having pixels homogeneous in said first property-, and means for growing, for each said frame, a number of regions from said seeds, wherein a number of pixels that border said growing regions are considered and that pixel of said number that is most similar in a second property to a region it borders is appended to that region and the said second property of the appended region is updated and the growing operation is repeated until no pixels bordering the growing regions are available; means far estimating the motion of said segmented regions of a current frame; means for projecting the segmented regions of the current frame into a next adjacent frame using said estimated motion; and means for establishing a correspondence between the segmented regions of the current and the next adjacent frames using the projected regions, wherein said segmented regions of the current frame and corresponding said segmented regions of the next adjacent frame constitute said objects and the estimated motion of said segmented regions their motion.

According to a sixth aspect of the invention, there is provided a computer readable medium comprising a computer program for tracking moving objects from a video sequence, said video sequence comprising a plurality of frames each comprising a plurality of pixels, the computer program comprising: means for segmenting a current frame of the video sequence into a number of homogeneous regions of pixels; means for estimating the motion of said segmented regions of the current fr-ame; means for selecting 53O 2

S

4 .mcndmenssOI doe COMS ID No: SMBI-00289460 Received by IP Australia: Time 16:56 Date 2003-06-11 11. UN. 2603 16:52 SPRUSON FERGUSON NO. 2592 P. 23 -6any one or more of those said segmented regions of the current frame having motion; means for projecting the selected segmented regions of the current frame into said next frame using said estimated motion; means for distributing seeds in said projected regions of said next frame as a function of a first property of said pixels within those projected regions, wherein fewer seeds are allocated to those regions having pixels homogeneous in said first property;, means for growing a number of regions from said seeds, wherein a number of pixels having motion that border said growing regions are considered and that pixel of said number that is most similar in a second property to a region it borders is appended to that region and the said second property of the appended region is updated and the growing operation is repeated until there are no pixels bordering the growing regions having motion; and means for establishing a correspondence between the selected segmented regions of the current frame and the segmented regions of the next frame using :e :the projected regions, wherein said selected segmented regions of the current frame and corresponding said segmented regions of the next frame constitute said objects and the *15 estimated motion of said selected segmented regions their motion.

Brief Description of the Drawings A number of preferred embodiments of the present invention will now be described with reference to the drawings, in which: Fig. I is a schematic block diagram of a general-purpose computer upon which the preferred embodiment of the present invention can be practiced; Fig 2 is a flow diagram of a method of segmenting and tracking moving objects' from video sequences in accordance with a preferred embodiment; Fig 3*safothr famehdo emniga iaea sdii,2 Fig. 3A is a flow chart of a method of seed~ing an image as used in Fig. Fi.4,*utae neapeo eddiaesee codn oFg A 4A is a flow chart of a method of seedenrgin mgeoi as used in Fig. 3; an Fig. SB illustrates a simplified example of the preferred region growing process of Fig. Detailed Description including Best Mode Where reference is made in any one or more of the accompanying drawings to steps and/or features, which have the same reference numerals, those steps and/or features have far the purposes of this description the same ftnetion(s) and/or operation(s), unless the contrary intention appears.

S36ZS4jmndinls~idoc COMS ID No: SMBI-00289460 Received by IP Australia: Time 16:56 Date 2003-06-11 11. JUN. 2003 16:52 SPRUSON FERGUSON NO. 2592 P. 24 -6A- Overview of Method.

The principles of the preferred method have general applicability to spatio-temporal segmentation and tracking of objects in multiple coloured or black and white video o e 4 4 o 536254_amendmn_01 .doc COMS ID No: SMBI-00289460 Received by IP Australia: Time 16:56 Date 2003-06-11 -7images. It relies on a robust spatial segmentation method to get accurate boundaries for the regions and uses motion information for temporal tracking.

Fig 2 is a flow diagram of a method of segmenting and tracking moving objects from video sequences in accordance with a preferred embodiment. Preferably, the video sequences consist of a plurality of frames each including a pixel-map representation of an original image. The pixel-map can include a numerical representation of the particular colour for each pixel location in a rectangular array. Any numerical representation of colour can be used and can be expressed as a sequence of one or more numbers. Pixel locations at the pixel grid are represented by an array of row and column (j) specifications.

The method commences at step 200, where a video sequence is input and any necessary parameters are initialised. The video sequence preferably consists of a series of frames of images of moving objects. Where the video sequence contains dramatic scene changes containing different moving objects, then the video sequence can be divided into eeee a number of video sequences. The method can then be called separately to operate on S. each of the divided video sequences in turn.

After step 200, the method then proceeds to step 202, where the first frame (t 1) of the video sequence is accessed. The method then proceeds to a decision box 204, where a check is made whether the current frame is the first frame of the video sequence.

If the decision box returns true, then the method proceeds to steps 206. Otherwise, the method proceeds to step 212.

In step 206, the method spatially segments the first frame into homogeneous regions, based on colour or intensity. The spatial segmentation preferably uses an automatic seeded region growing process as described herein in the section entitled "Section 2.0 Method of Segmenting an Image". Alternatively, the method may use any known robust segmentation process. After the first frame has been segmented into regions, the method proceeds to step 208.

In step 208, the method estimates the motion parameters of each of the homogeneous regions obtained by the seeded region growing step 206. Thus, spatial discontinuity information is taken into account when computing motion. The motion estimation step preferably estimates the motion parameters of each region using an affine motion model.

536254.doc -8- Preferably, the motion parameters are estimated from the image data directly. In the case of an affine motion model, if a pixel is transformed into their coordinates satisfy the following equations: ax by c y'=dx ey+ f' (1) where b, c, d, e, f are the motion parameters of the model.

Where the motion model is fitted directly from the image data, the motion model parameters are estimated by minimising the displaced frame difference for the considered region. Let Ik and Ik+1 be consecutive frames of the image sequence and R, a region of the kth frame, the displaced frame difference can be written: F= Z lk(xi,Yi)-Ik+I(X'iYi 1 2 (2) k (2) (x,,yi )eR

I

k is a scalar in the case of grey level images, a vector in the case of colour images.

o.

The displaced frame difference can be minimised to obtain the motion parameters b, c, d, e, using a non-linear least squares algorithm such as Levenberg- SMarquardt iterative method.

15 Alternatively, the method may use a motion field or optical flow field in estimating c the motion parameters of each region. In this case, a vector (ui,vi) describing each pixel's motion is available. Preferably, the optical flow method would take into account the boundary information of the regions obtained by the seeded growing step 206. The parameter values of a linear motion model, which best describe the motion field in the 20 region are then calculated. The least squares criterion to minimise is: -(ax +by, c)) 2 (d (3) A linear least-squares method can be applied to find the motion parameters b, c, d, e, which minimise this function.

After the motion parameters for each region have been estimated, the method proceeds to step 210. In step 210, a number of regions of objects are selected for tracking. The temporal tracking can be applied to all the regions of the image or preferably to selected regions. In the former case, this step 210 may be omitted and all regions of the image are by default selected.

Preferably, however, any one or more of the objects having a motion within the image can be selected. These moving objects in the image can be selected by motion 536254.doc thresholding or motion segmentation. For instance, those segmented regions having an estimated motion greater than a predetermined motion can be selected. Preferably, those objects or regions of interest to a user having an estimated motion greater than the predetermined motion can be selected. Automatic object definition can be a difficult problem because interesting video objects (eg people) are often composed of different regions, which are not homogeneous with respect to spatial characteristics and motion.

Thus, a semi-automatic object definition is preferably used where for example, an operator manually traces a bounding box around the object of interest. Then, only those regions having an estimated motion greater than a predetermined motion that are included in the bounding box are selected for tracking.

After step 210, the method then proceeds to step 214, where the selected regions of the objects are projected onto the next image frame t+l. The projection of the moving regions of one frame onto the next frame establishes a correspondence between moving 0@ 0 0O• oregions in consecutive frames. The selected regions of step 210 are projected according 0000 a* S 15 to the estimated motion parameters. The projection of R i into the next frame is described S by: Pi y'i(XiYi) Ri}, (4)

X'

i ax i by i +c where j'=x+y+ 1hr Yi dxi eYi f" ~Since the coordinates of the transformed pixel calculated from are real 09oo 20 values, an interpolation is necessary to predict the position of R in the next frame. This Oprojection allows us to establish a correspondence between selected regions of consecutive frames. It is not necessary to project non-selected regions.

After step 214, the method proceeds to step 216, where the projected regions of the next frame are seeded. After step 216, the method continues to step 218, where a region growing process is applied to the seeded image. Because the estimated motion of the objects is not accurate enough, the method refines the segmentation on the next frame using spatial information. This is done by seeding in step 216 the projected regions of the next image frame, and re-applying in step 218 the seeded region growing method. Seeds are distributed as described herein in the section entitled "2.0.1 Process for Selecting Seeds but retaining only those seeds which belong to the selected projected regions. As the boundaries of the projected regions are likely to be inaccurate, this simple way of seeding the next image could be further refined by not keeping the seeds, which are too 536254.doc close to the region boundaries. The distance of a seed to the region's boundaries can be computed using a distance transform on the edge map of the projected regions. Those seeds that fall within a predetermined distance of the region's boundaries, are then discarded. The seeds within each projected region of the next frame are then labelled with the name of its projected region. Once the seeds are distributed, the regions are grown as described herein in the section entitled "2.0.2 Process for Growing Seeded Regions After step 218, the method proceeds to step 220, where the re-segmented regions are merged taking into account information from the previous projected regions. For instance, those re-segmented regions grown from seeds, which have the same projection region labels are automatically merged unless their mean colour difference is bigger than a predetermined threshold. In this way, a correspondence is established between the projected regions and the grown regions of the next frame, thereby establishing a S".i correspondence between the segmented regions of the current frame and the grown 15 regions of the next frame. Those regions that are greater than the threshold are labelled as 0 0 new regions. The estimated motion parameters previously calculated can then be used as the motion parameters of the segmented regions of the current frame and grown regions "I of the next frame. The method can thus track moving objects (viz. regions) and estimates *0 00 "their motion in a robust and accurate manner.

20 After step 220, the method proceeds to step 222, where the next frame of the video :eeel e 0 sequence is accessed. The method proceeds to decision block 224, where a check is made whether the current image frame is the last frame of the video sequence.

If the decision block 224 returns false, the method returns to decision block 204. If :00•i the decision block 204 then returns false, the method directly proceeds to step 212, where the method estimates the motion parameters between the current frame and the next frame of each of the homogeneous regions determined during step 220. The method estimates these motion parameters in like manner to that of step 208. The method then proceeds to step 214 for the next pass of the loop 214 to 224.

On the other hand, if the decision block 224 returns true, the method terminates at step 226.

Method of Segmenting an Image Fig. 3 is a flow chart of a method of segmenting an image as used in step 206 of Fig. 1. As mentioned above, the image data is a pixel-map representation of an original image. In the step 304, a list of pixels is generated which are to be used as seeds for 536254.doc -11region growing. The automatic selection of an appropriate set of pixels or set of small connected regions, called seeds, controls the method of initially segmenting the image.

The selection of the set of seeds or small regions is critical for the success of the image segmentation.

This can be done by using a quad-tree approach in order to distribute seeds according to a homogeneity criterion based on the contrast of the pixels. The preferred process for generating these seeds is described in more detail in the next section, herein entitled "2.0.1 Process for Selecting Seeds As mentioned above the latter seed generation process is also used in step 216.

However in this step 216, only those seeds which belong to the selected projected regions are retained. Preferably, this is achieved by discarding any distributed seeds that do not fall within any selected projected region. As the boundaries of the projected regions are S.likely to be inaccurate, the seeding of the next image could be further refined by not keeping the seeds which are too close to the region boundaries. The distance of a seed to 15 the region's boundaries can be computed using a distance transform on the edge map of the projected regions.

In the next step 306, the process takes the generated seeds and grows seeded regions in an iterative fashion. At each iteration, all those pixels that border the growing regions are considered. The pixel that is most similar to a region that it borders is appended to that region. Alternatively, the similarity of a limited number of these pixels may be considered at each iteration, thus speeding up the processing. The preferred process for o growing the seeded regions is described in more detail in the section herein entitled "2.0.2 Process for Growing Seeded Regions This process continues until all pixels have been allocated to an associated region, resulting in a segmented image. The output of the seeded region growing is a set of homogeneous regions, wherein the number of regions obtained is equal to the number of seeds. During this step, the regions will continue to grow until they are bounded on all sides by other growing/grown regions. Also, some regions will grow more at the expense of others. For instance, there will tend to be large regions in the homogeneous areas and small regions in the non-homogeneous areas.

Furthermore, the contrast for each region is re-evaluated while the region grows. In this way, the preferred method is able to segment the image. After completion of this step 306, the processing terminates and continues to step 208.

As mentioned above the latter growing seeded region process is also used in step 218. This region growing process can be applied to the embodiment where all regions of 536254.doc 12the image are selected for tracking. However, in the embodiments where only some regions are selected for tracking, this region growing process should be used with one modification. As mentioned above, the region growing process will continue to grow until the regions are bounded on all sides by other growing/grown regions. In the embodiment where only some regions are selected for tracking, the region growing process will continue to grow beyond the selected regions. Thus the modified region growing process requires means for stopping the process growing beyond the selected regions. In this modified step 218, only those pixels having some motion that border the growing regions are considered for adding to the growing regions. The region growing process will stop when there are no more pixels having motion that border the growing regions.

In this step 218, only those pixels having some motion that border the growing regions are considered. In this way, regions will continue to grow until they are bounded S".i on all sides by other growing/grown regions or until they reach a region having little or no motion.

2.0.1 Process for Selecting Seeds The seed selection process 304 is a simple and fast quad-tree approach, which distributes the seeds over the image, but it allocates fewer seeds in homogeneous areas of S the image. The seed selection process addresses colour data by preferably processing the luminance image, ie. a grey-scale image. The homogeneity is measured by a simple contrast criterion: the difference between the minimum and maximum luminance.

The following pseudocode is illustrative of the method of seeding an image for use in Fig 3.

Psehtdocode SEED RECTANGLE A rectangle, given by and (width,height) RECTANGLE_LIST FIFO list of rectangles SEED_LIST List ofpixels (seeds) CONTRAST Difference between min and max luminance HI MIN SIZE Maximum block size for contrast assessment LO_MIN_SIZE Minimum block size for block splitting HI_MIN_SIZE) HI_DENSITY, LODENSITY Densities for pixel spreading HITHRESHOLD, LOTHRESHOLD Contrast thresholds Initialise RECTANGLE_LIST with the rectangle corresponding to the whole image.

536254.doc -13while RECTANGLE_LIST is not empty remove first element from RECTANGLE_LIST and keep it in RECTANGLE assess CONTRAST for area of the image corresponding to RECTANGLE if CONTRAST LO_ THRESHOLD add the pixel corresponding to the center of RECTANGLE to SEED_LIST continue loop if RECTANGLE size HI MIN SIZE split RECTANGLE into four and add the new rectangles in RECTANGLE_LIST continue loop if CONTRAST HITHRESHOLD spread pixels over RECTANGLE with LO_DENSITY add them to SEED LIST; continue loop; if RECTANGLE size LO MIN SIZE split RECTANGLE into four and add the new rectangles in

RECTANGLE_LIST;

continue loop spread pixels over RECTANGLE with HI_DENSITY add them to SEED_LIST endwhile Turning now to Fig. 4A, there is shown a flow chart of the last mentioned pseudocode called SEED. The seed selection processing commences at step 402, where rectangle co-ordinates corresponding to the entire image are stored in a FIFO buffer called RECTANGLE_LIST. After step 402, the processing continues at decision block 536254.doc -14- 404, where a check is made whether the RECTANGLE_LIST is empty. If the decision block returns true (yes) then the process for growing seeds terminates and the method proceeds to step 306. Otherwise, the processing continues at step 408, where the first element in RECTANGLE LIST is removed and stored in the variable RECTANGLE.

In the next step 410, the contrast of the entire image or a sub-image thereof corresponding to the removed rectangle is determined. The contrast is determined by calculating the difference between the minimum and maximum luminance values of the pixels in the rectangle. After step 410, the processing continues at decision block 412, where a check is made whether the determined contrast is less than a predetermined low threshold value called LOTHRESHOLD. If the decision block 412 returns true, then the co-ordinates of central pixel of the rectangle are added to a list called seed_list. In this way, sub-images corresponding to the rectangle which have a low contrast and which are of any size have a centre as a seed (see Table If the decision block 412 returns false, i the processing continues at decision block 416. In decision block 416, a check is made whether the size of the rectangle is greater than a predetermined constant called HI MIN SIZE. If the decision block 416 returns true, then the processing continues at step 418. In step 418, the rectangle is divided into four sub-rectangles in the manner of a quadtree approach. In this way, sub-images corresponding to rectangles of a large size having medium and high contrast are split (see Table After step 418, the processing continues at step 404, where the four sub-rectangles are added to the FIFO buffer RECTANGLELIST. If however, the decision block 416 returns false, the processing continues at decision block 420.

In the decision block 420, a check is made whether the determined contrast is less I than a predetermined high threshold value called HITHRESHOLD. If the decision block 420 returns true, then processing continues at step 422, where a number of pixels from the rectangle are added to the SEED_LIST as seeds. These newly added seed pixels are evenly distributed throughout the current rectangle in such a manner that there is a low density of such seed pixels in the rectangle. In this way, a low density seeding is achieved for sub-images of a small and medium size (See Table After step 422, the processing continues at step 404. If, however, the decision block 420 returns false then the processing continues at decision block 424.

In the decision block 424, a check is made whether the size of the rectangle is greater than a predetermined minimum size called LOMIN_SIZE. If the decision block returns false, the processing continues at step 426, where a number of pixels from the 536254.doc rectangle are added to the SEED_LIST as seeds. These newly added seed pixels are evenly distributed throughout the current rectangle in such a manner that there is a high density of such seed pixels in the rectangle. In this way, a high density seeding is achieved for sub-images of a small size (See Table If, however, the decision block 424 returns true then the processing continues at step 428. In step 428, the rectangle is divided into four sub-rectangles in the manner of a quadtree approach. In this way, corresponding sub-images of a medium size and high contrast are split (see Table A).

TABLE A Allocation of seeds as a function of: the contrast of current sub-image, and the size of the rectangle corresponding to the current sub-image as compared to the size of the rectangle corresponding to the entire image r .r Small size Medium size Large size Low contrast Center of rectangle is Center of rectangle Center of rectangle a seed is a seed is a seed Med contrast Low density seeding Low density seeding Split rectangle High contrast High density seeding Split rectangle Split rectangle Turning now to Table A, it can be seen that the split rectangular regions of the image of any size whose pixels have small variance in luminance (low contrast) are seeded in their center. In addition, split rectangular regions of a small or medium size whose pixels have a medium variance in luminance (medium contrast) are seeded evenly throughout these regions in a low-density manner. Furthermore, rectangular regions of a small size whose pixels have a high variance in luminance (high contrast) are seeded evenly throughout the region in a high-density manner. On the other hand, rectangular regions of medium size and high contrast are split into four rectangular sub-regions. In addition, rectangular regions of a large size and of a medium or high contrast are also split into rectangular sub-regions. This splitting continues in a quadtree manner until the split sub-region(s) meets the abovementioned relevant size and contrast requirements for seeding. As will be apparent, these rectangular sub-images can be in some circumstances equilateral rectangular sub-images.

536254.doc -16- Turning now to Fig. 4B, there is a shown an example of a seeded image 452 seeded according to the preferred process 304. For simplicity's sake, the image itself is not shown. Initially, during the seeding process, the original image is input and its contrast is determined. As the original image in this example has a medium contrast and is of a large size (as compared to itself), the image is split into four rectangles 454,456,458, and 460.

The process then considers each of the these rectangles 454,456,458, and 460. As the images, in this example, within rectangles 454,456, and 458 are of low contrast, and of a large size as compared to the original image, the centers of these rectangles are taken as seeds 462. However, as the image, in rectangle 460 is of a high contrast and large size, the rectangle is split further into four sub-rectangles 464,466,468, and 470. The process then considers each sub-rectangle 464,466,468, and 470. Rectangles 464 and 466 are each further split into four sub-rectangles, as they are both of a high contrast and medium size. As rectangle 468 is of a medium contrast and size, the rectangle is allocated seeds 472 in a low-density manner. In addition, as rectangle 470 is of a low contrast and medium size, the center of this rectangle is taken as a seed 474. The seeding processing continues in a similar manner, until all split rectangles have been seeded. In this particular example, the split rectangles are center seeded 476,478,480,482,484, and 486 and the remaining split rectangles are allocated seeds 488, and 490 in a high-density manner. At the completion of the seeding process, a list of all the pixel locations of seeds 462,472,474,476,478,480,482,484,486, 488, and 490 is established.

The preferred seeding process is a simple and fast approach, which distributes the seeds over the entire image, while allocating fewer seeds in homogeneous areas of the image. Furthermore, there is a high probability at least one seed will be allocated to each homogeneous region of the image.

2.0.1 Process for Growing Seeded Regions The seeded region growing process 306 takes a set of seeds, individual pixels or small groups of connected pixels, generated by step 304, as input. The preferred process 306 grows the seed regions in an iterative fashion. At each iteration, all those pixels that border the growing regions are considered. The pixel, which is most similar to a region that it borders, is appended to that region. In the preferred process, all the regions can be grown simultaneously.

The process evolves inductively from the seeds, namely, the initial state of the sets or regions A, A 2 Each step of the iterative process involves the addition of one 536254.doc 17pixel to one of the above sets. We now consider the state of the sets after m steps.

Let T be the set of all as-yet unallocated pixels which border at least one of the regions.

T= xe A, N(x) nj A, 0 i=1 i=1 where N(x) is the set of immediate neighbours of the pixel x. For each candidate pixel x, an index i(x) is found, which corresponds to the adjacent region where x is most likely to be included and a criterion 3(x) is computed; 6(x) measures how good a candidate x is for region A i If, for x T we have that N(x) meets just one of the then we define i(x) e n} to be that index such that N(x) r Ai() 0 and define 3(x) to be a measure of how different x is from the region it adjoins. The simplest definition for 6(x) is 1* mean[g(y)], (6) Y* Ai(x) where g(x) is the grey value of the image point x adjoining region Ai(x) and g(y) is the grey value of the image point y within region Aix). The extension of this criterion to colour images requires the choice of a suitable metric in colour space. For example, the absolute value of the Euclidean distance between the colour of pixel x and the mean colour of region Ai(x could be used. Alternatively, the segmentation could be performed on the luminance image as per on gray-scale images.

If N(x) meets two or more of the we take i(x) to be a value of i such that N(x) S 20 meets A, and 8(x) is minimised.

Then, a candidate pixel z T is chosen such that 3(z) min (7) xeT and append z to Ai(z).

This completes step m+l. The process is repeated until all pixels have been allocated. The process commences with each A, being just one of the seeds. The equations 6 and 7 ensure that the final segmentation is into regions as homogeneous as possible given the connectivity constraint.

In the preferred process 306, 6(x) is updated only for a limited number of the candidate pixels at each step of the iteration. Consequently, as the colour of the limited 536254.doc -18number of candidate pixels is always compared with the updated mean colour of the neighbourifg regions, the quality of the segmentation is comparable with Adams et al.

Furthermore, as the process does not consider all the candidate pixels, especially when the list is relatively long, the speed of the region growing process can be significantly increased without reducing the quality of the final segmentation. The region growing process 306 uses two ways, either alone or in combination, to avoid scanning the whole candidate pixels list.

The first one is to use a variable step when scanning the candidate pixel list, the value of the step depending on the size of the candidate list: the longer the list, the bigger the step. Another advantage of this method is a better control on the processing time (if there is a linear relation between the size of the candidate list and the step).

The second process consists in skipping a whole part of the list by choosing the first candidate pixel x such that 5(x) is smaller than z being the pixel selected at the previous step. If such a pixel is found, then the scanning of the list is interrupted prematurely; otherwise, the whole list is scanned to find the candidate pixel with the minimum 5(x) value, and the threshold is tuned to that value.

As successive best candidates often belong to the same region, inserting the new •candidates (neighbours of the selected pixel) at the beginning of the list can reduce the (*Aso.

computation time. However, they are not considered at the first step after their insertion in order not to introduce a bias in favour of a particular region.

The following pseudocode is illustrative of the preferred method of seeded region growing for use in Fig 3.

Pseudo-code REGION GROWING SEED_LIST List of seeds (pixels) CANDIDATE_LIST List of pixels which are neighbouring at least one region REGION Array used to store the growing regions ie. the lists of classified pixels MEAN[] Array containing the mean gray value or luminance colour of the regions DELTA Function measuring the difference between a pixel and a neighbouring region MIN Variable used for storing the minimum DELTA CHOSENPIX Chosen pixel CHOSEN_REG Index of the chosen region 536254.doc 19- DYN_THRESHOLD Dynamic threshold to allow early selection of a candidate pixel DYN_STEP Dynamic step for the scan loop of CANDIDATE_LIST DYNSTART Dynamic starting position for the scan loop Initialise each REGION with the corresponding seed of SEED_LIST and initialise CANDIDATELIST with the neighbours of each seed; DYN THRESHOLD 0; DYN START 0; while CANDIDATE_LIST is not empty Set Min =256; Set DYN_STEP depending on the size of CANDIDATE_LIST, e.g. DYN_STEP size of CANDIDATE_LIST 300 for i DYN START to size of CANDIDATELIST, i i+DYN_STEP 15 CURRENT PIX pixel i in CANDIDATE LIST if (DELTA(CURRENTPIX) <MIN) MIN DELTA(CURRENT_PIX) CHOSENPIX CURRENT PIX CHOSEN_REG index of the chosen region stop if MIN DYN THRESHOLD endfor put each unclassified neighbour of CHOSEN_PIX in CANDIDATE_LIST and set DYNSTART as the number of new pixels in CANDIDATE_LIST; Put CHOSEN_PIX in REGION[CHOSEN_REG]; Update MEAN[CHOSENREG]; Remove CHOSENPIX from CANDIDATE_LIST; DYN_THRESHOLD max (DYN_THRESHOLD, MIN) endwhile Turning now to Fig. 5A, there is shown a flow chart of the last mentioned pseudocode named REGION GROWING for growing the segmented regions. This pseudocode can be modified for use also in step 218. Specifically steps 504 and 526 are modified that only those neighbouring pixels will be added to the candidate list that have some motion. This can be determined by thresholding the motion information obtained during steps 208 or 212. For instance only those pixels that border the growing regions 536254.doc which have a motion greater than a predetermined motion will be added to the list. In this way the region growing process will stop when there are no more moving pixels bordering the regions. This can also be achieved by modifying the pseudocode to stop once the following condition is satisfied. Namely, when the luminance of each of the bordering pixels is greater by a predetermined threshold than the luminance of their respective regions.

The processing commences at step 502, where the seed list is stored in an array called REGION[]. This array REGION[] is used to store the growing regions, ie. the lists of classified pixels. Initially, the seeds of the image are denoted as the initial regions for growing.

In the next step 504, the neighbouring pixels of each seed are determined and stored in a list called CANDIDATE_LIST. In the next step 506, a variable DYN_THRESHOLD is set to zero. This variable stores a dynamic threshold to allow early selection of a candidate pixel in a manner, which will be explained below. After step 506 the processing continues at decision box 508, in which a check is made whether the CANDIDATE_LIST is empty. The CANDIDATE_LIST will be empty once there are no more pixels neighbouring the growing regions. If the decision box 508 returns true yes) the region growing process 306 is completed and the process continues to step 208.

If the decision block 508 returns false then the processing continues at step 512.

In step 512 the variable loop counter i is set to zero, the variable MIN is set to 256, and the variable DYN STEP is set to the current size of the CANDIDATE LIST divided by 300. The variable MIN is used for storing the minimum delta value of the previous iteration of the loop 508-512,..,and 536. The variable DYN_STEP is used for storing a variable step value used for scanning the CANDIDATE-LIST. This variable step value is used for determining the delta values for a limited number of candidates in the CANDIDATE-LIST. Specifically, only those candidates spaced apart by a value equal to the step value will be considered for allocation to the region. After step 512, the processing continues at decision box 514, where a check is made whether the loop counter is less than the size of the CANDIDATE LIST.

If the decision block 514 returns false, the processing continues at step 526, which is described below. If, however, the decision box 514 returns true then the region growing process has not considered all the limited number of neighbouring pixels. In this situation the processing continues at step 516, where the variable CURRENTPIX is set to pixel i in the CANDIDATE_LIST. This step 516 sets the next candidate pixel to be 536254.doc -21considered. It should be noted that this pixel is spaced apart from the previous pixel considered by a distance equal to the value stored in the variable DYN_STEP. After step 516, the processing continues at the decision box 518.

In decision box 518, a comparison is made whether the difference between the luminance value of the current candidate pixel and the mean of the neighbouring region is less than MIN. If the decision box 518 returns false, then the processing continues at step 520. In step 520 the loop counter i is incremented by the step value stored in DYN STEP. If the decision box 518, returns true, then the processing continues at step 522. In step 522, the MIN variable is now set to the minimum delta value determined for the current pixel. In addition, the variable CHOSENPIX is set to the selected current pixel and the variable CHOSEN_REG is set to the index of the current region. After step 522, the processing continues at step 524.

In decision block 524, a comparison is made whether the current minimum delta .:value stored in MIN is less than the current value stored in DYN THRESHOLD. If the decision block 524 returns false then the processing continues at step 520, where the loop counter i is incremented by the step value stored in DYN_STEP. Otherwise, if the decision block 524 returns true then the processing continues at step 526. In step 526, each pixel neighbouring the current pixel stored in CHOSEN_PIX, and not previously stored in the CANDIDATELIST, is now added to the CANDIDATE_LIST. During step 528, the current pixel stored in CHOSEN_PIX is added to the region stored in REGION[CHOSENREG]. After step 528, the processing continues at step 530, where the current pixel stored in CHOSEN_PIX is added to the region which is stored in REGION[CHOSENREG]. At the next step 534, the current pixel stored in CHOSENPIX is removed from the candidates in CANDIDATE_LIST. The processing then continues at step 536, where the variable DYN_THRESHOLD is reset to the maximum value of the current values stored in MIN and DYN THRESHOLD. After which, the processing returns to decision block 508. The process terminates when the CANDIDATE_LIST is empty.

The preferred growing process of Fig. 5A continues until all candidate pixels have been allocated to an associated region, resulting in a segmented image.

Fig. 5B illustrates a simplified example of the preferred region growing process.

For simplicity's sake, this example shows only one region of the region growing process, whereas the preferred method allows the simultaneous growing of multiple regions. An initial region 550 consisting of a plurality of pixels (not shown) is surrounded by a 536254.doc number of candidate pixels 552 to be added to the region 550. Firstly, the process calculates the mean of the luminance values of the pixels of the region 550. Then the process determines the difference between this mean and the luminance value of a limited number of candidate pixels 552 in turn. The process then determines the minimum difference of these differences and allocates the candidate pixel associated with this minimum difference to the region 550. If however, the luminance difference value of any candidate pixel is less than the minimum difference value in the previous iteration, then the process instead allocates this candidate pixel to the region 550 and then proceeds to the next iteration. In the next iteration, the mean value of the luminance values of the pixels of the grown region 550 is then recalculated and the process continues.

The aforementioned preferred method(s) comprise a particular control flow. There are many other variants of the preferred method(s) which use different control flows without departing the spirit or scope of the invention. For example, instead of using a do }while(END) loop as currently shown in Fig. 1, other loop arrangements may be oooo used. Furthermore one or more of the steps of the preferred method(s) may be performed in parallel rather sequential.

The method of segmenting and tracking objects is preferably practiced using a conventional general-purpose computer system 100, such as that shown in Fig. 1, wherein the processes of Figs. 2, 3, 4A and 5A may be implemented as software, such as an application program executing within the computer system 100. In particular, the steps of the segmenting and tracking method are effected by coding instructions in the software that are carried out by the computer. The software may be divided into two separate parts; one part for carrying out the segmenting and tracking methods; and another part to manage the user interface between the latter and the user. The software may be stored in a computer readable medium, including the storage devices described below, for example.

The software is loaded into the computer from the computer readable medium, and then executed by the computer. The use of the computer readable medium in the computer preferably effects an advantageous apparatus for segmenting and tracking in accordance with the embodiments of the invention.

The computer system 100 comprises a computer module 101, input devices such as a keyboard 102 and mouse 103, output devices including a printer 115 and a display device 114. A Modulator-Demodulator (Modem) transceiver device 116 is used by the computer module 101 for communicating to and from a communications network 120, for example connectable via a telephone line 121 or other functional medium. The modem 536254.doc 23- 116 can be used to obtain access to the Intemrnet, and other network systems, such as a Local Area Network (LAN) or a Wide Area Network (WAN).

The computer module 101 typically includes at least one processor unit 105, a memory unit 106, for example formed from semiconductor random access memory (RAM) and read only memory (ROM), input/output (10O) interfaces including a video interface 107, and an I/O interface 113 for the keyboard 102 and mouse 103 and optionally a joystick (not illustrated), and an interface 108 for the modem 116. A storage device 109 is provided and typically includes a hard disk drive 110 and a floppy disk drive 111. A magnetic tape drive (not illustrated) may also be used. A CD-ROM drive 112 is typically provided as a non-volatile source of data. The components 105 to 113 of the computer module 101, typically communicate via an interconnected bus 104 and in a manner, which results in a conventional mode of operation of the computer system 100 known to those in the relevant art. Examples of computers on which the embodiments can be practised include IBM-PC's and compatibles, Sun Sparcstations or alike computer systems evolved therefrom.

Typically, the application program of the preferred embodiment is resident on the hard disk drive 110 and read and controlled in its execution by the processor 105.

Intermediate storage of the program and any data fetched from the network 120 may be accomplished using the semiconductor memory 106, possibly in concert with the hard disk drive 110. In some instances, the application program may be supplied to the user encoded on a CD-ROM or floppy disk and read via the corresponding drive 112 or 111, or alternatively may be read by the user from the network 120 via the modem device 116.

Still further, the software can also be loaded into the computer system 100 from other computer readable medium including magnetic tape, a ROM or integrated circuit, a magneto-optical disk, a radio or infra-red transmission channel between the computer module 101 and another device, a computer readable card such as a PCMCIA card, and the Intemrnet and Intranets including email transmissions and information recorded on websites and the like. The foregoing is merely exemplary of relevant computer readable mediums. Other computer readable mediums may be practiced without departing from the scope and spirit of the invention.

The method of segmenting and tracking may alternatively be implemented in dedicated hardware such as one or more integrated circuits performing the functions or sub functions of the method. Such dedicated hardware may include graphic processors, digital signal processors, or one or more microprocessors and associated memories.

536254.doc -24- Industrial Applicability It is apparent from the above that the embodiment(s) of the invention are applicable to the video and related industries.

The foregoing describes some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiment(s) being illustrative and not restrictive.

In the context of this specification, the word "comprising" means "including principally but not necessarily solely" or "having" or "including" and not "consisting only of". Variations of the word "comprising", such as "comprise" and "comprises" have corresponding meanings.

s* 3

S

Claims

11. UN. 2003 16:52 SPRUSON FERGUSON NO 2592 P. The claims defining the invention are as follows: 1. A method of tracking moving objects from a video sequence, said video sequence comprising a plurality of frames each comprising a plurality of pixels, the method performing the following steps for each two adjacent frames of the video sequence: segmenting a current frame of the video sequence into a number of homogeneous regions of pixels; estimating the motion of said segmented regions of the current frame; projecting the segmented regions of the current frame into a next frame using said estimated motion; segmenting said next frame of the video sequence into a number of homogeneous regions; and establishing a correspondence between the segmented regions of the current and a. a next frame using the projected regions, wherein said segmented regions of the current frame and corresponding said segmented regions of the next frame constitute said objects and the estimated motion of said segmented regions their motion; wherein either one or both of said steps of segmenting said current and next frames comprises the following sub-steps: distributing seeds in said frame as a function of a first property of said pixels ~*.within the said frame, wherein fewer seeds are allocated to those regions having pixels homogeneous in said first property; and 22 growing a number of regions from said seeds, wherein a number of pixels that border said growing regions are considered and that pixel of said numb er that is most similar in a second property to a region it borders is appended to that region and the said second property of the appended region is updated and said growing step is repeated until no pixels bordering the growing regions are available. 2. A method as claimed in claim 1, wherein said motion of said objects is partly or wholly due to motion of a moving camera. 3. A method as claimed in claim 1 or 2, wherein said step of seeding said regions comprises: 536254 amedndmt0.doc COMS ID No: SMBI-00289460 Received by IP Australia: Time 16:56 Date 2003-06-11 11. JUN. 2003 16:53 SPRUSON FERGUSON NO. 2592 P. 26 -26- dividing the frame into a plurality of areas; allocating, for each divided area, one or .more seeds as a function of said first property within the divided area and the size of the divided area as compared to the size of the frame. 4. A method as claimed in claim 1, 2 or 3, wherein said growing step comprises the sub-steps of: generating a list of pixels that border the growing regions; scanning a said number of said list of pixels in a predetermined manner; determining a value, for each said scanned pixel, indicative of the similarity of a said second property of said scanned pixel and the corresponding said second property of a growing region that said scanned pixel borders; selecting a pixel that has a minimum said value; appending said selected pixel to said region it borders; updating the corresponding said second property of the appended region; and repeating the sub-steps of the growing step until no pixels bordering the growing regions are available. A method as claimed in claimn any one of the preceding claims, wherein said step of seeding distributes seeds in said projected regions of said next frame. 6. A method as claimed in claim 5, wherein said step of seeding comprises.- discarding those seeds in said projected regions falling within a predetermined distance of boundaries of the projected regions. S 7. A method as claimed in claim 5, wherein said method further comprises: merging those said grown regions of the frame, which have been grown from seeds which belong to the same said projected regions. 8. A method of tracking moving objects from a video sequence, said video sequence comprising a plurality of frames each comprising a plurality of pixels, the method performing the following steps for each two adacent fr-ames of the video sequence: segmenting a current frame of the video sequence into a number of homogeneous regions of pixels; 536254 r~crdmit_0 I doc COMS ID No: SMBI-00289460 Received by IP Australia: Time (I-tm) 16:56 Date 2003.06-11 11. JUN-2003 16:53 SPRUSON FERGUSON NO. 2592 P. 27 -27 estimating the motion of said segmented regions of the current frame; selecting any one or more of those said segmented regions of the curret fr-ame having motion; projecting the selected segmented regions of the current frame into a next frame using said estimated motion; distributing seeds in said projected regions of said next frame as a finction of a first property of said pixels within those projected regions, wherein fewer seeds are allocated to those regions having pixels homogeneous in said first property; growing a number of regions from said seeds, wberein a number of pixels having motion that border said growing regions are considered and that pixel of said number that is most similar in a second property to a region it borders is appended to that region and the said second property of the appended region is updated and said growing step is repeated until there are no pixels bordering the growing regions having motion; and sO establishing a correspondence between the selected segmented regions of the current frame and the segmented regions of the next frame using the projected regions, wherein said selected segmented regions of the current frame and corresponding said segmented regions of the next frame constitute said objects and the estimated motion of said selected segmented regions their motion. 9. A method as claimed in claim 8, wherein said motion of said objects is partly or V. wholly due to motion of a moving camera. A method as claimed in claim S or 9, wherein said step of seeding said projected 9 regions comprises: dividing the next fr-ame into a plurality of areas; allocating, for each divided area, one or more seeds as a fuinction of said first property within the divided area and the size of the divided area as compared to the size of the current frame;- and retakning only those allocated seeds that belong to the projected regions. 536234_amnendmemts_01.doc COMS ID No: SMBI-00289460 Received by IP Australia: Time 16:56 Date 2003-06-11 11. UN. 2003 16:53 SPRUSON FERGUSON NO. 2592 P. 28 -28 11. A method as claimed in claim 8, 9, or 10, wherein said step of seeding said projected regions comprises: discarding those seeds in said projected regions falling within a predetermined distance of boundaries of the projected regions.

12. A method as claimned in any one of the preceding claims 8 to 11, wherein said method irther comprises merging those said grown regions of the next frame, which have been grown from Io seeds which belong to the same said projected regions.

13. A method as claimed in any one of the preceding claims 8 to 12, wherein said selecting step comprises selecting any one or more of those said segmented regions of the current frame having a motion greater than a predetermined motion. *14. A method as claimed in any one of the preceding claim 8 to 12 wherein said selecting step comprises the sub-step: selecting any one or more of said segmented regions of the current frame having a motion greater than a predetermined motion and which is of interest to a user. :15. A method as claimed in any one of the preceding claims 8 to 12, wherein said selecting step comprises the sub-step: selecting any one or more of said segmented regions of the current frame within chosen hounding box having a motion greater than a predetermined motion.

16. A method as claimed in any one or more of the preceding claims 8 to 15, wherein said growing step is repeated until there are no pixels bordering the growing regions having a motion greater than a predetermined motion.

17. A method as claimed in any one or more of the preceding claims 8 to 15, wherein said growing step is repeated until there are no pixels bordering the growing regions having a said second property which differs from the second property of the region they border by less than a predetermined threshold.

536254.amcndmcnia0l.doc COMS ID No: SMBI-00289460 Received by IP Australia: Time 16:56 Date 2003-06-11 11. JUN. 2003 16:53 SPRUSON FERGUSON NO. 2592 P. 29 -29 18. A method as claimed in claim 8, wherein said growing step comprises the sub-steps of: generating a list of pixels that border the growing regions; scanning a said number of said list of pixels in a predetermined manner; determining a value, for each said scanned pixel, indicative of the similarity of a said second property of said scanned pixel and the corresponding said second property of a growing region that said scanned pixel borders; selecting a pixel that has a minimum said value; appending said selected pixel to said region it borders; updating the corresponding said second property of the appended region; and repeating the sub-steps of the growing step until no pixels bordering the growing regions having a motion greater than a predetermined threshold. 19. Apparatus for tracking moving objects from a video sequence, said video sequence comprising a plurality of frames each comprising a plurality of pixels, the apparatus comprising: means for segmenting each frame of the video sequence into a number of homogeneous regions of pixels, wherein said segmenting means comprises: means for distributing, for each said frame, seeds as a function of a first property of said pixels within the frame, wherein fewer seeds are allocated to those regions having pixels homogeneous in said first property; and means for growing, for each said frame 1 a number of regions from said seeds, wherein a number of pixels that border said growing regions are considered and 9.99 that pixel of said number that is most similar in a second property to a region it borders is appended to that region and the said second property of the appended region is updated and the growing operation is repeated until no pixels bordering the growing regions are available; means for estimating the motion of said segmented regions of a current frame; means for projecting the segmented regions of the current frame into a next adjacent frame using said estimated motion; and means for establishing a correspondence between the segmented regions of the current and the next adjacent frames using the projected regions, wherein said segmented regions of the current frame and corresponding said segmented regions of the next S3634-awndmns_0.doc COMS ID No: SMBI-00289460 Received by IP Australia: Time 16:56 Date 2003-06-11 11. JUN. 2063 16:54' SPRUSON FERGUSON NO. 2592 P. 36 30 adj acent frame constitute said objects and the estimated motion of said segmented regions their motion. Apparatus for tracking moving objects from a video sequence, said video sequence comprising a plurality of ftames each comprising a plurality of pixels, the apparatus compnising: means far segmenting a current frame of the video sequence into a number of homogeneous regions of pixels; means for estimating the motion of said segmented regions of the current frame; means for selecting any one or more of those said segmented regions of the current frame having motion; .0 means for projecting the selected segmented regions of the current frame into :0 said next frame using said estimated motion; means for distributing seeds in said projected regions of a next frame as a funrction of a first property of said pixels within those projected regions, wherein fewer *:seeds are allocated to those regions havinig pixels homogeneous in said first property; means for growing a number of regions from said seeds, wherein a number of pixels having motion that border said growing regions are considered and that pixel of said number that is most similar in a second property to a region it borders is appended to that region and the said second property of the appended region is updated and the V. growing operation is repeated until there are no pixels bordering the growing regions having motion; and means for establishing a correspondence between the selected segmented regions 0000 of the current frame and the segmented regions of the next frame using the projected regions, wherein said selected segmented regions of the current frame and corresponding 90.000said segmented regions of the pext frame constitute said objects and the estimated motion of said selected segmented regions their motion. 21. A computer readable medium comprising a computer program for tracking moving objects from a video sequence, said video sequence comprising a plurality of frames each comprising a plurality of pixels, the computer program comprising: means for segmenting each frame of the video sequence into a number of homogeneous regions of pixels, wherein said segmenting means comprises:. 536254.anendwlflsl.dcC COMS ID No: SMBI-00289460 Received by IP Australia: Time 16:56 Date 2003-06-11 11. JUN. 2003 16:54 SPRUSON FERGUSON NO. 2592 P. 31 means for distributing, for each said framie, seeds as a tunction of a first property of said pixels within the frame, wherein fewer seeds are allocated to those regions having pixels homogeneous in said first property; and means for growing, for each said frame, a number of regions from said seeds, wherein a number of pixels that border said growing regions are considered and that pixel of said number that is most similar in a second property to a region it borders is appended to that region and the said second property of the appended region is updated and the growing operation is repeated until no pixels bordering the girowing regions are available; means for estimating the motion of said segmented regions of a current firame; means for projecting the segmented regions of the current frame into a next adjacent frame using said estimated motion; and means for establishing a correspondenice between the segmented regions of the current and the next adjacent frames using the projected regions, wherein said segmented regions of the current frame and corresponding said segmented regions of the next '7'.:adjacent frame constitute said objects and the estimated motion of said segmented regions their motion. 22. A computer readable medium comprising a computer program for tracking Moving objects from a video sequence, said video sequence comprising a plurality of frames each comprising a plurality of pixels, the computer program comprising: means for segmenting a current frame of the video sequence into a number of homogeneous regions of pixels; means for estimating the motion of said segmented regions of the current frame; .25 meams for selecting any one or more of those said segmented regions of the current frame having motion; means for projecting the selected segmented regions of the current framne into said next fr-ame using said estimated motion; means for distributing seeds in said projected regions of a nedK frame as a function of a first property of said pixels within those projected regions,. wherein fewer seeds are allocated to those regions having pixels homogeneous in said first property; means for growing a number of regions from said seeds, wherein a number of pixels having motion that border said growing regions are considered and that pixel of said number that is most similar in a second property to a region it borders is appended to S3G2S4_mondmcnhS_01.4c COMS ID No: SMBI-00289460 Received by IP Australia: Time 16:56 Date 2003-06-11 11. JUN. 2003 16:54 SPRUSON FERGUSON NO. 2592 P. 32 -32- that region and the said second property of the appended region is updated and the growing operation is repeated until there are no pixels bordering the growing regions having motion; and means for establishing a correspondence between the selected segmented regions of the current frame and the segmented regions of the next frame using the projeced regions, wherein said selected segmented regions of the current frame and corresponding said segmented regions of the next frame constitute said objects and the estimated motion of said selected segmented regions their motion. 23. A method atf tracking moving objects from a video sequence, the method substantially as described herein with reference to Figs. 2, 3, 4A, and 5A of the accompanying drawings. 24. Apparatus for tracking moving objects from a video sequence, the apparatus 000415 substantially as described herein with reference to Figs. 1, 2, 3, 4A, and 5A of the accompanying drawings. A computer readable medium comprising a computer program for trad~dng moving objects from a video sequence, the apparatus substantially as described herein with reference to Figs. 1, 2, 3, 4A, and 5A of the accompanying drawings. DATED this eleventh Day of June, 2003 Canon Kabushiki Kaisha Patent Attorneys for the Applicant SPRUSON FERGUSON 536254JaMeaMents0l.duc COMS ID No: SMBI-00289460 Received by IP Australia: Time (I-tm) 16:56 Date 2003-06-11