WO2015181179A1 - Method and apparatus for object tracking and segmentation via background tracking - Google Patents

Method and apparatus for object tracking and segmentation via background tracking Download PDF

Info

Publication number
WO2015181179A1
WO2015181179A1 PCT/EP2015/061604 EP2015061604W WO2015181179A1 WO 2015181179 A1 WO2015181179 A1 WO 2015181179A1 EP 2015061604 W EP2015061604 W EP 2015061604W WO 2015181179 A1 WO2015181179 A1 WO 2015181179A1
Authority
WO
WIPO (PCT)
Prior art keywords
tracking
background
background region
propagation
current frame
Prior art date
Application number
PCT/EP2015/061604
Other languages
French (fr)
Inventor
Tomas Enrique CRIVELLI
Patrick Perez
Juan Manuel PEREZ RUA
Original Assignee
Thomson Licensing
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Thomson Licensing filed Critical Thomson Licensing
Priority to EP15724674.5A priority Critical patent/EP3149707A1/en
Priority to JP2016569791A priority patent/JP2017522647A/en
Priority to US15/314,497 priority patent/US10249046B2/en
Priority to CN201580028168.6A priority patent/CN106462975A/en
Priority to KR1020167033141A priority patent/KR20170015299A/en
Publication of WO2015181179A1 publication Critical patent/WO2015181179A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/215Motion-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics

Definitions

  • TECHNICAL FIELD This invention relates to tracking an object and segmenting it from, the background.
  • Tracking of an object entails locating the object's position at successive instances with the object manually defined in the first frame or as the output of an object detector.
  • object tracking depends on extracting one or more characteristic features of the object (motion, color, shape, appearance) and using such characteristic feature(s) to estimate the position of the object in a next image frame, based on the object's position in the current image frame.
  • a number of techniques exist for object tracking including, optimal filtering, point-tracking, tracking-by-detection, optical-flow, and background subtraction, for example.
  • Proposals to refine object tracking have suggested gaining an advantage by modeling the foreground (the tracked object) and the background, and using this information to reject the estimated object positions most likely belonging to the background.
  • the basic approach to such modeling entails extracting a model of the background appearance using color distributions learned in the first frame, for example, and updating such distributions along the sequence of images.
  • Such modeling requires prior knowledge of the background and the object, in order to learn the appearance of the object correctly. For this reason, foreground/background segmentation has become a key component in recent top-performing tracking devices.
  • present-day models often do not sufficiently discriminate between the object and the background for rigorously tracking the object.
  • a method for tracking an object commences by first establishing the object (12) in a current frame. Thereafter, a background region (202) is established surrounding the object in the current frame. The location for the object (12) is then estimated in a next frame. Next, the propagation of the background region (202) is determined. Finally, the object is segmented from, its background based on propagation of the background region, thereby allowing tracking of the object from frame to frame.
  • FIGURE 1 depicts a block schematic diagram of a system for tracking an object in accordance with the present principles
  • FIGURE 2 depicts a screen view of the object tracked by the system of FIG. 1; background;
  • FIGURE 3 depicts in flow chart form the steps of a method executed by the system of FIG. 1 for tracking the object shown in FIG. 2;
  • FIGURES 4A-4D depict successive image frames showing movement of an object from frame to frame.
  • FIGS 5A-5D depicts tracking windows corresponding to the images in FIGS 4A-4D, respectively, showing the tracked image segmented from the image background.
  • FIGURE 1 depicts a block schematic diagram of a system. 1.0 in accordance with the present principles for tracking an object .12, exemplified by an automobile, although the object could comprise virtually any article capable of undergoing imaging.
  • the system 10 of FIG. 1 includes a graphical processing unit .1.4, which can comprise a general or special purpose computer programmed with software for tracking an image using the object tracking and object flow algorithms described hereinafter.
  • software resides in a storage device 15, for example a hard disk drive or the like, which can also store data produced by the graphical processing unit 14 during operation.
  • the graphical processing unit 14 alternatively could comprise discrete circuitry capable of tracking an object executing the object tracking and object flow algorithms described hereinafter.
  • the system 10 includes an image acquisition device 16 in the form of a television camera which supplies the graphical processing unit 14 with video signals represented the image of the object 12 captured by the television camera.
  • An operator interacts with the graphical processing unit 14 through a keyboard 18 and/or mouse 20.
  • the keyboard 18 and mouse 20 constitute examples of well- known operator data-entry devices, and the system. 10 could make use of other such data entry device in place of, or in addition to, the keyboard and mouse.
  • the system. 10 also typically includes a network, interface unit 22, as are well known in the art for connecting the graphical processing unit 14 to a network, such as a Local Area Network. (LAN) or Wide Area Network, as exemplified by the Internet.
  • a network such as a Local Area Network. (LAN) or Wide Area Network, as exemplified by the Internet.
  • LAN Local Area Network
  • Wide Area Network as exemplified by the Internet.
  • the system 10 could include one or more peripheral devices as well, such as a printer and or a plotter.
  • improved tracking of an object from image frame to image frame occurs by tracking not only the object, but by tracking a region, (e.g., a group of pixels) of the background surrounding the object, which enables improved segmentation of the object from the background as well as a better estimation of the object location.
  • object tracking accomplished by present principles involves object flow, comprising the combination of object tracking and background tracking.
  • tracking an object involves estimating the position of object in a next image frame, given an initial position of the object in a current image frame.
  • the optical flow between a pair of frames necessitates finding a displacement vector for each pixel of the first image.
  • the graphical processing unit 14 of the system 10 of FIG. 1 employs a super- pixel matching technique hereinafter referred to as "superpixel flow.”
  • the superpixel flow technique has as its objective finding the best match for every superpixel p in the first frame with a superpixel ( ⁇ ') in the next frame, while maintaining a global flow-like behavior.
  • ⁇ ' superpixel
  • Such superpixelization should maintain a certain size homogeneity within a single frame.
  • Some existing pixel flow techniques can cope with this requirement.
  • the SLIC pixel flow method well known in the art, gives good results in terms of size homogeneity and compactness of the superpixelization.
  • the superpixel flow of the present principles is modelled with a pairwise Markov Random Field.
  • the matching is performed via maximum-a-posteriori (MAP) inference on the labeling /, which is equivalent to the minimization of an energy function of the form:
  • JV r defines a neighborhood of radius r of the superpixel p.
  • D, and S in equation (1) represent the data term and spatial smoothness term, respectively.
  • the first term determines how accurate is the labeling in terms of consistency with the measured data (color, shape,etc).
  • the data term corresponds to the pixel brightness conservation.
  • superpixels constitute a set of similar (e.g., homogeneous) pixels
  • an adequate appearance based feature is a low dimensional color histogram (with N bins).
  • D the Hellinger distance between the histograms is given by:
  • the spatial term constitutes a penalty function for the spatial difference of the displacement vectors between neighboring superpixels, where a displacement vector has its origin in the centroid of the superpixel of the first frame and its end in the centroid of the superpixel of the second frame.
  • ⁇ ( ⁇ ) (1 + p ⁇ h(p), fc(f ))) a .
  • the operator p constitutes the Hellinger distance as used in the data term (2).
  • the histogram distance is nonetheless computed between adjacent superpixels p and q, which belong to the first image.
  • the superpixels centroids are noted as q c and p c , and u * and v * and are the horizontal and vertical changes between centroids.
  • This term has a smoothing effect in superpixels that belong to the same object. In practice, when two close superpixels are different, thus, probably belonging to different objects within the image, the term ⁇ allows them to have matches that do not hold the smoothness prior with the same strength.
  • Quadratic Pseudo-Boolean Optimization (QPBO) to minimize the proposed energy function, by merging a set of candidate matches for every superpixel in the first frame.
  • the candidate matches are generated by assuming a proximity prior. This means, every possible match should lie inside a search radius in the second frame.
  • the object flow commences by computing the motion field for an object of interest through an image sequence.
  • the most usual approach is to implement some of the available optical flow techniques through the complete sequence and perform the flow integration. However, doing so will result in high levels of motion drift and usually the motion of the interest object is affected by a global regularization. In some extreme cases, the interest object motion may be totally blurred and other techniques have to be incorporated.
  • the diversity of natural video sequences makes it difficult to choose one technique over another, even when specialized databases are at hand because currently no single method can achieve a strong performance in all of the available datasets. Most of these methods minimize the energy function with two terms. The data term is mostly shared between different approaches, but the prior or spatial term is different, and states under what conditions the optical flow smoothness should be maintained or not.
  • the graphical processing unit 14 of FIG. 1 can refine the optical flow computation by taking into account the segmentation mask within the tracked windows. The graphical processing unit 14 of FIG. 1 undertakes this refinement by considering the segmentation limits as reliable smoothness boundaries, assuming the motion is indeed smooth within the object region. This is assumption remains valid for most scenes for a given object of interest. Naturally, the object flow should be more robust to rapid motions than the optical flow.
  • the full motion will split in two, the long-range motion, given by a tracker window, as described hereinafter, and the precision part, given by the targeted optical flow.
  • the Simple Flow used by the graphical processing unit 14 of FIG. 1 technique serves as the core base because of its scalability to higher resolutions and its specialization to the concept of object flow.
  • the graphical processing unit 14 of FIG. 1 can easily specify smoothness localization through computation masks. More specifically, the graphical processing unit 14 of FIG. 1 will derive the initial computation mask from the segmentation performed as a prior step. The graphical processing unit 14 of FIG. 1 filters the resulting flow only inside the mask limits to enhance precision and expediting the implementation.
  • the graphical processing unit 14 of FIG. 1 can precisely target regularity constraints by disconnecting foreground pixels from background pixels.
  • the graphical processing unit 14 can initially establish the position of the object 12 (e.g., the car) in FIG. 2 by circumscribing the object with a first bounding box 200, hereinafter referred to as a tracking window.
  • a first bounding box 200 hereinafter referred to as a tracking window.
  • the graphical processing unit 14 can estimate location of the object 12 in FIG. 2 in the next frame, thus giving a new position of the tracking window 200.
  • the graphical processing unit 14 of FIG. 1 circumscribes the tracking window 200 with second bounding box 202, hereinafter referred to as the background region.
  • the background region 202 is larger than the tracking window 202 and determines a background region (e.g., a group of pixels) that includes a portion of the image background surrounding the tracking window but not the tracking window itself.
  • a background region e.g., a group of pixels
  • a set of pixels lie inside the background region 202 and thus constitute part of the background.
  • these pixels fall into the tracking window 200, as indicated by displacement arrows 204, due to propagation of the background region 202.
  • the graphical processing unit 14 of FIG. 1 can determine which pixels fall into the tracking window in a subsequent frame.
  • the graphical processing unit 14 of FIG. 1 can safely assume that the pixels that previously resided in the background region 202 but now fall into the tracking window in a next frame really belong to the background (as they initially resided outside the tracking window). Thus, the graphical processing unit 14 will appropriately designate such pixels accordingly. Thus, new pixels entering the tracking window now become part of the background.
  • the graphical processing unit 14 of FIG. 1 will track the pixels previously deemed as part of the background along with the newly entering pixels. If a background pixel does not fall in the tracking window but stays within the surrounding background region, the graphical processing unit 14 of FIG.
  • FIGURE 3 depicts in flow-chart form the steps of a process 300 executed by the graphical processing unit 14 of FIG. 1 to accomplish object tracking as discussed above.
  • the process 300 commences by first establishing the location of the object 12 of FIG. 1 in a current frame during step 300.
  • the graphical processing unit 14 of FIG. 1 establishes the location of the object 12 by initially circumscribing the object with the tracking window 200 as seen in FIG. 2.
  • the graphical processing unit 14 of FIG. 1 establishes a background region, as represented by the background region 202 in FIG. 2.
  • the graphical processing unit 14 of FIG. 1 estimates the location of the object in a next frame.
  • step 306 the graphical processing unit 14 of FIG. 1 determines the propagation of the background region in the next fame. Based on the propagation of the background region during step 306, the graphical processing unit 14 of FIG. 1 can segment the object from the background during step 308, to facilitate tracking of the object.
  • FIGS. 4A-4D and 5A-5D depict, from right to left, a sequence images of an object, (e.g., an automobile) in motion.
  • FIGS. 5A-5D each depict a segmentations of the object from the background appearing in a corresponding one of the images depicted in FIGS. 4A-4D, respectively.
  • the tracking method of the present principles discussed above can include one or more of the following feature. For example, tracking all the background pixels can occur using one of a variety of well-known dense point tracking methods. Further, the graphical processing unit 14 of FIG. 1 could track a sparse set of pixels by any one of a variety of sparse point tracking methods and then extrapolate the segmentation to the other points. As another modification to the tracking method discussed above, the graphical processing unit 14 of FIG. 1 could first apply superpixelation to the background and then tracking superpixels, which is more efficient and robust and has the advantage that such superpixels are usually attached to the object borders.
  • the graphical processing unit 14 of FIG. 1 could use the segmentation obtained as described to better extract and initialize an appearance model of the foreground and background, and refine the segmentation via a standard color-based segmentation technique. Further, .the graphical processing unit 14 of FIG. 1 could use the complement of the background mask, i.e. the foreground mask, to adjust the position of the target by for example, center-of-mass computation, or blob analysis. The graphical processing unit 14 of FIG. 1 could also compute the target location by taking the more distant point to the background or using the distance transform to extract a more plausible position. Further, the graphical processing unit 14 of FIG. 1 can use the tracked background to reject the object location, thereby improving the output of standard tracker methods, notably the tracking-by- detection method.
  • the graphical processing unit 14 of FIG. 1 could combine several object tracking methods on near-to-the-target objects in the background in order to analyze the non-interest objects entering the tracking window for better handling occlusions and rejection of target object locations.
  • the graphical processing unit 14 of FIG. 1 could implement tracking in a backward direction, which is equivalent of detecting regions of the background that leave the tracking window to accomplish offline or delayed processing.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Closed-Circuit Television Systems (AREA)

Abstract

A method for tracking an object commences by first establishing the object (12) in a current frame. Thereafter, a background region (202) is established encompassing the object in the current frame. The location for the object (12) is then estimated in a next frame. Next, the propagation of the background region (202) is determined. Finally, the object is segmented from its background based on propagation of the background region, thereby allowing tracking of the object from frame to frame.

Description

METHOD AND APPARATUS FOR OBJECT TRACKING AND SEGMENTATION
VIA BACKGROUND TRACKING
CROSS REFERENCES
This application claims priority to European Application Serial No. 14305799.0, filed on May 28, 2014, which is herein incorporated by reference in its entirety.
TECHNICAL FIELD This invention relates to tracking an object and segmenting it from, the background.
BACKGROUND ART
Tracking of an object entails locating the object's position at successive instances with the object manually defined in the first frame or as the output of an object detector. In general, object tracking depends on extracting one or more characteristic features of the object (motion, color, shape, appearance) and using such characteristic feature(s) to estimate the position of the object in a next image frame, based on the object's position in the current image frame. A number of techniques exist for object tracking, including, optimal filtering, point-tracking, tracking-by-detection, optical-flow, and background subtraction, for example.
Proposals to refine object tracking have suggested gaining an advantage by modeling the foreground (the tracked object) and the background, and using this information to reject the estimated object positions most likely belonging to the background. The basic approach to such modeling entails extracting a model of the background appearance using color distributions learned in the first frame, for example, and updating such distributions along the sequence of images. However, such modeling requires prior knowledge of the background and the object, in order to learn the appearance of the object correctly. For this reason, foreground/background segmentation has become a key component in recent top-performing tracking devices. Moreover, even with correct initialization of the segmentation between the object and the background, present-day models often do not sufficiently discriminate between the object and the background for rigorously tracking the object. Finally, complete or partial occlusions of the object and changes in the appearance of the object resulting from rotation, illumination, shadows, and/or self-occlusions, for example, increase the difficulty of establishing a successful model adaptation strategy. Thus, a need exists for a technique for object tracking that overcomes the foregoing disadvantages of the prior art.
BRIEF SUMMARY
Briefly, in accordance with the present principles, a method for tracking an object commences by first establishing the object (12) in a current frame. Thereafter, a background region (202) is established surrounding the object in the current frame. The location for the object (12) is then estimated in a next frame. Next, the propagation of the background region (202) is determined. Finally, the object is segmented from, its background based on propagation of the background region, thereby allowing tracking of the object from frame to frame.
BRIEF SUMMARY OF THE DRAWINGS
FIGURE 1 depicts a block schematic diagram of a system for tracking an object in accordance with the present principles;
FIGURE 2 depicts a screen view of the object tracked by the system of FIG. 1; background;
FIGURE 3 depicts in flow chart form the steps of a method executed by the system of FIG. 1 for tracking the object shown in FIG. 2;
FIGURES 4A-4D depict successive image frames showing movement of an object from frame to frame; and
FIGS 5A-5D depicts tracking windows corresponding to the images in FIGS 4A-4D, respectively, showing the tracked image segmented from the image background.
DETAILED DESCRIPTION
FIGURE 1 depicts a block schematic diagram of a system. 1.0 in accordance with the present principles for tracking an object .12, exemplified by an automobile, although the object could comprise virtually any article capable of undergoing imaging. The system 10 of FIG. 1 includes a graphical processing unit .1.4, which can comprise a general or special purpose computer programmed with software for tracking an image using the object tracking and object flow algorithms described hereinafter. In practice, such software resides in a storage device 15, for example a hard disk drive or the like, which can also store data produced by the graphical processing unit 14 during operation. Although described as a programmed computer, the graphical processing unit 14 alternatively could comprise discrete circuitry capable of tracking an object executing the object tracking and object flow algorithms described hereinafter. To track the object 12, the system 10 includes an image acquisition device 16 in the form of a television camera which supplies the graphical processing unit 14 with video signals represented the image of the object 12 captured by the television camera.
An operator (not shown) interacts with the graphical processing unit 14 through a keyboard 18 and/or mouse 20. The keyboard 18 and mouse 20 constitute examples of well- known operator data-entry devices, and the system. 10 could make use of other such data entry device in place of, or in addition to, the keyboard and mouse. A display device 20, typically a monitor of a type well known in the art, displays information generated by the graphical processing unit 14 intended for observation by the operator.
The system. 10 also typically includes a network, interface unit 22, as are well known in the art for connecting the graphical processing unit 14 to a network, such as a Local Area Network. (LAN) or Wide Area Network, as exemplified by the Internet. Although not shown, the system 10 could include one or more peripheral devices as well, such as a printer and or a plotter.
In accordance with the present principles, improved tracking of an object from image frame to image frame occurs by tracking not only the object, but by tracking a region, (e.g., a group of pixels) of the background surrounding the object, which enables improved segmentation of the object from the background as well as a better estimation of the object location. Thus, the object tracking accomplished by present principles involves object flow, comprising the combination of object tracking and background tracking. As discussed, tracking an object involves estimating the position of object in a next image frame, given an initial position of the object in a current image frame. On the other hand, the optical flow between a pair of frames necessitates finding a displacement vector for each pixel of the first image.
In order to segment the object from the image, the graphical processing unit 14 of the system 10 of FIG. 1 employs a super- pixel matching technique hereinafter referred to as "superpixel flow." The superpixel flow technique has as its objective finding the best match for every superpixel p in the first frame with a superpixel (ρ ') in the next frame, while maintaining a global flow-like behavior. Thus, such superpixelization should maintain a certain size homogeneity within a single frame. Some existing pixel flow techniques can cope with this requirement. Preferably, the SLIC pixel flow method, well known in the art, gives good results in terms of size homogeneity and compactness of the superpixelization. Inspired by a large number of optical flow and stereo techniques, the superpixel flow of the present principles is modelled with a pairwise Markov Random Field. The matching is performed via maximum-a-posteriori (MAP) inference on the labeling /, which is equivalent to the minimization of an energy function of the form:
Figure imgf000005_0001
With / representing the set of labels of the superpixels in Io, that match with those in
JVr defines a neighborhood of radius r of the superpixel p. The terms D, and S in equation (1) represent the data term and spatial smoothness term, respectively. The first term determines how accurate is the labeling in terms of consistency with the measured data (color, shape,etc). In the classical optical flow equivalent of equation (1), the data term corresponds to the pixel brightness conservation. However, since superpixels constitute a set of similar (e.g., homogeneous) pixels, an adequate appearance based feature is a low dimensional color histogram (with N bins). With regard to equation (1), D the Hellinger distance between the histograms is given by:
Figure imgf000005_0002
Where (p) and (p ') are the color histograms of the superpixel p and its correspondent superpixel in the second frame Empirically, a RGB color histogram with N=3 bins per color proved satisfactory. Note that such a low dimensional histogram gives certain robustness against noise, and slowly changing colors between frames. On the other hand, the spatial term constitutes a penalty function for the spatial difference of the displacement vectors between neighboring superpixels, where a displacement vector has its origin in the centroid of the superpixel of the first frame and its end in the centroid of the superpixel of the second frame. (31
Figure imgf000006_0001
where, λ(ρ) = (1 + p{h(p), fc(f )))a.
The operator p constitutes the Hellinger distance as used in the data term (2). The histogram distance is nonetheless computed between adjacent superpixels p and q, which belong to the first image. The superpixels centroids are noted as qc and pc, and u* and v* and are the horizontal and vertical changes between centroids. This term has a smoothing effect in superpixels that belong to the same object. In practice, when two close superpixels are different, thus, probably belonging to different objects within the image, the term λ allows them to have matches that do not hold the smoothness prior with the same strength. The graphical processing unit 14 of FIG. 1 makes use of the Quadratic Pseudo-Boolean Optimization (QPBO) to minimize the proposed energy function, by merging a set of candidate matches for every superpixel in the first frame. The candidate matches are generated by assuming a proximity prior. This means, every possible match should lie inside a search radius in the second frame.
The object flow commences by computing the motion field for an object of interest through an image sequence. The most usual approach is to implement some of the available optical flow techniques through the complete sequence and perform the flow integration. However, doing so will result in high levels of motion drift and usually the motion of the interest object is affected by a global regularization. In some extreme cases, the interest object motion may be totally blurred and other techniques have to be incorporated. Moreover, the diversity of natural video sequences makes it difficult to choose one technique over another, even when specialized databases are at hand because currently no single method can achieve a strong performance in all of the available datasets. Most of these methods minimize the energy function with two terms. The data term is mostly shared between different approaches, but the prior or spatial term is different, and states under what conditions the optical flow smoothness should be maintained or not. In a global approach, however, this is a difficult to define. Most of these smoothness terms rely in appearance differences or gradients. Thus, some methods may be more reliable for some cases but weaker for others. This behavior may result because most of the techniques do not count in a manner to identify where exactly to apply this smoothness prior. In accordance with the present principles, the graphical processing unit 14 of FIG. 1 can refine the optical flow computation by taking into account the segmentation mask within the tracked windows. The graphical processing unit 14 of FIG. 1 undertakes this refinement by considering the segmentation limits as reliable smoothness boundaries, assuming the motion is indeed smooth within the object region. This is assumption remains valid for most scenes for a given object of interest. Naturally, the object flow should be more robust to rapid motions than the optical flow. Thus, the full motion will split in two, the long-range motion, given by a tracker window, as described hereinafter, and the precision part, given by the targeted optical flow. As a first approximation to the object flow, the Simple Flow used by the graphical processing unit 14 of FIG. 1 technique serves as the core base because of its scalability to higher resolutions and its specialization to the concept of object flow. By using the Simple Flow pipeline, the graphical processing unit 14 of FIG. 1 can easily specify smoothness localization through computation masks. More specifically, the graphical processing unit 14 of FIG. 1 will derive the initial computation mask from the segmentation performed as a prior step. The graphical processing unit 14 of FIG. 1 filters the resulting flow only inside the mask limits to enhance precision and expediting the implementation. Using graph based minimization approaches, the graphical processing unit 14 of FIG. 1 can precisely target regularity constraints by disconnecting foreground pixels from background pixels.
With the foregoing explanation, the method by which the graphical processing unit of the system 10 of FIG. 1 tracks the object 12 (e.g., a car) can best be appreciated by reference to FIG. 2. The graphical processing unit 14 can initially establish the position of the object 12 (e.g., the car) in FIG. 2 by circumscribing the object with a first bounding box 200, hereinafter referred to as a tracking window. Using a conventional object-tracking algorithm, the graphical processing unit 14 can estimate location of the object 12 in FIG. 2 in the next frame, thus giving a new position of the tracking window 200.
In accordance with the present principles, the graphical processing unit 14 of FIG. 1 circumscribes the tracking window 200 with second bounding box 202, hereinafter referred to as the background region. The background region 202 is larger than the tracking window 202 and determines a background region (e.g., a group of pixels) that includes a portion of the image background surrounding the tracking window but not the tracking window itself. As seen in FIG. 2, a set of pixels lie inside the background region 202 and thus constitute part of the background. In a next frame, these pixels fall into the tracking window 200, as indicated by displacement arrows 204, due to propagation of the background region 202. By estimating the motion of the background region (e.g., the above-described pixels surrounding the object), the graphical processing unit 14 of FIG. 1 can determine which pixels fall into the tracking window in a subsequent frame. The graphical processing unit 14 of FIG. 1 can safely assume that the pixels that previously resided in the background region 202 but now fall into the tracking window in a next frame really belong to the background (as they initially resided outside the tracking window). Thus, the graphical processing unit 14 will appropriately designate such pixels accordingly. Thus, new pixels entering the tracking window now become part of the background. In a next frame, given a new position of the object 12, the graphical processing unit 14 of FIG. 1 will track the pixels previously deemed as part of the background along with the newly entering pixels. If a background pixel does not fall in the tracking window but stays within the surrounding background region, the graphical processing unit 14 of FIG. 1 tracks pixels waiting to enter the tracking window until such pixels enter the tracking window or escape from the background region. In this way the background mask, originally limited to the background region 202 propagates with the tracking window 200, better delineating the segmentation as observed in FIGS. 5A-5D as described hereinafter.
FIGURE 3 depicts in flow-chart form the steps of a process 300 executed by the graphical processing unit 14 of FIG. 1 to accomplish object tracking as discussed above. The process 300 commences by first establishing the location of the object 12 of FIG. 1 in a current frame during step 300. As discussed previously, the graphical processing unit 14 of FIG. 1 establishes the location of the object 12 by initially circumscribing the object with the tracking window 200 as seen in FIG. 2. Next, during step 302, the graphical processing unit 14 of FIG. 1 establishes a background region, as represented by the background region 202 in FIG. 2. During step 304 of FIG. 3, the graphical processing unit 14 of FIG. 1 estimates the location of the object in a next frame. As discussed above, any of a variety of object tracking techniques, such as optimal filtering, point-tracking, t rack i ng - by-detect i on , optical-flow, and background subtraction, for example, could accomplish this step. During step 306, the graphical processing unit 14 of FIG. 1 determines the propagation of the background region in the next fame. Based on the propagation of the background region during step 306, the graphical processing unit 14 of FIG. 1 can segment the object from the background during step 308, to facilitate tracking of the object.
To appreciate the image segmentation, refer to FIGS. 4A-4D and 5A-5D. FIGS. 4A- 4D depict, from right to left, a sequence images of an object, (e.g., an automobile) in motion. FIGS. 5A-5D each depict a segmentations of the object from the background appearing in a corresponding one of the images depicted in FIGS. 4A-4D, respectively.
The tracking method of the present principles discussed above can include one or more of the following feature. For example, tracking all the background pixels can occur using one of a variety of well-known dense point tracking methods. Further, the graphical processing unit 14 of FIG. 1 could track a sparse set of pixels by any one of a variety of sparse point tracking methods and then extrapolate the segmentation to the other points. As another modification to the tracking method discussed above, the graphical processing unit 14 of FIG. 1 could first apply superpixelation to the background and then tracking superpixels, which is more efficient and robust and has the advantage that such superpixels are usually attached to the object borders.
Moreover, the graphical processing unit 14 of FIG. 1 could use the segmentation obtained as described to better extract and initialize an appearance model of the foreground and background, and refine the segmentation via a standard color-based segmentation technique. Further, .the graphical processing unit 14 of FIG. 1 could use the complement of the background mask, i.e. the foreground mask, to adjust the position of the target by for example, center-of-mass computation, or blob analysis. The graphical processing unit 14 of FIG. 1 could also compute the target location by taking the more distant point to the background or using the distance transform to extract a more plausible position. Further, the graphical processing unit 14 of FIG. 1 can use the tracked background to reject the object location, thereby improving the output of standard tracker methods, notably the tracking-by- detection method.
The graphical processing unit 14 of FIG. 1 could combine several object tracking methods on near-to-the-target objects in the background in order to analyze the non-interest objects entering the tracking window for better handling occlusions and rejection of target object locations. In addition, the graphical processing unit 14 of FIG. 1 could implement tracking in a backward direction, which is equivalent of detecting regions of the background that leave the tracking window to accomplish offline or delayed processing.
The foregoing describes a technique for tracking an object with improved segmentation.

Claims

I. A method for tracking an object (12), comprising
establishing the object (12) in a current frame;
establishing a background region (202) encompassing the object in the current frame;
estimating a location for the object (12) in a next frame;
determining propagation of the background region (202); and
segmenting the object from its background based on propagation of the background region.
2. The method according to claim 1 wherein the step of establishing the object in the current frame includes circumscribing the object with a first bounding box.
3. The method according to claim. 2 wherein the step of establishing the background region includes the step of circumscribing the object with a second bounding box larger than the first bounding box to encompass the object an at least a portion of background surrounding the object.
4. The method according to claim. 1 wherein the step of determining the propagation of the background region includes the step of detecting pixels that originally reside outside the object but inside the tracking window in the current frame but move inside a tracking window encompassing the object in the next frame.
5. The method according to claim. 1 further including the steps of
applying superpixelation to the background to yield super pixels attached to borders of the object; and
tracking such superpixels.
6. The method according to claim 1 wherein the step of segmenting the image includes refining the segmentation via a standard color-based segmentation technique.
7. The method according to claim 1 wherein the step of determining propagation of the background region(202) includes tracking all background pixels using a dense point tracking method.
8. System. (10) for tracking an object (12), comprising,
an image acquisition device (16) for capturing images of an object to undergo tracking; and
a processor (14)for processing the image captured by the image acquisition device, for establishing the object (12) in a current frame; (b) establishing a background region (202) encompassing the object in the current frame; (c) estimating a location for the object (12) in a next frame; (d) determining propagation of the background region (202); and (e) segmenting the object from its background based on propagation of the background region.
9. The system according to claim. 8 wherein processor establishes the object in the current frame by circumscribing the object with a first bounding box.
10. The system according to claim 9 wherein the processor establishes the background region by of circumscribing the object with a second bounding box larger than the first bounding box to encompass the object an at least a portion of background surrounding the object.
11. The system, according to claim. 8 wherein the processor determines the propagation of the background region by detecting pixels that originally reside outside the object but inside the tracking window in the current frame but move inside a tracking window encompassing the object in the next frame.
12. The system, according to claim 8 wherein the processor applies superpixelation to the background to yield super pixels attached to borders of the object and thereafter tracks such superpixels.
13. The system, according to claim 8 wherein the step of segmenting the image includes refining the segmentation via a standard color-based segmentation technique.
14. The system according to claim 8 wherein the processor determines propagation of the background region (202) by tracking all background pixels using a dense point tracking method.
PCT/EP2015/061604 2014-05-28 2015-05-26 Method and apparatus for object tracking and segmentation via background tracking WO2015181179A1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP15724674.5A EP3149707A1 (en) 2014-05-28 2015-05-26 Method and apparatus for object tracking and segmentation via background tracking
JP2016569791A JP2017522647A (en) 2014-05-28 2015-05-26 Method and apparatus for object tracking and segmentation via background tracking
US15/314,497 US10249046B2 (en) 2014-05-28 2015-05-26 Method and apparatus for object tracking and segmentation via background tracking
CN201580028168.6A CN106462975A (en) 2014-05-28 2015-05-26 Method and apparatus for object tracking and segmentation via background tracking
KR1020167033141A KR20170015299A (en) 2014-05-28 2015-05-26 Method and apparatus for object tracking and segmentation via background tracking

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP14305799 2014-05-28
EP14305799.0 2014-05-28

Publications (1)

Publication Number Publication Date
WO2015181179A1 true WO2015181179A1 (en) 2015-12-03

Family

ID=50943255

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2015/061604 WO2015181179A1 (en) 2014-05-28 2015-05-26 Method and apparatus for object tracking and segmentation via background tracking

Country Status (6)

Country Link
US (1) US10249046B2 (en)
EP (1) EP3149707A1 (en)
JP (1) JP2017522647A (en)
KR (1) KR20170015299A (en)
CN (1) CN106462975A (en)
WO (1) WO2015181179A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018032702A1 (en) * 2016-08-18 2018-02-22 广州视源电子科技股份有限公司 Image processing method and apparatus

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170330371A1 (en) * 2014-12-23 2017-11-16 Intel Corporation Facilitating culling of composite objects in graphics processing units when such objects produce no visible change in graphics images
US10896495B2 (en) * 2017-05-05 2021-01-19 Boe Technology Group Co., Ltd. Method for detecting and tracking target object, target object tracking apparatus, and computer-program product
CN108062761A (en) * 2017-12-25 2018-05-22 北京奇虎科技有限公司 Image partition method, device and computing device based on adaptive tracing frame
CN108010032A (en) * 2017-12-25 2018-05-08 北京奇虎科技有限公司 Video landscape processing method and processing device based on the segmentation of adaptive tracing frame
CN108171716B (en) * 2017-12-25 2021-11-26 北京奇虎科技有限公司 Video character decorating method and device based on self-adaptive tracking frame segmentation
CN108154119B (en) * 2017-12-25 2021-09-28 成都全景智能科技有限公司 Automatic driving processing method and device based on self-adaptive tracking frame segmentation
KR102582578B1 (en) 2018-04-20 2023-09-26 엘지전자 주식회사 Cooling system for a low temperature storage

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7408572B2 (en) * 2002-07-06 2008-08-05 Nova Research, Inc. Method and apparatus for an on-chip variable acuity imager array incorporating roll, pitch and yaw angle rates measurement
FR2880455A1 (en) * 2005-01-06 2006-07-07 Thomson Licensing Sa METHOD AND DEVICE FOR SEGMENTING AN IMAGE
EP2680226B1 (en) 2012-06-26 2016-01-20 Thomson Licensing Temporally consistent superpixels
CN102930539B (en) * 2012-10-25 2015-08-26 江苏物联网研究发展中心 Based on the method for tracking target of Dynamic Graph coupling
CN103164858B (en) * 2013-03-20 2015-09-09 浙江大学 Adhesion crowd based on super-pixel and graph model is split and tracking
CN103366382A (en) * 2013-07-04 2013-10-23 电子科技大学 Active contour tracing method based on superpixel
CN103413120B (en) * 2013-07-25 2016-07-20 华南农业大学 Tracking based on object globality and locality identification
WO2015013908A1 (en) * 2013-07-31 2015-02-05 Microsoft Corporation Geodesic saliency using background priors
US9947077B2 (en) * 2013-12-31 2018-04-17 Thomson Licensing Video object tracking in traffic monitoring

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
GALLEGO J., PARDAS M., HARO G.: "Enhanced foreground segmentation and tracking combining Bayesian background, shadow and foreground modeling", PATTERN RECOGNITION LETTERS, vol. 33, no. 12, 1 September 2012 (2012-09-01), pages 1558 - 1568, XP028503881, ISSN: 0167-8655, [retrieved on 20120523], DOI: 10.1016/J.PATREC.2012.05.004 *
NARAYANAN SUNDARAM ET AL: "Dense Point Trajectories by GPU-Accelerated Large Displacement Optical Flow", 5 September 2010, COMPUTER VISION - ECCV 2010, SPRINGER BERLIN HEIDELBERG, BERLIN, HEIDELBERG, PAGE(S) 438 - 451, ISBN: 978-3-642-15548-2, XP019150527 *
YANG FAN, LU HUCHUAN, ZANG MING-HSUAN: "Robust Superpixel Tracking", IEEE TRANSACTIONS ON IMAGE PROCESSING, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 23, no. 4, 1 April 2014 (2014-04-01), pages 1639 - 1651, XP011541796, ISSN: 1057-7149, [retrieved on 20140228], DOI: 10.1109/TIP.2014.2300823 *
ZHANG YAO,YHANG QUIHENG,XU ZHIYONG, XU JUNPING: "Dim point target detection against bright background", PROC. OF SPIE, vol. 7724, 5 May 2010 (2010-05-05), pages 1 - 9, XP040537091 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018032702A1 (en) * 2016-08-18 2018-02-22 广州视源电子科技股份有限公司 Image processing method and apparatus

Also Published As

Publication number Publication date
US10249046B2 (en) 2019-04-02
KR20170015299A (en) 2017-02-08
US20180247418A1 (en) 2018-08-30
JP2017522647A (en) 2017-08-10
EP3149707A1 (en) 2017-04-05
CN106462975A (en) 2017-02-22

Similar Documents

Publication Publication Date Title
US10249046B2 (en) Method and apparatus for object tracking and segmentation via background tracking
Zhou et al. Efficient road detection and tracking for unmanned aerial vehicle
CN110796010B (en) Video image stabilizing method combining optical flow method and Kalman filtering
CN104573614B (en) Apparatus and method for tracking human face
US9367897B1 (en) System for video super resolution using semantic components
WO2016034059A1 (en) Target object tracking method based on color-structure features
CN109086724B (en) Accelerated human face detection method and storage medium
CN108229475B (en) Vehicle tracking method, system, computer device and readable storage medium
WO2019071976A1 (en) Panoramic image saliency detection method based on regional growth and eye movement model
CN112184759A (en) Moving target detection and tracking method and system based on video
CN110647836B (en) Robust single-target tracking method based on deep learning
CN111340749B (en) Image quality detection method, device, equipment and storage medium
US20230334235A1 (en) Detecting occlusion of digital ink
CN109242959B (en) Three-dimensional scene reconstruction method and system
CN110738667A (en) RGB-D SLAM method and system based on dynamic scene
Zhang et al. Robust stereo matching with surface normal prediction
CN109658441B (en) Foreground detection method and device based on depth information
Caseiro et al. Foreground segmentation via background modeling on Riemannian manifolds
Liu et al. Automatic body segmentation with graph cut and self-adaptive initialization level set (SAILS)
CN110689553B (en) Automatic segmentation method of RGB-D image
JP2013080389A (en) Vanishing point estimation method, vanishing point estimation device, and computer program
Gallego et al. Joint multi-view foreground segmentation and 3d reconstruction with tolerance loop
Yu et al. Accurate motion detection in dynamic scenes based on ego-motion estimation and optical flow segmentation combined method
Huang et al. Tracking camouflaged objects with weighted region consolidation
Engels et al. Automatic occlusion removal from façades for 3D urban reconstruction

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15724674

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 20167033141

Country of ref document: KR

Kind code of ref document: A

Ref document number: 2016569791

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2015724674

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2015724674

Country of ref document: EP

Ref document number: 15314497

Country of ref document: US

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112016027739

Country of ref document: BR

ENP Entry into the national phase

Ref document number: 112016027739

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20161125