US20060182339A1 - Combining multiple cues in a visual object detection system - Google Patents

Combining multiple cues in a visual object detection system Download PDF

Info

Publication number
US20060182339A1
US20060182339A1 US11/059,862 US5986205A US2006182339A1 US 20060182339 A1 US20060182339 A1 US 20060182339A1 US 5986205 A US5986205 A US 5986205A US 2006182339 A1 US2006182339 A1 US 2006182339A1
Authority
US
United States
Prior art keywords
pixel
recited
cues
image
probabilities
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/059,862
Inventor
Jonathan Connell
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US11/059,862 priority Critical patent/US20060182339A1/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CONNELL, JONATHAN H.
Priority to CNB2006100059335A priority patent/CN100405828C/en
Publication of US20060182339A1 publication Critical patent/US20060182339A1/en
Priority to US12/196,732 priority patent/US8600105B2/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/143Segmentation; Edge detection involving probabilistic approaches, e.g. Markov random field [MRF] modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/215Motion-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Definitions

  • the present invention relates to video processing and more particularly to a system and method for detecting objects in video.
  • Background subtraction is a method for finding moving objects in a known background.
  • An incoming image is compared pixel-by-pixel with a stored reference image and a difference mask is computed.
  • Most such algorithms work on a very local basis. Neighboring pixels (in space and time) are considered only in a post-processing morphology stage. This can be a problem especially with shadow removal algorithms and may cause certain textured objects to be ignored. That is, it is possible for adjacent pixels to be corrected in opposite directions (e.g. one brightened, one dimmed) such that neither is perceived as different from the background pattern.
  • Another possible post-processing rule would be to search the interior of a region for texture and compare this to the texture found in the original background image. Once again, however, the criteria governing when to retain or dismiss a region are problematic.
  • Systems and methods for detecting visual objects by employing multiple cues include statistically combining information from multiple sources into a saliency map, wherein the information may include color, edge differences and/or motion in an image where an object is to be detected.
  • the statistical combination method makes use of pixel noise estimates to weight the contribution of pixel-by-pixel and local neighborhood cues.
  • the statistically combined information is thresholded to make decisions with respect to foreground/background pixels.
  • a system for detecting visual objects by employing multiple cues includes a video source, which provides images to be processed.
  • a probability determination module determines a probability for a plurality of cues based upon available information to determine if a pixel belongs to an object or a background, wherein the cues include a combination of pixel-by-pixel cues and local neighborhood cues.
  • a statistical combiner combines the probabilities from each of the plurality of cues into a saliency map such that statistically combined information is employed to make decisions with respect to foreground or background pixels.
  • Another system for detecting visual objects by employing multiple cues includes a video source, which provides images to be processed.
  • a probability determination module determines a probability for a plurality of cues based upon available information to determine if a pixel belongs to an object or a background.
  • a noise estimator estimates noise for each cue, wherein the noise estimate is employed in deriving probabilities for the cues.
  • a statistical combiner combines the probabilities from each of the plurality of cues into a saliency map such that statistically combined information is employed to make decisions with respect to foreground or background pixels.
  • FIG. 1 is a block/flow diagram showing an illustrative system/method for detecting objects in video based upon multiple cues
  • FIG. 2 is a block diagram showing an illustrative system for detecting objects in video based upon multiple cues.
  • Illustrative embodiments described herein statistically combine information from multiple sources (cues), e.g., differences, edges, and optionally motion, into a saliency map and then threshold this combined evidence to yield foreground/background pixel decisions.
  • differences may include changes in color, texture, edge energy or outlines, movement, etc. between frames.
  • edge information e.g., changes in color, texture, edge energy or outlines, movement, etc. between frames.
  • the computation and decision criterion are simple (addition and thresholding, respectively).
  • a suitable weighting can be achieved by measuring then propagating an estimate of pixel channel noise through the relevant cue detection computations to arrive at a corresponding estimate of noise for each type of cue.
  • This same methodology can be extended to cover the combination of additional pixel-by-pixel or local neighborhood cues.
  • the system and methods preferably estimate probabilities for each information source, and these probabilities are preferably based on overall noise estimates.
  • the global noise estimate(s) is one aspect that differentiates the present invention from other approaches.
  • the image in question may have more or less than 3 color channels (e.g., black and white security cameras, or military multi-spectral imagery).
  • a noise estimate can be generated for each available channel (and/or additional information source(s) or cue(s)) and propagated appropriately.
  • FIGS. may be implemented in various forms of hardware, software or combinations thereof. Preferably, these elements are implemented in software on one or more appropriately programmed general-purpose digital computers or equivalents having a processor and memory and input/output interfaces.
  • FIG. 1 a block/flow diagram shows an illustrative system/method for object detection in video.
  • the steps represented in the determination of probabilities may be taken in any order.
  • the probabilities calculated for edge differences, color and motion may be replaced or combined with other characteristics or information in an image.
  • Cues may include color/texture, edge differences, motion, etc. Other cues may also be employed in addition to or instead of these cues.
  • the system illustratively starts by estimating the noise energy in each of the red, green and blue color channels for an image as a whole as compared to a stored reference image in block 8 .
  • This correction is constrained to small range of gains corresponding to the depth of shadows (or inter-reflection highlights) expected.
  • each such difference measure can be interpreted as a probability that the pixel belongs to the background (Bc(x,y)).
  • the overall probability (Bp(x, y)) that a pixel belongs to the background model can be modeled as the product of the channel probabilities in block 20 . This is easier to work with if the logarithmic probability (Lp(x, y)) is used instead.
  • Ep of a color difference may be defined as follows. Note that a difference of one standard deviation in any channel yields an Ep value of one.
  • edge energy measures For texture, a variety of edge energy measures are computed: e.g., a Sobel horizontal mask convolution, H, a Sobel vertical mask convolution, V, and a 3 ⁇ 3 (or other neighborhood size) center surround difference, C in block 26 .
  • the same measures may also be extracted at each pixel over a similarly normalized monochrome version of the background image (from block 8 ).
  • Differences in each of these texture “channels” is then noted in block 28 .
  • Separating the two Sobel masks means changes in the orientation of a texture can be detected even if the magnitude of the texture remains comparable. Including the center surround operator more clearly marks the presence of features such as corners and speckles, which generate only weak Sobel responses.
  • the multiple edge measures are statistically combined, preferably using noise estimates similar to those described in block 16 .
  • the expected variation will be k*sqrt (3 ⁇ 4) assuming uncorrelated neighbors, whereas the variation observed for the corner detector C is dominated by the noise of the central pixel and is k*sqrt(9/8).
  • These values can be used as before to generate normalized probabilities, surprise estimates (block 22 ), and ultimately the contribution Ej of edge differences to the overall saliency.
  • a variation of one standard deviation in any of the texture measures yields an Ej value of one.
  • Ej ⁇ ( 4 / ( 3 * k 2 ) ) * [ Vi - Vs ] 2 + ⁇ ( 4. / ( 3 * k 2 ) ) * [ Hi - Hs ] 2 + ⁇ ( 8 / ( 9 * k 2 ) ) * [ Ci - Cs ] 2
  • the same statistical noise-based weighting scheme can also be used to incorporate motion into the saliency image.
  • double differences are taken to confine the motion energy to the interior of an object (as opposed to including a trailing “hole” where it has just been).
  • the likelihood Bm(x, y) is assessed such that the observed temporal difference was generated by a pixel that really did belong to the background.
  • the saliency value is then thresholded at some particular level of “surprise” to determine if a pixel is part of the foreground (e.g., not plausibly part of the background).
  • This threshold may be user-defined or predetermined. This information may be used to make a plurality of different decisions regarding the status of pixels in an image.
  • System 100 may include an acquisition device 102 for recording and storing video (e.g., a memory or repository) or visual information (e.g., a camera for real-time images).
  • Device 102 may include a video recording device, such as a video camera, digital camera, telephone with a camera built in or any other device capable of receiving, recording and optionally storing image data. These images may be processed in real-time or stored and processed later.
  • a probability determination module 106 or a plurality thereof determines a probability for a plurality of cues based upon available information solely at a pixel location to determine if that pixel belongs to an object or a background.
  • the cues may include brightness, color, motion, or any other relevant characteristic, which can be used to provide a difference from which a determination can be made about a pixel's status as an object. This is a pixel-by-pixel approach probability determination.
  • a probability determination module 108 or a plurality thereof determines a probability for a plurality of cues based upon available information in the spatial (or temporal) neighborhood of a pixel (multi-pixel approach) to determine if that pixel belongs to an object or a background.
  • the cues may include texture, edge energy, optical flow, or any other relevant characteristic, which can be used to provide a difference from which a determination can be made about a pixel's status as an object. Probabilities are preferably based on noise estimates on each channel (from each cue or source).
  • Modules 106 and 108 are advantageously combined in system 100 to provide multiple cues to determine the pixel or group of pixels' status as foreground or background.
  • the pixel-by-pixel approach ( 106 ) and the local neighborhood or multi-pixel approach ( 108 ) may each employ a plurality of different characteristics (hence a plurality of modules 106 and/or 108 ), such as color, texture, motion, etc. or combinations thereof in both approach and characteristic.
  • the pixel-by-pixel approach considers characteristics of each pixel while the multi-pixel approach considers a neighborhood or group of pixels' characteristics.
  • a statistical combiner 110 combines the probabilities from each of the plurality of cues into a saliency map 112 such that statistically combined information is employed to make decisions with respect to foreground/background pixels.
  • a noise estimator module 104 provides a measure of pixel noise in one or several pixels channels to aid in determining the true probabilities in cue detection modules 106 and 108 , thereby adjusting the weighting of the cues input to module 110 .
  • Combiner 110 , cue modules 106 and 108 , and noise estimator 104 may be part of a same computer or system 120 , and may be implemented in software.
  • the cues may be related to color, edge differences and motion in an image where an object is to be detected.
  • the noise estimator 104 provides the estimates and functionality as described in FIG. 1 (see e.g., blocks 10 , and 16 ). The estimation of noise can be performed to provide the statistical probabilities for color, texture, motion or any other cue.
  • the system and methods can detect objects including objects with smooth intensity gradients that could be missed by the pixel difference or texture methods of the prior art.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

Systems and methods for detecting visual objects by employing multiple cues include statistically combining information from multiple sources into a saliency map, wherein the information may include color, texture and/or motion in an image where an object is to be detected or background determined. The statistically combined information is thresholded to make decisions with respect to foreground/background pixels.

Description

    BACKGROUND
  • 1. Technical Field
  • The present invention relates to video processing and more particularly to a system and method for detecting objects in video.
  • 2. Description of the Related Art
  • Background subtraction is a method for finding moving objects in a known background. An incoming image is compared pixel-by-pixel with a stored reference image and a difference mask is computed. Most such algorithms work on a very local basis. Neighboring pixels (in space and time) are considered only in a post-processing morphology stage. This can be a problem especially with shadow removal algorithms and may cause certain textured objects to be ignored. That is, it is possible for adjacent pixels to be corrected in opposite directions (e.g. one brightened, one dimmed) such that neither is perceived as different from the background pattern.
  • One solution to this dilemma is to not employ a pixel-based shadow removal algorithm, but instead detect all regions and use a post-processing method to determine if certain areas are shadows. This can be done by carefully looking for bounding edges around the object. Yet, this can be a time-consuming operation and the rules for when to break a shadow portion off a larger blob are difficult to formulate. Furthermore, in a highly textured environment (e.g. some outdoors scenes, or in a cluttered office) there is likely to be edge information near the boundary of a region no matter how the region was formed.
  • Another possible post-processing rule would be to search the interior of a region for texture and compare this to the texture found in the original background image. Once again, however, the criteria governing when to retain or dismiss a region are problematic.
  • SUMMARY
  • Systems and methods for detecting visual objects by employing multiple cues include statistically combining information from multiple sources into a saliency map, wherein the information may include color, edge differences and/or motion in an image where an object is to be detected. The statistical combination method makes use of pixel noise estimates to weight the contribution of pixel-by-pixel and local neighborhood cues. The statistically combined information is thresholded to make decisions with respect to foreground/background pixels.
  • A system for detecting visual objects by employing multiple cues includes a video source, which provides images to be processed. A probability determination module determines a probability for a plurality of cues based upon available information to determine if a pixel belongs to an object or a background, wherein the cues include a combination of pixel-by-pixel cues and local neighborhood cues. A statistical combiner combines the probabilities from each of the plurality of cues into a saliency map such that statistically combined information is employed to make decisions with respect to foreground or background pixels.
  • Another system for detecting visual objects by employing multiple cues includes a video source, which provides images to be processed. A probability determination module determines a probability for a plurality of cues based upon available information to determine if a pixel belongs to an object or a background. A noise estimator estimates noise for each cue, wherein the noise estimate is employed in deriving probabilities for the cues. A statistical combiner combines the probabilities from each of the plurality of cues into a saliency map such that statistically combined information is employed to make decisions with respect to foreground or background pixels.
  • These and other objects, features and advantages will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.
  • BRIEF DESCRIPTION OF DRAWINGS
  • The disclosure will provide details in the following description of preferred embodiments with reference to the following figures wherein:
  • FIG. 1 is a block/flow diagram showing an illustrative system/method for detecting objects in video based upon multiple cues; and
  • FIG. 2 is a block diagram showing an illustrative system for detecting objects in video based upon multiple cues.
  • DETAILED DESCRIPTION OF PREFFERED EMBODIMENTS
  • Illustrative embodiments described herein statistically combine information from multiple sources (cues), e.g., differences, edges, and optionally motion, into a saliency map and then threshold this combined evidence to yield foreground/background pixel decisions. For example, differences may include changes in color, texture, edge energy or outlines, movement, etc. between frames. Because the system and methods may directly employ edge information, a change of texture contributes to a detection. Moreover, the computation and decision criterion are simple (addition and thresholding, respectively).
  • In a system using both texture differences computed over multi-pixel neighborhoods and more traditional single-pixel differences, a sound statistical basis may be found for combining the two types of differences. A suitable weighting can be achieved by measuring then propagating an estimate of pixel channel noise through the relevant cue detection computations to arrive at a corresponding estimate of noise for each type of cue.
  • This same methodology can be extended to cover the combination of additional pixel-by-pixel or local neighborhood cues. In particular, it is advantageous to merge motion into the saliency computation. This allows the system and methods to detect objects with smooth intensity gradients that might be missed by the pixel difference or texture methods alone.
  • The system and methods preferably estimate probabilities for each information source, and these probabilities are preferably based on overall noise estimates. The global noise estimate(s) is one aspect that differentiates the present invention from other approaches.
  • In general, the image in question may have more or less than 3 color channels (e.g., black and white security cameras, or military multi-spectral imagery). A noise estimate can be generated for each available channel (and/or additional information source(s) or cue(s)) and propagated appropriately.
  • It should be understood that the elements shown in the FIGS. may be implemented in various forms of hardware, software or combinations thereof. Preferably, these elements are implemented in software on one or more appropriately programmed general-purpose digital computers or equivalents having a processor and memory and input/output interfaces.
  • Referring now to the drawings in which like numerals represent the same or similar elements and initially to FIG. 1, a block/flow diagram shows an illustrative system/method for object detection in video. The steps represented in the determination of probabilities may be taken in any order. In addition, it is noted that the probabilities calculated for edge differences, color and motion may be replaced or combined with other characteristics or information in an image.
  • In block 6, an input image or images are taken and provided for processing. Cues may include color/texture, edge differences, motion, etc. Other cues may also be employed in addition to or instead of these cues.
  • In block 10, the system illustratively starts by estimating the noise energy in each of the red, green and blue color channels for an image as a whole as compared to a stored reference image in block 8.
  • In block 12, the system starts the color processing chain by making a best guess gain correction to each pixel in the input image (Ic(x,y)δI′c(x,y) where c is a color channel: r=red, g=green, b=blue) to correct for shadows. This correction is constrained to small range of gains corresponding to the depth of shadows (or inter-reflection highlights) expected.
  • In block 14, multiple (e.g., three) differences are formed relative to the stable background image (Sc(x,y) from block 8). These differences are then evaluated relative to the noise estimates for each channel (Nc) to determine how many standard deviations they are from the mean (assumed at zero) in block 16. In block 18, given a Normal or Gaussian distribution of noise differences and interpreting the noise estimates as the standard deviations of these distributions, each such difference measure can be interpreted as a probability that the pixel belongs to the background (Bc(x,y)).
    Dc(c, y)=I′c(x, y)−S(x, y)
    Bc(x, y)=J*expt[−Dc(x, y)2/(2*Nc 2)]
  • where J=1/(2*pi)1/2
  • If the pixel color channels are assumed independent, the overall probability (Bp(x, y)) that a pixel belongs to the background model can be modeled as the product of the channel probabilities in block 20. This is easier to work with if the logarithmic probability (Lp(x, y)) is used instead. This formulation is sometimes interpreted in an information theoretic context as the “surprise” (block 22) about the observed value, Bp ( x , y ) = Br ( x , y ) * Bg ( x , y ) * Bb ( x , y ) Lp ( x , y ) = - log Bp ( x , y ) = - log Br ( x , y ) - log Bg ( x , y ) - log Bb ( x , y ) = [ ( Dr / Nr ) 2 + ( Dg / Ng ) 2 + ( Db / Nb ) 2 ] / 2 - 3 * log J
    Here the red channel difference Dr(x, y) is abbreviated as Dr, and similarly for Dg and Db. The corresponding noise in the red channel Nr(x, y) is likewise abbreviated as Nr, as are the two other channel noise estimates Ng and Nb.
  • Noting that the saliency of a pixel based on color differences should be lowest when Dr=Dg=Db=0, the saliency contribution Ep of a color difference may be defined as follows. Note that a difference of one standard deviation in any channel yields an Ep value of one. Ep ( x , y ) = 2 * Lp ( x , y ) + 6 * log J = Dr 2 ( x , y ) / Nr 2 + Dg 2 ( x , y ) / Ng 2 + Db 2 ( x , y ) / Nb 2
  • The saliency due to color differences is next combined with saliencies due to texture and motion. To make these new contributions comparable to Ep, the differences are computed on a monochrome version (G) of the input image produced in block 24. This image is specially constructed so that one standard deviation difference in any of the color channels will yield the same change in the combined intensity. Here k is a simple scaling factor determining how many intensity levels in the output correspond to one standard deviation in a color channel.
    G(x, y)=k*(I′r(x,y)/Nr+I′g(x,y)/Ng+I′b(x,y)/Nb)
  • For texture, a variety of edge energy measures are computed: e.g., a Sobel horizontal mask convolution, H, a Sobel vertical mask convolution, V, and a 3×3 (or other neighborhood size) center surround difference, C in block 26. The same measures may also be extracted at each pixel over a similarly normalized monochrome version of the background image (from block 8).
  • Differences in each of these texture “channels” is then noted in block 28. Separating the two Sobel masks means changes in the orientation of a texture can be detected even if the magnitude of the texture remains comparable. Including the center surround operator more clearly marks the presence of features such as corners and speckles, which generate only weak Sobel responses.
  • In block 30 the multiple edge measures are statistically combined, preferably using noise estimates similar to those described in block 16. For the Sobel mask differences H and V, the expected variation will be k*sqrt (¾) assuming uncorrelated neighbors, whereas the variation observed for the corner detector C is dominated by the noise of the central pixel and is k*sqrt(9/8). These values can be used as before to generate normalized probabilities, surprise estimates (block 22), and ultimately the contribution Ej of edge differences to the overall saliency. As always, a variation of one standard deviation in any of the texture measures yields an Ej value of one. Ej = ( 4 / ( 3 * k 2 ) ) * [ Vi - Vs ] 2 + ( 4. / ( 3 * k 2 ) ) * [ Hi - Hs ] 2 + ( 8 / ( 9 * k 2 ) ) * [ Ci - Cs ] 2
  • The same statistical noise-based weighting scheme can also be used to incorporate motion into the saliency image. In block 32, for motion, double differences are taken to confine the motion energy to the interior of an object (as opposed to including a trailing “hole” where it has just been). Let G0, G1, and G2 be the monochrome version of three successive frames, the raw motion M1 at the middle frame is defined as:
    M(x, y)=M1=min(|G2−G1|,|G1−G0|)
  • Assuming the noise estimates remain the same across time, the k scaling factor for each monochrome frame is also the same. Thus, in block 34, the likelihood Bm(x, y) is assessed such that the observed temporal difference was generated by a pixel that really did belong to the background.
  • Once again, logarithms are used to convert this likelihood to a surprise value, then offset and scale the surprise value (block 22) to get a saliency contribution Em. Note that, once again, a motion difference of one standard deviation yields an Em value of one. Bm ( x , y ) = J * exp t [ - M ( x , y ) 2 / ( 2 * k 2 ) ] Lm ( x , y ) = - log Bm ( x , y ) = ( M / k ) 2 / 2 - log J Em ( x , y ) = 2 * Lm ( x , y ) + 2 * log J = ( M / k ) 2
  • In practice, it may be desirable to multiply the Ej and Em contributions by fractional fudge factors Fj and Fm (default value of one) relative to Ep in order to reduce the system sensitivity to various classes of phenomena in block 36. This yields altered contributions of Ej′=Fj*Ej and Em′=Fm*Em.
  • After this, in block 38, all three contributions are summed to generate an overall saliency E(x, y) at a pixel. Em, Ep, and Ej are added since this is multiplying the probabilities that the observed pixel changes could occur (due to noise) despite the pixel truly being part of the background.
    E=Ep+Ej′+Em′
  • In block 40, the saliency value is then thresholded at some particular level of “surprise” to determine if a pixel is part of the foreground (e.g., not plausibly part of the background). This threshold may be user-defined or predetermined. This information may be used to make a plurality of different decisions regarding the status of pixels in an image.
  • Referring to FIG. 2, a system 100 for detecting visual objects by employing multiple cues is illustratively shown in accordance with one embodiment. System 100 may include an acquisition device 102 for recording and storing video (e.g., a memory or repository) or visual information (e.g., a camera for real-time images). Device 102 may include a video recording device, such as a video camera, digital camera, telephone with a camera built in or any other device capable of receiving, recording and optionally storing image data. These images may be processed in real-time or stored and processed later.
  • A probability determination module 106 or a plurality thereof determines a probability for a plurality of cues based upon available information solely at a pixel location to determine if that pixel belongs to an object or a background. The cues may include brightness, color, motion, or any other relevant characteristic, which can be used to provide a difference from which a determination can be made about a pixel's status as an object. This is a pixel-by-pixel approach probability determination.
  • In addition a probability determination module 108 or a plurality thereof determines a probability for a plurality of cues based upon available information in the spatial (or temporal) neighborhood of a pixel (multi-pixel approach) to determine if that pixel belongs to an object or a background. The cues may include texture, edge energy, optical flow, or any other relevant characteristic, which can be used to provide a difference from which a determination can be made about a pixel's status as an object. Probabilities are preferably based on noise estimates on each channel (from each cue or source).
  • Modules 106 and 108 are advantageously combined in system 100 to provide multiple cues to determine the pixel or group of pixels' status as foreground or background. The pixel-by-pixel approach (106) and the local neighborhood or multi-pixel approach (108) may each employ a plurality of different characteristics (hence a plurality of modules 106 and/or 108), such as color, texture, motion, etc. or combinations thereof in both approach and characteristic. The pixel-by-pixel approach considers characteristics of each pixel while the multi-pixel approach considers a neighborhood or group of pixels' characteristics.
  • A statistical combiner 110 combines the probabilities from each of the plurality of cues into a saliency map 112 such that statistically combined information is employed to make decisions with respect to foreground/background pixels. A noise estimator module 104 provides a measure of pixel noise in one or several pixels channels to aid in determining the true probabilities in cue detection modules 106 and 108, thereby adjusting the weighting of the cues input to module 110.
  • Combiner 110, cue modules 106 and 108, and noise estimator 104 may be part of a same computer or system 120, and may be implemented in software. The cues may be related to color, edge differences and motion in an image where an object is to be detected. The noise estimator 104 provides the estimates and functionality as described in FIG. 1 (see e.g., blocks 10, and 16). The estimation of noise can be performed to provide the statistical probabilities for color, texture, motion or any other cue.
  • Statistically combining information from multiple sources such as differences, edges, and optionally motion, into a saliency map and then thresholding this combined evidence yields significant improvements in foreground/background pixel decisions. Because the system and methods employ edge information directly, a change of texture helps to contribute toward detection of the differences, edges, etc. Moreover, the computation and decision criterion require only simple operations, such as addition and thresholding.
  • By also merging motion into the saliency calculation, the system and methods can detect objects including objects with smooth intensity gradients that could be missed by the pixel difference or texture methods of the prior art.
  • Having described preferred embodiments of systems and methods for combining multiple cues in a visual object detection system (which are intended to be illustrative and not limiting), it is noted that modifications and variations can be made by persons skilled in the art in light of the above teachings. It is therefore to be understood that changes may be made in the particular embodiments disclosed which are within the scope and spirit of the invention as outlined by the appended claims. Having thus described aspects of the invention, with the details and particularity required by the patent laws, what is claimed and desired protected by Letters Patent is set forth in the appended claims.

Claims (32)

1. A method for detecting visual objects by employing multiple cues, comprising the steps of:
statistically combining pixel information into a saliency map by weighting multiple cues of a pixel's status using noise estimates as a basis for determining a statistical probability that a pixel is a foreground pixel or background pixel; and
thresholding the statistically combined pixel information to make decisions with respect to the foreground/background pixels.
2. The method as recited in claim 1, wherein the statistically combining includes determining statistical probabilities based on color to determine if a pixel belongs to a background image.
3. The method as recited in claim 2, wherein the determining statistical probabilities includes estimating noise energy in each of red, green and blue color channels for an image as a whole.
4. The method as recited in claim 3, further comprising estimating a gain correction for each pixel in an input image to correct for shadows.
5. The method as recited in claim 4, further comprising forming multiple differences relative to a stable background image to evaluate, relative to the noise energy, estimates for each channel to determine a standard deviation from a mean value.
6. The method as recited in claim 1, wherein the statistically combining includes determining probabilities based on motion to determine if a pixel belongs to a background image.
7. The method as recited in claim 6, further comprising confining motion energy to an interior of an object by taking differences between image frames in a monochrome version of the image.
8. The method as recited in claim 1, wherein the statistically combining includes determining probabilities based on texture to determine if a pixel belongs to a background image.
9. The method as recited in claim 8, further comprising computing multiple difference measures for edges by using a normalized monochrome image.
10. The method as recited in claim 9, wherein the difference measures include at least one of a Sobel horizontal mask convolution, H, a Sobel vertical mask convolution, V, and a center surround difference for neighboring pixels.
11. The method as recited in claim 1, wherein the statistically combining includes combining probabilities for color differences, texture differences and motion for each pixel in an image, and based upon the combined probability, determining if the pixel is background.
12. The method as recited in claim 1, wherein the statistically combining information includes the adjusting the statistical probabilities to permit combining the probabilities.
13. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for detecting visual objects by employing multiple cues, as recited in claim 1.
14. A method for detecting visual objects by employing multiple cues, comprising the steps of:
for each pixel, determining a probability for each of a plurality of information sources for making a determination as to a status of the pixel using noise estimates as a basis for determining the probability that a pixel is a foreground pixel or background pixel;
statistically combining the probabilities from all of the information sources to form a saliency map to determine whether a pixel belongs to a background image or an object; and
thresholding the statistically combined information to make decisions with respect to foreground/background pixels.
15. The method as recited in claim 14, wherein an information source includes pixel color and the determining probability is based on color to determine if a pixel belongs to a background image.
16. The method as recited in claim 15, wherein the determining probability includes estimating and accounting for noise energy in each of the red, green and blue color channels for an image as a whole.
17. The method as recited in claim 16, further comprising estimating a gain correction for each pixel in an input image to correct for shadows.
18. The method as recited in claim 17, further comprising forming multiple differences relative to a stable background image to evaluate, relative to noise energy, estimates for each channel.
19. The method as recited in claim 14, wherein the determining probability is based on motion to determine if a pixel belongs to a background image.
20. The method as recited in claim 19, further comprising the step of confining motion energy to an interior of an object by taking differences between image frames in a monochrome version of the image.
21. The method as recited in claim 14, wherein the determining a probability is based on texture to determine if a pixel belongs to a background image.
22. The method as recited in claim 21, further comprising computing multiple difference measures for edges by using a normalized monochrome image.
23. The method as recited in claim 22, wherein the difference measures include at least one of a Sobel horizontal mask convolution, H, a Sobel vertical mask convolution, V, and a center surround difference for neighboring pixels.
24. The method as recited in claim 14, wherein the statistically combining includes combining probabilities for color differences, texture differences and motion for each pixel in an image, and based upon the combined probability determining if the pixel is background.
25. The method as recited in claim 14, wherein the statistically combining includes the step of adjusting the probabilities to permit combining the probabilities.
26. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for detecting visual objects by employing multiple cues, as recited in claim 14.
27. A system for detecting visual objects by employing multiple cues, comprising:
a video source which provides images to be processed;
a probability determination module which determines a probability for a plurality of cues based upon available information to determine if a pixel belongs to an object or a background, wherein the cues include a combination of pixel-by-pixel cues and local neighborhood cues; and
a statistical combiner which combines the probabilities from each of the plurality of cues into a saliency map such that statistically combined information is employed to make decisions with respect to foreground or background pixels.
28. The system as recited in claim 27 wherein the cues are related to at least one of color, texture and motion in an image where an object is to be detected.
29. The system as recited in claim 27 wherein the probability determination module and the statistical combiner are included in a computer system.
30. The system as recited in claim 27 further comprising a noise estimator, which estimates noise, which is employed in deriving probabilities for the cues.
31. A system for detecting visual objects by employing multiple cues, comprising:
a video source which provides images to be processed;
a probability determination module which determines a probability for a plurality of cues based upon available information to determine if a pixel belongs to an object or a background;
a noise estimator, which estimates noise for each cue, wherein the noise estimate is employed in deriving probabilities for the cues; and
a statistical combiner which combines the probabilities from each of the plurality of cues into a saliency map such that statistically combined information is employed to make decisions with respect to foreground or background pixels.
32. The system as recited in claim 31, wherein the cues include a combination of pixel-by-pixel cues and local neighborhood cues.
US11/059,862 2005-02-17 2005-02-17 Combining multiple cues in a visual object detection system Abandoned US20060182339A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US11/059,862 US20060182339A1 (en) 2005-02-17 2005-02-17 Combining multiple cues in a visual object detection system
CNB2006100059335A CN100405828C (en) 2005-02-17 2006-01-19 Method and system for visual object detection
US12/196,732 US8600105B2 (en) 2005-02-17 2008-08-22 Combining multiple cues in a visual object detection system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US11/059,862 US20060182339A1 (en) 2005-02-17 2005-02-17 Combining multiple cues in a visual object detection system

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US12/196,732 Continuation US8600105B2 (en) 2005-02-17 2008-08-22 Combining multiple cues in a visual object detection system

Publications (1)

Publication Number Publication Date
US20060182339A1 true US20060182339A1 (en) 2006-08-17

Family

ID=36815674

Family Applications (2)

Application Number Title Priority Date Filing Date
US11/059,862 Abandoned US20060182339A1 (en) 2005-02-17 2005-02-17 Combining multiple cues in a visual object detection system
US12/196,732 Active 2028-08-03 US8600105B2 (en) 2005-02-17 2008-08-22 Combining multiple cues in a visual object detection system

Family Applications After (1)

Application Number Title Priority Date Filing Date
US12/196,732 Active 2028-08-03 US8600105B2 (en) 2005-02-17 2008-08-22 Combining multiple cues in a visual object detection system

Country Status (2)

Country Link
US (2) US20060182339A1 (en)
CN (1) CN100405828C (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080002856A1 (en) * 2006-06-14 2008-01-03 Honeywell International Inc. Tracking system with fused motion and object detection
US20080025639A1 (en) * 2006-07-31 2008-01-31 Simon Widdowson Image dominant line determination and use
WO2008043204A1 (en) * 2006-10-10 2008-04-17 Thomson Licensing Device and method for generating a saliency map of a picture
US20080101761A1 (en) * 2006-10-31 2008-05-01 Simon Widdowson Weighted occlusion costing
US20080304740A1 (en) * 2007-06-06 2008-12-11 Microsoft Corporation Salient Object Detection
US20090324088A1 (en) * 2008-06-30 2009-12-31 Christel Chamaret Method for detecting layout areas in a video image and method for generating an image of reduced size using the detection method
WO2011022277A2 (en) * 2009-08-18 2011-02-24 Behavioral Recognition Systems, Inc. Inter-trajectory anomaly detection using adaptive voting experts in a video surveillance system
US20110044492A1 (en) * 2009-08-18 2011-02-24 Wesley Kenneth Cobb Adaptive voting experts for incremental segmentation of sequences with prediction in a video surveillance system
US20110052000A1 (en) * 2009-08-31 2011-03-03 Wesley Kenneth Cobb Detecting anomalous trajectories in a video surveillance system
US20130156320A1 (en) * 2011-12-14 2013-06-20 Canon Kabushiki Kaisha Method, apparatus and system for determining a saliency map for an input image
CN103400382A (en) * 2013-07-24 2013-11-20 佳都新太科技股份有限公司 Abnormal panel detection algorithm based on ATM (Automatic Teller Machine) scene
US20150131858A1 (en) * 2013-11-13 2015-05-14 Fujitsu Limited Tracking device and tracking method
US9195903B2 (en) 2014-04-29 2015-11-24 International Business Machines Corporation Extracting salient features from video using a neurosynaptic system
US9373058B2 (en) 2014-05-29 2016-06-21 International Business Machines Corporation Scene understanding using a neurosynaptic system
US9798972B2 (en) 2014-07-02 2017-10-24 International Business Machines Corporation Feature extraction using a neurosynaptic system for object classification
US10115054B2 (en) 2014-07-02 2018-10-30 International Business Machines Corporation Classifying features using a neurosynaptic system
US10528807B2 (en) * 2018-05-01 2020-01-07 Scribe Fusion, LLC System and method for processing and identifying content in form documents

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101324927B (en) * 2008-07-18 2011-06-29 北京中星微电子有限公司 Method and apparatus for detecting shadows
US9202137B2 (en) * 2008-11-13 2015-12-01 Google Inc. Foreground object detection from multiple images
US8175376B2 (en) * 2009-03-09 2012-05-08 Xerox Corporation Framework for image thumbnailing based on visual similarity
US8271871B2 (en) 2009-04-30 2012-09-18 Xerox Corporation Automated method for alignment of document objects
US8867820B2 (en) * 2009-10-07 2014-10-21 Microsoft Corporation Systems and methods for removing a background of an image
CN101847264B (en) * 2010-05-28 2012-07-25 北京大学 Image interested object automatic retrieving method and system based on complementary significant degree image
CN102271262B (en) * 2010-06-04 2015-05-13 三星电子株式会社 Multithread-based video processing method for 3D (Three-Dimensional) display
CN102222231B (en) * 2011-05-26 2015-04-08 厦门大学 Visual attention information computing device based on guidance of dorsal pathway and processing method thereof
US8873852B2 (en) * 2011-09-29 2014-10-28 Mediatek Singapore Pte. Ltd Method and apparatus for foreground object detection
EP2624173B1 (en) 2012-02-03 2016-12-14 Vestel Elektronik Sanayi ve Ticaret A.S. Permeability based saliency map extraction method
US9025880B2 (en) * 2012-08-29 2015-05-05 Disney Enterprises, Inc. Visual saliency estimation for images and video
US9165190B2 (en) * 2012-09-12 2015-10-20 Avigilon Fortress Corporation 3D human pose and shape modeling
WO2015028842A1 (en) * 2013-08-28 2015-03-05 Aselsan Elektronik Sanayi Ve Ticaret Anonim Sirketi A semi automatic target initialization method based on visual saliency
JP6330385B2 (en) * 2014-03-13 2018-05-30 オムロン株式会社 Image processing apparatus, image processing method, and program
CN104966286B (en) * 2015-06-04 2018-01-09 电子科技大学 A kind of 3D saliencies detection method
KR102707377B1 (en) * 2023-12-07 2024-09-19 주식회사 피씨티 Method for Managing a Vehicle Type Recognition Parking Lot Using an Edge Algorithm

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6332038B1 (en) * 1998-04-13 2001-12-18 Sharp Kabushiki Kaisha Image processing device
US20020090133A1 (en) * 2000-11-13 2002-07-11 Kim Sang-Kyun Method and apparatus for measuring color-texture distance, and method and apparatus for sectioning image into plurality of regions using measured color-texture distance
US20020164074A1 (en) * 1996-11-20 2002-11-07 Masakazu Matsugu Method of extracting image from input image using reference image
US20030099397A1 (en) * 1996-07-05 2003-05-29 Masakazu Matsugu Image extraction apparatus and method
US20030194131A1 (en) * 2002-04-11 2003-10-16 Bin Zhao Object extraction
US20040017930A1 (en) * 2002-07-19 2004-01-29 Samsung Electronics Co., Ltd. System and method for detecting and tracking a plurality of faces in real time by integrating visual ques
US20050078747A1 (en) * 2003-10-14 2005-04-14 Honeywell International Inc. Multi-stage moving object segmentation
US20050286764A1 (en) * 2002-10-17 2005-12-29 Anurag Mittal Method for scene modeling and change detection

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6470094B1 (en) * 2000-03-14 2002-10-22 Intel Corporation Generalized text localization in images
US20030123703A1 (en) * 2001-06-29 2003-07-03 Honeywell International Inc. Method for monitoring a moving object and system regarding same
US20030156759A1 (en) * 2002-02-19 2003-08-21 Koninklijke Philips Electronics N.V. Background-foreground segmentation using probability models that can provide pixel dependency and incremental training
US20040241670A1 (en) * 2003-06-02 2004-12-02 Srinka Ghosh Method and system for partitioning pixels in a scanned image of a microarray into a set of feature pixels and a set of background pixels

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030099397A1 (en) * 1996-07-05 2003-05-29 Masakazu Matsugu Image extraction apparatus and method
US20020164074A1 (en) * 1996-11-20 2002-11-07 Masakazu Matsugu Method of extracting image from input image using reference image
US6332038B1 (en) * 1998-04-13 2001-12-18 Sharp Kabushiki Kaisha Image processing device
US20020090133A1 (en) * 2000-11-13 2002-07-11 Kim Sang-Kyun Method and apparatus for measuring color-texture distance, and method and apparatus for sectioning image into plurality of regions using measured color-texture distance
US20030194131A1 (en) * 2002-04-11 2003-10-16 Bin Zhao Object extraction
US20040017930A1 (en) * 2002-07-19 2004-01-29 Samsung Electronics Co., Ltd. System and method for detecting and tracking a plurality of faces in real time by integrating visual ques
US20050286764A1 (en) * 2002-10-17 2005-12-29 Anurag Mittal Method for scene modeling and change detection
US20050078747A1 (en) * 2003-10-14 2005-04-14 Honeywell International Inc. Multi-stage moving object segmentation

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8467570B2 (en) * 2006-06-14 2013-06-18 Honeywell International Inc. Tracking system with fused motion and object detection
US20080002856A1 (en) * 2006-06-14 2008-01-03 Honeywell International Inc. Tracking system with fused motion and object detection
US20080025639A1 (en) * 2006-07-31 2008-01-31 Simon Widdowson Image dominant line determination and use
US7751627B2 (en) * 2006-07-31 2010-07-06 Hewlett-Packard Development Company, L.P. Image dominant line determination and use
WO2008043204A1 (en) * 2006-10-10 2008-04-17 Thomson Licensing Device and method for generating a saliency map of a picture
US20080101761A1 (en) * 2006-10-31 2008-05-01 Simon Widdowson Weighted occlusion costing
US7940985B2 (en) 2007-06-06 2011-05-10 Microsoft Corporation Salient object detection
US20080304740A1 (en) * 2007-06-06 2008-12-11 Microsoft Corporation Salient Object Detection
US20090324088A1 (en) * 2008-06-30 2009-12-31 Christel Chamaret Method for detecting layout areas in a video image and method for generating an image of reduced size using the detection method
US8374436B2 (en) * 2008-06-30 2013-02-12 Thomson Licensing Method for detecting layout areas in a video image and method for generating an image of reduced size using the detection method
US20110044492A1 (en) * 2009-08-18 2011-02-24 Wesley Kenneth Cobb Adaptive voting experts for incremental segmentation of sequences with prediction in a video surveillance system
WO2011022277A3 (en) * 2009-08-18 2011-07-14 Behavioral Recognition Systems, Inc. Inter-trajectory anomaly detection using adaptive voting experts in a video surveillance system
US8295591B2 (en) 2009-08-18 2012-10-23 Behavioral Recognition Systems, Inc. Adaptive voting experts for incremental segmentation of sequences with prediction in a video surveillance system
US8340352B2 (en) 2009-08-18 2012-12-25 Behavioral Recognition Systems, Inc. Inter-trajectory anomaly detection using adaptive voting experts in a video surveillance system
US20110044499A1 (en) * 2009-08-18 2011-02-24 Wesley Kenneth Cobb Inter-trajectory anomaly detection using adaptive voting experts in a video surveillance system
WO2011022277A2 (en) * 2009-08-18 2011-02-24 Behavioral Recognition Systems, Inc. Inter-trajectory anomaly detection using adaptive voting experts in a video surveillance system
US20110052000A1 (en) * 2009-08-31 2011-03-03 Wesley Kenneth Cobb Detecting anomalous trajectories in a video surveillance system
WO2011025662A3 (en) * 2009-08-31 2011-07-14 Behavioral Recognition Systems, Inc. Detecting anomalous trajectories in a video surveillance system
US8285060B2 (en) 2009-08-31 2012-10-09 Behavioral Recognition Systems, Inc. Detecting anomalous trajectories in a video surveillance system
US9483702B2 (en) * 2011-12-14 2016-11-01 Canon Kabushiki Kaisha Method, apparatus and system for determining a saliency map for an input image
US20130156320A1 (en) * 2011-12-14 2013-06-20 Canon Kabushiki Kaisha Method, apparatus and system for determining a saliency map for an input image
CN103400382A (en) * 2013-07-24 2013-11-20 佳都新太科技股份有限公司 Abnormal panel detection algorithm based on ATM (Automatic Teller Machine) scene
US20150131858A1 (en) * 2013-11-13 2015-05-14 Fujitsu Limited Tracking device and tracking method
US9734395B2 (en) * 2013-11-13 2017-08-15 Fujitsu Limited Tracking device and tracking method
US9922266B2 (en) 2014-04-29 2018-03-20 International Business Machines Corporation Extracting salient features from video using a neurosynaptic system
US9195903B2 (en) 2014-04-29 2015-11-24 International Business Machines Corporation Extracting salient features from video using a neurosynaptic system
US9355331B2 (en) 2014-04-29 2016-05-31 International Business Machines Corporation Extracting salient features from video using a neurosynaptic system
US11227180B2 (en) 2014-04-29 2022-01-18 International Business Machines Corporation Extracting motion saliency features from video using a neurosynaptic system
US10528843B2 (en) 2014-04-29 2020-01-07 International Business Machines Corporation Extracting motion saliency features from video using a neurosynaptic system
US10140551B2 (en) 2014-05-29 2018-11-27 International Business Machines Corporation Scene understanding using a neurosynaptic system
US10043110B2 (en) 2014-05-29 2018-08-07 International Business Machines Corporation Scene understanding using a neurosynaptic system
US9536179B2 (en) 2014-05-29 2017-01-03 International Business Machines Corporation Scene understanding using a neurosynaptic system
US10558892B2 (en) 2014-05-29 2020-02-11 International Business Machines Corporation Scene understanding using a neurosynaptic system
US10846567B2 (en) 2014-05-29 2020-11-24 International Business Machines Corporation Scene understanding using a neurosynaptic system
US9373058B2 (en) 2014-05-29 2016-06-21 International Business Machines Corporation Scene understanding using a neurosynaptic system
US10115054B2 (en) 2014-07-02 2018-10-30 International Business Machines Corporation Classifying features using a neurosynaptic system
US9798972B2 (en) 2014-07-02 2017-10-24 International Business Machines Corporation Feature extraction using a neurosynaptic system for object classification
US11138495B2 (en) 2014-07-02 2021-10-05 International Business Machines Corporation Classifying features using a neurosynaptic system
US10528807B2 (en) * 2018-05-01 2020-01-07 Scribe Fusion, LLC System and method for processing and identifying content in form documents

Also Published As

Publication number Publication date
US8600105B2 (en) 2013-12-03
CN100405828C (en) 2008-07-23
CN1822646A (en) 2006-08-23
US20080304742A1 (en) 2008-12-11

Similar Documents

Publication Publication Date Title
US8600105B2 (en) Combining multiple cues in a visual object detection system
US10325360B2 (en) System for background subtraction with 3D camera
US9600887B2 (en) Techniques for disparity estimation using camera arrays for high dynamic range imaging
US7542600B2 (en) Video image quality
US7623683B2 (en) Combining multiple exposure images to increase dynamic range
US8594451B2 (en) Edge mapping incorporating panchromatic pixels
US7095892B2 (en) Object-of-interest image capture
US7689036B2 (en) Correction of blotches in component images
USRE42367E1 (en) Method for illumination independent change detection in a pair of registered gray images
US9767387B2 (en) Predicting accuracy of object recognition in a stitched image
US8213052B2 (en) Digital image brightness adjustment using range information
US8929650B2 (en) Image color correction
US10951843B2 (en) Adjusting confidence values for correcting pixel defects
US9330340B1 (en) Noise estimation for images using polynomial relationship for pixel values of image features
CN114820334A (en) Image restoration method and device, terminal equipment and readable storage medium
US9251600B2 (en) Method and apparatus for determining an alpha value
CN113628259A (en) Image registration processing method and device
US20060114994A1 (en) Noise reduction in a digital video
Zou et al. Statistical analysis of signal-dependent noise: application in blind localization of image splicing forgery
JP2006004124A (en) Picture correction apparatus and method, and picture correction program
Sajid et al. Background subtraction under sudden illumination change
Guthier et al. Histogram-based image registration for real-time high dynamic range videos
EP2846307A1 (en) Method and apparatus for determining an alpha value for alpha matting
JP2007025901A (en) Image processor and image processing method
CN116017178A (en) Image processing method and device and electronic equipment

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CONNELL, JONATHAN H.;REEL/FRAME:015844/0848

Effective date: 20050203

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION