WO2007004864A1 - Method and apparatus for visual object recognition - Google Patents
Method and apparatus for visual object recognition Download PDFInfo
- Publication number
- WO2007004864A1 WO2007004864A1 PCT/NL2005/000485 NL2005000485W WO2007004864A1 WO 2007004864 A1 WO2007004864 A1 WO 2007004864A1 NL 2005000485 W NL2005000485 W NL 2005000485W WO 2007004864 A1 WO2007004864 A1 WO 2007004864A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- color
- density profile
- image
- invariant
- predefined
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/758—Involving statistics of pixels or of feature values, e.g. histogram matching
Definitions
- the invention relates to object recognition by visual inspection.
- the invention relates to a method of inspecting an object and associating the object with a predetermined characterization or category of objects.
- Object appearance is highly influenced by the imaging circumstances under which the object is viewed. Illumination color, shading effects, cast shadows, all affect the appearance of the object.
- local features have received much attention in the field of object recognition.
- Promising methods include the local SIFT (scale invariant feature transform) features proposed by Lowe, for instance discussed in US6711293. The dependence on local features is crucial for these methods.
- the SIFT method is however not related to analysing colouring aspects of an object.
- color invariant features are known to be very effective in emphasizing the native object characteristics.
- One publication discussing color invariants is US2003099376. However, this publication is not related to characterizing an object on a local level and the object recognition power is limited. It is an aspect of the invention to provide an apparatus and method wherein coloring aspects are taken into account in order to improve a recognition ratio of objects to be inspected. It is another aspect of the invention to provide a reliable classification of objects according to characteristics invariant to local imaging and lighting conditions. In another aspect, it is an object to provide a local characterization of imaging areas for producing and reproducing an image.
- the invention provides a method according to the features of claim
- the invention provides an apparatus according to the features of claim 16.
- a robust object recognition method is provided. This is in particular the case, when these color transitions are made invariant to local lighting conditions using color invariants.
- the method conforms to a natural image statistics characterization. Examples of such characterizations are Weibull type distributions or integrated Weibull distribution, also known as Generalized Gaussian or Generalized Laplacian.
- Fig 1 illustrates a score chart between the inventive method and a prior art recognition strategy
- Fig 2 illustrates another comparison between the inventive method and a prior art recognition strategy
- Fig 3 illustrates a local density distribution for various Weibull parameter values
- Fig 4 shows an image to be analyzed
- Fig 5 shows a retina of image analysis kernels for image analysis
- Fig 6 shows an apparatus for visual object recognition according to the invention.
- Fig 1 a score chart is illustrated of the inventive method and a prior art recognition strategy, in particular, the visual recognition strategy of the Lowe SIFT patent. It can be shown that where the prior art only has a high recognition score when an accepted fault tolerance is high, the method according to the invention shows a high recognition score with a much smaller fault tolerance. In particular, when a fault tolerance of 20% is accepted, the inventive method has a 95% recognition score. In contrast, the prior art score is then only 30%. It can be concluded that the method performs particularly well in view of the prior art method.
- FIG 2 another score chart is shown, showing a fault recognition ratio (of a total of 1000 samples) for differing illumination conditions.
- These conditions are standardized according to the ALOI conditions as is detailed in "The Amsterdam Library of Object Images", Jan-Mark Geusebroek et al, International Journal of Computer Vision 61(1), 103-112, 2005.
- the "1" condition (11-18) refers to differing illumination angles
- the "r” condition refers to differing viewing angles of the object relative to the camera
- the "c” condition relates to a frontal angle of the camera and corresponding azimuth of the illumination direction.
- the "i” condition relates to an illumination color, from reddish to white.
- a probability distribution for a different set of gamma's (ranging from 0,5 to 2) is shown for a Weibull distribution, to be elaborated further below. It shows that larger gamma results in a broader distribution, with less pronounced tails, resulting in corresponding relative small local textureness variations in the picture. Smaller gamma, results in wilder inclinations and more distributed inclinations for color transitions in the picture.
- Fig 4 a schematic approach is given of analysis of an image 1 showing an object 2 using a mask area or retina 3.
- the mask area is defined by a predefined number of image areas 4 having a predetermined position relative to each other.
- an error matching parameter is calculated by fitting a density profile of color transitions in said image area to a predefined parametrization function.
- the image area 4 is a Gaussian Kernel, given by eq. (9) herebelow.
- the scale of the kernel 4 can be adjusted to conform with scaling properties of the object to be inspected.
- the error matching parameter can be provided by eq. (16) and (17) further specified herebelow.
- An optimal recognition can be obtained by a total error matching parameter of the mask area defined as a product of error matching parameters of said image areas 4.
- Fig 5 specifically shows a configuration of a retina or mask area 3.
- a total of 1+6+12+18 37 histograms are constructed (for convenience, only a few image areas 4 are referenced), while the kernels are positioned on a hexagonal grid having a spacing distance of roughly 2 ⁇ (the kernel scale).
- Fig 6 finally shows an apparatus 5 for characterizing an object.
- the apparatus 5 comprises: an input 6 for receiving a digitized graphical image 7 and a circuit 8 arranged for defining one or more image areas of the object in said digitized graphical image. Accordingly, a number of preselected image areas are defined as explained with reference to Fig 4 and 5.
- the apparatus 5 comprises a circuit 9 for receiving digitized input of the image area for analyzing color and/or intensity transitions within the image area of a predefined color basis. These color transitions result in a calculation of color invariant coefficients as further exemplified below with respect to eqs. (4)-(7). Also a circuit 10 is provided for creating a density profile based on the transitions calculated in circuit 9 and for fitting said density profile to a predefined parametrization function. The apparatus 5 further comprises an output 11 for providing the matching parameters of said density profile.
- the apparatus 5 is communicatively coupled to a database 12 of a set of objects comprising predetermined density profiles characteristics; and matching circuitry 13 is provided for matching a measured density profile or characteristics thereof of said object to said predetermined density profile characteristics for outputting a response 14 in relation to recognizing said object.
- the matching circuitry 13 is arranged to provide an error matching parameter derived from the measured gamma and beta characteristics of a test density profile relatative to a targeted Weibull distribution.
- RGB sensitivity curves of the camera are transformed to Gaussian basis functions, being the Gaussian and its first and second order derivative.
- the transformed values represent an opponent color system, measuring intensity, yellow versus blue, and red versus green.
- Photometric invariance is now obtained by considering two non-linear transformations. The first one isolates intensity variation from chromatic variation, and is given by (leaving out parameters)
- the invariant W measures all intensity fluctuations except for overall intensity level. That is, edges due to shading, cast shadow, and albedo changes of the object surface. A more strict class of invariance is obtained by considering the chromatic invariant C,
- Each of the invariants in C is composed by an algebraic combination of the color-NJet components.
- C ⁇ x is obtained by filtering the yellow -blue opponent color channel with a first order Gaussian derivative filtering, resulting in A E ⁇ x. This is pixel-wise multiplied by the Gaussian smoothed version of the intensity channel, A E , yielding ⁇ E ⁇ x • ⁇ E.
- the second combination in the numerator of C ⁇ x is obtain by smoothing the yellow -blue opponent channel, and multiplying with the Gaussian derivative of the intensity channel. The two parts are pixel-wise subtracted, and divided by the smoothed intensity squared, yielding the invariant under consideration.
- the invariant C measures all chromatic variation in the image, disregarding intensity variation. That is, all variation where the color of the pixels change. These invariants measure point-properties of the scene, and are referred to as point-based invariants.
- Point-based invariants are well known to be unstable and noise sensitive. Increasing the scale of the Gaussian filters overcomes this partially. However, robustness is traded for invariance.
- a new class of invariant features is derived, which have high discriminative power, are robust to noise, and improve upon invariant properties of point-based invariants.
- the main idea is to construct local histograms of responses for the color invariants given in the previous section. Localization is obtained by estimating the histogram under a kernel. Kernel based descriptors are known to be highly discriminative, and have been successfully applied in tracking applications.
- Localization and spatial extent (scale) of local histograms is obtained by weighing the contribution of pixels by a kernel
- ⁇ is the Kronecker delta function
- r(x; y) is a discretized version of one of the invariant gradients (Ww, C ⁇ w, C ⁇ w ⁇ , or edge detectors (Wx, Wy, C ⁇ x, C ⁇ y, C ⁇ x, C ⁇ y ⁇ .
- the histogram h(i) is constructed by taking all pixels with discretized value i, and adding there weighed contribution, weighed by kernel k(.), to the histogram bin i.
- the choice of kernel should be such that the contribution to the histogram for pixels far away from the origin (x ⁇ ; y ⁇ ) approaches zero.
- a suitable kernel choice is provided by the Gaussian kernel,
- the parameter ⁇ k represent the size of the kernel, not to be mistaken for the scale ⁇ of the Gaussian filters in the previous section.
- an "inner” scale at which point measurements are taken which are accumulated over an "outer” scale into a local histogram.
- a kernel may be introduced in the contrast direction. This boils down to the use of a kernel density estimator for the histogram of invariant edge responses.
- a known density function may be fitted through the histogram, effectively describing the data. In that case, the accuracy of histogram estimation is not of major concern.
- histograms of derivative filters can be well modeled by simple distributions.
- histograms of Gaussian derivative filters in a large collection of images follow a Weibull type distribution.
- the gradient magnitude for invariants W and C given above follow a Weibull distribution
- r represents the response for one of the invariants (Wx, Wy, C ⁇ x, C ⁇ y, C ⁇ x, C ⁇ y ⁇ ,.
- F( ⁇ ) represents the complete Gamma
- the density estimation is only marginally sensitive to histogram quantization effects. A too small number of bins will yield poor estimates of the Weibull parameters. Too many bins will have no influence, the limit being one bin for each data point. In that case, the parameters may as well be estimated from the data directly. In general, this yields optimal estimates but at the cost of considerable computing time (typically seconds). As a rule of thumb, choosing the number of bins in the order of the one- dimensional effective extent of the kernel K will yield a good estimate of the parameters, at low computational cost (in the order of milliseconds).
- Ex and Ey represent the response to the x and y- derivative filter, respectively, and where E ⁇ is resulting response of a derivative filter in the ⁇ -direction.
- Each of the Ex and Ey responses are characterized by an integral Weibull type probability density, although they may have different parameters.
- ⁇ and ⁇ for 0°, 45°, 90°, and 135°, and use a least square fitting to obtain the shortest and longest axes ⁇ s, ⁇ l, ⁇ s, and ⁇ l, which characterizes the local histogram invariant to rotation of the original image.
- F* represents the test distribution
- F the target cumulative distribution function under consideration.
- An object is characterized by learning the invariant Weibull parameters at fixed locations in the training image, representing a sort of fixed "retina" of receptive fields as discussed with reference to Figs 4 and 5.
- the same retinal structure is swept over the target image, and values are compared (Eq. (16)) against the object under consideration (or a database). Hence, the example objects are searched within the composition.
- the proposed recognition algorithm runs at two frames per second, allowing close to real time recognition rates.
- a photometric reflectance model also, other models can be used for determining photometric invariants, for example, by deriving coloring coefficients from a transmitted light model (for instance, for the purposes of image analysis in light microscopy), a scattered or transluded or diffused light model (for example, in the analysis of images with diffused light such as translucent plastics), or a fluorescent light model (for instance, for purposes of cell classification methods in fluorescence miser oscopy/flow cytometry).
- a transmitted light model for instance, for the purposes of image analysis in light microscopy
- a scattered or transluded or diffused light model for example, in the analysis of images with diffused light such as translucent plastics
- a fluorescent light model for instance, for purposes of cell classification methods in fluorescence miser oscopy/flow cytometry.
- G(r; m , ⁇ ) l/(sqrt(2 pi) sigma) exp(((x-m)/ ⁇ ) A 2) which has a origin given by its mean "m” and a width given by its standard deviation " ⁇ ".
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
Abstract
Method of characterizing an object comprising: defining one or more image areas of the object; analyzing color and/or intensity transitions within the image area of a predefined color basis; creating a density profile of said transitions in said image area; and fitting said density profile to a predefined parametrization function According to the method said density profile is characteristic for an object and can be used for object recognition purposes.
Description
Title: Method and apparatus for visual object recognition
The invention relates to object recognition by visual inspection. In particular, the invention relates to a method of inspecting an object and associating the object with a predetermined characterization or category of objects. Object appearance is highly influenced by the imaging circumstances under which the object is viewed. Illumination color, shading effects, cast shadows, all affect the appearance of the object. Also local features have received much attention in the field of object recognition. Promising methods include the local SIFT (scale invariant feature transform) features proposed by Lowe, for instance discussed in US6711293. The dependence on local features is crucial for these methods. The SIFT method is however not related to analysing colouring aspects of an object.
Especially color characteristics of an object are highly sensible for local illumination circumstances. For this class of appearance effects, color invariant features are known to be very effective in emphasizing the native object characteristics. One publication discussing color invariants is US2003099376. However, this publication is not related to characterizing an object on a local level and the object recognition power is limited. It is an aspect of the invention to provide an apparatus and method wherein coloring aspects are taken into account in order to improve a recognition ratio of objects to be inspected. It is another aspect of the invention to provide a reliable classification of objects according to characteristics invariant to local imaging and lighting conditions. In another aspect, it is an object to provide a local characterization of imaging areas for producing and reproducing an image. Accordingly, the invention provides a method according to the features of claim Un another aspect, the invention provides an apparatus according to the features of claim 16. In particular, by defining one or more image areas of the object; analyzing color and/or intensity transitions within
the image area of a predefined color basis; creating a density profile of said transitions in said image area; and fitting said density profile to a predefined parametrization, a robust object recognition method is provided. This is in particular the case, when these color transitions are made invariant to local lighting conditions using color invariants. Further, in particular, the method conforms to a natural image statistics characterization. Examples of such characterizations are Weibull type distributions or integrated Weibull distribution, also known as Generalized Gaussian or Generalized Laplacian.
The invention will be further elucidated with reference to the figures. In the figures:
Fig 1 illustrates a score chart between the inventive method and a prior art recognition strategy;
Fig 2 illustrates another comparison between the inventive method and a prior art recognition strategy; Fig 3 illustrates a local density distribution for various Weibull parameter values;
Fig 4 shows an image to be analyzed;
Fig 5 shows a retina of image analysis kernels for image analysis; and Fig 6 shows an apparatus for visual object recognition according to the invention.
In Fig 1 a score chart is illustrated of the inventive method and a prior art recognition strategy, in particular, the visual recognition strategy of the Lowe SIFT patent. It can be shown that where the prior art only has a high recognition score when an accepted fault tolerance is high, the method according to the invention shows a high recognition score with a much smaller fault tolerance. In particular, when a fault tolerance of 20% is accepted, the inventive method has a 95% recognition score. In contrast, the prior art score
is then only 30%. It can be concluded that the method performs particularly well in view of the prior art method.
In Fig 2 another score chart is shown, showing a fault recognition ratio (of a total of 1000 samples) for differing illumination conditions. These conditions are standardized according to the ALOI conditions as is detailed in "The Amsterdam Library of Object Images", Jan-Mark Geusebroek et al, International Journal of Computer Vision 61(1), 103-112, 2005. Here the "1" condition (11-18) refers to differing illumination angles; the "r" condition refers to differing viewing angles of the object relative to the camera; and the "c" condition relates to a frontal angle of the camera and corresponding azimuth of the illumination direction. The "i" condition relates to an illumination color, from reddish to white. It appears that the chart of the method of the invention yields a lower error score than the prior art method; for instance, a small rotation gives a dramatic increase of error score for the prior art method; whereas the inventive method only shows a light increase of error score for rotations close to 90° (0° indicating a frontal view).
In Fig 3 a probability distribution for a different set of gamma's (ranging from 0,5 to 2) is shown for a Weibull distribution, to be elaborated further below. It shows that larger gamma results in a broader distribution, with less pronounced tails, resulting in corresponding relative small local textureness variations in the picture. Smaller gamma, results in wilder inclinations and more distributed inclinations for color transitions in the picture.
In Fig 4 a schematic approach is given of analysis of an image 1 showing an object 2 using a mask area or retina 3. Here, the mask area is defined by a predefined number of image areas 4 having a predetermined position relative to each other. Of each individual image area an error matching parameter is calculated by fitting a density profile of color transitions in said image area to a predefined parametrization function. In particular, the image area 4 is a Gaussian Kernel, given by eq. (9) herebelow.
Furthermore, the scale of the kernel 4 can be adjusted to conform with scaling properties of the object to be inspected.
In particular, the error matching parameter can be provided by eq. (16) and (17) further specified herebelow. An optimal recognition can be obtained by a total error matching parameter of the mask area defined as a product of error matching parameters of said image areas 4.
Fig 5 specifically shows a configuration of a retina or mask area 3. Here a total of 1+6+12+18 = 37 histograms are constructed (for convenience, only a few image areas 4 are referenced), while the kernels are positioned on a hexagonal grid having a spacing distance of roughly 2 σ (the kernel scale). Fig 6 finally shows an apparatus 5 for characterizing an object. The apparatus 5 comprises: an input 6 for receiving a digitized graphical image 7 and a circuit 8 arranged for defining one or more image areas of the object in said digitized graphical image. Accordingly, a number of preselected image areas are defined as explained with reference to Fig 4 and 5.
Furthermore, the apparatus 5 comprises a circuit 9 for receiving digitized input of the image area for analyzing color and/or intensity transitions within the image area of a predefined color basis. These color transitions result in a calculation of color invariant coefficients as further exemplified below with respect to eqs. (4)-(7). Also a circuit 10 is provided for creating a density profile based on the transitions calculated in circuit 9 and for fitting said density profile to a predefined parametrization function. The apparatus 5 further comprises an output 11 for providing the matching parameters of said density profile. In particular, for object recognition purposes the apparatus 5 is communicatively coupled to a database 12 of a set of objects comprising predetermined density profiles characteristics; and matching circuitry 13 is provided for matching a measured density profile or characteristics thereof of said object to said predetermined density profile characteristics for outputting a response 14 in relation to recognizing said object. The matching circuitry 13
is arranged to provide an error matching parameter derived from the measured gamma and beta characteristics of a test density profile relatative to a targeted Weibull distribution.
In the remainder, prior to introducing the histogram based invariants according to the invention, a short overview is given of Color Invariant features. The analysis is started by transforming each pixel's RGB value to an opponent color representation,
The rationale behind this transformation is that the RGB sensitivity curves of the camera are transformed to Gaussian basis functions, being the Gaussian and its first and second order derivative. Hence, the transformed values represent an opponent color system, measuring intensity, yellow versus blue, and red versus green.
Spatial scale is incorporated by convolving the opponent color images with a Gaussian filter,
E(x, y, σ) = G(x, y, σ) * E(x, y)
Ex (x, y, σ) = G(x, y, σ) * Eλ(x. y)
E xx (x, y, σ) = G(X1 y, σ) * Eλλ (x, y)
(2) where
(3)
Note that one now has a spatial Gaussian times a spectral Gaussian, yielding a combined Gaussian measurement in a spatio-spectral Hubert space. The spatial derivatives results in the color-NJet, denoted by its components {E, Ex, Ey, ..., Eλ , Eλx, Eλy,.., Eλλ , Eλλx, Eλλy, ...}.
Photometric invariance is now obtained by considering two non-linear transformations. The first one isolates intensity variation from chromatic variation, and is given by (leaving out parameters)
W = Eχ^m n H- m > 1
E
(4)
That is, all spatial derivatives of the intensity image AE normalized by intensity ΛE. The invariant W measures all intensity fluctuations except for overall intensity level. That is, edges due to shading, cast shadow, and albedo changes of the object surface. A more strict class of invariance is obtained by considering the chromatic invariant C,
C λλaιnym dxndym { E(x, y, σ)
(6) for which the first order derivatives are given by (leaving out parameters)
Eχ^E — EχEw E χvE — EχE: y
^Xx — & Xu =
E* E2 (7) EχχxE — EwEx E XXy E - EχχEtJ
O XXx = O XXy =
E'
Each of the invariants in C is composed by an algebraic combination of the color-NJet components. For example, Cλx is obtained by filtering the
yellow -blue opponent color channel with a first order Gaussian derivative filtering, resulting in AEλx. This is pixel-wise multiplied by the Gaussian smoothed version of the intensity channel, AE , yielding ΛEλ x • ΛE. The second combination in the numerator of Cλx is obtain by smoothing the yellow -blue opponent channel, and multiplying with the Gaussian derivative of the intensity channel. The two parts are pixel-wise subtracted, and divided by the smoothed intensity squared, yielding the invariant under consideration. The invariant C measures all chromatic variation in the image, disregarding intensity variation. That is, all variation where the color of the pixels change. These invariants measure point-properties of the scene, and are referred to as point-based invariants.
Point-based invariants, as provided above, are well known to be unstable and noise sensitive. Increasing the scale of the Gaussian filters overcomes this partially. However, robustness is traded for invariance. In this section, a new class of invariant features is derived, which have high discriminative power, are robust to noise, and improve upon invariant properties of point-based invariants. The main idea is to construct local histograms of responses for the color invariants given in the previous section. Localization is obtained by estimating the histogram under a kernel. Kernel based descriptors are known to be highly discriminative, and have been successfully applied in tracking applications.
Advantage of the use of an opponent color space, with additional photometric invariant transformations, is that color values are decorrelated. Hence, for a distinctive image content descriptor, one may as well use the marginal, one dimensional, distributions for each of the color channels. This in contrast to the histogram of the full 2D chromatic or 3D color space. In the next sections, the one-dimensional channel histograms of the invariant gradients (Ww, Cλw, Cλλw}, or edge detectors (Wx, Wy, Cλx, Cλy, Cλλx, Cλλy}, are considered separately.
The resulting histograms may be described by parameterized density functions. The parameters act as a new class of photometric and geometric invariants.
Localization and spatial extent (scale) of local histograms is obtained by weighing the contribution of pixels by a kernel
h(i) = ]P k(x - '3GQy V - Vo)S [r(x,y) - i] x,t/
(8) where δ is the Kronecker delta function, and where r(x; y) is a discretized version of one of the invariant gradients (Ww, Cλw, Cλλw}, or edge detectors (Wx, Wy, Cλx, Cλy, Cλλx, Cλλy}. The histogram h(i) is constructed by taking all pixels with discretized value i, and adding there weighed contribution, weighed by kernel k(.), to the histogram bin i. The choice of kernel should be such that the contribution to the histogram for pixels far away from the origin (xθ; yθ) approaches zero. A suitable kernel choice is provided by the Gaussian kernel,
The parameter σk represent the size of the kernel, not to be mistaken for the scale σ of the Gaussian filters in the previous section. Hence, there is provided an "inner" scale at which point measurements are taken, which are accumulated over an "outer" scale into a local histogram. Besides spatial extent, a kernel may be introduced in the contrast direction. This boils down to the use of a kernel density estimator for the histogram of invariant edge responses. Next it will be shown that a known density function may be fitted through the histogram, effectively describing the data. In that case, the accuracy of histogram estimation is not of major concern.
From natural image statistics research, it is known that histograms of derivative filters can be well modeled by simple distributions. In previous work, we showed that histograms of Gaussian derivative filters in a large collection of images follow a Weibull type distribution. Furthermore, the gradient magnitude for invariants W and C given above follow a Weibull distribution,
where r represents the response for one of the invariants (Ww, Cλw, Cλλw}. The local histogram of invariants of derivative filters can be well modelled by an integrated Weibull type distribution
In this case, r represents the response for one of the invariants (Wx, Wy, Cλx, Cλy, Cλλx, Cλλy},. Furthermore, F(α) represents the complete Gamma
function, T(a) . See Figure 1 for examples of the distribution. For
the Weibull distribution, an expression for the MLE (Maximum Likelyhood Estimation) of β and γ is given by
(12)
assuming zero centered data. For the local invariant histograms, the histogram density was converted to Weibull form by first centering the data at the mode μ of the distribution. Then, multiplying each bin frequency p(ri) by its absolute response value | p(ri) | , and normalizing the distribution. These transformations allows the estimation of the Weibull density parameters β and
Y-
Note that the density estimation is only marginally sensitive to histogram quantization effects. A too small number of bins will yield poor estimates of the Weibull parameters. Too many bins will have no influence, the limit being one bin for each data point. In that case, the parameters may as well be estimated from the data directly. In general, this yields optimal estimates but at the cost of considerable computing time (typically seconds). As a rule of thumb, choosing the number of bins in the order of the one- dimensional effective extent of the kernel K will yield a good estimate of the parameters, at low computational cost (in the order of milliseconds).
So far, Weibull parameters were estimated from either the gradient magnitude of invariant color edge filters, or directly from the derivative response of these filters. Regarding the former case, rotation invariance is trivially obtained by the rotational symmetry of the gradient operator and the rotational symmetry of the kernel K(.) Eq. (9). However, when assessing the response of derivative filters in the latter case, derivatives are taken in a particular direction. It was found previously that Weibull parameters are close
to elliptical when plotted as function of orientation. This empirical finding may be explained by the steerable characteristic of the Gaussian derivative filter. If one takes a derivative filter in the x and y-direction, a derivative in any other direction may be achieved by the linear combination [2]
EQ = .Ex cos θ H- By sin $ " (14) where Ex and Ey represent the response to the x and y- derivative filter, respectively, and where Eθ is resulting response of a derivative filter in the θ-direction. Each of the Ex and Ey responses are characterized by an integral Weibull type probability density, although they may have different parameters.
Hence, their weighted sum results in a probability density which is given by the linear convolution of the two densities. As a consequence, the Weibull parameters span ellipses when plotted as function of angle. The shortest and longest principal axis for β and γ, together with the orientation of the ellipse, indicate the directional structure in the underlying edge map. To achieve rotation invariance, one needs to estimate the longest and shortest principal axes of these ellipses, disregarding the ellipse's orientation. Many methods exists for elliptic fitting. As a simple example, one could estimate the γ and β for 0°, 45°, 90°, and 135°, and use a least square fitting to obtain the shortest and longest axes γs, γl, βs, and βl, which characterizes the local histogram invariant to rotation of the original image.
For translational invariance, consider the two steps of the algorithm, being the edge detection and the local histogram formation. The edge detection boils down to a combination of convolution operators, which is inherently translation invariant. The local histogram formation is parameterized by the kernel K(x; y; σk) Eq. (9), fixed at its location (x; y). Translation invariance here is obtained by a dense sampling over all image locations, where "dense" implies a sampling such that kernel centers are typically σk apart. Hence,
translational invariance is ensured by the convolution operator, followed by densely sampling the local histograms. Regarding scale invariance, both the invariant edge detectors as well as the local histogram kernel have a scale parameter. Scale invariance is achieved by sampling over increasing scale. Scale selection methods can be applied to detect locally the optimal scale for describing image content [10].
Consider the parameters μ, β, and γ estimated from the local histograms of point-invariants. Recall the point invariants W and C to measure invariant edge strength. The local histogram of Ww represents the possibility of finding an edge with contrast w within the local region described by a kernel K. Smooth transitions in the image, smooth compared to the size of the kernel K, will cause a shift in the edge histogram. Hence, its mode μ will be shifted from zero. Such shifts are typically caused by uneven illumination, large scale shading effects, and -most prominently in the chromatic C invariant- by colored illumination. Hence, to achieve color constancy, the value of μ may be ignored. The remaining parameters β and γ indicate the (local) edge contrast, and the (local) roughness or textureness, respectively [5].
To assess the difference between two local histograms, fitted by a Weibull distribution, a goodness-of-fit tests is considered between the two respective distributions. A sensitive goodness-of-fit test is obtained when considering the integrated squared error between the cumulative distributions, which is obtained by the Cramer-von Mises statistic,
Here, F* represents the test distribution, and F the target cumulative distribution function under consideration. For the Weibull distribution with different parameters γl βl γ2 β2, a first order Taylor approximation boils down
to the log difference between the parameters. Hence, assessing the ratio between the parameter values,
1- β-
€y = 1 — — — , £β = 1 — -T— .
7+ P+
(16) where γ+ and γ- are the and are the largest and smallest of γl and γ2, respectively. Similarly, β+ and β- and represent the values for beta. In this way, the errors are normalized between zero and one. Due to independence between the and γ and β parameter, the total error is obtained by
€ = ey€β .
(17) Note that, for non-overlapping histograms taken at various locations, the histograms may be assumed to be independent, and errors may be multiplied to yield a total error.
To assess the constancy of the proposed invariants under varying illumination conditions, the features have been applied to the ALOI collection. The collection consists of 1,000 objects recorded under various imaging circumstances. Specifically, viewing angle, illumination angle, and illumination color is systematically varied for each object, resulting in over a hundred images of each object. Color constancy, one of the hardest cases of illumination invariance, is tested by assessing the variation in the parameters of the Weibull fit as function of illumination color. Therefore, the illO : : : i250 recordings of the 1,000 objects in the ALOI collection are considered, yielding black-body illumination in the range of 2175 up to 3075 Kelvin. Parameters were fitted for the central local histogram of the collection, at a scale σk = 30 pixels. The scale for the derivative filters was set at σ = 4 pixels. To compare the proposed features with point-based invariants, the experiments were repeated for Gaussian point-based features at scale σ = 4 and at scale σ = 30, respectively. In this way, insight is gained in the behavior of the features at
5 000485
14
which the local histograms are based. Furthermore, comparison with point- based features having the same spatial extent as the local histograms can be made.
Discriminative power was tested by counting the number of patches which could not be distinguished within the 1,000 objects in the ALOI collection. Therefore, the local histogram at the centre of the 18cl image was constructed, and values were compared against the same patch for the other objects. A patch was counted as indistinguishable if values were within the relative error at i = 110 in the previous graphs. If one only considers intensity W, 858 and 679 patches are indistinguishable for respectively σ = 4 and σ = 30. For β, 672 patches are not distinguishable out of 920. For γ, 742 patches are undistinguishable. If one combines γ and β, only 29 objects are indistinguishable. For chromatic variation only, that is Cλ and Cλλ, 884 and 816 patches are indistinguishable for respectively σ = 1 and σ = 7:5. For β, 636 patches are not distinguishable out of 920. For γ, 565 patches are undistinguishable. If one combines γ and β, only 91 objects are indistinguishable. Most discriminative power is obtained if one combines intensity and chromaticity by taking W, Cλ, and Cλλ into account. In that case, 629 and 170 patches are indistinguishable for the point-based invariants at respectively σ = 4 and σ = 30. For β, 75 patches are not distinguishable out of 920. For γ, 28 patches are undistinguishable. If one combines γ and β, all 920 objects can be recognized. Hence, the newly proposed features outperform point-based invariants with respect to discriminative power.
To illustrate the effectiveness of the proposed features in capturing object properties, a simple algorithm for object recognition and localization is suggested. An object is characterized by learning the invariant Weibull parameters at fixed locations in the training image, representing a sort of fixed "retina" of receptive fields as discussed with reference to Figs 4 and 5. The kernels are positioned at a hexagonal grid, 2σ apart, on three rings from the
center of the image. Hence, a total of 1 + 6 + 12 + 18 = 37 histograms are constructed. For each histogram, the invariant Weibull parameters are estimated. The same retinal structure is swept over the target image, and values are compared (Eq. (16)) against the object under consideration (or a database). Hence, the example objects are searched within the composition. The proposed recognition algorithm runs at two frames per second, allowing close to real time recognition rates.
It was found that the algorithm is not too sensitive for differences in background. Even a cluttered background yields correct recognition and despite illumination conditions, the proposed features are well able to capture the intrinsic texture of the material. Furthermore, tests were performed where a composite image was captured by a high-quality CMOS 3RGB camera (Sigma SD9) at night, with electronic flash. The sensitivity curves of the CMOS RGB sensors is known to be highly non-linear. The example objects are captured by a digital video camera (JVC GR-DX25), with digital video compressed output, under daylight/direct sunlight (the bear). Without any calibration, the change in equipment appears not to influence results, demonstrating the robustness of the proposed features. The proposed features appear to be highly stable against compression artifacts.
Although the method has been explained with reference to a photometric reflectance model, also, other models can be used for determining photometric invariants, for example, by deriving coloring coefficients from a transmitted light model (for instance, for the purposes of image analysis in light microscopy), a scattered or transluded or diffused light model (for example, in the analysis of images with diffused light such as translucent plastics), or a fluorescent light model (for instance, for purposes of cell classification methods in fluorescence miser oscopy/flow cytometry). Furthermore, while the method has been explained with reference to fitting to Weibull functions, generally, given the discretized probability density of the
values occuring in the edge image also other functions can be fitted through the histogram values in order to summarize the content of the histogram by the parameters of the simple function. Such functions should have a settable origin, a settable width, and optionally a settable peakness. For example, consider a Gaussian function
G(r; m , σ) = l/(sqrt(2 pi) sigma) exp(((x-m)/σ )A2) which has a origin given by its mean "m" and a width given by its standard deviation "σ".
Alternatively, consider an double exponential e(r; m, σ) = 1/(2 sigma) exp(- 1 r-m | / σ) origin given by m, width given by σ . Furthermore, consider the integral form of the Weibull distribution (Weibull, 1951), f(r; m, b , g) = 1/(2 b gΛ(l/g) Gamma(l/g)) exp(- 1 (r-m )/b I Λg) where Gamma(t) is the complete Gamma function, and where m denotes origin, b the width, and g the peakness of the distribution.
In the case of gradient magnitude, which is an absolute positive quantity by combining orthogonal filter responses grad_mag = sqrt(Filter_xA2+Filter_yA2), the response distribution is best characterized by a simple distribution limited at 0 at one extreme, going to infinite at the other extreme, having a settable width and optionally a skewness parameter.
Consider for example the Rayleigh distribution f(x;beta) = (2/b)x exp(-(x/b)Λ2) where beta represents the width, or alternatively the Weibull distribution, f(x; b, g) = g/b xA(g -1) exp(-(x/b)Λg) where b indicates its width and g its skewness. Of course, many additional alternatives can be constructed. Furthermore, the class of "simple" peaked functions is well approximated by combinations of other simple function, for example by a Mixture of Gaussians. The parameters of the
simple function can be optimized such that the parameterized function optimally or sub optimally fits the histogram. Then, the values of the histogram are characterized by the chosen function and its parameters. Hence, the object representation is now coded by the few parameters of the simple function rather than by the original discretized values of the histogram. This results in a large data reduction, such that more objects can be stored in a small memory than the original histogram representation. Consequently, more objects may be stored and faster search times are achieved. Additionally, invariant properties of the parameters can be characterized. Furthermore, while the method and apparatus have been discussed in the context of object recognition, more generally, object characterization is well within reach of the scope of the invention. These aspects can be used in a broader context, for instance, in the context of rendering virtual reality images and/or compression technology or image classification in search machines. While the invention has been described with reference to the drawings, it is not limited thereto and these embodiments are for illustrative purposes only, where variations and modifications to the basic inventive concept is possible without departing from the scope of the claims, as defined hereinafter.
Claims
1. Method of characterizing an object comprising:
- defining one or more image areas of the object;
- analyzing color and/or intensity transitions within the image area of a predefined color basis; - creating a density profile of said transitions in said image area; and
- fitting said density profile to a predefined parametrization function.
2. Method according to claim 1, wherein said parametrization conforms to a natural image statistics characterization.
3. Method according to claim 1, wherein said parametrization conforms to a Weibull distribution or integrated Weibull distribution.
4. Method according to any of claims 1-3, further comprising defining an error matching parameter as a function of a gamma error and a beta error for gamma and beta derived from a test density profile relatative to a targeted Weibull distribution.
5. Method according to claim 4, wherein said gamma error ε(γ) is given by: ε(γ) = 1- γ-/γ+; wherein γ- is the smaller and γ+ is the larger of the γ parameters of the test and target distributions; and wherein said βerror ε(β) is given by: ε(β) = 1- β-/β+; wherein β- is the smaller and β+ is the larger of the γ parameters of the test and target distributions.
6. Method according to claim 4 and 5, further comprising a step of analysing color transitions in at least two different directions; comparing the density distributions in said at least two directions; and deriving a rotation invariant feature from said at least two density distributions.
7. Method according to claim 6, wherien said rotation invariant feature is provided from the group of long and a short axes of an ellipse describing a Weibull parameter curve as a function of rotation.
8. Method according to claim 1, further comprising a mask area comprising a predefined number of image areas having a predetermined position relative to each other, wherein a total error matching parameter of the mask area is defined as the product of error matching parameters of said image areas.
9. Method according to claim 1, wherein said image area is defined by a Gausssian kernel.
10. Method according to any of the preceding claims, wherein said method further comprises providing a database of a set of objects comprising predetermined density profiles characteristics; and matching a measured density profile or characteristics thereof of said object to said predetermined density profile characteristics for recognizing said object.
11. Method according to claim 1, wherein said color basis is an invariant color basis providing point coloring characteristics of said image which are invariant to lighting and imaging conditions.
12. Method according to claim 11, wherein said invariant color basis is formed by coloring coefficients derived from a photometric reflectance model of imaging light reflected by said object.
13. Method according to claim 11, wherein said invariant color basis is formed by coloring coefficients derived from a transmitted light model, a scattered or transluded or diffused light model, or a fluorescent light model.
14. Method according to claim 12, wherein said color invariant coefficients are expressed by a spatial derivatives W of a normalized intensity value; and spatial derivatives C of a chromatic variation value of opponent color images transforming red, green and blue in intensity, yellow/blue contrast and red/green contrast.
15. Method according to claim 1, wherein a density profile is created from an image area by spatially integrating said point coloring characteristics by a weighted contribution in a predefined image area.
16. Apparatus for characterizing an object comprising: - an input for receiving a digitized graphical image;
- a circuit arranged for defining one or more image areas of the object;
— analyzing color and/or intensity transitions within the image area of a predefined color basis;
— creating a density profile of said transitions in said image area; and — fitting said density profile to a predefined parametrization function; and
- an output for providing the matching parameters of said density profile.
17. Apparatus according to claim 16, wherein apparatus is further communicatively coupled to a database of a set of objects comprising predetermined density profiles characteristics; and matching circuitry is provided for matching a measured density profile or characteristics thereof of said object to said predetermined density profile characteristics for recognizing said object.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/NL2005/000485 WO2007004864A1 (en) | 2005-07-06 | 2005-07-06 | Method and apparatus for visual object recognition |
PCT/NL2006/000328 WO2007004868A1 (en) | 2005-07-06 | 2006-07-03 | Method and apparatus for image characterization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/NL2005/000485 WO2007004864A1 (en) | 2005-07-06 | 2005-07-06 | Method and apparatus for visual object recognition |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2007004864A1 true WO2007004864A1 (en) | 2007-01-11 |
Family
ID=34981303
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/NL2005/000485 WO2007004864A1 (en) | 2005-07-06 | 2005-07-06 | Method and apparatus for visual object recognition |
PCT/NL2006/000328 WO2007004868A1 (en) | 2005-07-06 | 2006-07-03 | Method and apparatus for image characterization |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/NL2006/000328 WO2007004868A1 (en) | 2005-07-06 | 2006-07-03 | Method and apparatus for image characterization |
Country Status (1)
Country | Link |
---|---|
WO (2) | WO2007004864A1 (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8015131B2 (en) | 2007-10-12 | 2011-09-06 | Microsoft Corporation | Learning tradeoffs between discriminative power and invariance of classifiers |
CN106682157A (en) * | 2016-12-24 | 2017-05-17 | 辽宁师范大学 | Image retrieval method based on Weibull distribution parameters |
CN108447058A (en) * | 2018-03-30 | 2018-08-24 | 北京理工大学 | A kind of image quality evaluating method and system |
CN118097289A (en) * | 2024-03-15 | 2024-05-28 | 华南理工大学 | Open world target detection method based on visual large model enhancement |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2914984B1 (en) | 2012-11-02 | 2019-02-20 | Exxonmobil Upstream Research Company | Analyzing seismic data |
WO2015089115A1 (en) | 2013-12-09 | 2015-06-18 | Nant Holdings Ip, Llc | Feature density object classification, systems and methods |
US11386636B2 (en) | 2019-04-04 | 2022-07-12 | Datalogic Usa, Inc. | Image preprocessing for optical character recognition |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1018700A2 (en) * | 1999-01-08 | 2000-07-12 | Omron Corporation | An image recognition device using pattern elements |
-
2005
- 2005-07-06 WO PCT/NL2005/000485 patent/WO2007004864A1/en active Application Filing
-
2006
- 2006-07-03 WO PCT/NL2006/000328 patent/WO2007004868A1/en active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1018700A2 (en) * | 1999-01-08 | 2000-07-12 | Omron Corporation | An image recognition device using pattern elements |
Non-Patent Citations (5)
Title |
---|
CHANG C-C ET AL: "A Color Image Retrieval Method Based on Local Histogram", LECTURE NOTES IN COMPUTER SCIENCE, SPRINGER VERLAG, NEW YORK, NY, US, vol. 2195, 2001, pages 831 - 836, XP002319297, ISSN: 0302-9743 * |
GEUSEBROEK J -M ET AL: "A six-stimulus theory for stochastic texture", INTERNATIONAL JOURNAL OF COMPUTER VISION KLUWER ACADEMIC PUBLISHERS NETHERLANDS, vol. 62, no. 1-2, April 2005 (2005-04-01), pages 7 - 16, XP002347919, ISSN: 0920-5691 * |
GEUSEBROEK J-M ET AL: "COLOR INVARIANCE", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, IEEE SERVICE CENTER, LOS ALAMITOS, CA, US, vol. 12, no. 23, December 2001 (2001-12-01), pages 1338 - 1350, XP001141668, ISSN: 0162-8828 * |
GEVERS T ET AL: "Robust histogram construction from color invariants for object recognition", IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE IEEE COMPUT. SOC USA, vol. 26, no. 1, January 2004 (2004-01-01), pages 113 - 118, XP002347920, ISSN: 0162-8828 * |
HEALEY G ET AL: "USING ILLUMINATION INVARIANT COLOR HISTOGRAM DESCRIPTORS FOR RECOGNITION", PROCEEDINGS OF THE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION. SEATTLE, JUNE 21 - 23, 1994, LOS ALAMITOS, IEEE COMP. SOC. PRESS, US, 21 June 1994 (1994-06-21), pages 355 - 360, XP000515863, ISBN: 0-8186-5827-4 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8015131B2 (en) | 2007-10-12 | 2011-09-06 | Microsoft Corporation | Learning tradeoffs between discriminative power and invariance of classifiers |
CN106682157A (en) * | 2016-12-24 | 2017-05-17 | 辽宁师范大学 | Image retrieval method based on Weibull distribution parameters |
CN108447058A (en) * | 2018-03-30 | 2018-08-24 | 北京理工大学 | A kind of image quality evaluating method and system |
CN108447058B (en) * | 2018-03-30 | 2020-07-14 | 北京理工大学 | Image quality evaluation method and system |
CN118097289A (en) * | 2024-03-15 | 2024-05-28 | 华南理工大学 | Open world target detection method based on visual large model enhancement |
Also Published As
Publication number | Publication date |
---|---|
WO2007004868A1 (en) | 2007-01-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Jagalingam et al. | A review of quality metrics for fused image | |
Finlayson et al. | On the removal of shadows from images | |
US8478040B2 (en) | Identification apparatus and method for identifying properties of an object detected by a video surveillance camera | |
JP4997252B2 (en) | How to identify the illumination area in an image | |
Finlayson et al. | Entropy minimization for shadow removal | |
Riess et al. | Scene illumination as an indicator of image manipulation | |
US7760942B2 (en) | Methods for discriminating moving objects in motion image sequences | |
CN102308306B (en) | A constraint generator for use in image segregation | |
US8144975B2 (en) | Method for using image depth information | |
Klinker et al. | Image segmentation and reflection analysis through color | |
Tan et al. | Separation of highlight reflections on textured surfaces | |
WO2007004864A1 (en) | Method and apparatus for visual object recognition | |
US20130342694A1 (en) | Method and system for use of intrinsic images in an automotive driver-vehicle-assistance device | |
WO2018223267A1 (en) | Method and system for hyperspectral light field imaging | |
Kagarlitsky et al. | Piecewise-consistent color mappings of images acquired under various conditions | |
US20140294296A1 (en) | Spatially varying log-chromaticity normals for use in an image process | |
US9754155B2 (en) | Method and system for generating intrinsic images using a single reflectance technique | |
CN110321869A (en) | Personnel's detection and extracting method based on Multiscale Fusion network | |
Erener et al. | A methodology for land use change detection of high resolution pan images based on texture analysis | |
Zhou et al. | Recognizing black point in wheat kernels and determining its extent using multidimensional feature extraction and a naive Bayes classifier | |
US8934735B2 (en) | Oriented, spatio-spectral illumination constraints for use in an image progress | |
Jacq et al. | Structure-from-motion, multi-view stereo photogrammetry applied to line-scan sediment core images | |
Tatar et al. | a New Object-Based Framework to Detect Shodows in High-Resolution Satellite Imagery Over Urban Areas | |
Lauziere et al. | Autonomous physics-based color learning under daylight | |
Panetta et al. | Techniques for detection and classification of edges in color images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 05759862 Country of ref document: EP Kind code of ref document: A1 |