WO2006090313A2 - Object recognition using adrc (adaptive dynamic range coding) - Google Patents

Object recognition using adrc (adaptive dynamic range coding) Download PDF

Info

Publication number
WO2006090313A2
WO2006090313A2 PCT/IB2006/050518 IB2006050518W WO2006090313A2 WO 2006090313 A2 WO2006090313 A2 WO 2006090313A2 IB 2006050518 W IB2006050518 W IB 2006050518W WO 2006090313 A2 WO2006090313 A2 WO 2006090313A2
Authority
WO
WIPO (PCT)
Prior art keywords
type
digital data
digital
representation
data representation
Prior art date
Application number
PCT/IB2006/050518
Other languages
French (fr)
Other versions
WO2006090313A3 (en
Inventor
Ahmet Ekin
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Publication of WO2006090313A2 publication Critical patent/WO2006090313A2/en
Publication of WO2006090313A3 publication Critical patent/WO2006090313A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • G06V10/507Summing image-intensity values; Histogram projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/758Involving statistics of pixels or of feature values, e.g. histogram matching

Definitions

  • the invention relates to identification of a type of one or more digital objects.
  • the invention relates to type identification by means of texture-based features.
  • Object detection in digital image data is interesting for a number of applications, such as a surveillance system, a face recognition system, a video-based computer/machine interface, etc. It has proven a difficult task to provide a machine, e.g. a computer-based machine with the ability of recognizing a type of an object in an image with a high certainty, i.e. assigning an abstract type to a concrete ensemble of image elements.
  • Methods of identifying a type of a digital object include differentiating one object from the other by means of texture features.
  • texture features such as Gabor filter coefficients
  • the problem with using based-based features is the variability of object based features such as due to the changes in illumination, image capturing parameters, and the employed based spaces.
  • the inventor of the present invention has appreciated that an improved method for type identification of a digital object is of benefit, and has in consequence devised the present invention.
  • the present invention seeks to provide improved means for identifying one or more objects in a digital data representation.
  • the invention alleviates or mitigates one or more of the above or other disadvantages singly or in any combination.
  • a method for identifying a type of one or more digital objects in a digital data representation comprising the steps of: transforming at least a part of the digital data representation from a first form to a second form, and identifying the type of the one or more digital objects from a probabilistic- based comparison between the digital data in the second form and type data in a type repository, wherein the digital data is transformed into the second form by means of an Adaptive Dynamic Range Coding (ADRC) process.
  • ADRC Adaptive Dynamic Range Coding
  • a digital data representation may be an image, such as a 2D image made up pixels, it may be a 3D image made up of voxels, it may be a stream of images, such as a video stream, etc.
  • the format of the image may be any type of format, such as standard image and video stream formats.
  • An object in the image may be any kind of image objects, such as a graphical object defined by a selection of image elements.
  • the digital data representation may initially be in a first form, and at least a part of the digital data representation may be transformed into a second form.
  • the second form may be obtained by means of running one or more algorithms, one or more mathematical transformations, etc. on the data to obtain the digital data in the second form.
  • the data may in the transformation process be present in one or more intermediate forms, e.g. in connection with running a number of algorithms.
  • the dimensionality of the data may be altered in the transformation, e.g. 2D and/or 3D image data, may be transformed into ID data.
  • the transformation to the second form may include statistical data analysis.
  • a repository of type data may be consulted, and based on a probabilistic comparison between the data in the second form and the type data in the repository, may a type of the object be identified.
  • the probabilistic comparison may include a likelihood analysis assessing a statistical likelihood between the data in the second form and the type data to determine whether or not an object of a specific type is represented in the type repository.
  • the type repository may comprise one or more data sets, each data set corresponding to data of a specific type or data specifically not corresponding to a given type.
  • ADRC Adaptive Dynamic Range Coding
  • An ADRC process is a method to efficiently extract texture characteristics of image data. See e.g. US 5,241,381 and US 5,825,313.
  • a range of advantages may be attributed to identifying a type by means of an ADRC process.
  • ADRC processing offers a means for fast extraction of texture features in a digital image. Texture-based features bring about large descriptive power for object detection, extracting them, however, is typically costly. The present invention circumvents this problem by describing objects by easily computable ADRC features.
  • ADRC features are in widespread use for spatial image up-scaling and temporal video up- conversion.
  • ADRC features may therefore, in a number of applications, such as TV-sets, DVD-players etc., already be computed for the spatial and/or temporal up-sampling applications, and the current invention may immediately benefit form this, since after only minor extension can available temporal and spatial up-sampling architectures be provided with detection capability. And even if such ADRC features are not already provided for other purposes, may the features be extracted at low cost.
  • the features as defined in claim 2 has the advantage that a histogram representation facilitates fast and easy extraction of texture feature statistics, fast and easy as compared to standard texture feature analytical tools, such as texture features based on Gabor filter responses.
  • the features as defined in claim 3 has the advantage that a pattern of characteristics, such as texture features of the image elements located near the image element of interest, may be chosen based on a type of the object. This is advantageous since texture features of different types of objects may be different for different object types, and a pattern may be chosen which is known to provide a high certainty in type identification. Alternatively, in a situation where a type may not be identified with a sufficiently high certainty, different patterns may be tried to improve the certainty. Another advantage may be that the complexity of the pattern, e.g. size of the pattern and/or the inclusion of different weights to different pixel elements in the pattern may be altered in accordance with the type of digital data representation, resolution of the digital data representation, computational power, accepted calculation time, etc.
  • the feature as defined in claim 4 has the advantage that a mathematical formalism describing probability density iunction (pfd) is available. Probability density function formalism is a generic object detection framework and applicable to any type of object, ensuring robust and trustworthy analysis. Furthermore, the construction of a histogram non-parametric pdf is faster than the construction of parametric pdfs, such as the ones based on gaussian mixture models (GMMs).
  • the feature as defined in claim 5 has the advantage that by comparison to both type data and type data different from the type, e.g. non-type data, a more efficient type identification may be provided.
  • the feature as defined in claim 6 has the advantage that type data may be generated at a specified identification resolution, or at a number of predefined identification resolutions. This is advantageous since the type data may be present in a ready-to-use form which may be easily accessed, e.g. in look-up tables.
  • an object detector for identifying a type of one or more digital objects in a digital data representation, the object detector comprising: a transformer for transforming at least a part of the digital data representation from a first form to a second form, and an analyzer for identifying the type of the one or more digital objects from a probabilistic-based comparison between the digital data in the second form and type data in a type repository, wherein the digital data is transformed into the second form by means of an Adaptive Dynamic Range Coding (ADRC) process.
  • ADRC Adaptive Dynamic Range Coding
  • the method according to the first aspect of the invention may be implemented in a device, such as a stand-alone device or a module suitable for implementation in a device for providing object identification capability to the device.
  • the implementation may be provided by means of software implementation or hardware implementation, e.g. in an implementation comprising one or more ICs, or any other suitable way of implementation.
  • IC for identification of a type of one or more digital objects in a digital data representation, the IC being adapted to identify a digital object according to the first aspect of the invention.
  • the IC may be a single chip, a group of chips or an electronic circuit comprising a variety of electronic components. Especially, may the IC be incorporated as a part of the object detector according to the second aspect of the invention.
  • a computer readable code for identification of a type of one or more digital objects in a digital data representation, the code being adapted to implement the method according to the first aspect of the invention.
  • the computer readable code may be implemented in an object detector according to the second aspect of the invention and/or in an IC according to the third aspect of the invention.
  • a system for identification of a type of one or more digital objects in a digital data representation comprising: an input module for inputting at least a part of the digital data representation, a transforming module for transforming the digital data representation accessed from the input module from a first form to a second form, - a repository for storing type data, an identification module for identifying the type of the one or more digital objects from a probabilistic-based comparison between the digital data in the second form and type data in the type repository, and an output module for outputting a type of the identified one or more digital objects, wherein the digital data is transformed into the second form by means of an Adaptive Dynamic Range Coding (ADRC) process.
  • ADRC Adaptive Dynamic Range Coding
  • the input module may be a software application or a hardware section, e.g. an interlace means for interfacing one or more signals, such as data streams, to the transformer module comprising a processing means.
  • the input module may be any type of means provided for feeding or providing one or more signals or data to the transforming module.
  • the input signal may be an output signal from a given unit, e.g. an input signal may be a signal provided by a processing means storing digital data for visual or other purposes.
  • the transforming module and the identification module may comprise separate or shared processing means.
  • the processing means may be any type of processing means, both dedicated processing means, or the processing means may be part of general purpose computer, such as a computer program.
  • the output module may be a storage means enabling access to the result or the output module, e.g. as an intermediate step in connection with showing the result graphically.
  • the system may be a system for implementing the method according to the first aspect of the invention and/or a system including the object detector according to the second aspect of the invention. Furthermore, may the system implement IC means according to the third aspect of the invention, as well as computer readable code according to the forth aspect of the invention. In general may the various aspects of the invention may be combined and coupled in any way possible within the scope of the invention.
  • Fig. 1 shows a general scheme illustrating an embodiment of the present invention
  • Fig. 2 shows an example of a digital data representation in the first form
  • Fig. 3 shows a schematic illustration a selection of image elements located near an image element of interest
  • Fig. 4 shows plots of histograms computed from a collection of face and non- face images.
  • FIG. 1 A general scheme illustrating an embodiment of the present invention for identifying a type of one or more digital objects in a digital data representation is presented in Fig. 1.
  • a specific type of an object refers to, for example, whether an object is a car, a face, a text region, etc.
  • a digital data representation is in connection with the description of Figs. 1 to 4 exemplified by a 2D image. It is however to be understood that a digital data representation is not limited to a 2D image.
  • Fig. 1 is a digital object provided as a digital data representation in a first step 10 in a first form.
  • the digital data representation of the object is in another step 11 transformed into a second form.
  • step 12 is the digital data representation in the second form compared to type data in a type repository 13, resulting in a type identification 14 of the object present in the digital data representation in the first form.
  • the digital data representation in the first form may be an image represented in a bitmap type representation, such as a standard bitmap format, e.g. a jpeg, gif, bmp, etc. format.
  • the image in the first form may be transformed into the second form by means of an Adaptive Dynamic Range Coding (ADRC) process, where ADRC features are extracted from the image in the first form, and used to generate the second form.
  • ADRC Adaptive Dynamic Range Coding
  • FIG. 2 An example of an image 20 in the first form is provided in Fig. 2.
  • a part 21 of the image i.e. a selection of image elements (pixels 22), may be analyzed in a pixel-by-pixel manner, where a value (ADRC value) may be assigned to each pixel in the selection of image elements.
  • the image shown in Fig. 2 shows only one object 23, being a face, however it is to be understood that more than one object may be present in an image.
  • the present invention may be applied to more objects, e.g. by selecting one object to be identified or by sequential (or parallel) identification of more than one object in an image.
  • ADRC features computed for a plurality of image elements, typically being a local window with a given aperture size (aperture size of 3x3, 5x5, 7x7 are common), which is shifted pixel by pixel to span the whole image.
  • FIG. 3 A schematic illustration of an area 30 containing 5x5 pixels is illustrated in Fig. 3.
  • the local area being a part of the selection of image elements.
  • the ADRC value of a pixel (here illustrated for the pixel designated 0) is computed for the given window (aperture) size 31.
  • the ADRC value is based on the computation of the average (/ ⁇ vg ) of the pixel intensity values in the window and the assignment of the pixels in the same window to a level based on their differences from that average value, referred to as L-levels.
  • the pixel is assigned to one, and otherwise zero.
  • 511 (2 3x3 -1) possible patterns for a window all zero or all one cases, depending on where the equality is assigned, if the all one case is impossible to realize due to the definition of ADRC; two is subtracted).
  • the resulting patterns of the local windows are classified into a set of classes.
  • the patterns are thus a pattern of texture characteristics (here whether the individual pixels are darker or lighter than the average) of the image elements located near the image element of interest, i.e. the center pixel 0.
  • the local window is in the present embodiment a square including nearest and next nearest neighbors, however other shapes may be employed, such as shapes with an envelope shape of a rectangle, a triangle, a polygon, a circle, an ellipse, etc.
  • the window designated 31 in Fig. 3 A is the average intensity calculated and the intensities of the pixels 0-8 are evaluated according to whether they posses a greater or a smaller intensity than the average value.
  • This is exemplified in Fig. 2B where pixels 1, 7 and 8 are found (by way of example) to be darker whereas the pixels 0, 2-6 are found to be brighter.
  • a class of the pattern is computed and assigned to the center pixel of the window, i.e. to pixel 0.
  • the class may e.g. be a corresponding binary number, such as the class of the window illustrated in Fig. 3B may be the binary number: 011111100 that is assigned to pixel 0.
  • each pixel can be assigned to a value in the range [0, L MxM - 1]. These values are binned in a histogram, where each bin counts the number of times a given pattern appears for each pixel in the image or in a subsection of the image.
  • the ADRC values are used to obtain the digital data representation in the second form. In a given embodiment may the histogram of the ADRC values, provided as explained above, be taken as the data representation in the second form.
  • the ADRC values can be described in terms of object probability density functions (pdfs).
  • a pdf can be described either in parametric form or in non-parametric form.
  • Parametric pdfs are powerful if the assumed distribution is true. However, if not, estimated pdf will be very different from the true one.
  • non-parametric descriptors do not depend on such assumptions. As a result, the use of a non-parametric descriptor (a histogram) may be preferable to represent the pdf of an object.
  • identification of a type of an object is exemplified in terms of face recognition, i.e. the type of the object is taken to be a face.
  • a type repository i.e. a face repository
  • ADRC statistics are collected for a number of face images to compute a face histogram (Hf ace ).
  • Hf ace face histogram
  • a face histogram 40 is illustrated in Fig. 4A. It may be more efficient to also model non- faces, by collected images that do not contain faces and from those images, build non- face histogram, (H non .f ace ).
  • a non-face histogram 41 is illustrated in Fig. 4B.
  • the ADRC statistics is collected to compute the histogram of the unknown object, H unknOwn .
  • the statistics be computed from a block of pixels with a given identification resolution (e.g. a fixed block size of 24x24). Larger objects can be detected by down-sampling the image so that the object size gets close to the predefined block size, and vice versa for smaller objects can be up-sampled.
  • the size-independence of the object detector can be guaranteed by running the detector at multiple scales.
  • a probabilistic-based comparison may be made, in this embodiment by computing the similarity of H unknOwn to Hf ace and H non .f ace .
  • the similarity to face (Sf aCe ) and non-face histograms (S non .face) are found by employing a popular histogram similarity metric.
  • an embodiment utilizing histogram intersection as defined below for histograms H 1 and H 2 is explained.
  • Detection decision may in a given embodiment be based on the similarity values and several thresholds.
  • a type i.e. a face, may be set as is detected if the following conditions are satisfied:
  • condition may be employed, as well as a routine may be present for adjusting the detection condition, likewise may other thresholds be used.
  • the values of 0.75 and 0.1 are mentioned only for illustrative purposes, and should not be taken as a limitation. Any suitable threshold value may be assumed, and the threshold values may depend upon the type of object to be identified, the nature of the digital data representation, etc.
  • a routine may be present for adjusting the threshold values in a situation of use.
  • Figs.4 A and 4B show the plots of histograms Hf ace and H non .f ace that are computed from a collection of face and non- face images, respectively.
  • the attainable correct type identification rate may in a situation of use depend upon a number of features, such as upon the type of object to be identified, the nature of the digital data representation, nature and quality of the type data etc.
  • certain bins i.e. certain patterns, be weighted. For example, may a certain type of object be very sensitive to the occurrence of certain patterns, and large weights may be given to these bins:
  • Object data and non-object data may be provided in a variety of ways.
  • An application may be born with given data sets, i.e. given object histograms and non-object histograms.
  • histograms may be provided in a training process.
  • weight assigned to bin as explained above may be learned in a training process.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to type identification of one or more digital objects by means of texture-based features using an Adaptive Dynamic Range Coding (ADRC) process. A method for identifying a type of one or more digital objects in a digital data representation, such as in image data, an object detector, a system for object detection, as well as applications of the method are disclosed. A type may be identified by transforming at least a part of the digital data representation from a first form, such as an image form to a second form by means of an Adaptive Dynamic Range Coding (ADRC) process. The second form being a histogram representation of a probability density function computed from texture feature statistics. The type identification is obtained from a probabilistic-based comparison between the digital data in the second form and type data in a type repository.

Description

Type identification of digital object(s)
The invention relates to identification of a type of one or more digital objects. In particular the invention relates to type identification by means of texture-based features.
Object detection in digital image data is interesting for a number of applications, such as a surveillance system, a face recognition system, a video-based computer/machine interface, etc. It has proven a difficult task to provide a machine, e.g. a computer-based machine with the ability of recognizing a type of an object in an image with a high certainty, i.e. assigning an abstract type to a concrete ensemble of image elements. Methods of identifying a type of a digital object include differentiating one object from the other by means of texture features. In texture-based object detection, however, extraction of texture features, such as Gabor filter coefficients, is usually computational heavy. Because of that, instead of texture features, easily computable color- based features may be used to detect objects. The problem with using based-based features is the variability of object based features such as due to the changes in illumination, image capturing parameters, and the employed based spaces.
The publication "Face Detection in Still Gray Images", B. Heisele, T. Poggio and M. Pontil, A.I. Memo 1687, Massachusetts Institute of Technology, 2000; describes a trainable system for detecting frontal and near- frontal views of faces in still grey images by use of Support Vector Machines (SVMs). Object analysis based on SVM is, however, computational heavy.
The inventor of the present invention has appreciated that an improved method for type identification of a digital object is of benefit, and has in consequence devised the present invention.
The present invention seeks to provide improved means for identifying one or more objects in a digital data representation. Preferably, the invention alleviates or mitigates one or more of the above or other disadvantages singly or in any combination. Accordingly there is provided, in a first aspect, a method for identifying a type of one or more digital objects in a digital data representation, the method comprising the steps of: transforming at least a part of the digital data representation from a first form to a second form, and identifying the type of the one or more digital objects from a probabilistic- based comparison between the digital data in the second form and type data in a type repository, wherein the digital data is transformed into the second form by means of an Adaptive Dynamic Range Coding (ADRC) process.
A digital data representation may be an image, such as a 2D image made up pixels, it may be a 3D image made up of voxels, it may be a stream of images, such as a video stream, etc. The format of the image may be any type of format, such as standard image and video stream formats. An object in the image may be any kind of image objects, such as a graphical object defined by a selection of image elements.
The digital data representation may initially be in a first form, and at least a part of the digital data representation may be transformed into a second form. The second form may be obtained by means of running one or more algorithms, one or more mathematical transformations, etc. on the data to obtain the digital data in the second form. The data may in the transformation process be present in one or more intermediate forms, e.g. in connection with running a number of algorithms. The dimensionality of the data may be altered in the transformation, e.g. 2D and/or 3D image data, may be transformed into ID data. The transformation to the second form may include statistical data analysis.
A repository of type data may be consulted, and based on a probabilistic comparison between the data in the second form and the type data in the repository, may a type of the object be identified. The probabilistic comparison may include a likelihood analysis assessing a statistical likelihood between the data in the second form and the type data to determine whether or not an object of a specific type is represented in the type repository. The type repository may comprise one or more data sets, each data set corresponding to data of a specific type or data specifically not corresponding to a given type.
The transformation of the data from the first type to the second type may be done by means of an Adaptive Dynamic Range Coding (ADRC) process. An ADRC process is a method to efficiently extract texture characteristics of image data. See e.g. US 5,241,381 and US 5,825,313. A range of advantages may be attributed to identifying a type by means of an ADRC process. ADRC processing offers a means for fast extraction of texture features in a digital image. Texture-based features bring about large descriptive power for object detection, extracting them, however, is typically costly. The present invention circumvents this problem by describing objects by easily computable ADRC features. Furthermore, ADRC features are in widespread use for spatial image up-scaling and temporal video up- conversion. ADRC features may therefore, in a number of applications, such as TV-sets, DVD-players etc., already be computed for the spatial and/or temporal up-sampling applications, and the current invention may immediately benefit form this, since after only minor extension can available temporal and spatial up-sampling architectures be provided with detection capability. And even if such ADRC features are not already provided for other purposes, may the features be extracted at low cost.
The features as defined in claim 2 has the advantage that a histogram representation facilitates fast and easy extraction of texture feature statistics, fast and easy as compared to standard texture feature analytical tools, such as texture features based on Gabor filter responses.
The features as defined in claim 3 has the advantage that a pattern of characteristics, such as texture features of the image elements located near the image element of interest, may be chosen based on a type of the object. This is advantageous since texture features of different types of objects may be different for different object types, and a pattern may be chosen which is known to provide a high certainty in type identification. Alternatively, in a situation where a type may not be identified with a sufficiently high certainty, different patterns may be tried to improve the certainty. Another advantage may be that the complexity of the pattern, e.g. size of the pattern and/or the inclusion of different weights to different pixel elements in the pattern may be altered in accordance with the type of digital data representation, resolution of the digital data representation, computational power, accepted calculation time, etc.
The feature as defined in claim 4 has the advantage that a mathematical formalism describing probability density iunction (pfd) is available. Probability density function formalism is a generic object detection framework and applicable to any type of object, ensuring robust and trustworthy analysis. Furthermore, the construction of a histogram non-parametric pdf is faster than the construction of parametric pdfs, such as the ones based on gaussian mixture models (GMMs). The feature as defined in claim 5 has the advantage that by comparison to both type data and type data different from the type, e.g. non-type data, a more efficient type identification may be provided.
The feature as defined in claim 6 has the advantage that type data may be generated at a specified identification resolution, or at a number of predefined identification resolutions. This is advantageous since the type data may be present in a ready-to-use form which may be easily accessed, e.g. in look-up tables.
According to a second aspect is provided an object detector for identifying a type of one or more digital objects in a digital data representation, the object detector comprising: a transformer for transforming at least a part of the digital data representation from a first form to a second form, and an analyzer for identifying the type of the one or more digital objects from a probabilistic-based comparison between the digital data in the second form and type data in a type repository, wherein the digital data is transformed into the second form by means of an Adaptive Dynamic Range Coding (ADRC) process.
The method according to the first aspect of the invention may be implemented in a device, such as a stand-alone device or a module suitable for implementation in a device for providing object identification capability to the device. The implementation may be provided by means of software implementation or hardware implementation, e.g. in an implementation comprising one or more ICs, or any other suitable way of implementation.
It is an advantage to provide an object detector since such a device may be part of, or may easily be made part of a device where object detection is desirable. According to a third aspect of the invention is provided an integrated circuit
(IC) for identification of a type of one or more digital objects in a digital data representation, the IC being adapted to identify a digital object according to the first aspect of the invention.
The IC may be a single chip, a group of chips or an electronic circuit comprising a variety of electronic components. Especially, may the IC be incorporated as a part of the object detector according to the second aspect of the invention.
According to a fourth aspect of the invention is provided a computer readable code for identification of a type of one or more digital objects in a digital data representation, the code being adapted to implement the method according to the first aspect of the invention. The computer readable code may be implemented in an object detector according to the second aspect of the invention and/or in an IC according to the third aspect of the invention.
According to a fifth aspect of the invention is provided a system for identification of a type of one or more digital objects in a digital data representation, the system comprising: an input module for inputting at least a part of the digital data representation, a transforming module for transforming the digital data representation accessed from the input module from a first form to a second form, - a repository for storing type data, an identification module for identifying the type of the one or more digital objects from a probabilistic-based comparison between the digital data in the second form and type data in the type repository, and an output module for outputting a type of the identified one or more digital objects, wherein the digital data is transformed into the second form by means of an Adaptive Dynamic Range Coding (ADRC) process.
The input module may be a software application or a hardware section, e.g. an interlace means for interfacing one or more signals, such as data streams, to the transformer module comprising a processing means. However, in general may the input module be any type of means provided for feeding or providing one or more signals or data to the transforming module. The input signal may be an output signal from a given unit, e.g. an input signal may be a signal provided by a processing means storing digital data for visual or other purposes. The transforming module and the identification module may comprise separate or shared processing means. The processing means may be any type of processing means, both dedicated processing means, or the processing means may be part of general purpose computer, such as a computer program. The output module may be a storage means enabling access to the result or the output module, e.g. as an intermediate step in connection with showing the result graphically. The system may be a system for implementing the method according to the first aspect of the invention and/or a system including the object detector according to the second aspect of the invention. Furthermore, may the system implement IC means according to the third aspect of the invention, as well as computer readable code according to the forth aspect of the invention. In general may the various aspects of the invention may be combined and coupled in any way possible within the scope of the invention.
These and other aspects, features and/or advantages of the invention will be apparent from and elucidated with reference to the embodiments described hereinafter.
Embodiments of the invention will be described, by way of example only, with reference to the drawings, in which:
Fig. 1 shows a general scheme illustrating an embodiment of the present invention,
Fig. 2 shows an example of a digital data representation in the first form, Fig. 3 shows a schematic illustration a selection of image elements located near an image element of interest, and
Fig. 4 shows plots of histograms computed from a collection of face and non- face images.
A general scheme illustrating an embodiment of the present invention for identifying a type of one or more digital objects in a digital data representation is presented in Fig. 1. A specific type of an object refers to, for example, whether an object is a car, a face, a text region, etc. A digital data representation is in connection with the description of Figs. 1 to 4 exemplified by a 2D image. It is however to be understood that a digital data representation is not limited to a 2D image.
In Fig. 1 is a digital object provided as a digital data representation in a first step 10 in a first form. The digital data representation of the object is in another step 11 transformed into a second form. In a further step 12 is the digital data representation in the second form compared to type data in a type repository 13, resulting in a type identification 14 of the object present in the digital data representation in the first form.
The digital data representation in the first form may be an image represented in a bitmap type representation, such as a standard bitmap format, e.g. a jpeg, gif, bmp, etc. format. The image in the first form may be transformed into the second form by means of an Adaptive Dynamic Range Coding (ADRC) process, where ADRC features are extracted from the image in the first form, and used to generate the second form. The description of an object by ADRC features according to an embodiment of the present invention is described first.
An example of an image 20 in the first form is provided in Fig. 2. A part 21 of the image, i.e. a selection of image elements (pixels 22), may be analyzed in a pixel-by-pixel manner, where a value (ADRC value) may be assigned to each pixel in the selection of image elements. The image shown in Fig. 2 shows only one object 23, being a face, however it is to be understood that more than one object may be present in an image. The present invention may be applied to more objects, e.g. by selecting one object to be identified or by sequential (or parallel) identification of more than one object in an image. For each pixel in the selection of image elements is ADRC features computed for a plurality of image elements, typically being a local window with a given aperture size (aperture size of 3x3, 5x5, 7x7 are common), which is shifted pixel by pixel to span the whole image.
A schematic illustration of an area 30 containing 5x5 pixels is illustrated in Fig. 3. The local area being a part of the selection of image elements. The ADRC value of a pixel (here illustrated for the pixel designated 0) is computed for the given window (aperture) size 31. The ADRC value is based on the computation of the average (/αvg) of the pixel intensity values in the window and the assignment of the pixels in the same window to a level based on their differences from that average value, referred to as L-levels. In the embodiment illustrated here, an aperture size 31 of 3x3 is used, and the case where L=2 is explained, it is however to be understood that L may be larger resulting in a larger number of possible patterns. In the L=2 case, if a pixel intensity value in the window is greater than Iavg, the pixel is assigned to one, and otherwise zero. These settings result in 511 (23x3 -1) possible patterns for a window (all zero or all one cases, depending on where the equality is assigned, if the all one case is impossible to realize due to the definition of ADRC; two is subtracted). The resulting patterns of the local windows are classified into a set of classes. The patterns are thus a pattern of texture characteristics (here whether the individual pixels are darker or lighter than the average) of the image elements located near the image element of interest, i.e. the center pixel 0. The local window is in the present embodiment a square including nearest and next nearest neighbors, however other shapes may be employed, such as shapes with an envelope shape of a rectangle, a triangle, a polygon, a circle, an ellipse, etc.
Thus for the window designated 31 in Fig. 3 A, is the average intensity calculated and the intensities of the pixels 0-8 are evaluated according to whether they posses a greater or a smaller intensity than the average value. This is exemplified in Fig. 2B where pixels 1, 7 and 8 are found (by way of example) to be darker whereas the pixels 0, 2-6 are found to be brighter. Based on all the pixel assignments in the window, a class of the pattern is computed and assigned to the center pixel of the window, i.e. to pixel 0. The class may e.g. be a corresponding binary number, such as the class of the window illustrated in Fig. 3B may be the binary number: 011111100 that is assigned to pixel 0. In this way, each pixel can be assigned to a value in the range [0, LMxM- 1]. These values are binned in a histogram, where each bin counts the number of times a given pattern appears for each pixel in the image or in a subsection of the image. The ADRC values are used to obtain the digital data representation in the second form. In a given embodiment may the histogram of the ADRC values, provided as explained above, be taken as the data representation in the second form.
The ADRC values can be described in terms of object probability density functions (pdfs). A pdf can be described either in parametric form or in non-parametric form. Parametric pdfs are powerful if the assumed distribution is true. However, if not, estimated pdf will be very different from the true one. In contrast to this drawback of parametric density estimation methods, non-parametric descriptors do not depend on such assumptions. As a result, the use of a non-parametric descriptor (a histogram) may be preferable to represent the pdf of an object.
In the following, identification of a type of an object is exemplified in terms of face recognition, i.e. the type of the object is taken to be a face. In order to obtain the pdf of a face class, i.e. in order to obtain a type repository (i.e. a face repository), ADRC statistics are collected for a number of face images to compute a face histogram (Hface). Please note that all the histograms are normalized, so that the sum of their bins is equal to one. A face histogram 40 is illustrated in Fig. 4A. It may be more efficient to also model non- faces, by collected images that do not contain faces and from those images, build non- face histogram, (Hnon.face). A non-face histogram 41 is illustrated in Fig. 4B. By using both face and non-face data or generally first and second data referring to object and non-object data, the type identification may be based on comparison to first and second data.
When a new image is presented to detect a particular type of object, the ADRC statistics is collected to compute the histogram of the unknown object, HunknOwn. In an embodiment of the present invention may the statistics be computed from a block of pixels with a given identification resolution (e.g. a fixed block size of 24x24). Larger objects can be detected by down-sampling the image so that the object size gets close to the predefined block size, and vice versa for smaller objects can be up-sampled. The size-independence of the object detector can be guaranteed by running the detector at multiple scales. In order to verify that the histogram computed from the image is a face (or a known object), a probabilistic-based comparison may be made, in this embodiment by computing the similarity of HunknOwn to Hface and Hnon.face . The similarity to face (SfaCe) and non-face histograms (Snon.face) are found by employing a popular histogram similarity metric. Without limiting the scope of the invention, an embodiment utilizing histogram intersection as defined below for histograms H1 and H2 is explained.
∑wm{H13H2 ) S(H13H2) = -^-z
(1)
Detection decision may in a given embodiment be based on the similarity values and several thresholds. A type, i.e. a face, may be set as is detected if the following conditions are satisfied:
S face > Thrx and Sface - Snon_face > Thr2 (2)
where we set 7%n=0.75 and Thr2=0Λ.
Other condition may be employed, as well as a routine may be present for adjusting the detection condition, likewise may other thresholds be used. The values of 0.75 and 0.1 are mentioned only for illustrative purposes, and should not be taken as a limitation. Any suitable threshold value may be assumed, and the threshold values may depend upon the type of object to be identified, the nature of the digital data representation, etc. A routine may be present for adjusting the threshold values in a situation of use.
Experiments with face detection have been conducted. Figs.4 A and 4B show the plots of histograms Hface and Hnon.facethat are computed from a collection of face and non- face images, respectively. By using 100 of face/non-face images, including difficult sets of face images, with different orientation, eye-glasses, and faces of people across different age groups, could a correct type identification rate of 85%, with 5% false identifications was achieved, and in 10% of the cases, could a face/non-face type identification not be made.
These results are mentioned only for illustrative purposes and should not be taken as a limitation of the present invention. The attainable correct type identification rate may in a situation of use depend upon a number of features, such as upon the type of object to be identified, the nature of the digital data representation, nature and quality of the type data etc. In another embodiment may certain bins, i.e. certain patterns, be weighted. For example, may a certain type of object be very sensitive to the occurrence of certain patterns, and large weights may be given to these bins:
Figure imgf000011_0001
Object data and non-object data may be provided in a variety of ways. An application may be born with given data sets, i.e. given object histograms and non-object histograms. However, histograms may be provided in a training process. Also weight assigned to bin as explained above may be learned in a training process.
Although the present invention has been described in connection with preferred embodiments, it is not intended to be limited to the specific form set forth herein. Rather, the scope of the present invention is limited only by the accompanying claims.
In this section, certain specific details of the disclosed embodiment such as number of process steps, algorithm details, object type, etc., are set forth for purposes of explanation rather than limitation, so as to provide a clear and thorough understanding of the present invention. However, it should be understood readily by those skilled in this art, that the present invention may be practiced in other embodiments which do not conform exactly to the details set forth herein, without departing significantly from the spirit and scope of this disclosure. Further, in this context, and for the purposes of brevity and clarity, detailed descriptions of well-known apparatus, circuits and methodology have been omitted so as to avoid unnecessary detail and possible confusion. Reference signs are included in the claims, however the inclusion of the reference signs is only for clarity reasons and should not be construed as limiting the scope of the claims.

Claims

CLAIMS:
1. Method for identifying a type (14) of one or more digital objects (10,23) in a digital data representation (20), the method comprising the steps of: transforming at least a part (21) of the digital data representation (20) from a first form to a second form (40,41), and - identifying the type (14) of the one or more digital objects from a probabilistic-based comparison between the digital data in the second form (11) and type data in a type repository (13), wherein the digital data is transformed into the second form by means of an Adaptive Dynamic Range Coding (ADRC) process.
2. Method according to claim 1, wherein the first form is an image form (20) and wherein the transformation to the second form, for at least a selection of image elements in the image form, includes the step of assigning a value to an image element of interest (0) in at least the selection of image elements (31), the value being assigned based on a plurality of image elements located near the image element of interest, and wherein the second form is a histogram representation (40,41) of the assigned image element values.
3. Method according to claim 2, wherein the assigned value of the image element of interest is assigned in accordance with a pattern of the plurality of image elements located near the image element of interest, the pattern being a pattern of characteristics of the image elements located near the image element of interest.
4. Method according to claim 1, wherein the second form is a representation of texture features of the one or more digital objects, and wherein the representation is a probability density iunction (pdf) computed from texture feature statistics.
5. Method according to claim 1, wherein the type repository (13) comprises first data (40) describing the type of the selected digital object and second data (41) describing a type different from the type of the selected digital object, and wherein the probabilistic-based comparison includes comparison to both the first and the second data.
6. Method according to claim 1 , wherein the identification of the type of one or more objects in a digital data representation is performed using a given identification resolution of the digital data representation, so that if a resolution of the digital data representation is smaller than the identification resolution, the digital data representation is up-sampled to fit the identification resolution, and if the resolution of the digital data representation is larger than the identification resolution, the digital data representation is down-sampled to fit the identification resolution,
7. Object detector for identifying a type (14) of one or more digital objects (10,23) in a digital data representation (20), the object detector comprising: a transformer for transforming at least a part (21 ) of the digital data representation (20) from a first form to a second form (40,41), and an analyzer (12) for identifying the type of the one or more digital objects from a probabilistic-based comparison between the digital data in the second form (11) and type data in a type repository (13), wherein the digital data is transformed into the second form by means of an Adaptive Dynamic Range Coding (ADRC) process.
8. Integrated circuit (IC) for identification of a type of one or more digital objects in a digital data representation, the IC being adapted to identify a digital object according to the method of claim 1.
9. Computer readable code for identification of a type of one or more digital objects in a digital data representation, the code being adapted to conduct the steps: transforming at least a part of the digital data representation from a first form to a second form, and - identifying the type of the one or more digital objects from a probabilistic- based comparison between the digital data in the second form and type data in a type repository, wherein the digital data is transformed into the second form by means of an Adaptive Dynamic Range Coding (ADRC) process.
10. System for identification of a type of one or more digital objects in a digital data representation, the system comprising: an input module for inputting at least a part of the digital data representation, - a transforming module for transforming the digital data representation accessed from the input module from a first form to a second form, a repository for storing type data, an identification module for identifying the type of the one or more digital objects from a probabilistic-based comparison between the digital data in the second form and type data in the type repository, and an output module for outputting a type of the identified one or more digital objects, wherein the digital data is transformed into the second form by means of an Adaptive Dynamic Range Coding (ADRC) process.
11. Use of an Adaptive Dynamic Range Coding (ADRC) process for identification of a type of one or more digital objects in a digital data representation.
PCT/IB2006/050518 2005-02-24 2006-02-17 Object recognition using adrc (adaptive dynamic range coding) WO2006090313A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP05101403.3 2005-02-24
EP05101403 2005-02-24

Publications (2)

Publication Number Publication Date
WO2006090313A2 true WO2006090313A2 (en) 2006-08-31
WO2006090313A3 WO2006090313A3 (en) 2006-12-07

Family

ID=36910840

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2006/050518 WO2006090313A2 (en) 2005-02-24 2006-02-17 Object recognition using adrc (adaptive dynamic range coding)

Country Status (1)

Country Link
WO (1) WO2006090313A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013030104A (en) * 2011-07-29 2013-02-07 Panasonic Corp Feature amount extraction apparatus and feature amount extraction method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0895191A2 (en) * 1997-07-31 1999-02-03 Sony Corporation Hierarchical image processing apparatus and method
US20040066966A1 (en) * 2002-10-07 2004-04-08 Henry Schneiderman Object finder for two-dimensional images, and system for determining a set of sub-classifiers composing an object finder

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0895191A2 (en) * 1997-07-31 1999-02-03 Sony Corporation Hierarchical image processing apparatus and method
US20040066966A1 (en) * 2002-10-07 2004-04-08 Henry Schneiderman Object finder for two-dimensional images, and system for determining a set of sub-classifiers composing an object finder

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
RIE HONDA; SHUAI WAN; TOKIO KIKUCHI; OSAMU KONISHI: "Mining of Moving Objects from Time-Series Images and its Application to Satellite Weather Imagery" JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 11 February 2002 (2002-02-11), XP002398131 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013030104A (en) * 2011-07-29 2013-02-07 Panasonic Corp Feature amount extraction apparatus and feature amount extraction method
WO2013018816A3 (en) * 2011-07-29 2013-03-28 Panasonic Corporation Feature value extraction apparatus and feature value extraction method
US9367567B2 (en) 2011-07-29 2016-06-14 Panasonic Intellectual Property Management Co., Ltd. Feature value extraction apparatus and feature value extraction method

Also Published As

Publication number Publication date
WO2006090313A3 (en) 2006-12-07

Similar Documents

Publication Publication Date Title
JP4429370B2 (en) Human detection by pause
US9547800B2 (en) System and a method for the detection of multiple number-plates of moving cars in a series of 2-D images
US7840037B2 (en) Adaptive scanning for performance enhancement in image detection systems
CN109241985B (en) Image identification method and device
US9008365B2 (en) Systems and methods for pedestrian detection in images
US8103058B2 (en) Detecting and tracking objects in digital images
JP4479478B2 (en) Pattern recognition method and apparatus
EP2434431A1 (en) Method and device for classifying image
US20120213422A1 (en) Face recognition in digital images
US7983480B2 (en) Two-level scanning for memory saving in image detection systems
Han et al. Real‐time license plate detection in high‐resolution videos using fastest available cascade classifier and core patterns
JP2013161126A (en) Image recognition device, image recognition method, and image recognition program
JP2013533998A (en) Object detection in images using self-similarity
JP2008117391A (en) Method and apparatus for detecting faces in digital images
US20210124977A1 (en) System and Method for Multimedia Analytic Processing and Display
US20080175447A1 (en) Face view determining apparatus and method, and face detection apparatus and method employing the same
Nercessian et al. Automatic detection of potential threat objects in X-ray luggage scan images
US8913782B2 (en) Object detection apparatus and method therefor
CN112364873A (en) Character recognition method and device for curved text image and computer equipment
KR101268520B1 (en) The apparatus and method for recognizing image
CN109726621B (en) Pedestrian detection method, device and equipment
Fang et al. 1-D barcode localization in complex background
KR101681233B1 (en) Method and apparatus for detecting face with low energy or low resolution
JP6377214B2 (en) Text detection method and apparatus
Hassan et al. Facial image detection based on the Viola-Jones algorithm for gender recognition

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 06727626

Country of ref document: EP

Kind code of ref document: A2

122 Ep: pct application non-entry in european phase

Ref document number: 06727626

Country of ref document: EP

Kind code of ref document: A2