EP2817762A1 - Filtre de détection des traits au moyen de champs d'orientation - Google Patents
Filtre de détection des traits au moyen de champs d'orientationInfo
- Publication number
- EP2817762A1 EP2817762A1 EP13751817.1A EP13751817A EP2817762A1 EP 2817762 A1 EP2817762 A1 EP 2817762A1 EP 13751817 A EP13751817 A EP 13751817A EP 2817762 A1 EP2817762 A1 EP 2817762A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- orientation
- orientation field
- orientations
- image
- field
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
Definitions
- SIFT scale invariant feature transform
- US patent number 6,711,293 Objects can be detected in images using feature description algorithms such as the scale invariant feature transform or SIFT.
- SIFT is described for example in US patent number 6,711,293.
- SIFT finds interesting parts in a digital image and defines them according to SIFT feature descriptors, also called key points.
- the descriptors are stored in a database. Those same images can be recognized in new images by comparing the features from the new image to the database.
- the SIFT algorithm teaches different ways of matching that are invariant to scale, orientation, distortion, and illumination changes. Summary
- the present application describes a technique for finding image parts, e.g. logos, in images.
- An embodiment describes using a Matched Orientation Field Filter (MOFF) algorithm for object detection.
- MOFF Matched Orientation Field Filter
- figures 1A, IB and 1C show logos in images
- Figure 2 shows an exemplary logo on the top, and the orientation field of that logo on the bottom
- figures 3A, 3B and 3C show logos, and orientation fields of those logos
- figure 4 shows a logo, on the top portion and orientation field of the logo in the middle portion, and thresholded orientation field of the logo at the bottom portion;
- figure 5 shows orientation fields for a logo found in the actual image
- figure 6 shows a flowchart of operation of the matching technique according to the present application.
- Image processing techniques that rely on key-points, such as SIFT, have been very successful for certain types of object detection tasks. Such tasks include matching objects in several overlapping images for the purpose of "stitching" images together. It is also quite successful for detecting certain types of logos.
- the SIFT algorithm works quite well for detecting objects with a large amount of detail and texture, which is sufficient in many object recognition issues, such as image stitching and the like.
- the current inventors have found that the SIFT algorithm has not been well- suited for detecting simple objects that have little detail or texture.
- SIFT was found to recognize for example large logos that were a major part of the picture, but not to find smaller less detailed logos.
- Figures 1A and 1 B show details of different kinds of images, while figure 1 A has a lot of detail, and could be usable with SIFT, figure 1 B is a low detail textureless logo, and the inventors found that SIFT does not work well in this object.
- FIG 1C Another example is shown in Fig 1C, where the logo is a very small part of the overall image.
- the inventors believe that SIFT does not work well for texture- less objects, because it relies on its detection of local key points in images. The key points are selected such that they contain as much unique information as possible. However, for an object such as the Nike-logo in Figure IB, there are few unique details to this image.
- the key points that the SIFT algorithm may find may also be present in a wide variety of other objects in the target image. That is, there is not enough uniqueness in the features of simple logos like the logo of figure 1 B.
- the inventors have developed a technique that uses different techniques than that in SIFT, and is particularly useful for detecting texture- less objects, such as the Nike logo in Figure IB.
- the techniques described herein attempts to capture most of the shape attributes of the object. This uses global information in the model, as opposed to the highly localized key points used by SIFT.
- the global attributes of the object are described using descriptors that are insensitive to color and light variations, shading, compression artifacts, and other image distortions.
- SIFT-like methods perform well for objects of good quality containing a large amount of details.
- MOFF will work on all these types of objects, but is particularly suitable for "difficult" objects, such as objects with little detail, and objects perturbed by artifacts caused by e.g., low resolution or compression.
- An embodiment describes feature detection in an image using orientation fields.
- a special application is described for detecting "texture- less" objects, referring to objects with little or no texture, such as certain corporate logos.
- An embodiment describes finding an item in one or more images, where the item is described herein as being a "model image”.
- a so- called target image is being analyzed to detect occurrences of the model image in the target image.
- the embodiment describes assigning a "reliability weight" to each detected object.
- the reliability weight indicates a calculated probability that the match is valid.
- the model image in this embodiment can be an image of a logo, and the target image is a (large) image or a movie frame.
- Key-point based algorithms typically include the following main steps:
- the MOFF algorithm generates an orientation field at all pixels in the image. Then, the techniques operate to measure of alignment and auto-correlation of the orientation field with respect to a data base of models.
- a logo often comes in many varieties, including different colors and shadings, so it is important that we can detect all varieties of a logo without having to use every conceivable variation of the logo in our model data base.
- a metallic logo may include reflections from the surroundings in the target image, and consequently a logo detection technique should look for and take into account such reflections.
- orientation field is described herein for carrying out the feature detection.
- the orientation field may describe the orientations that are estimated at discrete positions of the image being analyzed. Each element in the image, for example, can be characterized according to its orientation and location.
- An embodiment describes using orientation fields as described herein is a way of describing textureless objects.
- a scale invariant orientation field is generated representing each logo. After this orientation Field has been generated, a matching orientation fields searched for in the target image.
- I(x) denote the intensity at location x of the image I.
- F(x) ⁇ w(x); ⁇ (x) ⁇ , where w is a scalar that denotes a weight for the angle ⁇ .
- the angle ⁇ is an angle between 0 and 180 degrees (the interval [0, ⁇ ), and the weight w is a real number such that the size of w measures the importance or reliability of the orientation.
- orientations typically used by SIFT- like methods are vectors
- the orientations defined here need not be vectors.
- the gradient vectors typically used by SIFT-like algorithms have a periodicity of 360 degrees when rotated around a point.
- the orientations used by MOFF do not have a notion of "forward and backward", and have a periodicity of 180 degrees when rotated around a point.
- a useful example of a kernel well suited for MOFF is the cosine kernel, defined as
- orientations have periodicity 180 degrees when rotated around a point, opposed to a periodicity of 360 degrees for traditional gradient based vectors.
- the weights for MOFF orientations do not rely on just intensity differences, but should be thought more of as a reliability measure of the orientation.
- gradient based orientations rely on the intensity difference between nearby pixels.
- orientations computed above have the unique property of pointing towards an edge when located outside an object, and along an edge when located inside an object. This is a major difference compared to gradient based orientations, which always point perpendicular with respect to an edge.
- the MOFF orientations are particularly well-suited for describing, thin, line-like features.
- Gradient based orientations are well- suited for defining edges between larger regions.
- an orientation computed using gradients have a notion of "forward and backwards". This is often illustrated by adding an arrow tip to the line representing the orientation.
- the MOFF orientation does not have a notion of "forward and backwards", and is therefore more suitably depicted by a line segment without an arrow tip.
- a second important property of the orientations presented in this document is that MOFF orientations stress geometry rather than intensity difference when assessing the weight of an orientation. As a consequence, MOFF orientations will generate similar (strong) weights for two objects of the same orientation, even when they have different intensities.
- MOFF orientations point parallel to an object if located inside the object, and perpendicular to the edge if located outside the object. This happens regardless of the thickness of the object, and this property remains stable even for very thin, curve-like objects that gradient based orientations have problems with.
- FIG. 2 shows the original model at the top, and the orientation field at the bottom. Note the pointing of the orientation depends on whether it is inside or outside the object.
- orientations can be overcome by introducing blurring and averaging techniques, such methods have their own problems.
- One drawback with such techniques is that blurring will decrease the intensity difference of legitimate structures as well, which makes the gradient computation less robust.
- orientations which are deemed too unreliable. Typically, orientations with a reliability below 3% are removed this way. All remaining orientations are then given the same weight (which we set to 1). This is to remove all references to the intensity of the objects we want to detect. The intensity can vary greatly from image to image, whereas the underlying geometry, which are described by the orientation angles, are remarkably stable with respect to artifacts.
- the orientation fields are thresholded such that the weights that are almost zero (weights that are about 3%) of the max weight in the image, are set to zero. All other orientation weights are set to 1.
- the thresholded model orientation field is now matched with the thresholded target orientation field in a convolution style algorithm.
- modelArea is a normalization constant which is set to the number of non-zero weights in the model orientation field.
- the auto-correlation matrix can then be stored along with the pre-computed orientation field for each model.
- the auto-correlation can now be used as follows.
- C(i,j) is computed from Equation above
- A is the autocorrelation for the model computed from Equation above
- B(i,j) is the sub-matrix extracted around the neighborhood of C(i,j) as described above.
- T 610 For each model in the data base, initialize a final match map as a matrix M with the same size as the original image with zeros everywhere.
- orientation fields computed with respect to larger scales are effectively down sampled to a coarser grid.
- a coarser grid By looking for matches on a coarser grid, one can get a rough idea where potential matches are located, and if they are located in the image. Also, by looking at the coarser scales first, one can often determine which objects that with high certainty are not in the image, and remove these from further consideration.
- Equation (2) instead of immediately computing the alignment for all pixels (k,l) in the summation in Equation (2) described above, we introduce the notion of nested subsets.
- Equation (2) we start by defining a set consisting of all pixels slimmed over by Equation (2), which we define as
- M and N denote the vertical and horizontal size of the model.
- Cm(i,j) We can compute Cm(i,j) from Cm- 1 (i,j), by only computing the sum in Equation (5) over the indices given by the difference of two consecutive subsets m and m-1.
- DSP Digital Signal Processor
- ASIC Application Specific Integrated Circuit
- FPGA Field Programmable Gate Array
- GPU Graphical Processing Units
- a general purpose processor may be a microprocessor, but in the alternative, the processor may be any combination thereof designed to perform the functions described herein.
- a general purpose processor may be a microprocessor, but in the alternative, the processor may be any combination thereof designed to perform the functions described herein.
- a general purpose processor may be a microprocessor, but in the alternative, the processor may be any combination thereof designed to perform the functions described herein.
- a general purpose processor may be a microprocessor, but in the alternative, the processor may be any combination thereof designed to perform the functions described herein.
- a general purpose processor may be a microprocessor, but in the alternative, the processor may be any combination thereof designed to perform the functions described herein.
- the processor can be part of a computer system that also has a user interface port that communicates with a user interface, and which receives commands entered by a user, has at least one memory (e.g., hard drive or other comparable storage, and random access memory) that stores electronic information including a program that operates under control of the processor and with communication via the user interface port, and a video output that produces its output via any kind of video output format, e.g., VGA, DVI, HDMI, displayport, or any other form.
- This may include laptop or desktop computers, and may also include portable computers, including cell phones, tablets such as the IPADTM, and all other kinds of computers and computing platforms.
- a processor may also be implemented as a combination of computing devices, e.g., a combination of a DSP and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a DSP core, or any other such configuration. These devices may also be used to select values for devices as described herein.
- RAM Random Access Memory
- ROM Read Only Memory
- EPROM Electrically Programmable ROM
- EEPROM Electrically erasable ROM
- registers hard disk, a removable disk, a CD-ROM, or any other form of tangible storage medium that stores tangible, non transitory computer based instructions.
- An exemplary storage medium is coupled to the processor such that the processor can read information from, and write information to, the storage medium.
- the storage medium may be integral to the processor.
- the processor and the storage medium may reside in reconfigurable logic of any type.
- the functions described may be implemented in hardware, software, firmware, or any combination thereof. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium.
- Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another.
- a storage media may be any available media that can be accessed by a computer.
- such computer-readable media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
- the memory storage can also be rotating magnetic hard disk drives, optical disk drives, or flash memory based storage drives or other such solid state, magnetic, or optical storage devices.
- any connection is properly termed a computer-readable medium.
- the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium.
- DSL digital subscriber line
- Disk and disc includes compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer-readable media.
- the computer readable media can be an article comprising a machine-readable non-transitory tangible medium embodying information indicative of instructions that when performed by one or more machines result in computer implemented operations comprising the actions described throughout this specification.
- Operations as described herein can be carried out on or over a website.
- the website can be operated on a server computer, or operated locally, e.g., by being downloaded to the client computer, or operated via a server farm.
- the website can be accessed over a mobile phone or a PDA, or on any other client.
- the website can use HTML code in any form, e.g., MHTML, or XML, and via any form such as cascading style sheets (“CSS”) or other.
- the computers described herein may be any kind of computer, either general purpose, or some specific purpose computer such as a workstation.
- the programs may be written in C, or Java, Brew or any other programming language.
- the programs may be resident on a storage medium, e.g., magnetic or optical, e.g. the computer hard drive, a removable disk or media such as a memory stick or SD media, or other removable medium.
- the programs may also be run over a network, for example, with a server or other machine sending signals to the local machine, which allows the local machine to carry out the operations described herein.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Image Analysis (AREA)
Abstract
Selon l'invention, on trouve un objet cible dans une image cible au moyen d'un ordinateur pour déterminer un champ d'orientation d'au moins une pluralité de pixels de l'image cible, où le champ d'orientation décrit des pixels à des positions discrètes de l'image cible en cours d'analyse selon une orientation et l'emplacement. Le champ d'orientation est mis en correspondance avec un champ d'orientation dans des images modèles dans une base de données pour calculer les valeurs de correspondance entre le champ d'orientation de l'image cible et le champ d'orientation des images modèles dans la base de données. Un seuil est déterminé pour les valeurs de correspondance et les valeurs de correspondance qui dépassent le seuil sont comptées pour déterminer une correspondance entre la cible et le modèle.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201261603087P | 2012-02-24 | 2012-02-24 | |
PCT/US2013/027696 WO2013126914A1 (fr) | 2012-02-24 | 2013-02-25 | Filtre de détection des traits au moyen de champs d'orientation |
Publications (1)
Publication Number | Publication Date |
---|---|
EP2817762A1 true EP2817762A1 (fr) | 2014-12-31 |
Family
ID=51947626
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP13751817.1A Withdrawn EP2817762A1 (fr) | 2012-02-24 | 2013-02-25 | Filtre de détection des traits au moyen de champs d'orientation |
Country Status (1)
Country | Link |
---|---|
EP (1) | EP2817762A1 (fr) |
-
2013
- 2013-02-25 EP EP13751817.1A patent/EP2817762A1/fr not_active Withdrawn
Non-Patent Citations (1)
Title |
---|
See references of WO2013126914A1 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11210797B2 (en) | Systems, methods, and devices for image matching and object recognition in images using textures | |
US10360689B2 (en) | Detecting specified image identifiers on objects | |
JP5594852B2 (ja) | 物体認識用のヒストグラム方法及びシステム | |
US10699146B2 (en) | Mobile document detection and orientation based on reference object characteristics | |
CN102667810B (zh) | 数字图像中的面部识别 | |
US9754164B2 (en) | Systems and methods for classifying objects in digital images captured using mobile devices | |
US10127636B2 (en) | Content-based detection and three dimensional geometric reconstruction of objects in image and video data | |
US10410087B2 (en) | Automated methods and systems for locating document subimages in images to facilitate extraction of information from the located document subimages | |
US9508151B2 (en) | Systems, methods, and devices for image matching and object recognition in images using image regions | |
Mikolajczyk et al. | A comparison of affine region detectors | |
US8879796B2 (en) | Region refocusing for data-driven object localization | |
US9076056B2 (en) | Text detection in natural images | |
US8811751B1 (en) | Method and system for correcting projective distortions with elimination steps on multiple levels | |
US9679354B2 (en) | Duplicate check image resolution | |
CN107368829B (zh) | 确定输入图像中的矩形目标区域的方法和设备 | |
CN112445926B (zh) | 一种图像检索方法以及装置 | |
Yu et al. | Robust image hashing with saliency map and sparse model | |
EP3436865A1 (fr) | Détection à base de contenu et reconstruction géométrique tridimensionnelle d'objets dans des données d'image et vidéo | |
Huan et al. | Camera model identification based on dual-path enhanced ConvNeXt network and patches selected by uniform local binary pattern | |
US20130236054A1 (en) | Feature Detection Filter Using Orientation Fields | |
EP2817762A1 (fr) | Filtre de détection des traits au moyen de champs d'orientation | |
Lee et al. | An identification framework for print-scan books in a large database | |
Lee et al. | A restoration method for distorted comics to improve comic contents identification | |
Liu | Digits Recognition on Medical Device | |
Kakar | Passive approaches for digital image forgery detection |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20140924 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR |
|
AX | Request for extension of the european patent |
Extension state: BA ME |
|
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20150901 |