US20040234159A1 - Descriptors adjustment when using steerable pyramid to extract features for content based search - Google Patents

Descriptors adjustment when using steerable pyramid to extract features for content based search Download PDF

Info

Publication number
US20040234159A1
US20040234159A1 US10/693,369 US69336903A US2004234159A1 US 20040234159 A1 US20040234159 A1 US 20040234159A1 US 69336903 A US69336903 A US 69336903A US 2004234159 A1 US2004234159 A1 US 2004234159A1
Authority
US
United States
Prior art keywords
image
descriptors
images
laplacian
orientation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/693,369
Inventor
Lizhi Wang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US10/693,369 priority Critical patent/US20040234159A1/en
Publication of US20040234159A1 publication Critical patent/US20040234159A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/52Scale-space analysis, e.g. wavelet analysis

Definitions

  • the present invention is in the field of image descriptors and image processing. More specifically, the present invention relates to a technique for image descriptor adjustment when using steerable pyramids to extract image features for content-based search.
  • object classification operates by segmenting the original image into a series of discrete objects. These discrete objects are then measured using a variety of shape measurement identifications, such as shape dimensions and statistics, to identify each discrete object. Accordingly, each of the discrete objects are then classified into different categories by comparing the shape measurement identifications associated with each of the discrete objects against known shape measurement identifications of known reference objects. As such, the shape measurement identifications associated with each of the discrete objects are compared against known shape measurement identifications of known reference objects in order to determine a correlation or match between the images.
  • shape measurement identifications such as shape dimensions and statistics
  • Match filtering utilizes a pixel-by-pixel or image mask comparison of an area of interest associated with the proffered image against a corresponding interest area contained in the reference image. Accordingly, provided the area of interest associated With the proffered image matches the corresponding interest area of the reference image, via comparison, an area or pixel match between the images is accomplished and the images are considered to match.
  • Yet another technique utilizes a series of textual descriptors which are associated with different reference images.
  • the textual descriptors describe the image With textual descriptions, such as shape (e.g., round), color (e.g., green), and item (e.g., ball).
  • textual descriptions such as shape (e.g., round), color (e.g., green), and item (e.g., ball).
  • shape e.g., round
  • color e.g., green
  • item e.g., ball
  • Each of the aforementioned image matching techniques uses different types of data or partial image data to describe the images.
  • these techniques typically may not use the actual full image data associated with the each image. Accordingly, these techniques do not provide for an optimally accurate image comparison since they are limited to the usage of only a small or fractional portion of the full representative image data.
  • these techniques often result in the matching of very different images as having a correlation to one another. This partial-matching result is due in part by the limited amount or type of data used in the image comparison process.
  • a method of creating image descriptors by applying steerable filter to Laplacian images of a steerable pyramid is disclosed.
  • the Laplacian images are generated by two corresponding Gaussian images in the steerable pyramid. If the Laplacian images possess negativity, they are adjusted accordingly to eliminate the negativity.
  • Steerable filters are applied to the non-negative Laplacian images to generate orientation data and energy data. The adjustment made to the Laplacian images are correspondingly removed from the orientation data and the energy data.
  • FIG. 1 illustrates an exemplary simplified flow diagram showing generation of image descriptors from an input image.
  • FIG. 2A illustrates one exemplary steerable pyramid.
  • FIG. 2B illustrates an exemplary block diagram showing construction of an image descriptor from an original image.
  • FIG. 3A illustrates an exemplary multi-band image along with its corresponding orientation map, energy map, histogram, and co-occurrence matrix.
  • FIG. 3B illustrates an exemplary block diagram of the image descriptor corresponding to the multi-band image of FIG. 3A.
  • FIG. 4 illustrates an exemplary computer system that can be used in accordance with the present invention.
  • FIG. 5 illustrates an exemplary computer-readable medium that can be used in accordance with the present invention.
  • FIG. 1 illustrates an exemplary simplified flow diagram of one embodiment of the present invention.
  • an original image is received.
  • the original image is filtered to produce lower resolution images of the original image. These filtered images are then used to create images in the Laplacian pyramid in step 13 .
  • the Laplacian images are adjusted, if necessary, to keep them from being negative in values. This ensures that the input to the steerable filters of step 15 is always positive.
  • the steerable filters generate orientation map and energy map for the corresponding input images.
  • the image descriptors of the original image are created and may subsequently be processed.
  • a steerable pyramid is a multi-scale, multi-oriented linear image decomposition that provides a useful front-end for many computers image application.
  • the original image is decomposed into sets of low-pass and band-pass components via Gaussian and Laplacian pyramids, respectively.
  • the Gaussian pyramid consists of low pass filtered (LPF) versions of the input image in decreasing resolution.
  • the LPF versions of the input images are the Gaussian filter responses.
  • FIG. 2A illustrates an exemplary representation of the steerable pyramid as used in one embodiment of the present invention.
  • Each stage of the Gaussian pyramid is computed by low-pass filtering and by sub-sampling of the previous stage. For example, starting with the original image 21 , the low-resolution images 22 and 23 in the Gaussian pyramid are generated by applying Gaussian filters to the image in the previous stage.
  • the Laplacian pyramid images can be generated.
  • the Laplacian pyramid consists of band-pass filtered (BPF) versions of the input image. Each stage of the Laplacian pyramid are formed by subtracting the neighboring two Gaussian pyramid levels.
  • the Laplacian pyramid includes images 24 , 25 , and 26 .
  • Laplacian image 25 is derived from Gaussian image 22 and Gaussian image 23 .
  • the bottom level of the Laplacian pyramid 26 is merely the bottom level of the Gaussian pyramid 23 as there is nothing left to subtract.
  • the pyramid algorithm breaks the image down into a series of frequency bands which, when summed together, reproduce the original image without any distortion.
  • the result of the subtraction of the two Gaussian images is not always positive.
  • the output image of a Laplacian filter or, in other words, the subtraction of the two corresponding Gaussian images might possibly be negative.
  • a verification of the subtraction result is conducted prior to applying the steerable filters to the Laplacian image. If the result is negative, an adjustment is added to keep it from being negative. The adjustment is later removed without causing any variances to the image.
  • the output of the steerable filter is an orientation and an energy map.
  • FIG. 2A illustrates the output orientation map 27 and the output energy map 28 corresponding to the input Laplacian image 24 .
  • FIG. 2B illustrates an embodiment of a method for generating a multi-element image descriptor. Particularly, FIG. 2B illustrates, in block flow diagram format, one embodiment of a method of generating an image descriptor which is representative of a multi-band image for use in image processing.
  • Image features extracted from the output of spatial filters are often used for image representation.
  • the application of multi-band images to spatial filters enables the construction of feature sets which contain a wide range of spectral and spatial properties.
  • One such type of oriented spatial filter is the steerable filter.
  • Steerable filter is a class of filters in which a filter of arbitrary orientation is synthesized as a linear combination of a set of “basis filters”. Steerable filters obtain information about the response of a filter at any orientation using the small set of these basis filters.
  • an image [I(x,y)] 100 is applied to the steerable filter [Filter f ⁇ (x,y)] 105 which provides two different matrices for each image, an orientation matrix 110 and an energy matrix 115 .
  • the energy matrix 115 also referred to as an Energy Map E (I(x,y)) 115 , corresponds to the dominant orientation at each pixel position (x,y) in accordance with equation (2):
  • E ⁇ ( x , y ) E ⁇ ( 0 ⁇ ° ) + E ⁇ ( 60 ⁇ ° ) + E ⁇ ( 120 ⁇ ° ) + 2 ⁇ E 2 ⁇ ( 0 ⁇ ° ) + E 2 ⁇ ( 60 ⁇ ° ) + E 2 ⁇ ( 120 ⁇ ° ) - E ⁇ ( 0 ⁇ ° ) ⁇ ( E ⁇ ( 60 ⁇ ° ) + E ⁇ ( 120 ⁇ ° ) ) - E ⁇ ( 60 ⁇ ° ) ⁇ E ⁇ ( 120 ⁇ ° ) ( 2 )
  • the Orientation Map ⁇ (I(x,y)) 110 and the Energy Map E (I(x,y)) 115 a corresponding histogram or set of histograms is used to represent global information, along with a set of co-occurence matrices which are used to represent local information.
  • the Orientation Map ⁇ (I(x,y)) 110 is represented as a corresponding orientation histogram H ( ⁇ ) 120 and set of orientation co-occurence matrices C ⁇ 125 .
  • each image 100 is represented by a corresponding orientation histogram H( ⁇ ) 120 , a set of orientation co-occurence matrices C ⁇ 125 , a corresponding energy histogram H(E) 130 , and a set of energy co-occurence matrices C E 135 .
  • the descriptors extracted from the orientation histogram H( ⁇ ) 120 of the Orientation Map ⁇ (I(x,y)) 110 are peak descriptors (PD) 140 and statistic descriptors (SD 1 ) 145 .
  • the peak descriptors (PD) 140 comprise position, value, and shape data associated with the orientation histogram H( ⁇ ) 120 .
  • the statistic descriptors (SD 1 ) 145 indicate mean, standard deviation, third and fourth order moments associated with the orientation histogram H( ⁇ ) 120 . Select elements within the peak descriptors (PD) 140 are used to classify images into different categories, whereas the statistic descriptors (SD 1 ) 145 are used to describe the shape of the orientation histogram H( ⁇ ) 120 .
  • Descriptors extracted from the orientation co-occurence matrices C ⁇ 125 of the Orientation Map ⁇ (I(x,y)) 110 are co-occurence descriptors (COD 1 ) 150 .
  • the co-occurence descriptors (COD 1 ) 150 comprise maximum probability, entropy, uniformity, mean, correlation, and difference moments.
  • the co-occurence descriptors (COD 1 ) 150 in the present embodiment are computed in four different orientations ( ⁇ 45 Degrees, O Degrees, 45 Degrees, and 90 Degrees).
  • the descriptors extracted from the energy histogram H(E) 130 of the Energy Map E (I(x,y)) 115 are statistic descriptors (SD 2 ) 155 .
  • the statistic descriptors (SD 2 ) 155 indicate mean, standard deviation, third and fourth order moments associated with the energy histogram H(E) 130 .
  • the statistic descriptors (SD 2 ) 155 associated with the energy histogram H(E) 130 are used to describe the shape of the orientation energy histogram H(E) 130 .
  • the descriptors extracted from the energy co-occurence matrices C E 135 of the Energy Map E (I(x,y)) 115 are co-occurence descriptors (COD 2 ) 160 .
  • the co-occurence descriptors (COD 2 ) 160 comprise maximum probability, entropy, uniformity, mean, correlation, and difference moments.
  • the co-occurence descriptors (COD 2 ) 160 in the present embodiment are computed in four different orientations ( ⁇ 45 Degrees, O Degrees, 45 Degrees, and 90 Degrees).
  • each individual descriptor 170 associated with an image comprises a peak descriptors (PD) 140 , statistic descriptors (SD 1 ) 145 , co-occurence descriptors (COD 1 ) 150 , statistic descriptors (SD 2 ) 155 , and co-occurence descriptors (COD 2 ) 160 , which are combined to form an image descriptor 165 .
  • the image descriptor 165 is a full representation of each image which may be used for image processing.
  • an image descriptor 165 is generated for each information band comprising the multi-band image, as such, each information band associated with each multi-band image has a corresponding image descriptor 165 .
  • a multi-band image using the RGB color spectrum would have an individual image descriptor 165 for each information or color band (RGB) of the multi-band image.
  • the input image [I(x,y)] 100 may carry a negative value when applied to the steerable filter [Filter f ⁇ (x,y)] 105 and thus needs to be adjusted.
  • the subtraction of the two levels of the Gaussian images may produce a Laplacian image or the [I(x,y)] 100 image having a negative value.
  • the Laplacian image having a negative intensity value can not be used as input to the steerable filter since the output orientation data and energy data will not be meaningful. Consequently, the respective orientation map 110 and energy map 115 can not be used to create image descriptors.
  • the negativity of the input image [I(x,y)] 100 is determined by measuring its intensity value.
  • Gaussian image 23 may possess higher intensity value than Gaussian image 22 .
  • Laplacian image 25 is generated by the subtraction of the two Gaussian images at the corresponding levels, thus in this example, causing Laplacian image 25 to possess a negative intensity value.
  • an adjustment is made by adding a constant value C to the input image [I(x,y)] 100 , or the Laplacian image, such that the adjusted Laplacian input image I′(x,y) is positive and is in the form:
  • E( ⁇ ) is the output of the steerable filter when the steerable filter is applied to the image I(x,y)
  • C is the resulting adjustment to the output of the steerable filter when the adjustment value C is added to the input image I(x,y).
  • E ′ ⁇ ( x , y ) E ⁇ ( 0 ⁇ ° ) ⁇ ( x , y ) + E ⁇ ( 60 ⁇ ° ) ⁇ ( x , y ) + E ⁇ ( 120 ⁇ ° ) ⁇ ( x , y ) + 2 ⁇ E 2 ⁇ ( 0 ⁇ ° ) + E 2 ⁇ ( 60 ⁇ ° ) + E 2 ⁇ ( 120 ⁇ ° ) - E ⁇ ( 0 ⁇ ° ) ⁇ ( E ⁇ ( 60 ⁇ ° ) + E ⁇ ( 120 ⁇ ° ) ) - E ⁇ ( 60 ⁇ ° ) ⁇ E ⁇ ( 120 ⁇ ° ) + 3 ⁇ C
  • the orientation histogram H ( ⁇ ) 120 and the orientation co-occurence matrices C ⁇ 125 do not change.
  • the energy histogram H(E) 130 , and the energy co-occurence matrices C E 135 reserve their shapes but have a spartial shift by a value of 3 C.
  • the centered moments are used as descriptors to get rid of the translation caused by the DC component C.
  • the intensity value is discussed in this embodiment to detect negativity, other image characteristics may be used.
  • FIG. 3A and FIG. 3B illustrate a multi-band image along with the corresponding respective orientation map and energy map associated with the multi-band image.
  • FIG. 3A and FIG. 3B also illustrate the corresponding histograms (0 Degrees to 181 Degrees) and matrices, in addition to the corresponding image descriptors, associated with the multi-band image.
  • the multi-band image has an associated image descriptor which describes the corresponding multi-band image, each image descriptor constituting a peak descriptor (PD), a statistic descriptor (SD 1 ), a co-occurence descriptor (COD 1 ), a statistic descriptor (SD 2 ), and a co-occurence descriptor (COD 2 ).
  • PD peak descriptor
  • SD 1 statistic descriptor
  • COD 1 co-occurence descriptor
  • SD 2 statistic descriptor
  • COD 2 co-occurence descriptor
  • FIGS. 3A and FIG. 3B a representative orientation histogram H( ⁇ ) 305 and other related information (histograms/matrices) are illustrated for each information band 310 (e.g., RGB color bands) of the multi-band images 300 .
  • the corresponding image descriptor 320 provided for each information band 310 of the multi band image 300 contain data corresponding to each representative orientation histogram H( ⁇ ) 305 and other related information associated with each information band 310 .
  • the representative orientation histograms H( ⁇ ) 305 corresponding to the multi-band images 300 of FIGS. 3A exhibits large peaks 315 along the different coordinates of the corresponding orientation histograms H( ⁇ ) 305 .
  • the large peaks 315 represented in each representative orientation histogram H( ⁇ ) 305 are likewise represented in the corresponding image descriptors 320 associated with each corresponding information band 310 as large peak representative data. Accordingly, those image descriptors 320 which contain large peak representative data, corresponding to the large peaks 315 represented in each representative orientation histogram H( ⁇ ) 305 , are classified into the large peak category. For instance, the orientation histogram H( ⁇ ) 305 associated with the (B) information band (e.g., blue color band) of FIG. 3A and FIG.
  • the orientation histogram H( ⁇ ) 305 associated with the (G) information band (e.g., green color band) of FIG. 3A and FIG.
  • orientation histogram H( ⁇ ) 305 exhibits a series of large peaks 315 along different coordinates (1 Degree, 89 Degrees, and 179 Degrees) of the corresponding orientation histogram H( ⁇ ) 305 , the series of large peaks 315 are likewise reflected in the corresponding image descriptor 320 at 1 Degree, 89 Degrees, and 179 Degrees. Further, the orientation histogram H( ⁇ ) 305 associated with the (R) information band (e.g., red color band) of FIG. 3A and FIG.
  • R information band
  • FIG. 4 illustrates an embodiment of an exemplary computer system that can be used with the present invention.
  • the various components shown in FIG. 4 are provided by way of example. Certain components of the computer in FIG. 4 can be deleted from the addressing system for a particular implementation of the invention.
  • the computer shown in FIG. 4 may be any type of computer including a general purpose computer.
  • FIG. 4 illustrates a system bus 400 to which various components are coupled.
  • a processor 402 performs the processing tasks required by the computer.
  • Processor 402 may be any type of processing device capable of implementing the steps necessary to perform the addressing and delivery operations discussed above.
  • An input/output (I/O) device 404 is coupled to bus 400 and provides a mechanism for communicating with other devices coupled to the computer.
  • a read-only memory (ROM) 406 and a random access memory (RAM) 408 are coupled to bus 400 and provide a storage mechanism for various data and information used by the computer. Although ROM 406 and RAM 408 are shown coupled to bus 400 , in alternate embodiments, ROM 406 and RAM 408 are coupled directly to processor 402 or coupled to a dedicated memory bus (not shown).
  • a video display 410 is coupled to bus 400 and displays various information and data to the user of the computer.
  • a disk drive 412 is coupled to bus 400 and provides for the long-term mass storage of information. Disk drive 412 may be used to store various profile data-sets and other data generated by and used by the addressing and delivery system.
  • a keyboard 414 and pointing device 416 are also coupled to bus 400 and provide mechanisms for entering information and commands to the computer.
  • a printer 418 is coupled to bus 400 and is capable of creating a hard copy of information generated by or used by the computer.
  • FIG. 5 illustrates an embodiment of an exemplary computer-readable medium 500 containing various sets of instructions, code sequences, configuration information, and other data used by a computer or other processing device.
  • the embodiment illustrated in FIG. 5 is suitable, for example, to use with the peak determination method described above.
  • the various information stored on medium 500 is used to perform various data processing operations.
  • Computer-readable medium 500 is also referred to as a processor-readable medium.
  • Computer-readable medium 500 can be any type of magnetic, optical, or electrical storage medium including a diskette, magnetic tape, CD-ROM, memory device, or other storage medium.
  • Computer-readable medium 500 includes interface code 502 that controls the flow of information between various devices or components in the computer system. Interface code 502 may control the transfer of information within a device (e.g., between the processor and a memory device), or between an input/output port and a storage device. Additionally, interface code 502 may control the transfer of information from one device to another. Computer-readable medium 500 may also include other programs working with one another to produce a result in accordance with the present invention. For example, computer-readable medium 500 may include a program that accept an original image as input and apply appropriate Gaussian filters to generate Gaussian images, as shown in block 504 .
  • a Laplacian image generation program 506 may be responsible for generating Laplacian images by using the Gaussian images of program 504 as its input. Prior to executing steerable filter program 510 , the intensity value of the Laplacian image is tested for negativity by program 508 . Furthermore, in the process of finding image descriptors, program 512 may be executed for peak determination.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

A method of adjusting image descriptors when using steerable pyramid to extract features for content-based search is disclosed in this invention. Using steerable pyramid, the original image is filtered to produce Gaussian images and Laplacian images. The image descriptors are formed by filtering the Laplacian images. The filtering of the Laplacian images provides orientation data and energy data. From the orientation data, a first set of global information and local information is generated. From the energy data, a second set of global information and local information is generated. The descriptors are extracted from the respective sets of global information and local information associated with the orientation data and the energy data. The descriptors associated with the orientation data and the descriptors associated with the energy data are combined to form an image descriptor. The Laplacian image may be adjusted prior to the application of the filter. The adjustment is necessary since the value of the Laplacian image may not always be positive. When the adjustment is made to the Laplacian image, a spatial shift occurs to the corresponding energy data.

Description

    FIELD OF THE INVENTION
  • The present invention is in the field of image descriptors and image processing. More specifically, the present invention relates to a technique for image descriptor adjustment when using steerable pyramids to extract image features for content-based search. [0001]
  • BACKGROUND
  • As image processing applications become more complex, an image search engine needs to be able to search and retries e information about images effectively and efficiently. Images are often retrieved from a database by similarity of image features. Image processing allows for the comparison of a reference image against another image or multiple images in order to determine a “match” or correlation between the respective images. Accordingly, a variety of different image-matching techniques have been employed to determine such a match or correlation between images. [0002]
  • One such image matching technique is known as object classification. The object classification technique operates by segmenting the original image into a series of discrete objects. These discrete objects are then measured using a variety of shape measurement identifications, such as shape dimensions and statistics, to identify each discrete object. Accordingly, each of the discrete objects are then classified into different categories by comparing the shape measurement identifications associated with each of the discrete objects against known shape measurement identifications of known reference objects. As such, the shape measurement identifications associated with each of the discrete objects are compared against known shape measurement identifications of known reference objects in order to determine a correlation or match between the images. [0003]
  • Another image matching technique utilized in determining a match between images is a process known as match filtering. Match filtering utilizes a pixel-by-pixel or image mask comparison of an area of interest associated with the proffered image against a corresponding interest area contained in the reference image. Accordingly, provided the area of interest associated With the proffered image matches the corresponding interest area of the reference image, via comparison, an area or pixel match between the images is accomplished and the images are considered to match. [0004]
  • Yet another technique utilizes a series of textual descriptors which are associated with different reference images. The textual descriptors describe the image With textual descriptions, such as shape (e.g., round), color (e.g., green), and item (e.g., ball). Accordingly, when a proffered image is received for comparison, the textual descriptor of the proffered image is compared against the textual descriptors associated with the reference images. As such, the textual descriptor associated with the respective images under comparison are compared to each other in order to determine a best match between the textual descriptions associated with each image, and therefore a match between the respective images. [0005]
  • Each of the aforementioned image matching techniques uses different types of data or partial image data to describe the images. However, these techniques typically may not use the actual full image data associated with the each image. Accordingly, these techniques do not provide for an optimally accurate image comparison since they are limited to the usage of only a small or fractional portion of the full representative image data. Thus, when a search for similar images is conducted against a basis image, these techniques often result in the matching of very different images as having a correlation to one another. This partial-matching result is due in part by the limited amount or type of data used in the image comparison process. [0006]
  • SUMMARY OF THE INVENTION
  • A method of creating image descriptors by applying steerable filter to Laplacian images of a steerable pyramid is disclosed. The Laplacian images are generated by two corresponding Gaussian images in the steerable pyramid. If the Laplacian images possess negativity, they are adjusted accordingly to eliminate the negativity. Steerable filters are applied to the non-negative Laplacian images to generate orientation data and energy data. The adjustment made to the Laplacian images are correspondingly removed from the orientation data and the energy data. [0007]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The present invention is illustrated by way of example in the following drawings in which like references indicate similar elements. The following drawings disclose various embodiments of the present invention for purposes of illustration only and are not intended to limit the scope of the invention. [0008]
  • FIG. 1 illustrates an exemplary simplified flow diagram showing generation of image descriptors from an input image. [0009]
  • FIG. 2A illustrates one exemplary steerable pyramid. [0010]
  • FIG. 2B illustrates an exemplary block diagram showing construction of an image descriptor from an original image. [0011]
  • FIG. 3A illustrates an exemplary multi-band image along with its corresponding orientation map, energy map, histogram, and co-occurrence matrix. [0012]
  • FIG. 3B illustrates an exemplary block diagram of the image descriptor corresponding to the multi-band image of FIG. 3A. [0013]
  • FIG. 4 illustrates an exemplary computer system that can be used in accordance with the present invention. [0014]
  • FIG. 5 illustrates an exemplary computer-readable medium that can be used in accordance with the present invention. [0015]
  • DETAILED DESCRIPTION
  • The following detailed description sets forth numerous specific details to provide a thorough understanding of the invention. However, those of ordinary skill in the art will appreciate that the invention may be practiced without these specific details. In other instances, well-known methods, procedures, protocols, components, algorithms, and circuits have not been described in detail so as not to obscure the invention. [0016]
  • FIG. 1 illustrates an exemplary simplified flow diagram of one embodiment of the present invention. At [0017] step 11, an original image is received. At step 12, using Gaussian filter in the Gaussian pyramid, the original image is filtered to produce lower resolution images of the original image. These filtered images are then used to create images in the Laplacian pyramid in step 13. At step 14, the Laplacian images are adjusted, if necessary, to keep them from being negative in values. This ensures that the input to the steerable filters of step 15 is always positive. The steerable filters generate orientation map and energy map for the corresponding input images. Finally, in step 16, the image descriptors of the original image are created and may subsequently be processed.
  • A steerable pyramid is a multi-scale, multi-oriented linear image decomposition that provides a useful front-end for many computers image application. In one embodiment of the steerable pyramid the original image is decomposed into sets of low-pass and band-pass components via Gaussian and Laplacian pyramids, respectively. [0018]
  • The Gaussian pyramid consists of low pass filtered (LPF) versions of the input image in decreasing resolution. The LPF versions of the input images are the Gaussian filter responses. FIG. 2A illustrates an exemplary representation of the steerable pyramid as used in one embodiment of the present invention. Each stage of the Gaussian pyramid is computed by low-pass filtering and by sub-sampling of the previous stage. For example, starting with the [0019] original image 21, the low- resolution images 22 and 23 in the Gaussian pyramid are generated by applying Gaussian filters to the image in the previous stage.
  • Once the appropriate Gaussian pyramid images have been generated, the Laplacian pyramid images can be generated. The Laplacian pyramid consists of band-pass filtered (BPF) versions of the input image. Each stage of the Laplacian pyramid are formed by subtracting the neighboring two Gaussian pyramid levels. In FIG. 2A, the Laplacian pyramid includes [0020] images 24, 25, and 26. In this example, Laplacian image 25 is derived from Gaussian image 22 and Gaussian image 23. The bottom level of the Laplacian pyramid 26 is merely the bottom level of the Gaussian pyramid 23 as there is nothing left to subtract. The pyramid algorithm breaks the image down into a series of frequency bands which, when summed together, reproduce the original image without any distortion.
  • One aspect of this pyramid algorithm is that the result of the subtraction of the two Gaussian images is not always positive. According to the example shown in FIG. 2A, the output image of a Laplacian filter or, in other words, the subtraction of the two corresponding Gaussian images might possibly be negative. In one embodiment, prior to applying the steerable filters to the Laplacian image, a verification of the subtraction result is conducted. If the result is negative, an adjustment is added to keep it from being negative. The adjustment is later removed without causing any variances to the image. The output of the steerable filter is an orientation and an energy map. FIG. 2A illustrates the [0021] output orientation map 27 and the output energy map 28 corresponding to the input Laplacian image 24.
  • FIG. 2B illustrates an embodiment of a method for generating a multi-element image descriptor. Particularly, FIG. 2B illustrates, in block flow diagram format, one embodiment of a method of generating an image descriptor which is representative of a multi-band image for use in image processing. [0022]
  • Image features extracted from the output of spatial filters are often used for image representation. The application of multi-band images to spatial filters enables the construction of feature sets which contain a wide range of spectral and spatial properties. One such type of oriented spatial filter is the steerable filter. Steerable filter is a class of filters in which a filter of arbitrary orientation is synthesized as a linear combination of a set of “basis filters”. Steerable filters obtain information about the response of a filter at any orientation using the small set of these basis filters. [0023]
  • In one embodiment, [0024] x 2 exp ( - x 2 + y 2 2 * σ 2 )
    Figure US20040234159A1-20041125-M00001
  • is chosen as the kernel of the steerable filter. Accordingly, for this kernel of information, a steerable filter at an arbitrary orientation θ can be synthesized using a linear combination of three basis filters according to: [0025] h θ ( x , y ) = k 1 ( θ ) h 0 ( x , y ) + k 2 ( θ ) h 60 ( x , y ) + k 3 ( θ ) h 120 ( x , y ) , where h 0 ( x , y ) = x 2 exp ( - x 2 + y 2 2 * σ 2 ) , h 60 ( x , y ) = ( 1 2 x + 3 2 y ) 2 exp ( - x 2 + y 2 2 * σ 2 ) , h 120 ( x , y ) = ( - 1 2 x + 3 2 y ) exp ( - x 2 + y 2 2 * σ 2 ) and k 1 ( θ ) = 1 + 2 cos 2 θ k 2 ( θ ) = 1 - cos 2 θ + 3 sin 2 θ k 3 ( θ ) = 1 - cos 2 θ - 3 sin 2 θ .
    Figure US20040234159A1-20041125-M00002
  • As illustrated in the embodiment of FIG. 2B, an image [I(x,y)] [0026] 100 is applied to the steerable filter [Filter f θ (x,y)] 105 which provides two different matrices for each image, an orientation matrix 110 and an energy matrix 115. The orientation matrix 110, also referred to as an Orientation Map Θ (I(x,y)) 110, is derived by computing the dominant orientation at each pixel position (x,y) by using equation (1): θ ( x , y ) = 1 2 arctan ( 3 ( E ( 60 ° ) ( x , y ) - E ( 120 ° ) ( x , y ) ) 2 E ( 0 ° ) ( x , y ) - E ( 60 ° ) ( x , y ) - E ( 120 ° ) ( x , y ) ) . ( 1 )
    Figure US20040234159A1-20041125-M00003
  • The [0027] energy matrix 115, also referred to as an Energy Map E (I(x,y)) 115, corresponds to the dominant orientation at each pixel position (x,y) in accordance with equation (2): E ( x , y ) = E ( 0 ° ) + E ( 60 ° ) + E ( 120 ° ) + 2 E 2 ( 0 ° ) + E 2 ( 60 ° ) + E 2 ( 120 ° ) - E ( 0 ° ) ( E ( 60 ° ) + E ( 120 ° ) ) - E ( 60 ° ) E ( 120 ° ) ( 2 )
    Figure US20040234159A1-20041125-M00004
  • Accordingly, for each matrix or map, the Orientation Map Θ (I(x,y)) [0028] 110 and the Energy Map E (I(x,y)) 115, a corresponding histogram or set of histograms is used to represent global information, along with a set of co-occurence matrices which are used to represent local information. As such, the Orientation Map Θ (I(x,y)) 110 is represented as a corresponding orientation histogram H (θ) 120 and set of orientation co-occurence matrices C Θ 125. Similarly, the Energy Map E (I(x,y)) 115 is represented as a corresponding energy histogram H(E) 130 and set of energy co-occurence matrices CE 135. Therefore, each image 100 is represented by a corresponding orientation histogram H(θ) 120, a set of orientation co-occurence matrices C Θ 125, a corresponding energy histogram H(E) 130, and a set of energy co-occurence matrices CE 135.
  • Next, a series of descriptors are extracted from each of the corresponding histograms and co-occurence matrices. In one embodiment, the descriptors extracted from the orientation histogram H(θ) [0029] 120 of the Orientation Map Θ (I(x,y)) 110 are peak descriptors (PD) 140 and statistic descriptors (SD1) 145. The peak descriptors (PD) 140 comprise position, value, and shape data associated with the orientation histogram H(θ) 120. The statistic descriptors (SD1) 145 indicate mean, standard deviation, third and fourth order moments associated with the orientation histogram H(θ) 120. Select elements within the peak descriptors (PD) 140 are used to classify images into different categories, whereas the statistic descriptors (SD1) 145 are used to describe the shape of the orientation histogram H(θ) 120.
  • Descriptors extracted from the orientation co-occurence [0030] matrices C Θ 125 of the Orientation Map Θ (I(x,y)) 110 are co-occurence descriptors (COD1) 150. The co-occurence descriptors (COD1) 150 comprise maximum probability, entropy, uniformity, mean, correlation, and difference moments. The co-occurence descriptors (COD1) 150 in the present embodiment are computed in four different orientations (−45 Degrees, O Degrees, 45 Degrees, and 90 Degrees).
  • Correspondingly, the descriptors extracted from the energy histogram H(E) [0031] 130 of the Energy Map E (I(x,y)) 115 are statistic descriptors (SD2) 155. The statistic descriptors (SD2) 155 indicate mean, standard deviation, third and fourth order moments associated with the energy histogram H(E) 130. The statistic descriptors (SD2) 155 associated with the energy histogram H(E) 130 are used to describe the shape of the orientation energy histogram H(E) 130.
  • Likewise, the descriptors extracted from the energy co-occurence [0032] matrices CE 135 of the Energy Map E (I(x,y)) 115 are co-occurence descriptors (COD2) 160. The co-occurence descriptors (COD2) 160 comprise maximum probability, entropy, uniformity, mean, correlation, and difference moments. The co-occurence descriptors (COD2) 160 in the present embodiment are computed in four different orientations (−45 Degrees, O Degrees, 45 Degrees, and 90 Degrees).
  • The descriptors associated with an image are combined in order to form a feature vector or [0033] image descriptor 165. As such, in one embodiment, each individual descriptor 170 associated with an image comprises a peak descriptors (PD) 140, statistic descriptors (SD1) 145, co-occurence descriptors (COD1) 150, statistic descriptors (SD2) 155, and co-occurence descriptors (COD2) 160, which are combined to form an image descriptor 165. As such, the image descriptor 165 is a full representation of each image which may be used for image processing. For multi-band applications, an image descriptor 165 is generated for each information band comprising the multi-band image, as such, each information band associated with each multi-band image has a corresponding image descriptor 165. For instance, a multi-band image using the RGB color spectrum would have an individual image descriptor 165 for each information or color band (RGB) of the multi-band image.
  • One aspect of the present invention, which will be appreciated from the above discussion with reference to FIG. 2B, is that the input image [I(x,y)] [0034] 100 may carry a negative value when applied to the steerable filter [Filter f θ (x,y)] 105 and thus needs to be adjusted. For example, under certain circumstances, the subtraction of the two levels of the Gaussian images may produce a Laplacian image or the [I(x,y)] 100 image having a negative value. The Laplacian image having a negative intensity value can not be used as input to the steerable filter since the output orientation data and energy data will not be meaningful. Consequently, the respective orientation map 110 and energy map 115 can not be used to create image descriptors.
  • In one embodiment of the present invention, the negativity of the input image [I(x,y)] [0035] 100 is determined by measuring its intensity value. For example, as shown in FIG. 2A, Gaussian image 23 may possess higher intensity value than Gaussian image 22. As previously described, Laplacian image 25 is generated by the subtraction of the two Gaussian images at the corresponding levels, thus in this example, causing Laplacian image 25 to possess a negative intensity value. Referring back to FIG. 2B, based on this intensity value, an adjustment is made by adding a constant value C to the input image [I(x,y)] 100, or the Laplacian image, such that the adjusted Laplacian input image I′(x,y) is positive and is in the form:
  • I′(x,y)=I(x,y)+C
  • Accordingly, with the adjusted input image I′(x,y), the output of the steerable basis filter can be expressed by the equation: [0036]
  • E′(Θ)=E(Θ)+C
  • where E(θ) is the output of the steerable filter when the steerable filter is applied to the image I(x,y), and where the C is the resulting adjustment to the output of the steerable filter when the adjustment value C is added to the input image I(x,y). [0037]
  • Using the same adjusted Laplacian image I′(x,y) as the input image of the steerable filter, the Orientation map Θ′ (I′(x,y)) can be represented by the following equation: [0038] θ ( x , y ) = 1 2 arctan ( 3 ( E ( 60 ° ) ( x , y ) + C - E ( 120 ° ) ( x , y ) - C ) 2 E ( 0 ° ) ( x , y ) + 2 C - E ( 60 ° ) ( x , y ) - C - E ( 120 ° ) ( x , y ) - C )
    Figure US20040234159A1-20041125-M00005
  • The value C in Θ′(x,y) cancels one another, leaving Θ′(I′(x,y))=Θ(I(x,y)). Thus the Orientation map is invariant to the DC component change. [0039]
  • Similarly, using the same adjusted Laplacian image I′(x,y) as the input image of the steerable filter, the energy map E′(I′(x,y)) can be represented by the following equation: [0040] E ( x , y ) = E ( 0 ° ) ( x , y ) + E ( 60 ° ) ( x , y ) + E ( 120 ° ) ( x , y ) + 2 E 2 ( 0 ° ) + E 2 ( 60 ° ) + E 2 ( 120 ° ) - E ( 0 ° ) ( E ( 60 ° ) + E ( 120 ° ) ) - E ( 60 ° ) E ( 120 ° ) + 3 C
    Figure US20040234159A1-20041125-M00006
  • Thus, the orientation histogram H (θ) [0041] 120 and the orientation co-occurence matrices C Θ 125 do not change. The energy histogram H(E) 130, and the energy co-occurence matrices CE 135 reserve their shapes but have a spartial shift by a value of 3 C. The centered moments are used as descriptors to get rid of the translation caused by the DC component C. Although the intensity value is discussed in this embodiment to detect negativity, other image characteristics may be used.
  • FIG. 3A and FIG. 3B illustrate a multi-band image along with the corresponding respective orientation map and energy map associated with the multi-band image. FIG. 3A and FIG. 3B also illustrate the corresponding histograms (0 Degrees to 181 Degrees) and matrices, in addition to the corresponding image descriptors, associated with the multi-band image. The multi-band image has an associated image descriptor which describes the corresponding multi-band image, each image descriptor constituting a peak descriptor (PD), a statistic descriptor (SD[0042] 1), a co-occurence descriptor (COD1), a statistic descriptor (SD2), and a co-occurence descriptor (COD2). Each image descriptor describes a particular image in accordance with the attributes associated with that particular individual image.
  • In FIGS. 3A and FIG. 3B, a representative orientation histogram H(θ) [0043] 305 and other related information (histograms/matrices) are illustrated for each information band 310 (e.g., RGB color bands) of the multi-band images 300. Likewise, the corresponding image descriptor 320 provided for each information band 310 of the multi band image 300 contain data corresponding to each representative orientation histogram H(θ) 305 and other related information associated with each information band 310. The representative orientation histograms H(θ) 305 corresponding to the multi-band images 300 of FIGS. 3A exhibits large peaks 315 along the different coordinates of the corresponding orientation histograms H(θ) 305. Correspondingly, the large peaks 315 represented in each representative orientation histogram H(θ) 305 are likewise represented in the corresponding image descriptors 320 associated with each corresponding information band 310 as large peak representative data. Accordingly, those image descriptors 320 which contain large peak representative data, corresponding to the large peaks 315 represented in each representative orientation histogram H(θ) 305, are classified into the large peak category. For instance, the orientation histogram H(θ) 305 associated with the (B) information band (e.g., blue color band) of FIG. 3A and FIG. 3B exhibits a series of large peaks 315 along different coordinates (2 Degrees, 90 Degrees, and 179 Degrees) of the corresponding orientation histogram H(θ) 305, the series of large peaks 315 are likewise reflected in the corresponding image descriptor 320 at 2 Degrees, 90 Degrees, and 179 Degrees. Likewise, the orientation histogram H(θ) 305 associated with the (G) information band (e.g., green color band) of FIG. 3A and FIG. 3B exhibits a series of large peaks 315 along different coordinates (1 Degree, 89 Degrees, and 179 Degrees) of the corresponding orientation histogram H(θ) 305, the series of large peaks 315 are likewise reflected in the corresponding image descriptor 320 at 1 Degree, 89 Degrees, and 179 Degrees. Further, the orientation histogram H(θ) 305 associated with the (R) information band (e.g., red color band) of FIG. 3A and FIG. 3B exhibits a series of large peaks 315 along different coordinates (1 Degree, 90 Degrees, and 179 Degrees) of the corresponding orientation histogram H(θ) 305, the series of large peaks 315 are likewise reflected in the corresponding image descriptor 320 at 1 Degree, 90 Degrees, and 179 Degrees. Although the large peak is being used as a category in this example, it is understood that a variety of different category types could be derived or instituted depending upon a specific or desired implementation of any categorization or image comparison scheme.
  • FIG. 4 illustrates an embodiment of an exemplary computer system that can be used with the present invention. The various components shown in FIG. 4 are provided by way of example. Certain components of the computer in FIG. 4 can be deleted from the addressing system for a particular implementation of the invention. The computer shown in FIG. 4 may be any type of computer including a general purpose computer. [0044]
  • FIG. 4 illustrates a [0045] system bus 400 to which various components are coupled. A processor 402 performs the processing tasks required by the computer. Processor 402 may be any type of processing device capable of implementing the steps necessary to perform the addressing and delivery operations discussed above. An input/output (I/O) device 404 is coupled to bus 400 and provides a mechanism for communicating with other devices coupled to the computer. A read-only memory (ROM) 406 and a random access memory (RAM) 408 are coupled to bus 400 and provide a storage mechanism for various data and information used by the computer. Although ROM 406 and RAM 408 are shown coupled to bus 400, in alternate embodiments, ROM 406 and RAM 408 are coupled directly to processor 402 or coupled to a dedicated memory bus (not shown).
  • A [0046] video display 410 is coupled to bus 400 and displays various information and data to the user of the computer. A disk drive 412 is coupled to bus 400 and provides for the long-term mass storage of information. Disk drive 412 may be used to store various profile data-sets and other data generated by and used by the addressing and delivery system. A keyboard 414 and pointing device 416 are also coupled to bus 400 and provide mechanisms for entering information and commands to the computer. A printer 418 is coupled to bus 400 and is capable of creating a hard copy of information generated by or used by the computer.
  • FIG. 5 illustrates an embodiment of an exemplary computer-[0047] readable medium 500 containing various sets of instructions, code sequences, configuration information, and other data used by a computer or other processing device. The embodiment illustrated in FIG. 5 is suitable, for example, to use with the peak determination method described above. The various information stored on medium 500 is used to perform various data processing operations. Computer-readable medium 500 is also referred to as a processor-readable medium. Computer-readable medium 500 can be any type of magnetic, optical, or electrical storage medium including a diskette, magnetic tape, CD-ROM, memory device, or other storage medium.
  • Computer-[0048] readable medium 500 includes interface code 502 that controls the flow of information between various devices or components in the computer system. Interface code 502 may control the transfer of information within a device (e.g., between the processor and a memory device), or between an input/output port and a storage device. Additionally, interface code 502 may control the transfer of information from one device to another. Computer-readable medium 500 may also include other programs working with one another to produce a result in accordance with the present invention. For example, computer-readable medium 500 may include a program that accept an original image as input and apply appropriate Gaussian filters to generate Gaussian images, as shown in block 504. A Laplacian image generation program 506 may be responsible for generating Laplacian images by using the Gaussian images of program 504 as its input. Prior to executing steerable filter program 510, the intensity value of the Laplacian image is tested for negativity by program 508. Furthermore, in the process of finding image descriptors, program 512 may be executed for peak determination.
  • From the above description and drawings, it will be understood by those of ordinary skill in the art that the particular embodiments shown and described are for purposes of illustration only and are not intended to limit the scope of the invention. Those of ordinary skill in the art will recognize that the invention may be embodied in other specific forms without departing from its spirit or essential characteristics. References to details of particular embodiments are not intended to limit the scope of the claims. [0049]

Claims (2)

1. A method of applying steerable filter to Laplacian images of a steerable pyramid, comprising:
getting a Laplacian image from corresponding Gaussian images in a steerable pyramid;
verifying the Laplacian image for negative value;
adjusting the Laplacian image to eliminate the negative value;
applying a steerable filter to the adjusted Laplacian image to generate orientation data and energy data; and
removing resulting adjustment.
2-31. Cancelled
US10/693,369 1999-10-07 2003-10-23 Descriptors adjustment when using steerable pyramid to extract features for content based search Abandoned US20040234159A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/693,369 US20040234159A1 (en) 1999-10-07 2003-10-23 Descriptors adjustment when using steerable pyramid to extract features for content based search

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US15833599P 1999-10-07 1999-10-07
US09/440,491 US6674915B1 (en) 1999-10-07 1999-11-15 Descriptors adjustment when using steerable pyramid to extract features for content based search
US10/693,369 US20040234159A1 (en) 1999-10-07 2003-10-23 Descriptors adjustment when using steerable pyramid to extract features for content based search

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/440,491 Continuation US6674915B1 (en) 1999-10-07 1999-11-15 Descriptors adjustment when using steerable pyramid to extract features for content based search

Publications (1)

Publication Number Publication Date
US20040234159A1 true US20040234159A1 (en) 2004-11-25

Family

ID=29738900

Family Applications (2)

Application Number Title Priority Date Filing Date
US09/440,491 Expired - Fee Related US6674915B1 (en) 1999-10-07 1999-11-15 Descriptors adjustment when using steerable pyramid to extract features for content based search
US10/693,369 Abandoned US20040234159A1 (en) 1999-10-07 2003-10-23 Descriptors adjustment when using steerable pyramid to extract features for content based search

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/440,491 Expired - Fee Related US6674915B1 (en) 1999-10-07 1999-11-15 Descriptors adjustment when using steerable pyramid to extract features for content based search

Country Status (1)

Country Link
US (2) US6674915B1 (en)

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070002067A1 (en) * 2005-06-30 2007-01-04 Microsoft Corporation Magnification of indirection textures
US20070002071A1 (en) * 2005-06-30 2007-01-04 Microsoft Corporation Parallel texture synthesis by upsampling pixel coordinates
US20070003152A1 (en) * 2005-06-30 2007-01-04 Microsoft Corporation Multi-level image stack of filtered images
US20070002069A1 (en) * 2005-06-30 2007-01-04 Microsoft Corporation Parallel texture synthesis having controllable jitter
US20070296730A1 (en) * 2006-06-26 2007-12-27 Microsoft Corporation Texture synthesis using dimensionality-reduced appearance space
US20080001963A1 (en) * 2006-06-30 2008-01-03 Microsoft Corporation Synthesis of advecting texture using adaptive regeneration
US20080001962A1 (en) * 2006-06-30 2008-01-03 Microsoft Corporation Anisometric texture synthesis
US20080317370A1 (en) * 2005-11-08 2008-12-25 Koninklijke Philips Electronics, N.V. Method and System for Filtering Elongated Features
US20100034465A1 (en) * 2008-08-08 2010-02-11 Kabushiki Kaisha Toshiba Method and apparatus for calculating pixel features of image data
US20100080469A1 (en) * 2008-10-01 2010-04-01 Fuji Xerox Co., Ltd. Novel descriptor for image corresponding point matching
US20100246969A1 (en) * 2009-03-25 2010-09-30 Microsoft Corporation Computationally efficient local image descriptors
US7817160B2 (en) 2005-06-30 2010-10-19 Microsoft Corporation Sub-pass correction using neighborhood matching
US8340415B2 (en) 2010-04-05 2012-12-25 Microsoft Corporation Generation of multi-resolution image pyramids
US8542942B2 (en) 2010-12-17 2013-09-24 Sony Corporation Tunable gaussian filters
US8547389B2 (en) 2010-04-05 2013-10-01 Microsoft Corporation Capturing image structure detail from a first image and color from a second image
US8606031B2 (en) 2010-10-18 2013-12-10 Sony Corporation Fast, accurate and efficient gaussian filter
US20150007243A1 (en) * 2012-02-29 2015-01-01 Dolby Laboratories Licensing Corporation Image Metadata Creation for Improved Image Processing and Content Delivery
CN112488039A (en) * 2020-12-15 2021-03-12 哈尔滨市科佳通用机电股份有限公司 Machine learning-based method for detecting falling fault of hook tail frame supporting plate of railway wagon

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1526480A1 (en) * 2000-10-17 2005-04-27 Fuji Photo Film Co., Ltd Apparatus for suppressing noise by adapting filter characteristics to input image signal based on characteristics of input image signal
DE10110275A1 (en) * 2001-03-02 2002-09-19 Daimler Chrysler Ag Method for semi-automatic recognition or classification of random sample patterns in which a pattern is first accessed by a computer system and a suggestion made, with the suggestion accepted or rejected by an operator
US7239750B2 (en) * 2001-12-12 2007-07-03 Sony Corporation System and method for effectively utilizing universal feature detectors
FR2845186B1 (en) * 2002-09-27 2004-11-05 Thomson Licensing Sa METHOD AND DEVICE FOR MEASURING SIMILARITY BETWEEN IMAGES
WO2004079637A1 (en) * 2003-03-07 2004-09-16 Consejo Superior De Investigaciones Científicas Method for the recognition of patterns in images affected by optical degradations and application thereof in the prediction of visual acuity from a patient's ocular aberrometry data
US7388987B2 (en) * 2004-05-28 2008-06-17 Hewlett-Packard Development Company, L.P. Computing dissimilarity measures
US7813552B2 (en) * 2004-09-23 2010-10-12 Mitsubishi Denki Kabushiki Kaisha Methods of representing and analysing images
GB0427737D0 (en) * 2004-12-17 2005-01-19 Univ Cambridge Tech Method of identifying features within a dataset
JP4752719B2 (en) * 2006-10-19 2011-08-17 ソニー株式会社 Image processing apparatus, image acquisition method, and program
JP5229575B2 (en) * 2009-05-08 2013-07-03 ソニー株式会社 Image processing apparatus and method, and program

Citations (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4646250A (en) * 1984-10-18 1987-02-24 International Business Machines Corp. Data entry screen
US4672683A (en) * 1984-03-20 1987-06-09 Olympus Optical Co., Ltd. Image retrieval and registration system
US4716404A (en) * 1983-04-01 1987-12-29 Hitachi, Ltd. Image retrieval method and apparatus using annotations as guidance information
US4829453A (en) * 1987-03-05 1989-05-09 Sharp Kabushiki Kaisha Apparatus for cataloging and retrieving image data
US4850025A (en) * 1985-09-27 1989-07-18 Sony Corporation Character recognition system
US4944023A (en) * 1987-05-19 1990-07-24 Ricoh Company, Ltd. Method of describing image information
US5012334A (en) * 1990-01-29 1991-04-30 Dubner Computer Systems, Inc. Video image bank for storing and retrieving video image sequences
US5093867A (en) * 1987-07-22 1992-03-03 Sony Corporation Candidate article recognition with assignation of reference points and respective relative weights
US5148522A (en) * 1987-03-17 1992-09-15 Kabushiki Kaisha Toshiba Information retrieval apparatus and interface for retrieval of mapping information utilizing hand-drawn retrieval requests
US5179652A (en) * 1989-12-13 1993-01-12 Anthony I. Rozmanith Method and apparatus for storing, transmitting and retrieving graphical and tabular data
US5202828A (en) * 1991-05-15 1993-04-13 Apple Computer, Inc. User interface system having programmable user interface elements
US5220648A (en) * 1989-05-31 1993-06-15 Kabushiki Kaisha Toshiba High-speed search system for image data storage
US5249056A (en) * 1991-07-16 1993-09-28 Sony Corporation Of America Apparatus for generating video signals from film
US5381158A (en) * 1991-07-12 1995-01-10 Kabushiki Kaisha Toshiba Information retrieval apparatus
US5421008A (en) * 1991-11-08 1995-05-30 International Business Machines Corporation System for interactive graphical construction of a data base query and storing of the query object links as an object
US5428727A (en) * 1989-01-27 1995-06-27 Kurosu; Yasuo Method and system for registering and filing image data
US5434966A (en) * 1991-05-31 1995-07-18 Kao Corporation System and method for storing and retrieving three dimensional shapes using two dimensional contrast images
US5462370A (en) * 1993-07-19 1995-10-31 Krupp Polysius Ag Tiltable supporting roller bearing
US5469512A (en) * 1992-09-08 1995-11-21 Sony Corporation Pattern recognition device
US5579471A (en) * 1992-11-09 1996-11-26 International Business Machines Corporation Image query system and method
US5586197A (en) * 1993-09-02 1996-12-17 Canon Kabushiki Kaisha Image searching method and apparatus thereof using color information of an input image
US5621821A (en) * 1992-06-04 1997-04-15 Sony Corporation Apparatus and method for detecting distortions in processed image data
US5687239A (en) * 1993-10-04 1997-11-11 Sony Corporation Audio reproduction apparatus
US5704013A (en) * 1994-09-16 1997-12-30 Sony Corporation Map determination method and apparatus
US5729471A (en) * 1995-03-31 1998-03-17 The Regents Of The University Of California Machine dynamic selection of one video camera/image of a scene from multiple video cameras/images of the scene in accordance with a particular perspective on the scene, an object in the scene, or an event in the scene
US5767893A (en) * 1995-10-11 1998-06-16 International Business Machines Corporation Method and apparatus for content based downloading of video programs
US5793888A (en) * 1994-11-14 1998-08-11 Massachusetts Institute Of Technology Machine learning apparatus and method for image searching
US20030110472A1 (en) * 2001-11-11 2003-06-12 International Business Machines Corporation Method and system for generating program source code of a computer application from an information model
US6625590B1 (en) * 1999-08-10 2003-09-23 International Business Machines Corporation Command line interface for reducing user input in a network management device
US6629313B1 (en) * 2000-06-29 2003-09-30 Microsoft Corporation In-line database access statements without a pre-compiler
US6654953B1 (en) * 1998-10-09 2003-11-25 Microsoft Corporation Extending program languages with source-program attribute tags
US6658625B1 (en) * 1999-04-14 2003-12-02 International Business Machines Corporation Apparatus and method for generic data conversion

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2089393T3 (en) * 1991-03-27 1996-10-01 Canon Kk APPARATUS FOR IMAGE PROCESSING.
US5526446A (en) * 1991-09-24 1996-06-11 Massachusetts Institute Of Technology Noise reduction system
DE69513300T2 (en) * 1994-08-29 2000-03-23 Torsana A/S, Skodsborg DETERMINATION PROCEDURE
US5956427A (en) * 1995-06-15 1999-09-21 California Institute Of Technology DFT encoding of oriented filter responses for rotation invariance and orientation estimation in digitized images
US5633511A (en) * 1995-12-22 1997-05-27 Eastman Kodak Company Automatic tone scale adjustment using image activity measures
US5974159A (en) * 1996-03-29 1999-10-26 Sarnoff Corporation Method and apparatus for assessing the visibility of differences between two image sequences
JPH1125222A (en) * 1997-07-08 1999-01-29 Sharp Corp Method and device for segmenting character
US6256409B1 (en) * 1998-10-19 2001-07-03 Sony Corporation Method for determining a correlation between images using multi-element image descriptors
US6532301B1 (en) * 1999-06-18 2003-03-11 Microsoft Corporation Object recognition with occurrence histograms

Patent Citations (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4716404A (en) * 1983-04-01 1987-12-29 Hitachi, Ltd. Image retrieval method and apparatus using annotations as guidance information
US4672683A (en) * 1984-03-20 1987-06-09 Olympus Optical Co., Ltd. Image retrieval and registration system
US4646250A (en) * 1984-10-18 1987-02-24 International Business Machines Corp. Data entry screen
US4850025A (en) * 1985-09-27 1989-07-18 Sony Corporation Character recognition system
US4829453A (en) * 1987-03-05 1989-05-09 Sharp Kabushiki Kaisha Apparatus for cataloging and retrieving image data
US5148522A (en) * 1987-03-17 1992-09-15 Kabushiki Kaisha Toshiba Information retrieval apparatus and interface for retrieval of mapping information utilizing hand-drawn retrieval requests
US4944023A (en) * 1987-05-19 1990-07-24 Ricoh Company, Ltd. Method of describing image information
US5093867A (en) * 1987-07-22 1992-03-03 Sony Corporation Candidate article recognition with assignation of reference points and respective relative weights
US5428727A (en) * 1989-01-27 1995-06-27 Kurosu; Yasuo Method and system for registering and filing image data
US5220648A (en) * 1989-05-31 1993-06-15 Kabushiki Kaisha Toshiba High-speed search system for image data storage
US5179652A (en) * 1989-12-13 1993-01-12 Anthony I. Rozmanith Method and apparatus for storing, transmitting and retrieving graphical and tabular data
US5012334A (en) * 1990-01-29 1991-04-30 Dubner Computer Systems, Inc. Video image bank for storing and retrieving video image sequences
US5012334B1 (en) * 1990-01-29 1997-05-13 Grass Valley Group Video image bank for storing and retrieving video image sequences
US5202828A (en) * 1991-05-15 1993-04-13 Apple Computer, Inc. User interface system having programmable user interface elements
US5434966A (en) * 1991-05-31 1995-07-18 Kao Corporation System and method for storing and retrieving three dimensional shapes using two dimensional contrast images
US5381158A (en) * 1991-07-12 1995-01-10 Kabushiki Kaisha Toshiba Information retrieval apparatus
US5469209A (en) * 1991-07-16 1995-11-21 Sony Electronics, Inc. Apparatus for generating video signals representing a photographic image previously recorded in a frame on a photographic film-type medium
US5249056A (en) * 1991-07-16 1993-09-28 Sony Corporation Of America Apparatus for generating video signals from film
US5421008A (en) * 1991-11-08 1995-05-30 International Business Machines Corporation System for interactive graphical construction of a data base query and storing of the query object links as an object
US5621821A (en) * 1992-06-04 1997-04-15 Sony Corporation Apparatus and method for detecting distortions in processed image data
US5469512A (en) * 1992-09-08 1995-11-21 Sony Corporation Pattern recognition device
US5751286A (en) * 1992-11-09 1998-05-12 International Business Machines Corporation Image query system and method
US5579471A (en) * 1992-11-09 1996-11-26 International Business Machines Corporation Image query system and method
US5462370A (en) * 1993-07-19 1995-10-31 Krupp Polysius Ag Tiltable supporting roller bearing
US5586197A (en) * 1993-09-02 1996-12-17 Canon Kabushiki Kaisha Image searching method and apparatus thereof using color information of an input image
US5687239A (en) * 1993-10-04 1997-11-11 Sony Corporation Audio reproduction apparatus
US5704013A (en) * 1994-09-16 1997-12-30 Sony Corporation Map determination method and apparatus
US5793888A (en) * 1994-11-14 1998-08-11 Massachusetts Institute Of Technology Machine learning apparatus and method for image searching
US5745126A (en) * 1995-03-31 1998-04-28 The Regents Of The University Of California Machine synthesis of a virtual video camera/image of a scene from multiple video cameras/images of the scene in accordance with a particular perspective on the scene, an object in the scene, or an event in the scene
US5729471A (en) * 1995-03-31 1998-03-17 The Regents Of The University Of California Machine dynamic selection of one video camera/image of a scene from multiple video cameras/images of the scene in accordance with a particular perspective on the scene, an object in the scene, or an event in the scene
US5767893A (en) * 1995-10-11 1998-06-16 International Business Machines Corporation Method and apparatus for content based downloading of video programs
US6654953B1 (en) * 1998-10-09 2003-11-25 Microsoft Corporation Extending program languages with source-program attribute tags
US6658625B1 (en) * 1999-04-14 2003-12-02 International Business Machines Corporation Apparatus and method for generic data conversion
US6625590B1 (en) * 1999-08-10 2003-09-23 International Business Machines Corporation Command line interface for reducing user input in a network management device
US6629313B1 (en) * 2000-06-29 2003-09-30 Microsoft Corporation In-line database access statements without a pre-compiler
US20030110472A1 (en) * 2001-11-11 2003-06-12 International Business Machines Corporation Method and system for generating program source code of a computer application from an information model

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070002067A1 (en) * 2005-06-30 2007-01-04 Microsoft Corporation Magnification of indirection textures
US20070002071A1 (en) * 2005-06-30 2007-01-04 Microsoft Corporation Parallel texture synthesis by upsampling pixel coordinates
US20070003152A1 (en) * 2005-06-30 2007-01-04 Microsoft Corporation Multi-level image stack of filtered images
US20070002069A1 (en) * 2005-06-30 2007-01-04 Microsoft Corporation Parallel texture synthesis having controllable jitter
US8068117B2 (en) 2005-06-30 2011-11-29 Microsoft Corporation Parallel texture synthesis by upsampling pixel coordinates
US7817160B2 (en) 2005-06-30 2010-10-19 Microsoft Corporation Sub-pass correction using neighborhood matching
US7400330B2 (en) 2005-06-30 2008-07-15 Microsoft Corporation Magnification of indirection textures
US7477794B2 (en) 2005-06-30 2009-01-13 Microsoft Corporation Multi-level image stack of filtered images
US7567254B2 (en) 2005-06-30 2009-07-28 Microsoft Corporation Parallel texture synthesis having controllable jitter
US20080317370A1 (en) * 2005-11-08 2008-12-25 Koninklijke Philips Electronics, N.V. Method and System for Filtering Elongated Features
US20070296730A1 (en) * 2006-06-26 2007-12-27 Microsoft Corporation Texture synthesis using dimensionality-reduced appearance space
US7817161B2 (en) 2006-06-26 2010-10-19 Microsoft Corporation Texture synthesis using dimensionality-reduced appearance space
US7733350B2 (en) 2006-06-30 2010-06-08 Microsoft Corporation Anisometric texture synthesis
US7643034B2 (en) 2006-06-30 2010-01-05 Microsoft Corporation Synthesis of advecting texture using adaptive regeneration
US20080001962A1 (en) * 2006-06-30 2008-01-03 Microsoft Corporation Anisometric texture synthesis
US20080001963A1 (en) * 2006-06-30 2008-01-03 Microsoft Corporation Synthesis of advecting texture using adaptive regeneration
US8447114B2 (en) * 2008-08-08 2013-05-21 Kabushiki Kaisha Toshiba Method and apparatus for calculating pixel features of image data
US20100034465A1 (en) * 2008-08-08 2010-02-11 Kabushiki Kaisha Toshiba Method and apparatus for calculating pixel features of image data
US20100080469A1 (en) * 2008-10-01 2010-04-01 Fuji Xerox Co., Ltd. Novel descriptor for image corresponding point matching
US8363973B2 (en) * 2008-10-01 2013-01-29 Fuji Xerox Co., Ltd. Descriptor for image corresponding point matching
US20100246969A1 (en) * 2009-03-25 2010-09-30 Microsoft Corporation Computationally efficient local image descriptors
US8340415B2 (en) 2010-04-05 2012-12-25 Microsoft Corporation Generation of multi-resolution image pyramids
US8547389B2 (en) 2010-04-05 2013-10-01 Microsoft Corporation Capturing image structure detail from a first image and color from a second image
US8606031B2 (en) 2010-10-18 2013-12-10 Sony Corporation Fast, accurate and efficient gaussian filter
US8542942B2 (en) 2010-12-17 2013-09-24 Sony Corporation Tunable gaussian filters
US20150007243A1 (en) * 2012-02-29 2015-01-01 Dolby Laboratories Licensing Corporation Image Metadata Creation for Improved Image Processing and Content Delivery
US9819974B2 (en) * 2012-02-29 2017-11-14 Dolby Laboratories Licensing Corporation Image metadata creation for improved image processing and content delivery
CN112488039A (en) * 2020-12-15 2021-03-12 哈尔滨市科佳通用机电股份有限公司 Machine learning-based method for detecting falling fault of hook tail frame supporting plate of railway wagon

Also Published As

Publication number Publication date
US6674915B1 (en) 2004-01-06

Similar Documents

Publication Publication Date Title
US20040234159A1 (en) Descriptors adjustment when using steerable pyramid to extract features for content based search
Kang et al. PCA-based edge-preserving features for hyperspectral image classification
Krig et al. Image pre-processing
US6687416B2 (en) Method for determining a correlation between images using multi-element image descriptors
Wang et al. Integration of soft and hard classifications using extended support vector machines
US8103115B2 (en) Information processing apparatus, method, and program
CN107967482A (en) Icon-based programming method and device
Kotwal et al. A novel approach to quantitative evaluation of hyperspectral image fusion techniques
US20040052414A1 (en) Texture-based colour correction
US9275305B2 (en) Learning device and method, recognition device and method, and program
US20040042656A1 (en) Method and apparatus for determining regions of interest in images and for image transmission
US20100067799A1 (en) Globally invariant radon feature transforms for texture classification
CN111160273A (en) Hyperspectral image space spectrum combined classification method and device
CN111369605A (en) Infrared and visible light image registration method and system based on edge features
Bae et al. Real-time face detection and recognition using hybrid-information extracted from face space and facial features
Satya et al. Stripe noise removal from remote sensing images
US8417038B2 (en) Image processing apparatus, processing method therefor, and non-transitory computer-readable storage medium
Al-Wassai et al. Multisensor images fusion based on feature-level
Lee et al. A taxonomy of color constancy and invariance algorithm
Badura et al. Advanced scale-space, invariant, low detailed feature recognition from images-car brand recognition
CN108764112A (en) A kind of Remote Sensing Target object detecting method and equipment
Rafi et al. Salient object detection employing regional principal color and texture cues
Kunaver et al. Image feature extraction-an overview
Ahmed et al. Blind copy-move forgery detection using SVD and KS test
Isnanto et al. Determination of the optimal threshold value and number of keypoints in scale invariant feature transform-based copy-move forgery detection

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION