WO2007120558A2 - Image classification based on a mixture of elliptical color models - Google Patents

Image classification based on a mixture of elliptical color models Download PDF

Info

Publication number
WO2007120558A2
WO2007120558A2 PCT/US2007/008446 US2007008446W WO2007120558A2 WO 2007120558 A2 WO2007120558 A2 WO 2007120558A2 US 2007008446 W US2007008446 W US 2007008446W WO 2007120558 A2 WO2007120558 A2 WO 2007120558A2
Authority
WO
WIPO (PCT)
Prior art keywords
color
images
models
color models
interest
Prior art date
Application number
PCT/US2007/008446
Other languages
French (fr)
Other versions
WO2007120558A3 (en
Inventor
Pingshan Li
Original Assignee
Sony Corporation
Sony Electronics Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Corporation, Sony Electronics Inc. filed Critical Sony Corporation
Priority to CN2007800133637A priority Critical patent/CN101421746B/en
Priority to EP07774732.7A priority patent/EP2005364B1/en
Publication of WO2007120558A2 publication Critical patent/WO2007120558A2/en
Publication of WO2007120558A3 publication Critical patent/WO2007120558A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/46Colour picture communication systems
    • H04N1/56Processing of colour picture signals
    • H04N1/60Colour correction or control
    • H04N1/62Retouching, i.e. modification of isolated colours only or in isolated picture areas only
    • H04N1/628Memory colours, e.g. skin or sky
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5838Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour

Definitions

  • the present invention relates to the field of imaging. More specifically, the present invention relates to improved imaging classification.
  • comparing images There are a number of ways of comparing images. Furthermore, there are many different implementations of comparing images. One implementation is to search based on the content of the image rather than a keyword.
  • a content based image retrieval system is an image retrieval system which classifies, detects and retrieves images from digital libraries, usually databases, by utilizing the content of the image rather than a text label.
  • Some content based systems retrieve images using a specified shape or object. For example, to find images of a dog, such systems would be provided with a specification of a shape of a dog. However, since dogs come in a variety of shapes and sizes, this is limited to only finding dogs that match the designated shapes.
  • a method of classifying images based on elliptical color models is utilized in a number of applications.
  • One or more color models are generated from a set of images with a region of interest.
  • sets of images are utilized for training.
  • One set of images has regions of interest, and the other set of images is without regions of interest.
  • a maximum difference between the sets is achieved, so that a color model is most representative of the object desired.
  • a collection of images are able to be searched, and images are retrieved based on the probability that the images contain the desired object.
  • a method of classifying images comprises generating one or more color models from one or more first images, selecting one or more optimum color models from the one or more color models, wherein the one or more optimum color models are representative of color in the one or more first images and comparing one or more color distributions from one or more second images with the one or more optimum color models.
  • the one or more color models are elliptical.
  • the one or more color models are generated in Hue, Saturation, Value color space.
  • the method further comprises training the one or more color models utilizing one or more third images with one or more regions of interest and one or more fourth images without regions of interest. Training further comprises maximizing the difference between the one or more third images with one or more regions of interest and the one or more fourth images without regions of interest.
  • the method further comprises retrieving the one or more second images based on similarity to the one or more optimum color models.
  • a smaller distance between the one or more optimum color models and the one or more color distributions results in a higher similarity.
  • a keyword is used to select the one or more first images.
  • the one or more first images are selected by a user.
  • the one or more optimum color models and the one or more color distributions are compared over the Internet.
  • the one or more optimum color models and the one or more color distributions are compared on a computing device selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
  • a method of classifying images comprises generating one or more color models from a first set of images with a region of interest, training the one or more color models utilizing a second set of images with one or more regions of interest and a third set of images without regions of interest, comparing the one or more color models with one or more color distributions from a fourth set of images and retrieving one or more images from the fourth set of images based on the comparison between the one or more color models and the one or more color distributions.
  • the one or more color models are elliptical.
  • the one or more color models are generated in Hue, Saturation, Value color space. Training further comprises maximizing the difference between the second set of images with one or more regions of interest and the third set of images without regions of interest.
  • a keyword is used to select the first set of images.
  • the first set of images is selected by a user.
  • the one or more color models and the one or more color distributions are compared over the Internet.
  • the one or more color models and the one or more color distributions are compared on a computing device selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
  • a method of optimizing color models for classifying images comprises generating a color model for each of one or more first images, searching for the color model for maximizing a statistical difference between the one or more first images and one or more second images, updating a color model set by adding the color model for maximizing the statistical difference to the color model set and repeating searching and updating until the statistical difference is maximized.
  • the one or more first images contain one or more regions of interest and the one or more second images are without one or more regions of interest.
  • the color model is elliptical.
  • the color model is generated in Hue, Saturation, Value color space. Optimizing color models is performed on a computing device selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
  • a system for comparing a plurality of images comprises one or more first images, one or more color models generated from the one or more first images, one or more second images with one or more regions of interest for training the one or more color models, one or more third images without regions of interest for training the one or more color models, one or more fourth images, one or more color distributions generated from the one or more fourth images and a program to compare the one or more color models with the one or more color distributions.
  • the one or more color models are elliptical.
  • the one or more color models are generated in Hue, Saturation, Value color space.
  • the one or more fourth images are retrieved based on similarity of the one or more color models to the one or more color distributions.
  • a smaller distance between the one or more color models and the one or more color distributions results in a higher similarity.
  • a keyword is used to select the one or more first images.
  • the one or more first images are selected by a user.
  • the one or more color models and the one or more color distributions are compared over the Internet.
  • the one or more color models and the one or more color distributions are compared on a computing device selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
  • a capture and display device comprises a receiving unit for receiving image data; a display unit coupled to the receiving unit for displaying image data and a program coupled to the receiving unit and the display unit to compare the image data by generating one or more color models from one or more selected images, selecting one or more optimum color models from the one or more color models, wherein the one or more optimum color models are representative of color in the one or more selected images and comparing one or more color distributions from the image data with the one or more optimum color models.
  • the one or more color models are elliptical.
  • the one or more color models are generated in Hue, Saturation, Value color space.
  • the capture and display device is selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
  • FIGs. IA-C illustrate different representations of the HSV color space.
  • FIG. ID illustrates an exemplary elliptical color model.
  • FIG. 2 illustrates a flowchart of the procedure of determining a color model set T that maximizes U(T).
  • FIG. 3 illustrates a flowchart of the procedure of comparing images.
  • FIG. 4 illustrates a block diagram of a media storage device with external controller operating according to the present invention.
  • FIG. 5 illustrates a flowchart showing the steps implemented by the controller and the media storage device during processing of a content stream to generate an index database.
  • FIG. 6 illustrates a flowchart showing the steps implemented by the controller and the media storage device during playback of a content stream.
  • FIG. 7 illustrates an exemplary system implementing the method described herein.
  • Color is often used as a characteristic of an object or a region for applications such as object detection, image segmentation and content based retrieval.
  • Many color based image classification algorithms have been developed for skin color detection applications.
  • Color based image classification generally uses color modeling such as Gaussian models and Bayes classifiers.
  • a method of statistical color modeling based on training is described herein.
  • a set of images with a region of interest is used to generate an elliptical color model for each image.
  • two training image sets are used for training the color model.
  • One image set contains images with regions of interest, and the other image set contains images without regions of interest.
  • a set of the optimal color models is chosen from the color model set by maximizing the statistical distance between the two training sets.
  • the Hue, Saturation, Value (HSV) color space is used.
  • the hue is also the color type (such as red, blue or green).
  • Saturation is the vibrancy of the color wherein the range is from 0-100%.
  • the value is also referred to as the brightness of the color which ranges from 0-100%.
  • Figures IA-C illustrate different representations of the HSV color space.
  • Figure IA shows the HSV color space in wheel form.
  • Figure IB shows it as a cylinder
  • Figure 1C shows the color space as a cone.
  • the color value is projected on the HS plane in the polar system.
  • the percentage of pixels within the elliptical model in the whole image is used to estimate the total probability that the image has the desired color.
  • Figure ID illustrates an exemplary elliptical color model.
  • Color representations 102 of pixels of an image are mapped on a color space as described above. Then using the equations (1-3), an optimal ellipse 100 is established which is used as the color model for later comparison.
  • Both t and i are represented as the percentage of the pixels in the whole image. A percentage is used instead of an absolute number to allow comparison of different sized images.
  • the color matching function between an image / and a color model set T is defined as
  • the issue is choosing a representative color model set to classify images.
  • the model selection process begins with a set of images with regions of interest I a and a set of images without regions of interest I b .
  • D(/ a , T) for I 3 e I 2 has mean ⁇ a and standard deviation ⁇ a ; and D(/ b , T) for / b e I b has mean ⁇ b and standard deviation ⁇ b .
  • the statistical distance between the sets of images I a and I b corresponding to the color model set T is defined as
  • the optimization procedure is to find a color model set T that maximizes the statistical distance U(T).
  • FIG. 2 illustrates a flowchart of the procedure of determining a color model set T that maximizes the statistical distance U(T).
  • the value of set T is set to equal PATENT SONY-32600WO ⁇ .
  • a color model T a for each / a e I a is generated.
  • the set T by T «- ⁇ 7; ⁇ u T is updated. Steps 204 and 206 are repeated until the statistical distance U(T) reaches a maximum.
  • a color model that is already in the set T is able to be chosen again. If this happens, a duplicated color model is added to the set T in the step 206.
  • the color matching function of equation (6) is used to evaluate the probability that the image has the object of desired color.
  • the threshold setting for image classification depends on specific applications.
  • the color model matching method works best for the situation where the color in the region of interest has Gaussian or nearly Gaussian distribution, such as skin color, blue sky and green plants.
  • the region of interest in an image has multiple colors, such as red flowers plus green leaves, the region is segmented into multiple objects and is classified separately. Then the classification results are combined for the final output. For example, if an image of a rose is desired, the flower part has a distinct color such as red, the stem and leaves have a distinct color such as green and the rest of the image comprises other colors. To properly determine the color model to search for, the image is broken down into different sections. The flower part is cropped, and the stem and leaves are cropped and put into their own separate images. Each section has its own color model using the equations above. Once the two color models for the rose are established, they are able to be compared with other color models to determine similarity between the images.
  • the color models are able to be used for any application that benefits from such information such as a search engine which searches by comparing the color model or color models with images within a database.
  • images that match the color models are found and displayed in order of similarity.
  • images that have a high concentration of red and green are displayed first while images lacking in those colors are displayed last or not displayed at all.
  • red since red is not the only color utilized, a red car should not appear very high on the list because most likely it will be lacking the green from the stem of the rose. Therefore, the accuracy of the search is improved by utilizing multiple color models for each distinctive aspect of an image.
  • FIG. 3 illustrates a flowchart of the method described herein.
  • one or more elliptical color models are generated from a first set of images with a region of interest. For example, a set of rose images are provided with the flower part as the region of interest. Color models focused on the flower part are generated from the set of rose images.
  • the one or more color models are trained utilizing a second set of images with one or more regions of interest and a third set of images without regions of interest.
  • the difference between the second set of images with one or more regions of interest and the third set of images without regions of interest is maximized, so that the best one or more color models is selected.
  • the one or more color models are compared with a fourth set of images in the step 306. In some embodiments, one or more images from the fourth set of images are retrieved based on the comparison with the one or more color models.
  • CBIR Content-Based Image Retrieval
  • QBIC Query By Image Content
  • CBVIR Content-Based Visual Information Retrieval
  • Content-based means that the search uses the contents of the images themselves, rather than relying on metadata such as titles, captions or keywords. CBIR is needed and useful because of the limitations in metadata-based systems in addition to the increased bandwidth and processing power of the Internet. Textual information about images is easily searched using current technology, but requires those descriptions to be input by someone, which is highly burdensome and impractical when dealing with extremely large amounts of data. PATENT SONY-32600WO
  • keyword searches for text have their own drawbacks such as requiring a user to accurately phrase his search, otherwise the search could result in nothing found.
  • CBIR systems are implemented in a number of different ways.
  • One example permits a user to make a request, similar to a keyword search, such as "rabbit" and any images of rabbits are retrieved.
  • the search looks for matching colors of an image that has a rabbit.
  • color labels are able to be included in the text-input version of the search such as "white rabbit” to further specify which type of rabbit is desired, since rabbits come in a variety of colors.
  • Other systems search by a sample image being provided by the user. As described above, the search begins with a set of sample images provided. The search then retrieves similar images. The results are returned in a variety of ways, and in some embodiments, they are sorted in ascending order based on the closest match. Another method of returning results only returns those images whose similarity falls within a designated acceptable range.
  • CBIR implementing the method described herein is performed on a local intranet or even on a user's computing device such as a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
  • a user wants to find all of their baby pictures on the computer, they are able to use the aforementioned technologies and retrieve all pictures that resemble a baby.
  • Another application the method described here is utilized with is a content recognition system.
  • the content recognition system for indexing occurrences of objects within an audio/video content data stream processes the stream of data to generate a content index database corresponding to the content stream.
  • the content stream is processed by applying recognition technology utilizing the image classification technology described herein to the content within the content stream to identify and index occurrences of identified objects.
  • the content stream is processed as the content stream is stored within a media storage device.
  • the content stream is processed after the content stream is stored within the media storage device.
  • the objects that are included within the index database are identified dynamically by the recognition technology during processing. As the content stream is processed, an entry for each object is generated within the index database.
  • each entry includes an object identifier and corresponding locations of PATENT SONY-32600WO that object.
  • the locations reference where the particular content is stored within the media storage device.
  • the content index database is generated, it is able to then be used to quickly locate and navigate to specific occurrences of content and objects within the content stream.
  • the objects that are able to be identified and indexed include any identifiable information within a content stream, including shapes, objects, events and movements within video streams.
  • the content index database is stored on the same media storage device as the content stream.
  • the media storage device 400 includes an interface circuit 402 for sending communications to and receiving communications from other devices coupled to the media storage device 400.
  • the interface circuit 402 is coupled to a buffer controller 404.
  • the buffer controller 404 is also coupled to a RAM 406 and to a read/write channel circuit 408.
  • the read/write channel circuit 408 is coupled to media 410 on which data is stored within the media storage device 400.
  • the read/write channel circuit 408 controls the storage operations on the media 410, including reading data from the media 410 and writing data to the media 410.
  • An external controller 420 is coupled to the buffer controller 404 for controlling the processing, classifying and indexing of data streams stored on the media 410.
  • the recognition engine within the controller 420 analyzes the content within the content stream to identify the appropriate objects within the content stream. As described above, the appropriate objects are dynamically identified by the recognition engine during processing. As appropriate objects within the content stream are identified, the occurrence of those identified objects within the content stream is then recorded within an index database. Once the content stream is processed and the index database is generated, the user then has the capability to jump to locations within the content stream where the desired object occurs, for viewing or editing the content stream.
  • FIG. 5 A flowchart showing the steps implemented in some embodiments by the controller 420 and the media storage device 400 during processing of a content stream to generate an index database is illustrated in Figure 5.
  • the process starts at the step 500.
  • the objects to be indexed and included in the index database are identified. As described above, this identification is performed manually by the user or dynamically by the recognition technology during processing.
  • the recognition engine or recognition PATENT SONY-32600WO technology is then applied to the content stream to analyze the content stream and determine the occurrence of identified objects within the content stream.
  • the step 506 it is determined whether the content within the content stream that is currently being analyzed includes an identified object. If the content currently being analyzed does include an identified object, then at the step 508, an entry is generated for the index database, including the object identifier entry within the object category and an entry identifying the corresponding location of the content within the location category. After the generation of the entry for the index database at the step 508, or if it is determined at the step 506, that the content currently being analyzed does not include an identified object, it is then determined at the step 510, if there is more content within the content stream, or if this is the end of the content stream. If it is determined that the content stream has not yet been fully processed, then the process jumps back to the step 504, to continue processing the content stream. If it is determined at the step 510 that all of the content stream has been processed, then the process ends at the step 512.
  • a flowchart showing the steps implemented in some embodiments by the controller
  • a user identifies an object that they would like to locate within the content stream.
  • the entry corresponding to the identified object is located within the index database and the location of the first occurrence of the object is targeted, using the entries from the object category and the location category.
  • the first occurrence of the object is located within the content stream.
  • this occurrence of the object is then played back for the user.
  • the process then jumps to the step 608 to playback this next occurrence. If it is determined at the step 610 that the user does not want the next occurrence of the object located and played back, the process then ends at the step 614.
  • a user records a video of their child's birthday on a tape within a video recorder.
  • This video includes audio and video components.
  • the video is then recorded PATENT SONY-32600WO from the tape to a media storage device 400.
  • the video is processed to generate the index database by applying recognition technology to the video components to determine each occurrence of an identified object within the content stream. As described above, this processing occurs either as the video is recorded on the media storage device 400, if the user's system has the processing capability to perform the processing online, or after the video is stored on the media storage device 400.
  • the video is analyzed to determine each occurrence of an identified object.
  • an entry corresponding to that occurrence is then added to the index database. For example, if the user identifies that they want every occurrence of a birthday cake within the video indexed, the recognition technology is then applied to the video content stream to determine every occurrence of the birthday cake within the video. These occurrences are identified and indexed within the index database, as described above. If the user then wants to view these occurrences or edit the video based on these occurrences, the system will utilize the index database to playback these occurrences of the birthday cake within the video or edit the video based on the occurrences of the birthday cake within the video.
  • a search system is implemented so that a user is able to request a search for something like a birthday cake, the system searches through the video and the images/video involving a birthday cake are queued to be viewed.
  • Figure 7 illustrates an exemplary system implementing the method described herein.
  • One or more first images 700 contain the image that is to be compared. In the example, a red rose with a green stem is the desired image. From the one or more first images 700, one or more elliptical color models 702 and 702' are generated. The color model 702 is from the red rose and the color model 702' is from the green stem of the image 700.
  • One or more second images 704 contain one or more regions of interest for training the one or more color models 702.
  • images with a red flower are used to train the color model 702.
  • images with green similar to the flower stem would be used to train the color model 702'.
  • One or more third images 706 do not contain regions of interest such as an image with white clouds and blue water. Such images provide contrast and help train the color models 702 and 702' to PATENT SONY-32600WO select matching images.
  • One or more optimum color models are selected after training. The one or more optimum color models provide the best representation of the one or more first images 700.
  • One or more fourth images 708 are the images to be compared with the one or more first images 700.
  • One or more color distributions 710 and 710' are generated from the one or more fourth images 708. The one or more fourth images are compared based on the similarity of the one or more color distributions 710 and 710' to the one or more color models 702 and 702'.
  • a program is able to compare the images utilizing the color models described above and retrieve similar images.
  • the method of classifying images based on elliptical color models is utilized in a number of applications.
  • One or more color models are generated from a set of images with a region of interest.
  • sets of images are utilized for training.
  • One set of images has regions of interest, and the other set of images is without regions of interest.
  • the image comparison method described herein is able to initially determine a best elliptical color model based on designated images that either have regions of interest or do not.
  • the HSV color space is utilized.
  • other images are compared, wherein the most similar images are selected, retrieved or utilized in a manner specified. For example, if the method is operating within a image search and retrieval system, then the images that most closely fit with the color model are retrieved in order based on similarity.
  • any application that benefits from an improved method of image matching based on color is able to implement the method described herein.
  • another application includes digital cameras with autofocus such that the autofocus focuses on skin color.
  • Other applications include, but are not limited to, art gallery and museum management, architectural image and design, interior design, remote sensing and management of earth resources, geographic information systems, scientific database management, weather forecasting, retailing, fabric and fashion design, trademark and copyright database management, law enforcement and criminal investigation and picture PATENT SONY-32600WO archiving, communication systems and inspection systems including circuit inspection systems.

Abstract

A method of classifying images based on elliptical color models is utilized in a number of applications. One or more color models are generated from a set of images with a region of interest. Then, sets of images are utilized for training. One set of images has regions of interest, and the other set of images is without regions of interest. By utilizing the two sets of images, a maximum difference between the sets is achieved, so that a color model is most representative of the object desired. Then using the optimal color model, a collection of images are able to be searched, and images are retrieved based on the probability that the images contain the desired object.

Description

EVIAGE CLASSIFICATION BASED ON A MIXTURE OF ELLIPTICAL COLOR
MODELS
Field of the Invention:
The present invention relates to the field of imaging. More specifically, the present invention relates to improved imaging classification.
Background of the Invention:
There are a number of ways of comparing images. Furthermore, there are many different implementations of comparing images. One implementation is to search based on the content of the image rather than a keyword.
A content based image retrieval system is an image retrieval system which classifies, detects and retrieves images from digital libraries, usually databases, by utilizing the content of the image rather than a text label.
Conventional content based image and video retrieval systems utilize images or video frames which have been supplemented with text such as titles, keywords or captions associated with the images. A user retrieves desired images from an image database, for example, by submitting textual queries to the system using these keywords. Images that match the input keywords are retrieved. However, with larger sets of image data, it becomes impractical to store all of the images with text indexes corresponding to each image. It is also highly burdensome for someone to manually attribute specific titles, keywords and captions to each one. Furthermore, text-based searches have their inherent drawbacks as well.
Some content based systems retrieve images using a specified shape or object. For example, to find images of a dog, such systems would be provided with a specification of a shape of a dog. However, since dogs come in a variety of shapes and sizes, this is limited to only finding dogs that match the designated shapes.
Summary of the Invention:
A method of classifying images based on elliptical color models is utilized in a number of applications. One or more color models are generated from a set of images with a region of interest. Then, sets of images are utilized for training. One set of images has regions of interest, and the other set of images is without regions of interest. By utilizing the two sets of images, a maximum difference between the sets is achieved, so that a color model is most representative of the object desired. Then using the optimal color model, a collection of images are able to be searched, and images are retrieved based on the probability that the images contain the desired object.
In one aspect, a method of classifying images comprises generating one or more color models from one or more first images, selecting one or more optimum color models from the one or more color models, wherein the one or more optimum color models are representative of color in the one or more first images and comparing one or more color distributions from one or more second images with the one or more optimum color models. The one or more color models are elliptical. The one or more color models are generated in Hue, Saturation, Value color space. The method further comprises training the one or more color models utilizing one or more third images with one or more regions of interest and one or more fourth images without regions of interest. Training further comprises maximizing the difference between the one or more third images with one or more regions of interest and the one or more fourth images without regions of interest. The method further comprises retrieving the one or more second images based on similarity to the one or more optimum color models. A smaller distance between the one or more optimum color models and the one or more color distributions results in a higher similarity. A keyword is used to select the one or more first images. The one or more first images are selected by a user. The one or more optimum color models and the one or more color distributions are compared over the Internet. Alternatively, the one or more optimum color models and the one or more color distributions are compared on a computing device selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
In another aspect, a method of classifying images comprises generating one or more color models from a first set of images with a region of interest, training the one or more color models utilizing a second set of images with one or more regions of interest and a third set of images without regions of interest, comparing the one or more color models with one or more color distributions from a fourth set of images and retrieving one or more images from the fourth set of images based on the comparison between the one or more color models and the one or more color distributions. The one or more color models are elliptical. The one or more color models are generated in Hue, Saturation, Value color space. Training further comprises maximizing the difference between the second set of images with one or more regions of interest and the third set of images without regions of interest. The smaller the distance between the one or more color models and the one or more color distributions the higher the similarity. A keyword is used to select the first set of images. The first set of images is selected by a user. The one or more color models and the one or more color distributions are compared over the Internet. Alternatively, the one or more color models and the one or more color distributions are compared on a computing device selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
In another aspect, a method of optimizing color models for classifying images comprises generating a color model for each of one or more first images, searching for the color model for maximizing a statistical difference between the one or more first images and one or more second images, updating a color model set by adding the color model for maximizing the statistical difference to the color model set and repeating searching and updating until the statistical difference is maximized. The one or more first images contain one or more regions of interest and the one or more second images are without one or more regions of interest. The color model is elliptical. The color model is generated in Hue, Saturation, Value color space. Optimizing color models is performed on a computing device selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
In yet another aspect, a system for comparing a plurality of images comprises one or more first images, one or more color models generated from the one or more first images, one or more second images with one or more regions of interest for training the one or more color models, one or more third images without regions of interest for training the one or more color models, one or more fourth images, one or more color distributions generated from the one or more fourth images and a program to compare the one or more color models with the one or more color distributions. The one or more color models are elliptical. The one or more color models are generated in Hue, Saturation, Value color space. The one or more fourth images are retrieved based on similarity of the one or more color models to the one or more color distributions. A smaller distance between the one or more color models and the one or more color distributions results in a higher similarity. A keyword is used to select the one or more first images. The one or more first images are selected by a user. The one or more color models and the one or more color distributions are compared over the Internet. Alternatively, the one or more color models and the one or more color distributions are compared on a computing device selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system. In another aspect, a capture and display device comprises a receiving unit for receiving image data; a display unit coupled to the receiving unit for displaying image data and a program coupled to the receiving unit and the display unit to compare the image data by generating one or more color models from one or more selected images, selecting one or more optimum color models from the one or more color models, wherein the one or more optimum color models are representative of color in the one or more selected images and comparing one or more color distributions from the image data with the one or more optimum color models. The one or more color models are elliptical. The one or more color models are generated in Hue, Saturation, Value color space. The capture and display device is selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
Brief Description of the Drawings:
FIGs. IA-C illustrate different representations of the HSV color space.
FIG. ID illustrates an exemplary elliptical color model. FIG. 2 illustrates a flowchart of the procedure of determining a color model set T that maximizes U(T).
FIG. 3 illustrates a flowchart of the procedure of comparing images.
FIG. 4 illustrates a block diagram of a media storage device with external controller operating according to the present invention. FIG. 5 illustrates a flowchart showing the steps implemented by the controller and the media storage device during processing of a content stream to generate an index database.
FIG. 6 illustrates a flowchart showing the steps implemented by the controller and the media storage device during playback of a content stream.
FIG. 7 illustrates an exemplary system implementing the method described herein.
Detailed Description of the Preferred Embodiment:
Color is often used as a characteristic of an object or a region for applications such as object detection, image segmentation and content based retrieval. Many color based image classification algorithms have been developed for skin color detection applications. Color based image classification generally uses color modeling such as Gaussian models and Bayes classifiers.
A method of statistical color modeling based on training is described herein. A set of images with a region of interest is used to generate an elliptical color model for each image. Then, two training image sets are used for training the color model. One image set contains images with regions of interest, and the other image set contains images without regions of interest. A set of the optimal color models is chosen from the color model set by maximizing the statistical distance between the two training sets.
To generate a color model from a given image, the Hue, Saturation, Value (HSV) color space is used. Within the HSV color space, the hue is also the color type (such as red, blue or green). Generally, the hue ranges from 0 to 360 or 0-100%. Saturation is the vibrancy of the color wherein the range is from 0-100%. The lower the saturation of color, the more grayness present, and the color appears more faded. The value is also referred to as the brightness of the color which ranges from 0-100%. Figures IA-C illustrate different representations of the HSV color space. Figure IA shows the HSV color space in wheel form. Figure IB shows it as a cylinder, and Figure 1C shows the color space as a cone. For each pixel in the region of interest, the color value is projected on the HS plane in the polar system. To determine the elliptical model, the HS plane is converted to the Cartesian system with coordinates x = (X1, x2)τ to compute the mean value and covariance matrix. Assuming the selected color has bivariate normal distribution on the plane with mean value
Figure imgf000006_0001
and the covariance matrix
Figure imgf000006_0002
The bivariate normal density is constant on ellipses (X - μ)Υ ∑ -χ {x - M) = c2 (3)
The distribution of the selected color is estimated to be within an ellipse determined by taking c = 1.5 in equation (3). The percentage of pixels within the elliptical model in the whole image is used to estimate the total probability that the image has the desired color.
Figure ID illustrates an exemplary elliptical color model. Color representations 102 of pixels of an image are mapped on a color space as described above. Then using the equations (1-3), an optimal ellipse 100 is established which is used as the color model for later comparison.
The following equations are utilized to determine how well a given image matches the color model. For a given image / under test, the distance between / and the color model T is PATENT SONY-32600WO defined by
d(I,T) = t- i (4)
Λ Λ where t is the amount of pixels in the color elliptical model and i is the amount of pixels of
Λ Λ
/ in the same color model. Both t and i are represented as the percentage of the pixels in the whole image. A percentage is used instead of an absolute number to allow comparison of different sized images. The distance d(/,T) is able to be negative if there are more pixels in / than in the color elliptical model. If the image contains a large amount of desired color pixels, the distance ά{I,T) tends to be small, thus it is determined that this image is a similar image. If the image does not contain many of the desired color pixels, the distance d(/,7) is large and thus it is determined that this image is not a similar image. Assuming there is a set of multiple color models: T = [T1, T2, ...,Tn } (5)
The color matching function between an image / and a color model set T is defined as
D(/,T) = ∑ d(/,D. (6)
TeT
The issue is choosing a representative color model set to classify images. The model selection process begins with a set of images with regions of interest Ia and a set of images without regions of interest Ib. For a color model set T, suppose D(/a, T) for I3 e I2 has mean μa and standard deviation σa; and D(/b, T) for /b e Ib has mean μb and standard deviation σb. The statistical distance between the sets of images Ia and Ib corresponding to the color model set T is defined as
U(J) = (μb ~ μa ) . (7) σbσa
The optimization procedure is to find a color model set T that maximizes the statistical distance U(T).
Figure 2 illustrates a flowchart of the procedure of determining a color model set T that maximizes the statistical distance U(T). In the step 200, the value of set T is set to equal PATENT SONY-32600WO φ. In the step 202, a color model Ta for each /a e Ia is generated. In the step 204, a color model Ta is searched for that maximizes the statistical distance U(T): T3 = argmax V({TJ u T). In the step 206, the set T by T «-{7;} u T is updated. Steps 204 and 206 are repeated until the statistical distance U(T) reaches a maximum. In the step 204, a color model that is already in the set T is able to be chosen again. If this happens, a duplicated color model is added to the set T in the step 206. In the step 208, it is then determined if the statistical distance U(T) has reached a maximum. If it is determined that the statistical distance U(T) has reached a maximum, then the process ends. Otherwise, the process returns to the step 204 to search for a color model that maximizes the statistical distance U(T). If the sets of images Ia and J^ contain large amounts of images, calculating the statistical distance U(T) becomes time consuming. In such a situation, subsets of the sets of images Ia and Ib are able to be chosen for each iteration. When choosing the subsets for each iteration, the method of choosing the subsets is able to be random or ordered.
After the optimal color model set is obtained, the color matching function of equation (6) is used to evaluate the probability that the image has the object of desired color. The threshold setting for image classification depends on specific applications. The color model matching method works best for the situation where the color in the region of interest has Gaussian or nearly Gaussian distribution, such as skin color, blue sky and green plants.
If the region of interest in an image has multiple colors, such as red flowers plus green leaves, the region is segmented into multiple objects and is classified separately. Then the classification results are combined for the final output. For example, if an image of a rose is desired, the flower part has a distinct color such as red, the stem and leaves have a distinct color such as green and the rest of the image comprises other colors. To properly determine the color model to search for, the image is broken down into different sections. The flower part is cropped, and the stem and leaves are cropped and put into their own separate images. Each section has its own color model using the equations above. Once the two color models for the rose are established, they are able to be compared with other color models to determine similarity between the images. The color models are able to be used for any application that benefits from such information such as a search engine which searches by comparing the color model or color models with images within a database. Preferably, images that match the color models are found and displayed in order of similarity. For PATENT SQNY-32600WO example, using the rose example, images that have a high concentration of red and green are displayed first while images lacking in those colors are displayed last or not displayed at all. Furthermore, since red is not the only color utilized, a red car should not appear very high on the list because most likely it will be lacking the green from the stem of the rose. Therefore, the accuracy of the search is improved by utilizing multiple color models for each distinctive aspect of an image. If needed, more color models are able to be used to provide further accuracy, such as a red rose with a green stem with a blue sky. In that scenario, there are three very distinct colors and most likely a sufficient number of pixels of each color. Hence, three color models are able to be implemented. Figure 3 illustrates a flowchart of the method described herein. In the step 300, one or more elliptical color models are generated from a first set of images with a region of interest. For example, a set of rose images are provided with the flower part as the region of interest. Color models focused on the flower part are generated from the set of rose images. Then, in the step 302, the one or more color models are trained utilizing a second set of images with one or more regions of interest and a third set of images without regions of interest. In the step 304, the difference between the second set of images with one or more regions of interest and the third set of images without regions of interest is maximized, so that the best one or more color models is selected. Once the best color model is established, the one or more color models are compared with a fourth set of images in the step 306. In some embodiments, one or more images from the fourth set of images are retrieved based on the comparison with the one or more color models.
One of the applications the method described herein is able to be utilized for is Content-Based Image Retrieval (CBIR) also known as Query By Image Content (QBIC) and Content-Based Visual Information Retrieval (CBVIR). CBIR is the application of computer vision to the image retrieval problem of searching for digital images in large databases.
"Content-based" means that the search uses the contents of the images themselves, rather than relying on metadata such as titles, captions or keywords. CBIR is needed and useful because of the limitations in metadata-based systems in addition to the increased bandwidth and processing power of the Internet. Textual information about images is easily searched using current technology, but requires those descriptions to be input by someone, which is highly burdensome and impractical when dealing with extremely large amounts of data. PATENT SONY-32600WO
Furthermore, keyword searches for text have their own drawbacks such as requiring a user to accurately phrase his search, otherwise the search could result in nothing found.
CBIR systems are implemented in a number of different ways. One example permits a user to make a request, similar to a keyword search, such as "rabbit" and any images of rabbits are retrieved. However, unlike a keyword search where the word "rabbit" is searched for, the search looks for matching colors of an image that has a rabbit. Additionally, color labels are able to be included in the text-input version of the search such as "white rabbit" to further specify which type of rabbit is desired, since rabbits come in a variety of colors. Other systems search by a sample image being provided by the user. As described above, the search begins with a set of sample images provided. The search then retrieves similar images. The results are returned in a variety of ways, and in some embodiments, they are sorted in ascending order based on the closest match. Another method of returning results only returns those images whose similarity falls within a designated acceptable range.
Alternatively, instead of the search being across the Internet, CBIR implementing the method described herein is performed on a local intranet or even on a user's computing device such as a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system. For example, if a user wants to find all of their baby pictures on the computer, they are able to use the aforementioned technologies and retrieve all pictures that resemble a baby. Another application the method described here is utilized with is a content recognition system. The content recognition system for indexing occurrences of objects within an audio/video content data stream processes the stream of data to generate a content index database corresponding to the content stream. The content stream is processed by applying recognition technology utilizing the image classification technology described herein to the content within the content stream to identify and index occurrences of identified objects. In an embodiment, the content stream is processed as the content stream is stored within a media storage device. Alternatively, the content stream is processed after the content stream is stored within the media storage device. The objects that are included within the index database, are identified dynamically by the recognition technology during processing. As the content stream is processed, an entry for each object is generated within the index database.
In some embodiments, each entry includes an object identifier and corresponding locations of PATENT SONY-32600WO that object. The locations reference where the particular content is stored within the media storage device. Once the content index database is generated, it is able to then be used to quickly locate and navigate to specific occurrences of content and objects within the content stream. The objects that are able to be identified and indexed include any identifiable information within a content stream, including shapes, objects, events and movements within video streams. In some embodiments, the content index database is stored on the same media storage device as the content stream.
A media storage device with external controller is illustrated in Figure 4. The media storage device 400 includes an interface circuit 402 for sending communications to and receiving communications from other devices coupled to the media storage device 400. The interface circuit 402 is coupled to a buffer controller 404. The buffer controller 404 is also coupled to a RAM 406 and to a read/write channel circuit 408. The read/write channel circuit 408 is coupled to media 410 on which data is stored within the media storage device 400. The read/write channel circuit 408 controls the storage operations on the media 410, including reading data from the media 410 and writing data to the media 410. An external controller 420 is coupled to the buffer controller 404 for controlling the processing, classifying and indexing of data streams stored on the media 410.
As the stream is processed, the recognition engine within the controller 420 analyzes the content within the content stream to identify the appropriate objects within the content stream. As described above, the appropriate objects are dynamically identified by the recognition engine during processing. As appropriate objects within the content stream are identified, the occurrence of those identified objects within the content stream is then recorded within an index database. Once the content stream is processed and the index database is generated, the user then has the capability to jump to locations within the content stream where the desired object occurs, for viewing or editing the content stream.
A flowchart showing the steps implemented in some embodiments by the controller 420 and the media storage device 400 during processing of a content stream to generate an index database is illustrated in Figure 5. The process starts at the step 500. At the step 502, the objects to be indexed and included in the index database are identified. As described above, this identification is performed manually by the user or dynamically by the recognition technology during processing. At the step 504, the recognition engine or recognition PATENT SONY-32600WO technology is then applied to the content stream to analyze the content stream and determine the occurrence of identified objects within the content stream.
At the step 506, it is determined whether the content within the content stream that is currently being analyzed includes an identified object. If the content currently being analyzed does include an identified object, then at the step 508, an entry is generated for the index database, including the object identifier entry within the object category and an entry identifying the corresponding location of the content within the location category. After the generation of the entry for the index database at the step 508, or if it is determined at the step 506, that the content currently being analyzed does not include an identified object, it is then determined at the step 510, if there is more content within the content stream, or if this is the end of the content stream. If it is determined that the content stream has not yet been fully processed, then the process jumps back to the step 504, to continue processing the content stream. If it is determined at the step 510 that all of the content stream has been processed, then the process ends at the step 512. A flowchart showing the steps implemented in some embodiments by the controller
420 and the media storage device 400 during playback of a content stream, that has a corresponding index database, is illustrated in Figure 6. The process starts at the step 600. At the step 602, a user identifies an object that they would like to locate within the content stream. At the step 604, the entry corresponding to the identified object is located within the index database and the location of the first occurrence of the object is targeted, using the entries from the object category and the location category. At the step 606, the first occurrence of the object is located within the content stream. At the step 608, this occurrence of the object is then played back for the user. At the step 610, it is then determined if the user wants the next occurrence of the object located and played back. If the user does want the next occurrence of the object located and played back, then the next occurrence of the object is located at the step 612. The process then jumps to the step 608 to playback this next occurrence. If it is determined at the step 610 that the user does not want the next occurrence of the object located and played back, the process then ends at the step 614.
As an example of the operation of the content recognition system and index database of the present invention, a user records a video of their child's birthday on a tape within a video recorder. This video includes audio and video components. The video is then recorded PATENT SONY-32600WO from the tape to a media storage device 400. Under the control of the controller 420 in conjunction with the media storage device 400, the video is processed to generate the index database by applying recognition technology to the video components to determine each occurrence of an identified object within the content stream. As described above, this processing occurs either as the video is recorded on the media storage device 400, if the user's system has the processing capability to perform the processing online, or after the video is stored on the media storage device 400. During processing the video is analyzed to determine each occurrence of an identified object. As an occurrence of an identified object is found within the video, an entry corresponding to that occurrence is then added to the index database. For example, if the user identifies that they want every occurrence of a birthday cake within the video indexed, the recognition technology is then applied to the video content stream to determine every occurrence of the birthday cake within the video. These occurrences are identified and indexed within the index database, as described above. If the user then wants to view these occurrences or edit the video based on these occurrences, the system will utilize the index database to playback these occurrences of the birthday cake within the video or edit the video based on the occurrences of the birthday cake within the video.
Alternatively, instead of generating an index database, a search system is implemented so that a user is able to request a search for something like a birthday cake, the system searches through the video and the images/video involving a birthday cake are queued to be viewed.
Figure 7 illustrates an exemplary system implementing the method described herein. One or more first images 700 contain the image that is to be compared. In the example, a red rose with a green stem is the desired image. From the one or more first images 700, one or more elliptical color models 702 and 702' are generated. The color model 702 is from the red rose and the color model 702' is from the green stem of the image 700. One or more second images 704 contain one or more regions of interest for training the one or more color models 702. Here, images with a red flower are used to train the color model 702. Likewise, images with green similar to the flower stem would be used to train the color model 702'. One or more third images 706 do not contain regions of interest such as an image with white clouds and blue water. Such images provide contrast and help train the color models 702 and 702' to PATENT SONY-32600WO select matching images. One or more optimum color models are selected after training. The one or more optimum color models provide the best representation of the one or more first images 700. One or more fourth images 708 are the images to be compared with the one or more first images 700. One or more color distributions 710 and 710' are generated from the one or more fourth images 708. The one or more fourth images are compared based on the similarity of the one or more color distributions 710 and 710' to the one or more color models 702 and 702'. A program is able to compare the images utilizing the color models described above and retrieve similar images.
The method of classifying images based on elliptical color models is utilized in a number of applications. One or more color models are generated from a set of images with a region of interest. Then, sets of images are utilized for training. One set of images has regions of interest, and the other set of images is without regions of interest. By utilizing the two sets of images, with the equations above, a maximum difference between the sets is achieved, so that a color model is most representative of the object desired. Then using the optimal color model, a collection of images is gathered, and images are retrieved based on the probability that the images contain the desired object.
In operation, the image comparison method described herein is able to initially determine a best elliptical color model based on designated images that either have regions of interest or do not. In some embodiments, the HSV color space is utilized. Then using the determined color model, other images are compared, wherein the most similar images are selected, retrieved or utilized in a manner specified. For example, if the method is operating within a image search and retrieval system, then the images that most closely fit with the color model are retrieved in order based on similarity.
Any application that benefits from an improved method of image matching based on color is able to implement the method described herein. In addition to the applications described above, another application includes digital cameras with autofocus such that the autofocus focuses on skin color. Other applications include, but are not limited to, art gallery and museum management, architectural image and design, interior design, remote sensing and management of earth resources, geographic information systems, scientific database management, weather forecasting, retailing, fabric and fashion design, trademark and copyright database management, law enforcement and criminal investigation and picture PATENT SONY-32600WO archiving, communication systems and inspection systems including circuit inspection systems.
The present invention has been described in terms of specific embodiments incorporating details to facilitate the understanding of principles of construction and operation of the invention. Such reference herein to specific embodiments and details thereof is not intended to limit the scope of the claims appended hereto. It will be readily apparent to one skilled in the art that other various modifications may be made in the embodiment chosen for illustration without departing from the spirit and scope of the invention as defined by the claims.

Claims

PATENT SONY-32600WOC L A I M SWhat is claimed is:
1. A method of classifying images, comprising: a. generating one or more color models from one or more first images; b. selecting one or more optimum color models from the one or more color models, wherein the optimum color models are representative of color in the one or more first images; and c. comparing one or more color distributions from one or more second images with the one or more optimum color models.
2. The method as claimed in claim 1 wherein the one or more color models are elliptical.
3. The method as claimed in claim 1 wherein the one or more color models are generated in Hue, Saturation, Value color space.
4. The method as claimed in claim 1 further comprising training the one or more color models utilizing one or more third images with one or more regions of interest and one or more fourth images without regions of interest.
5. The method as claimed in claim 4 wherein training further comprises maximizing the difference between the one or more third images with one or more regions of interest and the one or more fourth images without regions of interest.
6. The method as claimed in claim 1 further comprising retrieving the one or more second images based on similarity to the one or more optimum color models.
7. The method as claimed in claim 6 wherein a smaller distance between the one or more optimum color models and the one or more color distributions results in a higher similarity.
PATENT SONY-32600WO 8. The method as claimed in claim 1 wherein a keyword is used to select the one or more first images.
9. The method as claimed in claim 1 wherein the one or more first images are selected by a user.
10. The method as claimed in claim 1 wherein the one or more optimum color models and the one or more color distributions are compared over the Internet.
11. The method as claimed in claim 1 wherein the one or more optimum color models and the one or more color distributions are compared on a computing device selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
12. A method of classifying images, comprising: a. generating one or more color models from a first set of images with a region of interest; b. training the one or more color models utilizing a second set of images with one or more regions of interest and a third set of images without regions of interest; c. comparing the one or more color models with one or more color distributions from a fourth set of images; and d. retrieving one or more images from the fourth set of images based on the comparison between the one or more color models and the one or more color distributions.
13. The method as claimed in claim 12 wherein the one or more color models are elliptical.
14. The method as claimed in claim 12 wherein the one or more color models are generated in Hue, Saturation, Value color space. PATENT SONY-32600WO
15. The method as claimed in claim 12 wherein training further comprises maximizing the difference between the second set of images with one or more regions of interest and the third set of images without regions of interest.
16. The method as claimed in claim 12 wherein the smaller the distance between the one or more color models and the one or more color distributions the higher the similarity.
17. The method as claimed in claim 12 wherein a keyword is used to select the first set of images.
18. The method as claimed in claim 12 wherein the first set of images is selected by a user.
19. The method as claimed in claim 12 wherein the one or more color models and the one or more color distributions are compared over the Internet.
20. The method as claimed in claim 12 wherein the one or more color models and the one or more color distributions are compared on a computing device selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
21. A method of optimizing color models for classifying images, comprising: a. generating a color model for each of one or more first images; b. searching for the color model for maximizing a statistical distance between the one or more first images and one or more second images; c. updating a color model set by adding the color model for maximizing the statistical distance to the color model set; and d. repeating searching for the color model and updating a color model until the statistical distance is maximized.
PATENT SONY-32600WO 22. The method as claimed in claim 21 wherein the one or more first images contain one or more regions of interest and the one or more second images are without one or more regions of interest.
23. The method as claimed in claim 21 wherein the color model is elliptical.
24. The method as claimed in claim 21 wherein the color model is generated in Hue, Saturation, Value color space.
25. The method as claimed in claim 21 wherein optimizing color models is performed on a computing device selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
26. A system for comparing a plurality of images, comprising: a. one or more first images; b. one or more color models generated from the one or more first images; c. one or more second images with one or more regions of interest for training the one or more color models; d. one or more third images without regions of interest for training the one or more color models; e. one or more fourth images; f. one or more color distributions generated from the one or more fourth images; and g. a program to compare the one or more color models with the one or more color distributions.
27. The system as claimed in claim 26 wherein the one or more color models are elliptical.
28. The system as claimed in claim 26 wherein the one or more color models are generated in Hue, Saturation, Value color space. PATENT SONY-32600WO
29. The system as claimed in claim 26 wherein the one or more fourth images are retrieved based on similarity of the one or more color models to the one or more color distributions.
30. The system as claimed in claim 29 wherein a smaller distance between the one or more color models and the one or more color distributions results in a higher similarity.
31. The system as claimed in claim 26 wherein a keyword is used to select the one or more first images.
32. The system as claimed in claim 26 wherein the one or more first images are selected by a user.
33. The system as claimed in claim 26 wherein the one or more color models and the one or more color distributions are compared over the Internet.
34. The system as claimed in claim 26 wherein the one or more color models and the one or more color distributions are compared on a computing device selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
35. A capture and display device, comprising: a. a receiving unit for receiving image data; b. a display unit coupled to the receiving unit for displaying image data; and c. a program coupled to the receiving unit and the display unit to compare the image data by i. generating one or more color models from one or more selected images; ii. selecting one or more optimum color models from the one or more PATENT SONY-32600WO color models, wherein the one or more optimum color models are representative of color in the one or more selected images; and iii. comparing one or more color distributions from the image data with the one or more optimum color models.
36. The system as claimed in claim 35 wherein the one or more color models are elliptical.
37. The system as claimed in claim 35 wherein the one or more color models are generated in Hue, Saturation, Value color space.
38. The system as claimed in claim 35 wherein the capture and display device is selected from the group consisting of a personal computer, laptop, digital camera, digital camcorder, handheld, iPod® and home entertainment system.
PCT/US2007/008446 2006-04-11 2007-04-02 Image classification based on a mixture of elliptical color models WO2007120558A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN2007800133637A CN101421746B (en) 2006-04-11 2007-04-02 Image classification based on a mixture of elliptical color models
EP07774732.7A EP2005364B1 (en) 2006-04-11 2007-04-02 Image classification based on a mixture of elliptical color models

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US11/402,349 US7672508B2 (en) 2006-04-11 2006-04-11 Image classification based on a mixture of elliptical color models
US11/402,349 2006-04-11

Publications (2)

Publication Number Publication Date
WO2007120558A2 true WO2007120558A2 (en) 2007-10-25
WO2007120558A3 WO2007120558A3 (en) 2008-04-03

Family

ID=38574888

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2007/008446 WO2007120558A2 (en) 2006-04-11 2007-04-02 Image classification based on a mixture of elliptical color models

Country Status (4)

Country Link
US (1) US7672508B2 (en)
EP (1) EP2005364B1 (en)
CN (1) CN101421746B (en)
WO (1) WO2007120558A2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102203803A (en) * 2008-08-21 2011-09-28 4Sight成像有限公司 Visual object appearance modelling using image processing

Families Citing this family (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9530050B1 (en) 2007-07-11 2016-12-27 Ricoh Co., Ltd. Document annotation sharing
US8600989B2 (en) 2004-10-01 2013-12-03 Ricoh Co., Ltd. Method and system for image matching in a mixed media environment
US8989431B1 (en) 2007-07-11 2015-03-24 Ricoh Co., Ltd. Ad hoc paper-based networking with mixed media reality
US7812986B2 (en) 2005-08-23 2010-10-12 Ricoh Co. Ltd. System and methods for use of voice mail and email in a mixed media environment
US8949287B2 (en) 2005-08-23 2015-02-03 Ricoh Co., Ltd. Embedding hot spots in imaged documents
US8176054B2 (en) 2007-07-12 2012-05-08 Ricoh Co. Ltd Retrieving electronic documents by converting them to synthetic text
US9171202B2 (en) 2005-08-23 2015-10-27 Ricoh Co., Ltd. Data organization and access for mixed media document system
US9373029B2 (en) 2007-07-11 2016-06-21 Ricoh Co., Ltd. Invisible junction feature recognition for document security or annotation
US8825682B2 (en) 2006-07-31 2014-09-02 Ricoh Co., Ltd. Architecture for mixed media reality retrieval of locations and registration of images
US8510283B2 (en) 2006-07-31 2013-08-13 Ricoh Co., Ltd. Automatic adaption of an image recognition system to image capture devices
US8965145B2 (en) 2006-07-31 2015-02-24 Ricoh Co., Ltd. Mixed media reality recognition using multiple specialized indexes
US8156116B2 (en) 2006-07-31 2012-04-10 Ricoh Co., Ltd Dynamic presentation of targeted information in a mixed media reality recognition system
US7702673B2 (en) 2004-10-01 2010-04-20 Ricoh Co., Ltd. System and methods for creation and use of a mixed media environment
US8856108B2 (en) 2006-07-31 2014-10-07 Ricoh Co., Ltd. Combining results of image retrieval processes
US8838591B2 (en) 2005-08-23 2014-09-16 Ricoh Co., Ltd. Embedding hot spots in electronic documents
US9405751B2 (en) 2005-08-23 2016-08-02 Ricoh Co., Ltd. Database for mixed media document system
US8868555B2 (en) 2006-07-31 2014-10-21 Ricoh Co., Ltd. Computation of a recongnizability score (quality predictor) for image retrieval
US9384619B2 (en) 2006-07-31 2016-07-05 Ricoh Co., Ltd. Searching media content for objects specified using identifiers
US8521737B2 (en) 2004-10-01 2013-08-27 Ricoh Co., Ltd. Method and system for multi-tier image matching in a mixed media environment
US9063952B2 (en) 2006-07-31 2015-06-23 Ricoh Co., Ltd. Mixed media reality recognition with image tracking
US8489987B2 (en) 2006-07-31 2013-07-16 Ricoh Co., Ltd. Monitoring and analyzing creation and usage of visual content using image and hotspot interaction
US8201076B2 (en) 2006-07-31 2012-06-12 Ricoh Co., Ltd. Capturing symbolic information from documents upon printing
US8676810B2 (en) 2006-07-31 2014-03-18 Ricoh Co., Ltd. Multiple index mixed media reality recognition using unequal priority indexes
US9020966B2 (en) 2006-07-31 2015-04-28 Ricoh Co., Ltd. Client device for interacting with a mixed media reality recognition system
US9176984B2 (en) 2006-07-31 2015-11-03 Ricoh Co., Ltd Mixed media reality retrieval of differentially-weighted links
JP5115089B2 (en) * 2007-08-10 2013-01-09 富士通株式会社 Keyword extraction method
US9710491B2 (en) * 2009-11-02 2017-07-18 Microsoft Technology Licensing, Llc Content-based image search
US8433140B2 (en) * 2009-11-02 2013-04-30 Microsoft Corporation Image metadata propagation
US20110106798A1 (en) * 2009-11-02 2011-05-05 Microsoft Corporation Search Result Enhancement Through Image Duplicate Detection
US9058331B2 (en) 2011-07-27 2015-06-16 Ricoh Co., Ltd. Generating a conversation in a social network based on visual search results
US10311096B2 (en) 2012-03-08 2019-06-04 Google Llc Online image analysis
US9659033B2 (en) * 2013-08-19 2017-05-23 Nant Holdings Ip, Llc Metric based recognition, systems and methods
US20150279047A1 (en) * 2014-03-27 2015-10-01 Qualcomm Incorporated Exemplar-based color classification
CN108829815B (en) * 2018-06-12 2022-06-07 四川希氏异构医疗科技有限公司 Medical image screening method

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6400996B1 (en) 1999-02-01 2002-06-04 Steven M. Hoffberg Adaptive pattern recognition based control system and method
JP3653945B2 (en) * 1997-08-29 2005-06-02 ソニー株式会社 Color extraction apparatus and color extraction method
US6463426B1 (en) 1997-10-27 2002-10-08 Massachusetts Institute Of Technology Information search and retrieval system
US6445818B1 (en) 1998-05-28 2002-09-03 Lg Electronics Inc. Automatically determining an optimal content image search algorithm by choosing the algorithm based on color
AUPP400998A0 (en) 1998-06-10 1998-07-02 Canon Kabushiki Kaisha Face detection in digital images
US6633666B2 (en) * 1998-08-28 2003-10-14 Quark, Inc. Process and system for defining and visually depicting colors from the components of arbitrary color models
US6411953B1 (en) 1999-01-25 2002-06-25 Lucent Technologies Inc. Retrieval and matching of color patterns based on a predetermined vocabulary and grammar
US6606623B1 (en) 1999-04-09 2003-08-12 Industrial Technology Research Institute Method and apparatus for content-based image retrieval with learning function
US6763148B1 (en) 2000-11-13 2004-07-13 Visual Key, Inc. Image recognition methods
KR100785002B1 (en) * 2001-07-09 2007-12-11 삼성전자주식회사 Apparatus and method for browsing image data based on color temperature
EP1302865A1 (en) * 2001-10-10 2003-04-16 Mitsubishi Electric Information Technology Centre Europe B.V. Method and apparatus for searching for and retrieving colour images
GB2395781A (en) * 2002-11-29 2004-06-02 Sony Uk Ltd Face detection
JP2004234228A (en) * 2003-01-29 2004-08-19 Seiko Epson Corp Image search device, keyword assignment method in image search device, and program
JP2004361987A (en) * 2003-05-30 2004-12-24 Seiko Epson Corp Image retrieval system, image classification system, image retrieval program, image classification program, image retrieval method, and image classification method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ELI SABER ET AL., GRAPHICAL MODELS AND IMAGE PROCESSING AUTOMATIC IMAGE ANNOTATION USING ADAPTIVE COLOR CLASSIFICATION, vol. 58, 1 March 1996 (1996-03-01), pages 115 - 126
JOHN ZACHARY ET AL.: "Content based image retrieval and information theory: A general approach", JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 1 January 2001 (2001-01-01), pages 840 - 852, XP055161984, Retrieved from the Internet <URL:http://onlinelibrary.wiley.com/doi/10.1002/asi.1138/abstract> doi:10.1002/asi.1138
See also references of EP2005364A4
Y. WU ET AL.: "Discriminant-EM algorithm with application to image retrieval", PROCEEDINGS IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION. CVPR 2000 (CAT. NO.PR00662, vol. 1, 1 January 2000 (2000-01-01), pages 222 - 227, XP055162124, ISBN: 978-0-76-950662-3, doi:10.1109/CVPR.2000.855823

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102203803A (en) * 2008-08-21 2011-09-28 4Sight成像有限公司 Visual object appearance modelling using image processing
CN102203803B (en) * 2008-08-21 2016-11-09 4Sight成像有限公司 Use the modeling to visual object outward appearance for the image procossing

Also Published As

Publication number Publication date
EP2005364B1 (en) 2020-04-01
EP2005364A4 (en) 2015-03-11
WO2007120558A3 (en) 2008-04-03
US7672508B2 (en) 2010-03-02
EP2005364A2 (en) 2008-12-24
CN101421746B (en) 2012-09-19
CN101421746A (en) 2009-04-29
US20070236712A1 (en) 2007-10-11

Similar Documents

Publication Publication Date Title
EP2005364B1 (en) Image classification based on a mixture of elliptical color models
US8150170B2 (en) Statistical approach to large-scale image annotation
JP5170961B2 (en) Image processing system, image processing apparatus and method, program, and recording medium
US8107689B2 (en) Apparatus, method and computer program for processing information
TWI246664B (en) Camera meta-data for content categorization
US7124149B2 (en) Method and apparatus for content representation and retrieval in concept model space
US6774917B1 (en) Methods and apparatuses for interactive similarity searching, retrieval, and browsing of video
US6404925B1 (en) Methods and apparatuses for segmenting an audio-visual recording using image similarity searching and audio speaker recognition
US20140093174A1 (en) Systems and methods for image management
US20030123737A1 (en) Perceptual method for browsing, searching, querying and visualizing collections of digital images
US20110317885A1 (en) Automatic and Semi-automatic Image Classification, Annotation and Tagging Through the Use of Image Acquisition Parameters and Metadata
US20070195344A1 (en) System, apparatus, method, program and recording medium for processing image
US20090087122A1 (en) Video abstraction
JP2007206919A (en) Display control device, method, program and storage medium
US9137574B2 (en) Method or system to predict media content preferences
CN102132318A (en) Automatic creation of a scalable relevance ordered representation of an image collection
US20080085053A1 (en) Sampling image records from a collection based on a change metric
US7577684B2 (en) Fast generalized 2-Dimensional heap for Hausdorff and earth mover&#39;s distance
Suh et al. Semi-automatic image annotation using event and torso identification
Kamde et al. A survey on web multimedia mining
Khokher et al. Content-based image retrieval: state-of-the-art and challenges
US20070196032A1 (en) Compressible earth mover&#39;s distance
Khokher et al. Image retrieval: A state of the art approach for CBIR
Liang et al. Video Retrieval Based on Language and Image Analysis
Kovács et al. Visret–a content based annotation, retrieval and visualization toolchain

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07774732

Country of ref document: EP

Kind code of ref document: A2

WWE Wipo information: entry into national phase

Ref document number: 2007774732

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 200780013363.7

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE