WO2009072128A2 - System and method for representation and comparison of digital images - Google Patents

System and method for representation and comparison of digital images Download PDF

Info

Publication number
WO2009072128A2
WO2009072128A2 PCT/IL2008/001582 IL2008001582W WO2009072128A2 WO 2009072128 A2 WO2009072128 A2 WO 2009072128A2 IL 2008001582 W IL2008001582 W IL 2008001582W WO 2009072128 A2 WO2009072128 A2 WO 2009072128A2
Authority
WO
WIPO (PCT)
Prior art keywords
image
standardized
consolidated
identifiers
derivative
Prior art date
Application number
PCT/IL2008/001582
Other languages
French (fr)
Other versions
WO2009072128A3 (en
Inventor
Ohad Gilboa
Original Assignee
Vayar Vision Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Vayar Vision Ltd. filed Critical Vayar Vision Ltd.
Publication of WO2009072128A2 publication Critical patent/WO2009072128A2/en
Publication of WO2009072128A3 publication Critical patent/WO2009072128A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/40Analysis of texture
    • G06T7/41Analysis of texture based on statistical description of texture
    • G06T7/44Analysis of texture based on statistical description of texture using image operators, e.g. filters, edge density metrics or local histograms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content

Definitions

  • This invention relates to the field of digital images representation, comparison, search and retrieval
  • CBIR Content Based Image Retrieval
  • a method of generating image representation in a standardized format comprising: (a) obtaining data indicative of at least one digital image; (b) performing one or more transformations on said at least one digital image, resulting in at least one derivative image of said at least one digital image;
  • processing each of said at least one derivative image including: i) applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; ii) converting said at least one statistical value to at least one standardized format representation;
  • a system for generating images representation in a standardized format comprising: a transformation module, being responsive to information indicative of at least one digital image and configured to perform one or more transformations on said at least one digital image, resulting in at least one derivative image of said at least one digital image; an identifier generating module coupled to said transformation module and configured to process each of said at least one derivative images, the processing comprising: applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value and converting said at least one statistical value to at least one standardized format representation; a consolidating module coupled to said identifier generated module and configured to join standardized format representations corresponding to at least two derivative images into a standardized consolidated expression, wherein said standardized expression being a representation of said at least one digital image and wherein said standardized expression is accessible by a computer application for processing said at least one digital image.
  • a method of generating image representation in a standardized format comprising: (a) providing data indicative of at least one query image;
  • processing each of said at least one derivative image of said at least one query image comprising: i) applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; ii) converting said at least one statistical value, to at least one standardized format representation, thereby generating a first group of identifiers representing each of said at least one derivative image of said at least one query image;
  • processing each of said at least one derivative image of said reference image comprising: i) applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; ii) converting said at least one statistical value, to at least one standardized format representation, thereby generating a second group of identifiers representing each of said at least one derivative image of said reference image; (f) comparing said first and second groups of identifiers and selecting identifiers according to a predefined similarity degree;
  • a program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps of generating image representation in a standardized format, the method comprising: (h) obtaining data indicative of at least one digital image; (i) performing one or more transformations on said at least one digital image, resulting in at least one derivative image of said at least one digital image; (j) processing each of said at least one derivative image, including: i) applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; ii) converting said at least one statistical value to at least one standardized format representation; and (k) joining standardized format representations that correspond to at least two derivative images into a single consolidated standardized expression, wherein said standardized expression being a representation of said at least one digital image and wherein said standardized expression is accessible by a computer application for processing said at least one digital image.
  • a computer program product comprising a computer useable medium having computer readable program code embodied therein of generating image representation in a standardized format
  • the computer program product comprising: computer readable program code for causing the computer to obtain data indicative of at least one digital image; computer readable program code for causing the computer to perform one or more transformations on said at least one digital image, resulting in at least one derivative image of said at least one digital image; computer readable program code for causing the computer to process each of said at least one derivative image, including: computer readable program code for causing the computer to apply one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; computer readable program code for causing the computer to convert said at least one statistical value to at least one standardized format representation; and computer readable program code for causing the computer to join standardized format representations that correspond to at least two derivative images into a single consolidated standardized expression, wherein said standardized expression being a representation of said at least one digital
  • Fig. 1 is a schematic illustration of the system architecture, in accordance with an embodiment of the invention.
  • Fig. 2 is a flowchart showing the operations carried out for creating consolidated expressions, in accordance with an embodiment of the invention
  • Fig. 3 is an example of the processing of a digital image, in accordance with an embodiment of the invention
  • Fig. 4 is a flowchart showing the operations carried out in association with the comparison between digital images, in accordance with an embodiment of the invention
  • Fig. 5 is a flowchart showing the operations carried out in association with directing the comparison process to specific visual features of a digital image, in accordance with an embodiment of the invention
  • Fig. 6 shows an example, demonstrating the process described with reference to Fig. 5, in accordance with an embodiment of the invention.
  • the phrase “for example,” “such as” and variants thereof describing exemplary implementations of the present invention are exemplary in nature and not limiting.
  • Reference in the specification to "one embodiment”, “an embodiment”, “some embodiments”, “another embodiment”, “other embodiments”, “certain embodiment” or variations thereof means that a particular feature, structure or characteristic described in connection with the embodiment(s) is included in at least one embodiment of the invention.
  • the appearance of the phrase “one embodiment”, “an embodiment”, “some embodiments”, “another embodiment”, “other embodiments” or variations thereof do not necessarily refer to the same embodiment(s). It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment.
  • Fig. 1 illustrates a general system architecture 110 in accordance with an embodiment of the invention.
  • Each module in Fig. 1 can be made up of any combination of software, hardware and/or firmware that performs the functions as defined and explained herein.
  • the modules in Figure 1 may be centralized in one location or dispersed over more than one location. In other embodiments of the invention, the system may comprise fewer, more, and/or different modules than those shown in Fig. 1.
  • Some embodiments of the present invention are primarily disclosed as a method and it will be understood by a person of ordinary skill in the art that an apparatus such as a conventional data processor incorporated with a database, software and other appropriate components may be programmed or otherwise designed to facilitate the practice of the method of the invention.
  • Some embodiments of the present invention may use terms such as service, module, tool, technique, system, processor, device, tool, computer, apparatus, element, sub-system, server, engine, etc, (in single or plural form) for performing the operations herein. These terms, as appropriate, refer to any combination of software, hardware and/or firmware configured to perform the operations as defined and explained herein.
  • the module(s) (or counterpart terms specified above) may be specially constructed for the desired purposes, or it may comprise a general purpose computer selectively activated or reconfigured by a program stored in the computer.
  • Such a program may be stored in a readable storage medium, such as, but not limited to, any type of disk including optical disks, CD-ROMs, magnetic-optical disks, read- only memories (ROMs), random access memories (RAMs), electrically programmable read-only memories (EPROMs), electrically erasable and programmable read only memories (EEPROMs), magnetic or optical cards, any other type of media suitable for storing electronic instructions that are capable of being conveyed, for example via a computer system bus.
  • a readable storage medium such as, but not limited to, any type of disk including optical disks, CD-ROMs, magnetic-optical disks, read- only memories (ROMs), random access memories (RAMs), electrically programmable read-only memories (EPROMs), electrically erasable and programmable read only memories (EEPROMs), magnetic or optical cards, any other type of media suitable for storing electronic instructions that are capable of being conveyed, for example via a computer system bus.
  • digital images are processed, wherein the processing comprises performing a plurality of transformations and where each transformation results in a derivative image, each derivative image being an altered version of the original digital image.
  • a transformation may consist of one image processing function but more often consists of a sequence of two or more image processing functions.
  • An image processing function may comprise various visual manipulations done on the original digital image, in which the color pixel values of all or part of the pixels comprising the image is changed, in effect changing the visual properties of the image.
  • color pixel value or “color value of pixels” as used herein refers to the values of the pixels in respect to the type of the color representation of the image for example: grey level, RGB etc.
  • visual feature, visual aspect and visual property refer to the visual characteristics of a digital image such as, color, texture, outline, brightness etc. It should be noted that, in accordance with certain embodiments, the terms
  • original image or "original digital image” as used herein, refer to a digital image before being processed by the system and method of the present invention
  • derivative image refers to an image which has been derived from the original image by the system and method of the present invention (i.e. a transformed digital image).
  • digital image refers to an original digital image.
  • each derivative image is further processed using statistical functions which are implemented on the color pixel values of the resulting derivative images.
  • the resulting values of the statistical calculations are utilized in order to create one or more identifiers representing the transformations and the statistical values of each derivative image, in a standardized format.
  • the system and method are capable of calculating in a very short time hundreds of identifiers (for example, in some cases more than 200 identifiers in 1 tenth of a second or less).
  • the various identifiers provide an extremely diverse and heterogeneous representation for each digital image, giving rise to a high resolution fingerprint representing each original digital image.
  • the number of calculated identifiers may vary from a single identifier and up to hundreds of identifiers representing an original digital image.
  • the identifiers are not explicit, semantic descriptors of the content of the digital image, (for example describing an image as being a picture of an elephant or a tree), but rather provide an alternative and extended "vocabulary" for representing visual properties of the digital image.
  • Each identifier can be seen as a "word" representing a specific characteristic of certain digital image with respect to a specific visual property. For example, consider a transformation analyzing the red color in a digital image. The resulting values constructing the identifiers represent the red color characteristics of a transformed image (the red color being the visual property, and the calculated values in the identifiers the specific characteristics of each transformed digital image in respect of the red color property).
  • the use of a plurality of statistical functions on the same derivative image (thereby obtaining more that one identifier per each derivative image) provides additional diversity in the representation of each original digital image.
  • identifiers are represented as a string of textual characters and thus in addition to providing an extended alternative to the original pixel representation of the original digital image they also enable facilitating textual comparison means, including any device, method and software which handles textual strings for example, Boolean search and textual search engines for the purpose of image comparison and retrieval.
  • the method and system of the present invention allow comparison between digital images which is based neither on comparison between actual pixels of digital images, nor on textual tags and descriptors of the content of digital image, but is rather based on the comparison of identifiers, which represent visual features of digital images (similar to pixels) but are compared using textual comparison methods and search engines (similar to textual tags and description).
  • the collections of identifiers which represent the visual features of the image are conceptually similar to a collection of words semantically describing the visual aspects of the image.
  • the invention disclosed herein provides a system and method for implementing an image based search engine, which does not require any preliminary knowledge of the processed image. More specifically, according to certain embodiments, all images may be processed uniformly, regardless of the semantic description of their content. All digital images are processed and transformed into a textual representation which is independent of the semantic context of the image. Thus, the system is relieved from extensive pre- processing including for example, computational learning, segmentation, which, classifying images into subgroups according to content, constructing complex graphs representing similarity between images etc., all of which require extensive computational resources and extensive storage.
  • Fig. 1 showing a schematic illustration of the system architecture 110, according to an embodiment of the present invention.
  • the system 110 may be configured for processing single images or alternatively it may be configured for processing groups of images.
  • System 110 includes an executor 120 and optionally an associated system database 140.
  • database 140 stores digital images and associated consolidated standardized expressions, as will be explained in further detail below.
  • executor 120 comprises, in accordance with certain embodiments of the invention, a transformation module 122, an identifier generating module 124, a consolidating module 126 and a data management module 128.
  • the executer comprises a comparing and retrieval module 130 in addition to or instead of data management module 128. It should be noted that in some embodiments, the division of system 110 into the specific modules and the further breakdown of executor 120 into modules as shown in Fig.
  • system 110 includes less, more and/or different modules than shown in Fig. 1.
  • executor 120 includes less or more and/or different modules than shown in Fig. 1. hi other embodiments of the invention any module of system 110 (or executor 120) may provide less functionality, more functionality and/or different functionality than the functionality provided by the modules illustrated in Fig. 1.
  • each of the modules of system 110 (or of executor 120) may be made up of different combinations of software, hardware and/or firmware capable of performing the functions described and defined herein.
  • System 110 is illustrated in Fig. 1 in the context of a network 100.
  • Network 100 may be any appropriate network.
  • client(s) 112, image database(s) 116, image capturing device(s) 114, web crawler(s) 118 are coupled via network 100 to system 110.
  • image input into system 110 and/or retrieval from system 110 can be performed over a network 100, for example: the Internet, a local area network (LAN), wide area network (WAN), metropolitan area network (MAN) or a combination thereof.
  • the connection to the network may be realized through any suitable connection or communication utility.
  • the connection may be implemented by hardwire or wireless communication means.
  • system 110 may be fully or partially accessed outside of a context of a network, for example with any of client(s) 112, image database(s) 116, image capturing device(s) 114, web crawler(s) 118 directly coupled to system 110.
  • image input into or retrieval from system 110 may be performed via user interface or via a direct connection to system 110, for example via a universal serial bus (USB) connection.
  • USB universal serial bus
  • a robot having a built in video camera may capture images and send them to the system for processing, comparison and retrieval.
  • the system 110 may be embedded within the robot's mechanical and electronic architecture or alternatively the robot may communicate with a system 110 located in a remote location (e.g. another computer) via network communication means such as wireless Internet connection.
  • the system 110 may comprise a user interface allowing the direct interaction of a user with the system 110.
  • Digital images may be provided to system 110 via a variety of image input devices. Additionally or alternatively, digital images may be retrieved from system 110 to a variety of destinations using a variety of retrieval techniques. For example, as illustrated in Fig. 1 images may be provided to system 110 and/or retrieved from system 110 by one or more users which may interact with the system 110 through one or more clients 112. Clients 112 may be, but are not limited to, personal computers, portable computers, PDAs, cellular phones or the like. Each client 112 may include a user interface and possibly an application for sending and receiving web pages, such as a browser application which may be utilized, inter alia, for inserting digital images to system 110. As further exemplified in Fig.
  • images may be provided to system 110 from other image databases 116 and/or directly from an image capturing device 114 (e.g. camera or cell phone).
  • the system may operate a web crawler 118 for methodically browsing the World Wide Web and retrieving digital images for processing and storing in the system database 140.
  • web crawlers may be facilitated for locating digital images and processing digital images at their remote location (e.g. a remote computer) and associating the digital images with consolidated expressions without retrieving the digital image.
  • system 110 is illustrated in Fig. 1 as if comprised in a single unit, this is not always necessarily and depending on the embodiment, modules in system 110 may be comprised in the same unit, may be connected over a network and/or may be connected through a direct connection.
  • system database 140 may be connected via a network to executor 120 and according to other embodiments it may be connected directly or may be comprised in the same unit. More details on specific modules of executor 120 in various embodiments will now be provided.
  • the transformation module 122 facilitates performing a plurality of transformations for transforming the original digital image into a plurality of derivative images. Each derivative image is therefore the resulting image obtained by implementing a transformation which includes at least one image processing function.
  • Each transformation may include a single image processing function or alternatively a number of image processing functions performed in sequence. According to certain embodiments, the transformation is performed on the entire original digital image, while according to other embodiments the transformation may be performed on a specific area or on a specific object, within the original digital image.
  • the types, the number and the, order of the implementation, of the image processing functions determine the type of the transformation.
  • the resulting visual features of the derivative images depend on the transformation, which was performed together with the visual properties of the original digital image.
  • the sequence of image processing functions comprising a transformation may include different types of image processing functions. Additionally or alternatively, the same type of image processing function may be repeated more than once in a single transformation.
  • Each transformation involves the alteration of the color pixel values of at least part of the pixels of the original digital image.
  • the system and method of the present invention allows the construction, for each original digital image, of a potentially enormous number of different searching queries.
  • different queries may be directed to searching for specific visual features within a digital image.
  • a derivative image highlighting the white color in an image may be used for constructing a query which is directed for finding other images which have similar white color layout.
  • the transformations may be constructed and implemented according to specific visual features which are of interest to the user and a collection of transformations may be assembled in order to direct the system to compare between images according to a predefined set of visual features within a digital image. It should be noted, however, that according to other embodiments, transformations may be constructed, while disregarding the visual features which may be enhanced or reduced in the original digital image.
  • system 110 may allow different levels of user interaction from a completely manual to completely automatic construction of sequences of image processing functions (i.e. transformations).
  • system 110 may provide a predetermined collection of transformations all of which are performed on every digital image.
  • system 110 of the present invention may provide a working environment, possibly including a user interface, which allows a user, via a client 112 or directly, to manually select one or more image processing functions from a plurality of available functions, according to a specific application or to the user's specific needs and construct custom made transformations.
  • system 110 may present to the user or describe the possible effect of an image processing function or combination of functions, on the original image, and in some embodiments the user may select from among combinations of image processing functions, those combinations which are adequate for the user's needs. According to other embodiments, system 110 may provide information connecting between specific transformations to specific applications for which these transformations should be used. Additionally or alternatively, system 110 may provide a user with a list of transformations which are used most often for a specific application or in general.
  • the identifier generating module 124 is responsible for acquiring the color values of the pixels of each derivative image and performing statistical calculations on these values in order to obtain corresponding statistical measurements.
  • the specific visual property (embodied by the color pixel values) of each derivative image reflects the specific number, sequence, and type of image processing function(s) which were implemented in order to obtain the derivative image.
  • the differences in the statistical measurements which are calculated for different derivative images are a mathematical representation of the differences between derivative images resulting from different transformations.
  • the identifier generating module 124 calculates and assigns one or more identifiers (otherwise known as "codons"), which are strings of characters denotation for each derivative image.
  • the calculated values are converted into identifiers which are a standardized format representation of the calculated values. For example, an identifier of a specific derivative image identifies the type of transformation, a statistical function (e.g. average of the color pixel values of the derivative image) and the corresponding statistical value calculated with that statistical function for the specific derivative image.
  • the calculated values, corresponding to a certain digital image are normalized according to a predefined scale.
  • identifiers may include the values of only one statistical function, wherein a plurality of statistical functions, corresponding to the same derivative image, would be represented by different identifiers (i.e. one derivative image may be represented by a number of identifiers).
  • identifiers may include a plurality of values pertaining to a plurality of statistical functions implemented corresponding to a specific derivative image.
  • the identifier may provide less, more and/or different information. For instance in one example, the identifier may not identify the type of transformation or the type of statistical functions, whereas in another example the identifier may additionally or alternatively identify additional information such as an identification of the original digital image.
  • An identifier can comprise for example, alphanumeric values, binary values or any other type of values known in the art.
  • the format which is used for representing derivative images should be predefined and constant (i.e. standardized).
  • the system may convert identifiers in different formats into a uniform format to enable comparison between digital images represented by identifiers in different formats.
  • identifiers are represented as text wherein, in some embodiments, the representation of identifiers is in alphanumeric textual characters.
  • the identifier generating module 124 may generate identifiers comprising textual annotation for representing the type of the transformation, textual annotation indicating the type of the performed statistical function, where each indication of a statistical function is followed by the resulting value.
  • an identifier may comprise only the actual statistical value(s) where the corresponding statistical function(s), which were implemented to obtain the statistical values, are determined based on the location of the value within the identifier (e.g. average is the first value in the identifier) or where a plurality of identifiers are joined together (e.g. a consolidated standardized expression) based on the location of the value within the joined expression.
  • the consolidating module 126 joins all or part of the identifiers, pertaining to a collection of derivative images derived from the same original digital image, into a "consolidated standardized expression" (or a "consolidated expression”).
  • a "consolidated standardized expression” or a "consolidated expression”
  • the term “join” should not be construed as limiting the possibilities of how the identifiers are arranged in the consolidated expression. For example, identifiers may be arranged adjacent to one another, separated by empty spaces or one or more predetermined characters, interleaved with one another, etc.
  • the term “join” also includes any logical or physical arrangement of identifiers corresponding to a digital image, which allows accessing and processing the identifiers.
  • identifiers may be stored on different computers or different databases and be logically associated to a single consolidated expression.
  • a consolidated standardized expression is an expression which represents one or more derivative images derived from a single original image.
  • each consolidated standardized expression is in fact a standardized representation of a single processed, original digital image, in accordance with an embodiment of the invention.
  • a consolidated standardized expression can in some cases represent only a portion (e.g. a specific area) of an original digital image.
  • the number of identifiers constituting a consolidated expression may vary from one application to another.
  • a consolidated expression may be constructed from any number of identifiers pertaining to a certain original image or from any subgroup of a calculated collection of identifiers pertaining to a certain original image.
  • the consolidated expressions originating from different original digital images are utilized for comparing between digital images.
  • the identifiers in a consolidated expression associated with one original image are compared to the identifiers comprised in consolidated expressions associated with other original digital images, and a similarity degree between the images is determined according to the similarity between the identifiers in accordance with predefined criteria.
  • the term "similarity degree” may be used to represent the similarity between identifies and images while in other embodiments the term “similarity degree” may be used to represent the dissimilarity between identifies and images.
  • a consolidated expression comprising at least one identifier, resulting from a specific transformation, is compared with consolidated expressions comprising equivalent identifiers resulting from the transformation of one or more other digital images.
  • Equivalent identifiers are identifiers resulting from the same transformations performed on different original digital images comprising values which were calculated by the same one or more statistical function.
  • the comparison is performed by comparing between equivalent identifiers and calculating the overall similarity between the consolidated expressions.
  • the comparison is performed by a comparison and retrieval module 130.
  • a consolidated expression in a text format is in fact a representation of visual features of an original digital image in a textual format and thus, in accordance with certain embodiments, the comparison and retrieval module 130 may utilize textual search methods and search engines for comparing between visual features of different original digital images based on the similarity between identifiers comprising consolidated expression corresponding to the different original digital images.
  • the consolidated expression pertaining to each original digital image is stored as a text-file.
  • the present invention is not bound by textual format or alphanumeric values, and identifiers and consolidated expressions may be represented by other types of values (e.g. binary) and may be stored in other types of files.
  • comparison between consolidated expressions is done with a predefined level of tolerance.
  • the similarity between equivalent identifiers may be determined according to the following scale: 0-10 percent similarity is scored with 1, 11-20 percent similarity is scored with 2, 21- 30 percent similarity is scored with 3 and so on.
  • the digital images are not required to have identical values in order to be determined as similar.
  • the type, number and order of the image processing functions which comprise each transformation, as well as the number of implemented transformations and the number of statistical functions, are selected so as to provide a wide diversity of resulting derivative images and to ensure that each transformation provides additional information to the information embodied within derivative images produced by other transformations.
  • the number of performed transformations may be limited by the computational power of the system, thus in accordance with certain embodiments, the number of transformations is determined while taking into consideration the computational resources and the requirements of the user (e.g. the required accuracy).
  • a query with a small number (e.g. around 20) of identifiers may suffice for comparison between images.
  • a comparison which is based on the general aspects of the original digital image which can be represented by a small collection of identifiers.
  • even a smaller number of identifiers, (e.g. around 2 to 5 identifiers) are adequate for comparison between digital images, as a small number of identifiers can reflect, detect and recognize specific visual aspects which are in common to a group of image.
  • a single identifier can be used for comparing between digital images.
  • a Single identifier may reflect a specific visual feature in a digital image. Some comparisons between digital images may be based on a single visual feature expressed by a single identifier.
  • the consolidating module 126 constructs a consolidated expression by selecting a certain number of identifiers or specific identifiers from the entire collection of identifiers calculated by the identifier generating module 124. For example, in some cases there is a limit on the number of strings and characters that may be contained in a query (e.g. a 32 string limit in some search engines). In such cases, in order to search for images which are similar to a given original digital image (i.e. query image) system 110 may recommend to the user which identifiers to select from a collection of identifiers corresponding to a certain query image, according to a particular application or prescribed need.
  • a limit on the number of strings and characters that may be contained in a query e.g. a 32 string limit in some search engines.
  • system 110 may recommend to the user which identifiers to select from a collection of identifiers corresponding to a certain query image, according to a particular application or prescribed need.
  • system 110 may perform a limited number of transformations in order to construct a consolidated expression matching the query size. According to another embodiments many transformations are performed and a subset of identifiers pertaining to certain transformations (and calculated using certain statistical functions) are selected in order to construct a consolidated expression matching the query size.
  • the recommended identifiers may comprise identifiers representing general features of the image, for example, the intensity of blue, green and red colors or the intensity of light in the image. Additionally or alternatively, recommended identifiers may be selected using a reference image as described below with reference to Fig. 5. According to another embodiment, the system 110 may recommend the identifiers according to usage of the identifiers in previous queries.
  • a Boolean search query can be constructed by using any type of Boolean operator in order to define the relationship between the identifiers in a consolidated expression.
  • system 110 may allow assigning different identifiers pertaining to certain derivative images of the query image with different weights and thus allow building biased searching profiles which will be defined by the user and stems from the needs of the requested application.
  • different derivative images resulting from different transformations may in some cases represent or emphasize different visual features of an image.
  • the user may select, from the collection of derivative images, those images which depict visual features which are of interest to the user and give them higher weight.
  • other images showing visual features of less interest to the user may be either omitted or may be given lower weights. The search can be thus customized to the interests of the user.
  • system 110 may provide a means to refine query results. For example the system may assign weights connecting between images reflecting their similarity, clustering together images with high similarity. By continuously fine tuning the clustering, the system may improve the speed and efficiency of the search and retrieval. The fine tuning may be done both manually and automatically.
  • the system may implement a suggestion module (not shown) which provides users with recommended queries based on the results and feedback of previous users.
  • the resulting consolidated expression may be stored as a file where each file is associated with the original digital image.
  • both the original digital image and its associated file are stored in a database 140 by the database management module 128.
  • the resulting consolidated expression is used to search database 140 and retrieve images by the image retrieval module 130.
  • the system 110 of the present invention can be implemented as an image based search engine. Accordingly the system may be used for search and retrieval of images which resemble a given query image.
  • a user may input into the system 110, via a client's 112 user interface or via other user interface integrated within or other associated with the system 110, an original digital image to which the user would like to find similar images (i.e. query image).
  • a user may select a digital image from a database associated with the system 110. If the query image is not associated with a consolidated expression (i.e. has not been previously processed by the system) the query image must first be processed by the system in order to produce its own identifiers and consolidated expression.
  • a comparing and retrieval module 130 searches for other similar digital images, based on the similarity between equivalent identifiers comprising the consolidated expression.
  • Fig. 2 illustrates the operations carried out for creating consolidated expressions for digital images, in accordance with an embodiment of the invention.
  • method 200 is performed by executor 120.
  • there may be more, less and/or different steps than illustrated in Fig. 2, and/or steps illustrated as being sequential may be performed in parallel.
  • the process described in accordance with Fig. 2 may be performed with a single digital image and/or a plurality of images.
  • the first step 210 one or more digital images are provided to the system.
  • Provision of digital images can be made from one or more sources, for example: image databases 116, web crawlers 118, clients 112, image capturing devices 114, video players, video cameras and computers, directly or indirectly connected to the system, as previously specified with reference to Fig. 1 supra.
  • step 220 at least one transformation is performed on each digital image which is introduced into system 110, resulting in at least one derivative image pertaining to each original digital image.
  • this step is facilitated by an image transformation module 122.
  • Transformation may include different image processing functions including but not limited to one or more of the following visual manipulations: 1. Decreasing the number of colors, threshold, separation in various channeling systems (e.g. RGB), inversing colors equalization, changes of brightness, contrast, Gamma values, luminance etc. 2. Digital signal processing such as FFT; Morphological filters, erode, dilate, outline, etc; Tracing of the various transformations. 3. Merging, logical and mathematical operations on the images.
  • the filters include filters which are defined by a matrix of values which any pixel and the pixel around it are charged with.
  • Fig. 3 illustrates an example of the processing of a digital image, in accordance with an embodiment of the invention.
  • the process begins with a given original digital image 310.
  • the original digital image is subjected to a plurality of transformations, each transformation resulting in a different derivative image 320.
  • the transformations often, although not always, involve the degradation of the original image rendering the original image less coherent, clear or understandable to the human eye compared to the original digital image.
  • step 230 in Fig. 2 is performed by an identifier generating module 124.
  • the processing in step 230 includes the applying of different statistical functions for calculating statistical values.
  • the statistical functions are not limited to any specific number, type or combination of statistical functions, in the following description the average color value, the standard deviation of the color value, and the median color value of the pixels of each derivative image are used as an example. According to certain embodiments, these three statistical functions are sufficient for obtaining a high resolution comparison between digital images.
  • the entire collection of pixels of each derivative image is used for calculating a set of statistical values.
  • subgroups of pixels from each derivative image are used for calculating statistical values.
  • the entire collection of pixels is used for calculating statistical values and in other parts of the derivative images only subgroups of pixels are used for calculating statistical values.
  • each derivative image is assigned with an identifier which is a standardized format for representing the calculated statistical values of a specific derivative image.
  • identifier representing a derivative image corresponding to transformation Vyrl from Table 1.
  • vg identifies the average color pixel values
  • dv identifies standard deviation of color pixel values
  • md identifies median color pixel values, where all color pixels values correspond to the color pixel values of a specific derivative image.
  • the resulting identifiers representing the transformation which led to that derivative image and the statistical values of the derivative image would be: a first identifier showing average values vyrlvgl8, a second identifier showing standard deviation value: vyrldv5, and a third identifier showing median value: vyrlmdl ⁇ .
  • step 230 will result in a collection of identifiers, wherein each derivative image is represented by three identifiers, each identifier representing a transformation listed in table 1 (which resulted in the derivative image) and the statistical values of the color pixel values of the derivative image (in this example average value, standard deviation value, and median value).
  • identifiers may be used, for example, all three statistical values may be represented in a single identifier.
  • each derivative image is obtained by a transformation comprised of a predetermined number, types and order of image processing function. Transformations may comprise one or more image processing functions and the order of the implementation of the image processing functions may vary from one transformation to another. According to certain embodiments, once the transformations are defined, part or all images which are processed by the system undergo the same transformations comprising the same type and the sequence of image processing functions. In order to compare between original digital images, identical transformations are typically performed on all the original digital images and identical statistical functions must be implemented on all the resulting derivative images. Some original digital images may undergo more or less transformations than other digital images, however in order to compare between digital images at least a subset of the transformations (and statistical functions) performed on the compared images, is typically identical.
  • step 240 once each derivative image has been assigned with one or more identifiers, the identifiers associated with the same original digital image are arranged into a standardized consolidated expression.
  • step 240 is performed by a consolidating module 126.
  • each identifier is an independent entity representing an independent process usually the order of performing the different transformations during step 230, is not important.
  • each identifier is explicitly recognized and associated with its corresponding transformation and statistical function in order to allow the selection and comparison of specific identifiers.
  • the corresponding transformation and statistical functions are not explicitly noted and may be inferred, if required, by other means, for example, according to their position in the consolidated expression.
  • Box 330 in the example shown in Fig. 3 illustrates one embodiment of a consolidated expression where the identifiers have been arranged into a standardized consolidated expression in textual format.
  • the particular arrangement shown in box 330 should not be construed as limiting.
  • the consolidated expression is stored, for example by a data management module 128, in system database 140, for future reference.
  • the consolidated expression can be compared with a consolidated expression corresponding to other digital images, thereby facilitating a comparison between different digital images.
  • the consolidated expression may be used to query a database of consolidated expressions associated with digital images, compare between the consolidated expressions and retrieve similar images.
  • an image retrieval module 130 image may facilitate an image based search engine operated by comparing consolidated expressions, corresponding to different images.
  • an image based search engine operated by comparing consolidated expressions, corresponding to different images.
  • two images can be inputted and processed by system 110 and the resulting consolidated expressions of both digital images can be compared in order to determine a similarity degree between the images.
  • each original digital image is associated with a text-file containing a collection of identifiers (which constitutes the standardized consolidated expression).
  • digital images located at remote locations may be processed and associated with consolidated expressions.
  • the consolidated expression may be stored at the location of the digital image; alternatively or additionally the consolidated expression may be stored at a different location than the digital image and only be associated (e.g. linked) with it logically.
  • both the identifiers and the corresponding consolidated expression are independent entities and can be utilized, processed and manipulated without being associated with the original digital image.
  • a user may hold only the consolidated expression that was produced from a certain digital image, without having the actual original digital image. That user can utilize the consolidated expression for searching and retrieval of images which are similar to the original digital image without having the original digital image.
  • Fig. 4 is a flowchart showing the operations of the system carried out in association with the comparison between digital images, in accordance with an embodiment of the invention.
  • the first step 410 one or more query images are provided. Images may be obtained from a variety of sources as was previously described with reference to Fig. 1 and Fig. 2.
  • step 412 the user may decide whether such specific processing is needed. An affirmative answer would allow the specific processing to take place in step 414. Otherwise step 420 is followed. A more detailed description of the operation performed in step 414 is specified below with reference to Fig. 5. In some embodiments step 412 is omitted and step 414 is performed automatically without intervention of a user.
  • the system checks, in the system's database, whether the query image is associated with a consolidated expression, for example if the query image is selected from within the systems database 140 and thus the digital image is already associated with a consolidated expression. If the answer is "yes", the system retrieves the consolidated expression associated with the query image and turns to searching for similar digital images 460 based on the comparison of the consolidated expression of the query image and the consolidated expressions of other images stored in the system database or elsewhere. Otherwise, if the answer is "no" the query image must be processed in order to obtain the consolidated expression of the query image.
  • Steps 220-240 were described above with reference to Fig. 2.
  • a search for similar consolidated expressions associated with digital images is performed 460.
  • any digital image that is processed by system 110 is subjected to a large number and a large variety of transformations in order to create a diverse representation for the original digital images.
  • the number of identifiers initially created may be around 200, in some cases 20 identifiers or less may be selected for the comparison stage in step 460.
  • the initial number of transformations and corresponding identifiers calculated by system 110 may be limited to a smaller number of transformations or/and a smaller number of identifiers.
  • the comparison between query images is performed by comparing between the subsets of equivalent identifiers.
  • Search results consisting of the retrieved digital images, are presented to the user.
  • the retrieved images can be presented in ascending or descending order of similarity to the query image.
  • the user may decide whether the obtained results are satisfying or whether improvement of the results is required. If the user is satisfied with the results, the process can be terminated in step 490 or alternatively a new search can be initiated (e.g. by re-executing method 400 with a different query image).
  • additional comparisons can be made against the same image database that was used in the first search, or against any other collection of images selected by the user.
  • Fig. 5 is a flowchart showing the operations carried out in association with directing the comparison process to specific visual features of a digital image, in accordance with an embodiment of the invention.
  • Fig. 5 is an example of an exploded illustration of step 414 of Fig. 4.
  • image processing and comparison is not based on the semantic description of the content of digital images but rather on the visual properties of the digital images.
  • a user may wish to emphasize certain visual features of a given digital image and direct the search to these features while obfuscating or ignoring other features in the image.
  • the images retrieved by the system and method of the present invention may include unwanted images due to visual similarity between the query images and the images stored in the system database. These images may be conceptually different and may not reflect the results desired by the user. For example, consider two images, the first showing a white boat in the middle of the sea and the other showing a white airplane in the middle of the sky.
  • both images are comprised of a white object on a blue background they are likely to have similar visual properties and therefore, the processing of both images may produce identifiers having similar values.
  • the two images are indeed similar, some users may prefer to focus the search only on images depicting boats and not airplanes.
  • the system and method of the present invention provides a solution, illustrated with reference to Fig. 5, for improving and optimizing the search results by emphasizing certain visual features of interest within the original digital image and isolating identifiers which represent these visual features, thereby allowing focusing the comparison and search on specific similarity to the selected features.
  • a reference image is created, the reference image being a digital image that represents the visual features desired by the user.
  • Identifiers are created for the reference image and compared with equivalent identifiers of the query image. Identifiers having similar values in the query image and the reference image are selected.
  • the user is allowed to select specific image processing functions which may reflect or emphasize the desired visual features of the original image, for example, edge detection algorithms which emphasize the outline of the objects shown in the image, or other image processing functions which enhance specific colors in the image.
  • the system and method of the present invention allows users to utilize such image processing functions in order to create one or more reference images (i.e. a first type of reference image) tailored to a specific need or a specific application.
  • a reference image is a query image that has been processed and altered using one or more image processing functions specifically selected for achieving a specific visual effect. This facilitates the emphasis of specific visual features within the original digital image and allows obtaining specific identifiers which represent these specific visual features. These identifiers can be later used for directing the search and comparison to these specific visual features, as explained further below.
  • an auxiliary image i.e. a second type of reference image
  • an auxiliary image can be any digital image showing a desired object (e.g. a white boat) or desired visual feature (e.g. blue background).
  • the auxiliary image is selected by the user.
  • a reference image may be created by marking the boundaries of the desired object (e.g. a white boat) or area within the query image and defining a new image within these boundaries (i.e. a third type of reference image).
  • step 510 is facilitated for selecting a specific area within the original digital image or a specific object.
  • next steps 530 and 540 a collection of predefined transformations are performed on both the query image and the reference image which was obtained during the previous step 510.
  • steps 550 and 560 the same statistical calculations are performed on the color pixel values of both images and equivalent identifiers are generated from each transformation, as described above with reference to Fig. 2.
  • the identifiers of both the query image and/or the reference image may already be available (e.g. stored in the system database 140); in this case steps 530-560 are not performed, but rather the relevant information is retrieved.
  • the next step 570 the identifiers of the query image and the reference image are compared and a new collection of identifiers is created.
  • the identifiers in the new collection of identifiers are selected according to a predefined similarity degree between the equivalent identifiers of the two images.
  • the new collection of identifiers represents the visual features which are common to both the query image and the reference image.
  • the desired visual features which are emphasized in the reference image and represented by the identifiers of the reference image are isolated from the identifiers of the query image.
  • additional identifiers are added in step 580 to the new collection of identifiers, created in step 570.
  • the additional identifiers are selected from the identifiers which were calculated for the query image. More specifically, the additional identifiers are a subset of identifiers which reflect the general visual aspects of an image, for example, the degree of blue in an image. Different applications are characterized by different general visual features and these features are represented by specific identifiers. Thus, the additional identifiers are selected according to the specific application. This provides that the search is focused on images with both general similarity to the query image and also specific similarity to the visual features emphasized by the reference image.
  • a consolidated expression corresponding to the reference image is constructed with the new collection of identifiers and the new consolidated expression is used for search and comparison of the digital image as described above.
  • the process described in accordance with Fig. 5 is performed by the transformation module 122.
  • preprocessing as described with reference to Fig. 5 can be conducted on all original digital images before they are stored in a system database 140. This may be done for example, in order to underline specific visual features in digital images which are important for a specific application and to create a database of consolidated expressions which are adapted for that application.
  • Fig. 6a and Fig 6b show an example, demonstrating the process described with reference to Fig. 5, in accordance with an embodiment of the invention.
  • Fig. 6a is a query image showing two people, a woman and a man standing next to each other and a user is interested in searching for other images showing similar faces.
  • the values of the identifiers of the original image would be influenced by the additional elements in the image such as the color of the background or the flag and may bring forth search results which are different from the desired results.
  • image processing functions can be utilized in order to enhance the faces in the image and obfuscate the rest of the image.
  • steps 530 -560 can be performed on both the image in Fig. 6a and the image in Fig. 6b and a comparison between the resulting identifiers of the two images can be made.
  • Identifiers of the two images which are similar may be selected and used for constructing a new collection of identifiers representing the original image in this specific search. Adding to the new collection of identifiers, other identifiers from the original collection of identifiers of the original image, which represent general features of the original digital image, would direct the search to focus on searching for images with similar faces but also with similar general visual features.
  • the system of the present invention supports a variety of applications which include decision making which is based on visual information and content, e.g. computer vision, robotic vision, medical analysis etc.
  • Prior art methods and systems for image comparison and retrieval often require intensive computational steps before the actual comparison is performed.
  • Such systems which are used inter alia, for character recognition, face recognition, fingerprint recognition and the like, often include a learning step in which the system learns what to look for, from prepared examples of similar images and additional information.
  • Other systems are configured for searching for predefined and limited visual aspects in a digital image, for example, capturing motion in alert surveillance systems.
  • the system and method of the present invention can be utilized for a diversity of different applications, while performing similar processing regardless of the intended application.
  • Each field of interest may have specific needs and relevant visual aspects. Therefore, certain transformations which are directed to the requirements of a specific application may be utilized and help to improve the search results for a specific type of image.
  • the same pool of transformations can be executed for all applications resulting in a collection of identifiers. As explained above, a user can manually select, or otherwise selection can be performed automatically by system 110, from the pool of identifiers which are most appropriate for a specific application of interest.
  • Examples of applications include, inter-alia, medical imaging applications such as X-ray or MRI, black and white photographs, sketches, graphs, satellite generated images, microscope generated images, engineering research and development in every field that uses visual data, weather forecasting, financial data visualization, surveillance, face recognition etc.
  • medical imaging applications such as X-ray or MRI
  • black and white photographs sketches, graphs, satellite generated images, microscope generated images
  • engineering research and development in every field uses visual data, weather forecasting, financial data visualization, surveillance, face recognition etc.
  • the system and method of the present invention may be utilized for analysis of captured video of any kind.
  • the footage of a surveillance camera can be continuously analyzed for detecting movement.
  • the time required for generating a collection of about 200 identifiers in some embodiments, is about 1 tenth of a second, the system can sample for every short period of time (e.g. half a second) a captured image from the video footage and compare it to predefined reference image.
  • selecting a specific transformation would allow discerning between real movements and random changes in light.
  • the system and method of the present invention may combine identifiers together with conventional semantic tagging. As both types of image identification can be represented in a textual form, they can be utilized together in textual search engines providing improved textual queries. According to this embodiment, the method and system of the present invention may be used together with conventional searching methods such as textual comparison of tags or other textual semantic descriptors, thereby utilizing information from multiple sources and providing a powerful image searching and comparison tool. According to certain embodiments, the system and method of the present invention can be utilized for identifying and analyzing the content of a given image. Tags or other types of a more detailed or less detailed description of the content of the image associated with digital images may be associated with different images, possibly stored in a database.
  • system and methods of the present invention can be used to link visual content from any digital input device to meaningful information pertaining to the visual content. For instance, a user can capture an image of a certain geographical location and send it to system 110 for processing. According to some embodiments the image comparing and retrieval module 130 may retrieve similar images together with information on the specific geographical location. In another example, involving the analysis of medical images such as MRI or X-Ray, system database 140 may include a large number of images representing a variety of medical states, each image associated with a detailed description of the medical status inferred from the MRI image.
  • medical images such as MRI or X-Ray
  • An MRJ image may be inputted into system 110 and similar MRI images may be retrieved together with the medical diagnosis associated with the stored images.
  • system 110 may be utilized to link visual aspects of a given medical image (such as MRI image or Xray) with the relevant medical knowledge and diagnosis.
  • system according to the invention may be a suitably programmed computer.
  • the invention contemplates a computer program being readable by a computer for executing the method of the invention.
  • the invention further contemplates a machine-readable memory tangibly embodying a program of instructions executable by the machine for executing the method of the invention. While various embodiments have been shown and described, it will be understood that there is no intent to limit the invention by such disclosure, but rather, it is intended to cover all modifications and alternate constructions falling within the scope of the invention, as defined in the appended claims.

Abstract

A system and method for the representation, comparison and retrieval of digital images is disclosed herein. According to certain embodiments, digital images undergo one or more transformations, where each transformation results in a derivative image. Each derivative image is an altered version of the original digital image. In one embodiment, each derivative image is further processed using statistical functions which are implemented on the color pixel values of the resulting derivative images. The resulting values of the statistical calculations are utilized in order to create one or more identifiers representing the transformations and the statistical values of each derivative image, in a standardized format. The resulting identifiers enable to utilize textual comparison and search methods and systems for comparison, search and retrieval of digital images.

Description

System and Method for Representation and Comparison of Digital
Images
CROSS REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of US Provisional Application Number 60/992,912 filed on December 6, 2007 which is hereby incorporated by reference herein.
FIELD OF THE INVENTION
This invention relates to the field of digital images representation, comparison, search and retrieval
BACKGROUND OF THE INVENTION
With the ever-increasing popularity of personal photography and search engines and with the plethora of digital image related applications the research and development of Content Based Image Retrieval (CBIR) constantly provides new challenges. Many CBIR systems and methods perform feature extraction from a digital image as a pre-processing step. A feature is defined to capture a certain visual property of an image, either globally for the entire image, or locally for a small group of pixels. Most commonly used features include those reflecting color, texture, shape and salient points in an image. Once obtained, visual features act as inputs to subsequent image analysis tasks such as image comparison, concept detection, or annotation.
The various methods for visual feature extraction come with their share of advantages and limitations. One of the main challenges in this field is bridging the semantic gap between the content of an image, as perceived by a human beholder (face, people, flower, sea, sky etc.) and the visual content of an image , which is extracted and represented by the visual features ( color, texture, shapes; brightness, etc.).
A comprehensive survey on this subject can be found on:
"Image Retrieval: Ideas, Influences, and Trends of the New Age", ACM Computing Surveys, VoI. 40 no.2, 2008. RITENDRA DATTA, DHIRAJ JOSHI, JIA LI, AND JAMES Z. WANG, The Pennsylvania State University. A relevant background for this application is found on pages 24-26 of this survey.
An extension of this background and an example of the significance of CBIR to image search is found in the article of Jing and Baluja, from Google research group:
"VisualRank: Applying PageRank to Large-Scale image Search"
IEEE transactions on pattern analysis and machine intelligence vol.30 no 11, November 2008.
SUMMARY OF THE INVENTION
According to a first aspect of the present invention there is provided a method of generating image representation in a standardized format, the method comprising: (a) obtaining data indicative of at least one digital image; (b) performing one or more transformations on said at least one digital image, resulting in at least one derivative image of said at least one digital image;
(c) processing each of said at least one derivative image, including: i) applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; ii) converting said at least one statistical value to at least one standardized format representation;
(d) joining standardized format representations that correspond to at least two derivative images into a single consolidated standardized expression, wherein said standardized expression being a representation of said at least one digital image and wherein said standardized expression is accessible by a computer application for processing said at least one digital image. According to another aspect of the invention there is provided a system for generating images representation in a standardized format, comprising: a transformation module, being responsive to information indicative of at least one digital image and configured to perform one or more transformations on said at least one digital image, resulting in at least one derivative image of said at least one digital image; an identifier generating module coupled to said transformation module and configured to process each of said at least one derivative images, the processing comprising: applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value and converting said at least one statistical value to at least one standardized format representation; a consolidating module coupled to said identifier generated module and configured to join standardized format representations corresponding to at least two derivative images into a standardized consolidated expression, wherein said standardized expression being a representation of said at least one digital image and wherein said standardized expression is accessible by a computer application for processing said at least one digital image.
According to a further aspect of the invention there is provided a method of generating image representation in a standardized format, the method comprising: (a) providing data indicative of at least one query image;
(b) providing data indicative of a reference digital image;
(c) performing one or more transformations on said at least one query image and on said reference image, resulting in at least one derivative image of said at least one query image and at least one derivative image of said reference image;
(d) processing each of said at least one derivative image of said at least one query image, the processing comprising: i) applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; ii) converting said at least one statistical value, to at least one standardized format representation, thereby generating a first group of identifiers representing each of said at least one derivative image of said at least one query image;
(e) processing each of said at least one derivative image of said reference image, the processing comprising: i) applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; ii) converting said at least one statistical value, to at least one standardized format representation, thereby generating a second group of identifiers representing each of said at least one derivative image of said reference image; (f) comparing said first and second groups of identifiers and selecting identifiers according to a predefined similarity degree;
(g) creating a new group of identifiers comprising the selected identifiers, said new group of identifiers being a consolidated standardized expression representing common visual features to said at least one query image and said reference image, and wherein said consolidated standardized expression is accessible by a computer application for processing said at least one query image.
According to a further aspect of the invention there is provided a program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps of generating image representation in a standardized format, the method comprising: (h) obtaining data indicative of at least one digital image; (i) performing one or more transformations on said at least one digital image, resulting in at least one derivative image of said at least one digital image; (j) processing each of said at least one derivative image, including: i) applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; ii) converting said at least one statistical value to at least one standardized format representation; and (k) joining standardized format representations that correspond to at least two derivative images into a single consolidated standardized expression, wherein said standardized expression being a representation of said at least one digital image and wherein said standardized expression is accessible by a computer application for processing said at least one digital image.
According to yet a further aspect of the invention there is provided a computer program product comprising a computer useable medium having computer readable program code embodied therein of generating image representation in a standardized format, the computer program product comprising: computer readable program code for causing the computer to obtain data indicative of at least one digital image; computer readable program code for causing the computer to perform one or more transformations on said at least one digital image, resulting in at least one derivative image of said at least one digital image; computer readable program code for causing the computer to process each of said at least one derivative image, including: computer readable program code for causing the computer to apply one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; computer readable program code for causing the computer to convert said at least one statistical value to at least one standardized format representation; and computer readable program code for causing the computer to join standardized format representations that correspond to at least two derivative images into a single consolidated standardized expression, wherein said standardized expression being a representation of said at least one digital image and wherein said standardized expression is accessible by a computer application for processing said at least one digital image.
BRIEF DESCRIPTION OF THE DRAWINGS
In order to understand the invention and to see how it may be carried out in practice, a preferred embodiment will now be described, by way of non-limiting example only, with reference to the accompanying drawings, in which:
Fig. 1 is a schematic illustration of the system architecture, in accordance with an embodiment of the invention;
Fig. 2 is a flowchart showing the operations carried out for creating consolidated expressions, in accordance with an embodiment of the invention; Fig. 3 is an example of the processing of a digital image, in accordance with an embodiment of the invention; Fig. 4 is a flowchart showing the operations carried out in association with the comparison between digital images, in accordance with an embodiment of the invention;
Fig. 5 is a flowchart showing the operations carried out in association with directing the comparison process to specific visual features of a digital image, in accordance with an embodiment of the invention;
Fig. 6 shows an example, demonstrating the process described with reference to Fig. 5, in accordance with an embodiment of the invention.
DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION
As used herein, the phrase "for example," "such as" and variants thereof describing exemplary implementations of the present invention are exemplary in nature and not limiting. Reference in the specification to "one embodiment", "an embodiment", "some embodiments", "another embodiment", "other embodiments", "certain embodiment" or variations thereof means that a particular feature, structure or characteristic described in connection with the embodiment(s) is included in at least one embodiment of the invention. Thus the appearance of the phrase "one embodiment", "an embodiment", "some embodiments", "another embodiment", "other embodiments" or variations thereof do not necessarily refer to the same embodiment(s). It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable sub-combination. While the invention has been shown and described with respect to particular embodiments, it is not thus limited. Numerous modifications, changes and improvements within the scope of the invention will now occur to the reader. In embodiments of the invention, fewer, more and/or different stages than those shown in Fig. 2, Fig. 4 and Fig. 5 may be executed. In embodiments of the invention one or more stages illustrated in Fig. 2, Fig. 4 and Fig. 5 may be executed in a different order and/or one or more groups of stages may be executed simultaneously. Fig. 1 illustrates a general system architecture 110 in accordance with an embodiment of the invention. Each module in Fig. 1 can be made up of any combination of software, hardware and/or firmware that performs the functions as defined and explained herein. The modules in Figure 1 may be centralized in one location or dispersed over more than one location. In other embodiments of the invention, the system may comprise fewer, more, and/or different modules than those shown in Fig. 1.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Generally (although not necessarily), the nomenclature used herein described below are well known and commonly employed in the art. Unless described otherwise, conventional methods are used, such as those provided in the art and various general references.
Some embodiments of the present invention are primarily disclosed as a method and it will be understood by a person of ordinary skill in the art that an apparatus such as a conventional data processor incorporated with a database, software and other appropriate components may be programmed or otherwise designed to facilitate the practice of the method of the invention.
Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions, utilizing terms such as, "providing", "applying", "generating", "processing", "matching", "taking", "selecting", "receiving", "adjusting", "analyzing", "evaluating", "reevaluating", "joining", "enhancing", "performing", "executing" or the like, refer to the action and/or processes of any combination of software, hardware and/or firmware.
Some embodiments of the present invention may use terms such as service, module, tool, technique, system, processor, device, tool, computer, apparatus, element, sub-system, server, engine, etc, (in single or plural form) for performing the operations herein. These terms, as appropriate, refer to any combination of software, hardware and/or firmware configured to perform the operations as defined and explained herein. The module(s) (or counterpart terms specified above) may be specially constructed for the desired purposes, or it may comprise a general purpose computer selectively activated or reconfigured by a program stored in the computer. Such a program may be stored in a readable storage medium, such as, but not limited to, any type of disk including optical disks, CD-ROMs, magnetic-optical disks, read- only memories (ROMs), random access memories (RAMs), electrically programmable read-only memories (EPROMs), electrically erasable and programmable read only memories (EEPROMs), magnetic or optical cards, any other type of media suitable for storing electronic instructions that are capable of being conveyed, for example via a computer system bus.
Bearing the above in mind certain embodiments of the invention will now be described. According to certain embodiments of the present invention, there is provided a system and method for using identifiers in a standardized format for the representation and comparison of digital images. According to one embodiment, digital images are processed, wherein the processing comprises performing a plurality of transformations and where each transformation results in a derivative image, each derivative image being an altered version of the original digital image. A transformation may consist of one image processing function but more often consists of a sequence of two or more image processing functions. An image processing function may comprise various visual manipulations done on the original digital image, in which the color pixel values of all or part of the pixels comprising the image is changed, in effect changing the visual properties of the image. Depending on the image processing functions comprising the transformation, different visual properties of the original image may be emphasized while others are blurred or rendered indistinct. The term "color pixel value" or "color value of pixels" as used herein refers to the values of the pixels in respect to the type of the color representation of the image for example: grey level, RGB etc. The terms visual feature, visual aspect and visual property, as used herein, refer to the visual characteristics of a digital image such as, color, texture, outline, brightness etc. It should be noted that, in accordance with certain embodiments, the terms
"original image" or "original digital image" as used herein, refer to a digital image before being processed by the system and method of the present invention, whereas the term "derivative image" refers to an image which has been derived from the original image by the system and method of the present invention (i.e. a transformed digital image). Also it should be noted that unless stated otherwise the term "digital image" refers to an original digital image.
In one embodiment, each derivative image is further processed using statistical functions which are implemented on the color pixel values of the resulting derivative images. The resulting values of the statistical calculations are utilized in order to create one or more identifiers representing the transformations and the statistical values of each derivative image, in a standardized format.
According to certain embodiments, the system and method are capable of calculating in a very short time hundreds of identifiers (for example, in some cases more than 200 identifiers in 1 tenth of a second or less). The various identifiers provide an extremely diverse and heterogeneous representation for each digital image, giving rise to a high resolution fingerprint representing each original digital image. Depending on the requirements of a specific application, and depending on the user's needs, the number of calculated identifiers may vary from a single identifier and up to hundreds of identifiers representing an original digital image. Unlike conventional tagging of digital images, the identifiers are not explicit, semantic descriptors of the content of the digital image, (for example describing an image as being a picture of an elephant or a tree), but rather provide an alternative and extended "vocabulary" for representing visual properties of the digital image. Each identifier can be seen as a "word" representing a specific characteristic of certain digital image with respect to a specific visual property. For example, consider a transformation analyzing the red color in a digital image. The resulting values constructing the identifiers represent the red color characteristics of a transformed image (the red color being the visual property, and the calculated values in the identifiers the specific characteristics of each transformed digital image in respect of the red color property). The use of a plurality of statistical functions on the same derivative image (thereby obtaining more that one identifier per each derivative image) provides additional diversity in the representation of each original digital image.
According to certain embodiments, as explained in detail below, identifiers are represented as a string of textual characters and thus in addition to providing an extended alternative to the original pixel representation of the original digital image they also enable facilitating textual comparison means, including any device, method and software which handles textual strings for example, Boolean search and textual search engines for the purpose of image comparison and retrieval. According to certain embodiments, the method and system of the present invention allow comparison between digital images which is based neither on comparison between actual pixels of digital images, nor on textual tags and descriptors of the content of digital image, but is rather based on the comparison of identifiers, which represent visual features of digital images (similar to pixels) but are compared using textual comparison methods and search engines (similar to textual tags and description). The collections of identifiers which represent the visual features of the image are conceptually similar to a collection of words semantically describing the visual aspects of the image.
According to certain embodiments, the invention disclosed herein provides a system and method for implementing an image based search engine, which does not require any preliminary knowledge of the processed image. More specifically, according to certain embodiments, all images may be processed uniformly, regardless of the semantic description of their content. All digital images are processed and transformed into a textual representation which is independent of the semantic context of the image. Thus, the system is relieved from extensive pre- processing including for example, computational learning, segmentation, which, classifying images into subgroups according to content, constructing complex graphs representing similarity between images etc., all of which require extensive computational resources and extensive storage.
Attention is now drawn to Fig. 1, showing a schematic illustration of the system architecture 110, according to an embodiment of the present invention. Depending on the embodiment, the system 110 may be configured for processing single images or alternatively it may be configured for processing groups of images.
System 110 includes an executor 120 and optionally an associated system database 140. In some cases, database 140 stores digital images and associated consolidated standardized expressions, as will be explained in further detail below. As illustrated in Fig. 1, executor 120 comprises, in accordance with certain embodiments of the invention, a transformation module 122, an identifier generating module 124, a consolidating module 126 and a data management module 128. In accordance with certain embodiments of the invention the executer comprises a comparing and retrieval module 130 in addition to or instead of data management module 128. It should be noted that in some embodiments, the division of system 110 into the specific modules and the further breakdown of executor 120 into modules as shown in Fig. 1, may be different, and any of the modules may be separated into a plurality of modules or alternatively combined with any other module(s). In some embodiments, system 110 includes less, more and/or different modules than shown in Fig. 1. In some embodiments, executor 120 includes less or more and/or different modules than shown in Fig. 1. hi other embodiments of the invention any module of system 110 (or executor 120) may provide less functionality, more functionality and/or different functionality than the functionality provided by the modules illustrated in Fig. 1. According to different embodiments, each of the modules of system 110 (or of executor 120) may be made up of different combinations of software, hardware and/or firmware capable of performing the functions described and defined herein.
System 110 is illustrated in Fig. 1 in the context of a network 100. Network 100 may be any appropriate network. As illustrated in Fig. 1, and according to certain embodiments, client(s) 112, image database(s) 116, image capturing device(s) 114, web crawler(s) 118 are coupled via network 100 to system 110. In some embodiments, image input into system 110 and/or retrieval from system 110 can be performed over a network 100, for example: the Internet, a local area network (LAN), wide area network (WAN), metropolitan area network (MAN) or a combination thereof. The connection to the network may be realized through any suitable connection or communication utility. The connection may be implemented by hardwire or wireless communication means.
In other embodiments, system 110 may be fully or partially accessed outside of a context of a network, for example with any of client(s) 112, image database(s) 116, image capturing device(s) 114, web crawler(s) 118 directly coupled to system 110. hi these embodiments, alternatively or additionally, image input into or retrieval from system 110 may be performed via user interface or via a direct connection to system 110, for example via a universal serial bus (USB) connection. For example, a robot having a built in video camera may capture images and send them to the system for processing, comparison and retrieval. According to one embodiment the system 110 may be embedded within the robot's mechanical and electronic architecture or alternatively the robot may communicate with a system 110 located in a remote location (e.g. another computer) via network communication means such as wireless Internet connection. In some embodiments, the system 110 may comprise a user interface allowing the direct interaction of a user with the system 110.
Digital images may be provided to system 110 via a variety of image input devices. Additionally or alternatively, digital images may be retrieved from system 110 to a variety of destinations using a variety of retrieval techniques. For example, as illustrated in Fig. 1 images may be provided to system 110 and/or retrieved from system 110 by one or more users which may interact with the system 110 through one or more clients 112. Clients 112 may be, but are not limited to, personal computers, portable computers, PDAs, cellular phones or the like. Each client 112 may include a user interface and possibly an application for sending and receiving web pages, such as a browser application which may be utilized, inter alia, for inserting digital images to system 110. As further exemplified in Fig. 1, images may be provided to system 110 from other image databases 116 and/or directly from an image capturing device 114 (e.g. camera or cell phone). According to certain embodiments of the invention the system may operate a web crawler 118 for methodically browsing the World Wide Web and retrieving digital images for processing and storing in the system database 140. According to certain embodiments, web crawlers may be facilitated for locating digital images and processing digital images at their remote location (e.g. a remote computer) and associating the digital images with consolidated expressions without retrieving the digital image.
Although system 110 is illustrated in Fig. 1 as if comprised in a single unit, this is not always necessarily and depending on the embodiment, modules in system 110 may be comprised in the same unit, may be connected over a network and/or may be connected through a direct connection. For example, according to certain embodiments, system database 140 may be connected via a network to executor 120 and according to other embodiments it may be connected directly or may be comprised in the same unit. More details on specific modules of executor 120 in various embodiments will now be provided. According to certain embodiments, the transformation module 122 facilitates performing a plurality of transformations for transforming the original digital image into a plurality of derivative images. Each derivative image is therefore the resulting image obtained by implementing a transformation which includes at least one image processing function. Each transformation may include a single image processing function or alternatively a number of image processing functions performed in sequence. According to certain embodiments, the transformation is performed on the entire original digital image, while according to other embodiments the transformation may be performed on a specific area or on a specific object, within the original digital image. The types, the number and the, order of the implementation, of the image processing functions determine the type of the transformation. The resulting visual features of the derivative images depend on the transformation, which was performed together with the visual properties of the original digital image. According to certain embodiments, the sequence of image processing functions comprising a transformation may include different types of image processing functions. Additionally or alternatively, the same type of image processing function may be repeated more than once in a single transformation. Each transformation involves the alteration of the color pixel values of at least part of the pixels of the original digital image.
According to certain embodiments, as will be explained in more detail below, the system and method of the present invention allows the construction, for each original digital image, of a potentially enormous number of different searching queries. In accordance with some embodiments, different queries may be directed to searching for specific visual features within a digital image. For example, a derivative image highlighting the white color in an image may be used for constructing a query which is directed for finding other images which have similar white color layout. Thus, according to this embodiment there is a connection between the performed transformation and the search queries conducted with the resulting identifiers. Accordingly, the transformations may be constructed and implemented according to specific visual features which are of interest to the user and a collection of transformations may be assembled in order to direct the system to compare between images according to a predefined set of visual features within a digital image. It should be noted, however, that according to other embodiments, transformations may be constructed, while disregarding the visual features which may be enhanced or reduced in the original digital image.
According to certain embodiments, system 110, optionally via transformation module 122, may allow different levels of user interaction from a completely manual to completely automatic construction of sequences of image processing functions (i.e. transformations). According to one embodiment system 110 may provide a predetermined collection of transformations all of which are performed on every digital image. According to another embodiment, system 110 of the present invention may provide a working environment, possibly including a user interface, which allows a user, via a client 112 or directly, to manually select one or more image processing functions from a plurality of available functions, according to a specific application or to the user's specific needs and construct custom made transformations. According to this embodiment system 110 may present to the user or describe the possible effect of an image processing function or combination of functions, on the original image, and in some embodiments the user may select from among combinations of image processing functions, those combinations which are adequate for the user's needs. According to other embodiments, system 110 may provide information connecting between specific transformations to specific applications for which these transformations should be used. Additionally or alternatively, system 110 may provide a user with a list of transformations which are used most often for a specific application or in general.
According to certain embodiments, the identifier generating module 124 is responsible for acquiring the color values of the pixels of each derivative image and performing statistical calculations on these values in order to obtain corresponding statistical measurements. As each specific derivative image results from a specific transformation, the specific visual property (embodied by the color pixel values) of each derivative image reflects the specific number, sequence, and type of image processing function(s) which were implemented in order to obtain the derivative image. There follows that the differences in the statistical measurements which are calculated for different derivative images are a mathematical representation of the differences between derivative images resulting from different transformations.
According to certain embodiments, the identifier generating module 124 calculates and assigns one or more identifiers (otherwise known as "codons"), which are strings of characters denotation for each derivative image. According to certain embodiments, the calculated values are converted into identifiers which are a standardized format representation of the calculated values. For example, an identifier of a specific derivative image identifies the type of transformation, a statistical function (e.g. average of the color pixel values of the derivative image) and the corresponding statistical value calculated with that statistical function for the specific derivative image. According to certain embodiments, the calculated values, corresponding to a certain digital image are normalized according to a predefined scale. According to certain embodiments, identifiers may include the values of only one statistical function, wherein a plurality of statistical functions, corresponding to the same derivative image, would be represented by different identifiers (i.e. one derivative image may be represented by a number of identifiers). According to other embodiments identifiers may include a plurality of values pertaining to a plurality of statistical functions implemented corresponding to a specific derivative image. In other embodiments, the identifier may provide less, more and/or different information. For instance in one example, the identifier may not identify the type of transformation or the type of statistical functions, whereas in another example the identifier may additionally or alternatively identify additional information such as an identification of the original digital image. An identifier can comprise for example, alphanumeric values, binary values or any other type of values known in the art. However, in order to facilitate the comparison between identifiers originating from different original digital images, as explained in detail below, the format which is used for representing derivative images should be predefined and constant (i.e. standardized). According to one embodiment, the system may convert identifiers in different formats into a uniform format to enable comparison between digital images represented by identifiers in different formats.
According to one embodiment identifiers are represented as text wherein, in some embodiments, the representation of identifiers is in alphanumeric textual characters. Depending on the embodiment, the identifier generating module 124 may generate identifiers comprising textual annotation for representing the type of the transformation, textual annotation indicating the type of the performed statistical function, where each indication of a statistical function is followed by the resulting value. According to another embodiment, an identifier may comprise only the actual statistical value(s) where the corresponding statistical function(s), which were implemented to obtain the statistical values, are determined based on the location of the value within the identifier (e.g. average is the first value in the identifier) or where a plurality of identifiers are joined together (e.g. a consolidated standardized expression) based on the location of the value within the joined expression.
According to certain embodiments, the consolidating module 126 joins all or part of the identifiers, pertaining to a collection of derivative images derived from the same original digital image, into a "consolidated standardized expression" (or a "consolidated expression"). The term "join" should not be construed as limiting the possibilities of how the identifiers are arranged in the consolidated expression. For example, identifiers may be arranged adjacent to one another, separated by empty spaces or one or more predetermined characters, interleaved with one another, etc. Furthermore, the term "join" also includes any logical or physical arrangement of identifiers corresponding to a digital image, which allows accessing and processing the identifiers. For example, identifiers, corresponding to the same digital image, may be stored on different computers or different databases and be logically associated to a single consolidated expression. A consolidated standardized expression is an expression which represents one or more derivative images derived from a single original image. Thus each consolidated standardized expression is in fact a standardized representation of a single processed, original digital image, in accordance with an embodiment of the invention. Depending on the embodiment, a consolidated standardized expression can in some cases represent only a portion (e.g. a specific area) of an original digital image. The number of identifiers constituting a consolidated expression may vary from one application to another. A consolidated expression may be constructed from any number of identifiers pertaining to a certain original image or from any subgroup of a calculated collection of identifiers pertaining to a certain original image. The consolidated expressions originating from different original digital images are utilized for comparing between digital images. In one embodiment, the identifiers in a consolidated expression associated with one original image are compared to the identifiers comprised in consolidated expressions associated with other original digital images, and a similarity degree between the images is determined according to the similarity between the identifiers in accordance with predefined criteria. According to certain embodiments, the term "similarity degree" may be used to represent the similarity between identifies and images while in other embodiments the term "similarity degree" may be used to represent the dissimilarity between identifies and images. According to certain embodiments, a consolidated expression, comprising at least one identifier, resulting from a specific transformation, is compared with consolidated expressions comprising equivalent identifiers resulting from the transformation of one or more other digital images. Equivalent identifiers are identifiers resulting from the same transformations performed on different original digital images comprising values which were calculated by the same one or more statistical function. According to certain embodiments, the comparison is performed by comparing between equivalent identifiers and calculating the overall similarity between the consolidated expressions. According to certain embodiments the comparison is performed by a comparison and retrieval module 130.
A consolidated expression in a text format is in fact a representation of visual features of an original digital image in a textual format and thus, in accordance with certain embodiments, the comparison and retrieval module 130 may utilize textual search methods and search engines for comparing between visual features of different original digital images based on the similarity between identifiers comprising consolidated expression corresponding to the different original digital images. According to certain embodiments, wherein identifiers are represented in a textual format, the consolidated expression pertaining to each original digital image is stored as a text-file. However, it should be noted that the present invention is not bound by textual format or alphanumeric values, and identifiers and consolidated expressions may be represented by other types of values (e.g. binary) and may be stored in other types of files.
According to certain embodiments, comparison between consolidated expressions is done with a predefined level of tolerance. For example, the similarity between equivalent identifiers may be determined according to the following scale: 0-10 percent similarity is scored with 1, 11-20 percent similarity is scored with 2, 21- 30 percent similarity is scored with 3 and so on. Thus, the digital images are not required to have identical values in order to be determined as similar. Methods for text comparison and comparison of near duplicate documents, which are well known in the art, may be facilitated for determining the similarity between different consolidated expressions. In addition, using a large number of transformations and corresponding identifiers, and alternatively or additionally using a diverse collection of transformations where each transformation underlines different visual properties of the digital images, and corresponding identifiers calculated by utilizing a plurality of statistical functions, provides a balancing effect to the tolerance and ensures that the system determines only truly similar images as similar. There may be numerous possible transformations giving rise to numerous possible derivative images for each original digital image. The statistical significance of the comparison between original digital images is related, inter alia, to the number of derivative images (i.e. the number of transformations) that is performed on each digital image and on the diversity between the transformations (i.e. the diversity of the image processing functions comprising the transformations). Therefore, in general, a large number of substantially different transformations provides a higher resolution representation of the images and thus renders the comparison between the images more accurate and stable (The term stable in this context refers to the fact that repeating the comparison process with the same or similar images would provide the same or similar results). Thus, in accordance with certain embodiments, the type, number and order of the image processing functions which comprise each transformation, as well as the number of implemented transformations and the number of statistical functions, are selected so as to provide a wide diversity of resulting derivative images and to ensure that each transformation provides additional information to the information embodied within derivative images produced by other transformations. Alternatively or additionally, the greater the number of transformations (and thus the greater the number of derivative images and the greater the number of identifiers) the smaller the relative weight each identifier has on the overall comparison and thus the smaller the probability for a mistake in the image comparison process due to, for example, detected similarity between identifiers, of different digital images, wherein the similarity is restricted to only limited visual properties represented by these specific identifiers. In some embodiments, the number of performed transformations may be limited by the computational power of the system, thus in accordance with certain embodiments, the number of transformations is determined while taking into consideration the computational resources and the requirements of the user (e.g. the required accuracy). It should be noted that although, in some embodiments, using a large number of identifiers may be beneficial, in other embodiments, a query with a small number (e.g. around 20) of identifiers may suffice for comparison between images. For example, a comparison which is based on the general aspects of the original digital image which can be represented by a small collection of identifiers. According to another embodiment, even a smaller number of identifiers, (e.g. around 2 to 5 identifiers) are adequate for comparison between digital images, as a small number of identifiers can reflect, detect and recognize specific visual aspects which are in common to a group of image. For example in order to obtain relevant results while using a small number of identifiers the AND operator could be used in a Boolean query between the identifiers. According to yet another embodiment, even a single identifier can be used for comparing between digital images. A Single identifier may reflect a specific visual feature in a digital image. Some comparisons between digital images may be based on a single visual feature expressed by a single identifier.
As mentioned above, according to certain embodiments, the consolidating module 126 constructs a consolidated expression by selecting a certain number of identifiers or specific identifiers from the entire collection of identifiers calculated by the identifier generating module 124. For example, in some cases there is a limit on the number of strings and characters that may be contained in a query (e.g. a 32 string limit in some search engines). In such cases, in order to search for images which are similar to a given original digital image (i.e. query image) system 110 may recommend to the user which identifiers to select from a collection of identifiers corresponding to a certain query image, according to a particular application or prescribed need. According to one embodiment system 110 may perform a limited number of transformations in order to construct a consolidated expression matching the query size. According to another embodiments many transformations are performed and a subset of identifiers pertaining to certain transformations (and calculated using certain statistical functions) are selected in order to construct a consolidated expression matching the query size. According to one embodiment, the recommended identifiers may comprise identifiers representing general features of the image, for example, the intensity of blue, green and red colors or the intensity of light in the image. Additionally or alternatively, recommended identifiers may be selected using a reference image as described below with reference to Fig. 5. According to another embodiment, the system 110 may recommend the identifiers according to usage of the identifiers in previous queries. According to yet another embodiment the user may manually select the identifiers from the entire collection of identifiers initially provided by the system. According to certain embodiments, where a Boolean search is facilitated for search and comparison between digital images, a Boolean search query can be constructed by using any type of Boolean operator in order to define the relationship between the identifiers in a consolidated expression.
According to certain embodiments, system 110 may allow assigning different identifiers pertaining to certain derivative images of the query image with different weights and thus allow building biased searching profiles which will be defined by the user and stems from the needs of the requested application. As explained above, different derivative images resulting from different transformations may in some cases represent or emphasize different visual features of an image. According to certain embodiments, the user may select, from the collection of derivative images, those images which depict visual features which are of interest to the user and give them higher weight. On the other hand other images showing visual features of less interest to the user may be either omitted or may be given lower weights. The search can be thus customized to the interests of the user.
According to certain embodiments, system 110 may provide a means to refine query results. For example the system may assign weights connecting between images reflecting their similarity, clustering together images with high similarity. By continuously fine tuning the clustering, the system may improve the speed and efficiency of the search and retrieval. The fine tuning may be done both manually and automatically. In some embodiments, the system may implement a suggestion module (not shown) which provides users with recommended queries based on the results and feedback of previous users.
According to certain embodiments, the resulting consolidated expression may be stored as a file where each file is associated with the original digital image. In one of these embodiments both the original digital image and its associated file are stored in a database 140 by the database management module 128. Alternatively or additionally, the resulting consolidated expression is used to search database 140 and retrieve images by the image retrieval module 130.
According to certain embodiments, the system 110 of the present invention can be implemented as an image based search engine. Accordingly the system may be used for search and retrieval of images which resemble a given query image. According to certain embodiments, a user may input into the system 110, via a client's 112 user interface or via other user interface integrated within or other associated with the system 110, an original digital image to which the user would like to find similar images (i.e. query image). Alternatively or additionally, a user may select a digital image from a database associated with the system 110. If the query image is not associated with a consolidated expression (i.e. has not been previously processed by the system) the query image must first be processed by the system in order to produce its own identifiers and consolidated expression. According to certain embodiments, once identifiers are obtained for the query image and a consolidated expression is constructed, a comparing and retrieval module 130 searches for other similar digital images, based on the similarity between equivalent identifiers comprising the consolidated expression.
The following examples and drawings are described in accordance with the embodiment wherein identifiers are represented in textual format. However, this is merely a non-binding example and other embodiments of the invention may be implemented as well.
Attention is now drawn to Fig. 2 which illustrates the operations carried out for creating consolidated expressions for digital images, in accordance with an embodiment of the invention. In one embodiment, method 200 is performed by executor 120. In other embodiments, there may be more, less and/or different steps than illustrated in Fig. 2, and/or steps illustrated as being sequential may be performed in parallel. The process described in accordance with Fig. 2 may be performed with a single digital image and/or a plurality of images. In the first step 210 one or more digital images are provided to the system. Provision of digital images can be made from one or more sources, for example: image databases 116, web crawlers 118, clients 112, image capturing devices 114, video players, video cameras and computers, directly or indirectly connected to the system, as previously specified with reference to Fig. 1 supra. In the next step 220 at least one transformation is performed on each digital image which is introduced into system 110, resulting in at least one derivative image pertaining to each original digital image. However, more often a plurality of transformations are performed resulting in plurality of derivative images corresponding to each original digital image. According to certain embodiments this step is facilitated by an image transformation module 122.
Transformation may include different image processing functions including but not limited to one or more of the following visual manipulations: 1. Decreasing the number of colors, threshold, separation in various channeling systems (e.g. RGB), inversing colors equalization, changes of brightness, contrast, Gamma values, luminance etc. 2. Digital signal processing such as FFT; Morphological filters, erode, dilate, outline, etc; Tracing of the various transformations. 3. Merging, logical and mathematical operations on the images.
4. Filtering of any kind, such as smoothing, blurring, edge detection etc.
The filters include filters which are defined by a matrix of values which any pixel and the pixel around it are charged with.
Fig. 3 illustrates an example of the processing of a digital image, in accordance with an embodiment of the invention. As shown in Fig. 3 the process begins with a given original digital image 310. The original digital image is subjected to a plurality of transformations, each transformation resulting in a different derivative image 320. As exemplified in Fig. 3 the transformations often, although not always, involve the degradation of the original image rendering the original image less coherent, clear or understandable to the human eye compared to the original digital image.
Reverting to Fig. 2, in the next step 230 color values of a plurality of pixels from each derivative image are processed. According to certain embodiments, step 230 in Fig. 2 is performed by an identifier generating module 124. According to certain embodiments, the processing in step 230 includes the applying of different statistical functions for calculating statistical values. Although the statistical functions are not limited to any specific number, type or combination of statistical functions, in the following description the average color value, the standard deviation of the color value, and the median color value of the pixels of each derivative image are used as an example. According to certain embodiments, these three statistical functions are sufficient for obtaining a high resolution comparison between digital images. According to certain embodiments of the invention, the entire collection of pixels of each derivative image is used for calculating a set of statistical values. According to other embodiments, subgroups of pixels from each derivative image are used for calculating statistical values. According to further embodiments, in a part of the derivative images the entire collection of pixels is used for calculating statistical values and in other parts of the derivative images only subgroups of pixels are used for calculating statistical values. As was previously mentioned, each derivative image is assigned with an identifier which is a standardized format for representing the calculated statistical values of a specific derivative image.
For the sake of example, assume the following transformations listed in Table 1 along with the corresponding sequences of image processing functions comprised in each transformation.
Figure imgf000024_0001
The following is a non-limiting example of an identifier representing a derivative image corresponding to transformation Vyrl from Table 1. According to certain embodiments, vg identifies the average color pixel values, dv identifies standard deviation of color pixel values and md identifies median color pixel values, where all color pixels values correspond to the color pixel values of a specific derivative image. For example, assuming the following three statistical values were calculated for a certain derivative image, which resulted from performing transformation Vyrl on a given digital image: average = 18, standard deviation = 5 and median 16 then, in accordance with certain embodiments, the resulting identifiers representing the transformation which led to that derivative image and the statistical values of the derivative image would be: a first identifier showing average values vyrlvgl8, a second identifier showing standard deviation value: vyrldv5, and a third identifier showing median value: vyrlmdlό. Accordingly, step 230 will result in a collection of identifiers, wherein each derivative image is represented by three identifiers, each identifier representing a transformation listed in table 1 (which resulted in the derivative image) and the statistical values of the color pixel values of the derivative image (in this example average value, standard deviation value, and median value). In other embodiments, other forms of identifiers may be used, for example, all three statistical values may be represented in a single identifier.
As explained above, each derivative image is obtained by a transformation comprised of a predetermined number, types and order of image processing function. Transformations may comprise one or more image processing functions and the order of the implementation of the image processing functions may vary from one transformation to another. According to certain embodiments, once the transformations are defined, part or all images which are processed by the system undergo the same transformations comprising the same type and the sequence of image processing functions. In order to compare between original digital images, identical transformations are typically performed on all the original digital images and identical statistical functions must be implemented on all the resulting derivative images. Some original digital images may undergo more or less transformations than other digital images, however in order to compare between digital images at least a subset of the transformations (and statistical functions) performed on the compared images, is typically identical.
In the embodiment illustrated in Fig. 2, in step 240, once each derivative image has been assigned with one or more identifiers, the identifiers associated with the same original digital image are arranged into a standardized consolidated expression. According to certain embodiments, step 240 is performed by a consolidating module 126. As each identifier is an independent entity representing an independent process usually the order of performing the different transformations during step 230, is not important. According to certain embodiments, each identifier is explicitly recognized and associated with its corresponding transformation and statistical function in order to allow the selection and comparison of specific identifiers. According to other embodiments, the corresponding transformation and statistical functions are not explicitly noted and may be inferred, if required, by other means, for example, according to their position in the consolidated expression. Box 330 in the example shown in Fig. 3 illustrates one embodiment of a consolidated expression where the identifiers have been arranged into a standardized consolidated expression in textual format. The particular arrangement shown in box 330 should not be construed as limiting. According to one embodiment in step 250, the consolidated expression is stored, for example by a data management module 128, in system database 140, for future reference. Alternatively or additionally, as shown in step 260, the consolidated expression can be compared with a consolidated expression corresponding to other digital images, thereby facilitating a comparison between different digital images. The consolidated expression may be used to query a database of consolidated expressions associated with digital images, compare between the consolidated expressions and retrieve similar images. For example, an image retrieval module 130 image may facilitate an image based search engine operated by comparing consolidated expressions, corresponding to different images. Or in accordance with another example, two images can be inputted and processed by system 110 and the resulting consolidated expressions of both digital images can be compared in order to determine a similarity degree between the images.
According to one embodiment, each original digital image is associated with a text-file containing a collection of identifiers (which constitutes the standardized consolidated expression). According to certain embodiments, digital images located at remote locations (e.g. a remote computer) may be processed and associated with consolidated expressions. According to certain embodiments, the consolidated expression may be stored at the location of the digital image; alternatively or additionally the consolidated expression may be stored at a different location than the digital image and only be associated (e.g. linked) with it logically.
It should be noted that after the identifiers of a certain original digital image are created both the identifiers and the corresponding consolidated expression (e.g. in a text file format) are independent entities and can be utilized, processed and manipulated without being associated with the original digital image. For example a user may hold only the consolidated expression that was produced from a certain digital image, without having the actual original digital image. That user can utilize the consolidated expression for searching and retrieval of images which are similar to the original digital image without having the original digital image.
Fig. 4 is a flowchart showing the operations of the system carried out in association with the comparison between digital images, in accordance with an embodiment of the invention. In the first step 410 one or more query images are provided. Images may be obtained from a variety of sources as was previously described with reference to Fig. 1 and Fig. 2.
Different derivative images and their corresponding identifiers represent different visual features of the different digital images. Accordingly, in some embodiments, it may be desirable to select and isolate specific identifiers according to the visual features which they represent, thereby directing the search and comparison to specifically focus on these visual features. According to certain embodiments, in step 412 the user may decide whether such specific processing is needed. An affirmative answer would allow the specific processing to take place in step 414. Otherwise step 420 is followed. A more detailed description of the operation performed in step 414 is specified below with reference to Fig. 5. In some embodiments step 412 is omitted and step 414 is performed automatically without intervention of a user.
According to certain embodiments, in the following step 420, in order to decide whether the query image should be processed for generating a consolidated expression representing the query image, the system checks, in the system's database, whether the query image is associated with a consolidated expression, for example if the query image is selected from within the systems database 140 and thus the digital image is already associated with a consolidated expression. If the answer is "yes", the system retrieves the consolidated expression associated with the query image and turns to searching for similar digital images 460 based on the comparison of the consolidated expression of the query image and the consolidated expressions of other images stored in the system database or elsewhere. Otherwise, if the answer is "no" the query image must be processed in order to obtain the consolidated expression of the query image. Steps 220-240 were described above with reference to Fig. 2. After the consolidated expression (e.g. in a textual representation) of the query image is obtained, a search for similar consolidated expressions associated with digital images is performed 460. As explained above, in general, any digital image that is processed by system 110 is subjected to a large number and a large variety of transformations in order to create a diverse representation for the original digital images. However during the comparison of a digital image, depending on the specific application and preferences of the user, it may occur that only a subset of the available identifiers is used for a specific comparison. For example, while the number of identifiers initially created (in steps 220-240) may be around 200, in some cases 20 identifiers or less may be selected for the comparison stage in step 460. Alternatively or additionally, in some embodiments, the initial number of transformations and corresponding identifiers calculated by system 110 may be limited to a smaller number of transformations or/and a smaller number of identifiers. According to certain embodiments, where different digital images are represented by a different number of transformations or identifiers, the comparison between query images is performed by comparing between the subsets of equivalent identifiers.
Search results, consisting of the retrieved digital images, are presented to the user. Depending on the embodiment, the retrieved images can be presented in ascending or descending order of similarity to the query image. According to certain embodiments, in step 480 after viewing the results the user may decide whether the obtained results are satisfying or whether improvement of the results is required. If the user is satisfied with the results, the process can be terminated in step 490 or alternatively a new search can be initiated (e.g. by re-executing method 400 with a different query image). According to certain embodiments, additional comparisons can be made against the same image database that was used in the first search, or against any other collection of images selected by the user.
If the user is not satisfied with the results of the search and wishes to improve the results, he may choose to return to step 414 and perform specific processing in order to more accurately focus on visual features of interest within the original digital image. According to certain embodiments, specific processing may be performed on a query image in order to emphasize specific visual features within the image or in order to select a specific object or area within the image. Fig. 5 is a flowchart showing the operations carried out in association with directing the comparison process to specific visual features of a digital image, in accordance with an embodiment of the invention. Fig. 5 is an example of an exploded illustration of step 414 of Fig. 4. As mentioned above image processing and comparison, in accordance with certain embodiments of the present invention, is not based on the semantic description of the content of digital images but rather on the visual properties of the digital images. In some cases, a user may wish to emphasize certain visual features of a given digital image and direct the search to these features while obfuscating or ignoring other features in the image. In addition, the images retrieved by the system and method of the present invention may include unwanted images due to visual similarity between the query images and the images stored in the system database. These images may be conceptually different and may not reflect the results desired by the user. For example, consider two images, the first showing a white boat in the middle of the sea and the other showing a white airplane in the middle of the sky. As both images are comprised of a white object on a blue background they are likely to have similar visual properties and therefore, the processing of both images may produce identifiers having similar values. Although the two images are indeed similar, some users may prefer to focus the search only on images depicting boats and not airplanes. The system and method of the present invention provides a solution, illustrated with reference to Fig. 5, for improving and optimizing the search results by emphasizing certain visual features of interest within the original digital image and isolating identifiers which represent these visual features, thereby allowing focusing the comparison and search on specific similarity to the selected features. According to certain embodiments, in order to achieve this goal a reference image is created, the reference image being a digital image that represents the visual features desired by the user. Identifiers are created for the reference image and compared with equivalent identifiers of the query image. Identifiers having similar values in the query image and the reference image are selected. According to one embodiment, in step 510, the user is allowed to select specific image processing functions which may reflect or emphasize the desired visual features of the original image, for example, edge detection algorithms which emphasize the outline of the objects shown in the image, or other image processing functions which enhance specific colors in the image. According to certain embodiments, the system and method of the present invention allows users to utilize such image processing functions in order to create one or more reference images (i.e. a first type of reference image) tailored to a specific need or a specific application. According to this embodiment, a reference image is a query image that has been processed and altered using one or more image processing functions specifically selected for achieving a specific visual effect. This facilitates the emphasis of specific visual features within the original digital image and allows obtaining specific identifiers which represent these specific visual features. These identifiers can be later used for directing the search and comparison to these specific visual features, as explained further below.
According to another embodiment, in order to search for specific objects (e.g. images showing white boats) or specific visual features (e.g. images having a blue background) a user may use an auxiliary image (i.e. a second type of reference image). According to certain embodiments, an auxiliary image can be any digital image showing a desired object (e.g. a white boat) or desired visual feature (e.g. blue background). According to one embodiment, in step 510 the auxiliary image is selected by the user.
According to yet another embodiment a reference image may be created by marking the boundaries of the desired object (e.g. a white boat) or area within the query image and defining a new image within these boundaries (i.e. a third type of reference image). According to this embodiment, step 510 is facilitated for selecting a specific area within the original digital image or a specific object. It should be noted that the three types of reference images specified above are merely examples and the present invention is not bound by these examples, and other ways for producing different kinds of reference images may be used as well.
According to certain embodiments, in the next steps 530 and 540 a collection of predefined transformations are performed on both the query image and the reference image which was obtained during the previous step 510. In steps 550 and 560 the same statistical calculations are performed on the color pixel values of both images and equivalent identifiers are generated from each transformation, as described above with reference to Fig. 2. According to other embodiments, the identifiers of both the query image and/or the reference image may already be available (e.g. stored in the system database 140); in this case steps 530-560 are not performed, but rather the relevant information is retrieved. According to certain embodiments, in the next step 570 the identifiers of the query image and the reference image are compared and a new collection of identifiers is created. The identifiers in the new collection of identifiers are selected according to a predefined similarity degree between the equivalent identifiers of the two images. The new collection of identifiers represents the visual features which are common to both the query image and the reference image. Thus, the desired visual features which are emphasized in the reference image and represented by the identifiers of the reference image are isolated from the identifiers of the query image.
According to certain embodiments, in order to direct the search to images which are both characterized by the visual features which where emphasized by the reference image and also have specific similarity to the general visual features in the query image, additional identifiers are added in step 580 to the new collection of identifiers, created in step 570. According to certain embodiments, the additional identifiers are selected from the identifiers which were calculated for the query image. More specifically, the additional identifiers are a subset of identifiers which reflect the general visual aspects of an image, for example, the degree of blue in an image. Different applications are characterized by different general visual features and these features are represented by specific identifiers. Thus, the additional identifiers are selected according to the specific application. This provides that the search is focused on images with both general similarity to the query image and also specific similarity to the visual features emphasized by the reference image.
According to certain embodiments, a consolidated expression corresponding to the reference image is constructed with the new collection of identifiers and the new consolidated expression is used for search and comparison of the digital image as described above. According to certain embodiments the process described in accordance with Fig. 5 is performed by the transformation module 122.
Although the description set forth with reference to Fig. 5 is described in connection to preprocessing of a query image, according to certain embodiments the steps described in Fig. 5 can be used as an alternative to steps 220-240 which are described in Fig. 2. For example, preprocessing as described with reference to Fig. 5 can be conducted on all original digital images before they are stored in a system database 140. This may be done for example, in order to underline specific visual features in digital images which are important for a specific application and to create a database of consolidated expressions which are adapted for that application.
Fig. 6a and Fig 6b show an example, demonstrating the process described with reference to Fig. 5, in accordance with an embodiment of the invention. Assume Fig. 6a is a query image showing two people, a woman and a man standing next to each other and a user is interested in searching for other images showing similar faces. The values of the identifiers of the original image would be influenced by the additional elements in the image such as the color of the background or the flag and may bring forth search results which are different from the desired results. According to certain embodiments and as illustrated by the altered image shown in Fig. 6b, image processing functions can be utilized in order to enhance the faces in the image and obfuscate the rest of the image. According to one embodiment, steps 530 -560 can be performed on both the image in Fig. 6a and the image in Fig. 6b and a comparison between the resulting identifiers of the two images can be made. Identifiers of the two images which are similar may be selected and used for constructing a new collection of identifiers representing the original image in this specific search. Adding to the new collection of identifiers, other identifiers from the original collection of identifiers of the original image, which represent general features of the original digital image, would direct the search to focus on searching for images with similar faces but also with similar general visual features.
By processing the original images and producing the consolidated expression as described above, the system of the present invention supports a variety of applications which include decision making which is based on visual information and content, e.g. computer vision, robotic vision, medical analysis etc. Prior art methods and systems for image comparison and retrieval often require intensive computational steps before the actual comparison is performed. Such systems, which are used inter alia, for character recognition, face recognition, fingerprint recognition and the like, often include a learning step in which the system learns what to look for, from prepared examples of similar images and additional information. Other systems are configured for searching for predefined and limited visual aspects in a digital image, for example, capturing motion in alert surveillance systems. According to certain embodiments, the system and method of the present invention can be utilized for a diversity of different applications, while performing similar processing regardless of the intended application. Each field of interest may have specific needs and relevant visual aspects. Therefore, certain transformations which are directed to the requirements of a specific application may be utilized and help to improve the search results for a specific type of image. According to certain embodiments the same pool of transformations can be executed for all applications resulting in a collection of identifiers. As explained above, a user can manually select, or otherwise selection can be performed automatically by system 110, from the pool of identifiers which are most appropriate for a specific application of interest. Examples of applications include, inter-alia, medical imaging applications such as X-ray or MRI, black and white photographs, sketches, graphs, satellite generated images, microscope generated images, engineering research and development in every field that uses visual data, weather forecasting, financial data visualization, surveillance, face recognition etc.
For example, the system and method of the present invention may be utilized for analysis of captured video of any kind. According to this example, the footage of a surveillance camera can be continuously analyzed for detecting movement. As the time required for generating a collection of about 200 identifiers, in some embodiments, is about 1 tenth of a second, the system can sample for every short period of time (e.g. half a second) a captured image from the video footage and compare it to predefined reference image. In addition, according to some embodiments, for the purpose of the analysis of surveillance cameras footage, selecting a specific transformation would allow discerning between real movements and random changes in light.
According to certain embodiments, the system and method of the present invention may combine identifiers together with conventional semantic tagging. As both types of image identification can be represented in a textual form, they can be utilized together in textual search engines providing improved textual queries. According to this embodiment, the method and system of the present invention may be used together with conventional searching methods such as textual comparison of tags or other textual semantic descriptors, thereby utilizing information from multiple sources and providing a powerful image searching and comparison tool. According to certain embodiments, the system and method of the present invention can be utilized for identifying and analyzing the content of a given image. Tags or other types of a more detailed or less detailed description of the content of the image associated with digital images may be associated with different images, possibly stored in a database. These tags or other types of description can be accessed for obtaining information on a given digital image. Thus the system and methods of the present invention can be used to link visual content from any digital input device to meaningful information pertaining to the visual content. For instance, a user can capture an image of a certain geographical location and send it to system 110 for processing. According to some embodiments the image comparing and retrieval module 130 may retrieve similar images together with information on the specific geographical location. In another example, involving the analysis of medical images such as MRI or X-Ray, system database 140 may include a large number of images representing a variety of medical states, each image associated with a detailed description of the medical status inferred from the MRI image. An MRJ image may be inputted into system 110 and similar MRI images may be retrieved together with the medical diagnosis associated with the stored images. According to certain embodiments, based on previous knowledge, different visual properties of the image can be mapped according to different medical states, and system 110 may be utilized to link visual aspects of a given medical image (such as MRI image or Xray) with the relevant medical knowledge and diagnosis.
It will be understood that, in accordance with a certain embodiment, the system according to the invention may be a suitably programmed computer.
Likewise, the invention contemplates a computer program being readable by a computer for executing the method of the invention. The invention further contemplates a machine-readable memory tangibly embodying a program of instructions executable by the machine for executing the method of the invention. While various embodiments have been shown and described, it will be understood that there is no intent to limit the invention by such disclosure, but rather, it is intended to cover all modifications and alternate constructions falling within the scope of the invention, as defined in the appended claims.

Claims

CLAIMS:
1. A method of generating image representation in a standardized format, the method comprising:
(a) obtaining data indicative of at least one digital image; (b) performing one or more transformations on said at least one digital image, resulting in at least one derivative image of said at least one digital image;
(c) processing each of said at least one derivative image, including: i) applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; ii) converting said at least one statistical value to at least one standardized format representation; and
(d) joining standardized format representations that correspond to at least two derivative images into a single consolidated standardized expression, wherein said standardized expression being a representation of said at least one digital image and wherein said standardized expression is accessible by a computer application for processing said at least one digital image.
2. The method of claim 1 further comprising: comparing said consolidated standardized expression to one or more other consolidated standardized expressions, thereby enabling to determine a similarity degree between said at least one digital image corresponding to said consolidated expression and said one or more other digital images, corresponding, respectively, to said one or more other consolidated expressions.
3. The method of claim 2 wherein said one or more transformations include at least one predefined transformation, and wherein said consolidated standardized expression and said one or more other consolidated standardized expressions are generated by performing said at least one predefined transformation.
4. The method of claim 2 wherein said at least one digital image being a query image and wherein said one or more other digital images being a reference image, said reference image and said query image having common visual features, and wherein said comparing includes identifying said common visual features.
5. The method of claim 4 wherein said reference image is another digital image selected by a user, which is different from said query image.
6. The method of claim 4 wherein said reference image is obtained by processing said query image.
7. The method of claim 6 wherein said processing comprises selecting an area of said query image, giving rise to a reference image.
8. The method of claim 6 wherein said processing comprises selecting an object within said query image, giving rise to a reference image.
9. The method of claim 4 wherein said reference image is obtained by processing another digital image.
10. The method of claim 1 wherein said standardized format representation is a textual format.
11. The method of claim 10 wherein said textual format is comprised of alphanumeric characters.
12. The method of claim 1 wherein said consolidated standardized expression is stored as a text file.
13. The method of claim 2 wherein said consolidated standardized expressions are in a textual format, and wherein said comparison is performed by utilizing text comparison.
14. The method of claim 2 wherein said comparison is performed by utilizing a text search engine.
15. The method of claim 1 wherein said statistical functions are selected from a group consisting of at least a median, an average and standard deviation.
16. The method of claim 1 wherein said statistical functions include a median, an average and standard deviation.
17. The method of claim 1 wherein said statistical values are calculated from a portion of pixels of said at least one derivative image.
18. The method of claim 1 wherein said statistical values are calculated from all pixels of said at least one derivative image.
19. The method of claim 1 wherein said at least one derivative image is substantially visibly degraded in comparison with said digital image.
20. The method of claim 1 further comprising storing said consolidated standardized expression in a database.
21. The method of claim 1 wherein said standardized format representation are identifiers.
22. The method of claim 13 wherein said consolidated expression includes at least one other textual descriptor corresponding to said at least one digital image.
23. The method of claim 1 wherein each of said one or more transformations comprise of a sequence of one or more statistical functions.
24. A system for generating images representation in a standardized format, comprising: a transformation module, being responsive to information indicative of at least one digital image and configured to perform one or more transformations on said at least one digital image, resulting in at least one derivative image of said at least one digital image; an identifier generating module coupled to said transformation module and configured to process each of said at least one derivative images, the processing comprising: applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value and converting said at least one statistical value to at least one standardized format representation; and a consolidating module coupled to said identifier generated module and configured to join standardized format representations corresponding to at least two derivative images into a standardized consolidated expression, wherein said standardized expression being a representation of said at least one digital image and wherein said standardized expression is accessible by a computer application for processing said at least one digital image.
25. The system of claim 24 further comprising a comparing and retrieval module configured to compare said consolidated standardized expression to one or more other consolidated standardized expressions, thereby enabling to determine a similarity degree between said at least one digital image corresponding to said consolidated expression and said one or more other digital images, corresponding, respectively, to said one or more other consolidated expressions.
26. The system of claim 25 wherein said transformation module is configured to perform at least one predefined transformation, and wherein said consolidated standardized expression and said one or more other consolidated expressions are generated by performing said at least one predefined transformation.
27. The system of claim 24 wherein said standardized format is a textual format.
28. The system of claim 27 wherein said textual format is comprised of alphanumeric characters.
29. The system of claim 25 wherein said consolidated standardized expressions are in a textual format, and wherein said comparing is performed by utilizing text comparison methods.
30. The system of claim 24 wherein said at least one digital image is obtainable from at least one of the following:
(a) a client;
(b) an image database;
(c) an image acquisition device; and
(d) a web crawler.
31. The system of claim 24 wherein said one or more transformations comprise of a sequence of one or more statistical functions.
32. The system of claim 29 wherein said consolidated module join into said consolidated expression at least one other textual descriptor corresponding to said at least one digital image.
33. A method of generating image representation in a standardized format, the method comprising:
(a) providing data indicative of at least one query image;
(b) providing data indicative of a reference digital image; (c) performing one or more transformations on said at least one query image and on said reference image, resulting in at least one derivative image of said at least one query image and at least one derivative image of said reference image;
(d) processing each of said at least one derivative image of said at least one query image, the processing comprising: i) applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; ii) converting said at least one statistical value, to at least one standardized format representation, thereby generating a first group of identifiers representing each of said at least one derivative image of said at least one query image; (e) processing each of said at least one derivative image of said reference image, the processing comprising: i) applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; ii) converting said at least one statistical value, to at least one standardized format representation, thereby generating a second group of identifiers representing each of said at least one derivative image of said reference image;
(f) comparing said first and second groups of identifiers and selecting identifiers according to a predefined similarity degree; and (g) creating a new group of identifiers comprising the selected identifiers, said new group of identifiers being a consolidated standardized expression representing common visual features to said at least one query image and said reference image, and wherein said consolidated standardized expression is accessible by a computer application for processing said at least one query image.
34. The method of claim 33 further comprising the steps of:
(h) selecting additional identifiers from said first group of identifiers, wherein said additional identifiers were not selected during step (f); and
(i) adding said additional identifiers to said new group of identifiers.
35. The method of claim 34 wherein said additional identifiers represent general visual features of said at least one query image.
36. The method of claim 33 wherein said reference image is another digital image selected by a user, which is different from said query image.
37. The method of claim 33 wherein said reference image is obtained by processing said at least one query image.
38. The method of claim 37 wherein said processing comprises selecting a portion of said query image, giving rise to a reference image.
39. The method of claim 37 wherein said processing comprises selecting an object within said query image, giving rise to a reference image.
40. The method of claim 33 wherein said reference image is obtained by processing another digital image.
41. The method of claim 33 wherein said standardized format is a textual format and is stored as a text file.
42. The method of claim 41 wherein said textual format is comprised of alphanumeric characters.
43. The method of claim 33 wherein said consolidated standardized expressions are in a textual format, and wherein said comparison is performed by utilizing text comparison.
44. Textual representation of digital images obtained by the method of claim 1.
45. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps of generating image representation in a standardized format, the method comprising:
(a) obtaining data indicative of at least one digital image;
(b) performing one or more transformations on said at least one digital image, resulting in at least one derivative image of said at least one digital image;
(c) processing each of said at least one derivative image, including: i) applying one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; ii) converting said at least one statistical value to at least one standardized format representation; and
(d) joining standardized format representations that correspond to at least two derivative images into a single consolidated standardized expression, wherein said standardized expression being a representation of said at least one digital image and wherein said standardized expression is accessible by a computer application for processing said at least one digital image.
46. A computer program product comprising a computer useable medium having computer readable program code embodied therein of generating image representation in a standardized format, the computer program product comprising: computer readable program code for causing the computer to obtain data indicative of at least one digital image; computer readable program code for causing the computer to perform one or more transformations on said at least one digital image, resulting in at least one derivative image of said at least one digital image; computer readable program code for causing the computer to process each of said at least one derivative image, including: computer readable program code for causing the computer to apply one or more statistical functions on a plurality of pixels, thereby obtaining at least one statistical value; computer readable program code for causing the computer to convert said at least one statistical value to at least one standardized format representation; and computer readable program code for causing the computer to join standardized format representations that correspond to at least two derivative images into a single consolidated standardized expression, wherein said standardized expression being a representation of said at least one digital image and wherein said standardized expression is accessible by a computer application for processing said at least one digital image.
PCT/IL2008/001582 2007-12-06 2008-12-04 System and method for representation and comparison of digital images WO2009072128A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US99291207P 2007-12-06 2007-12-06
US60/992,912 2007-12-06

Publications (2)

Publication Number Publication Date
WO2009072128A2 true WO2009072128A2 (en) 2009-06-11
WO2009072128A3 WO2009072128A3 (en) 2010-03-11

Family

ID=40718300

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IL2008/001582 WO2009072128A2 (en) 2007-12-06 2008-12-04 System and method for representation and comparison of digital images

Country Status (1)

Country Link
WO (1) WO2009072128A2 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060020597A1 (en) * 2003-11-26 2006-01-26 Yesvideo, Inc. Use of image similarity in summarizing a collection of visual images
US7010745B1 (en) * 1999-07-01 2006-03-07 Sharp Kabushiki Kaisha Border eliminating device, border eliminating method, and authoring device
US20060143176A1 (en) * 2002-04-15 2006-06-29 International Business Machines Corporation System and method for measuring image similarity based on semantic meaning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7010745B1 (en) * 1999-07-01 2006-03-07 Sharp Kabushiki Kaisha Border eliminating device, border eliminating method, and authoring device
US20060143176A1 (en) * 2002-04-15 2006-06-29 International Business Machines Corporation System and method for measuring image similarity based on semantic meaning
US20060020597A1 (en) * 2003-11-26 2006-01-26 Yesvideo, Inc. Use of image similarity in summarizing a collection of visual images

Also Published As

Publication number Publication date
WO2009072128A3 (en) 2010-03-11

Similar Documents

Publication Publication Date Title
US10621755B1 (en) Image file compression using dummy data for non-salient portions of images
Lux et al. Visual information retrieval using java and lire
US7043474B2 (en) System and method for measuring image similarity based on semantic meaning
CN102576372B (en) Content-based image search
US8718383B2 (en) Image and website filter using image comparison
US20170024384A1 (en) System and method for analyzing and searching imagery
Boato et al. Exploiting visual saliency for increasing diversity of image retrieval results
Kalaiarasi et al. Clustering of near duplicate images using bundled features
Adjetey et al. Content-based image retrieval using Tesseract OCR engine and levenshtein algorithm
Bhoir et al. A review on recent advances in content-based image retrieval used in image search engine
Koskela Content-based image retrieval with self-organizing maps
Seth et al. A review on content based image retrieval
Ragatha et al. Image query based search engine using image content retrieval
Gupta et al. A Framework for Semantic based Image Retrieval from Cyberspace by mapping low level features with high level semantics
Morsillo et al. Mining the web for visual concepts
Mumar Image retrieval using SURF features
Singh et al. Semantics Based Image Retrieval from Cyberspace-A Review Study.
Waykar et al. Multimodal features and probability extended nearest neighbor classification for content-based lecture video retrieval
WO2009072128A2 (en) System and method for representation and comparison of digital images
Koyuncu et al. An analysis of content-based image retrieval
Roullet et al. Transfer learning methods for extracting, classifying and searching large collections of historical images and their captions
Badghaiya et al. Image classification using tag and segmentation based retrieval
Farooque Image indexing and retrieval
Philip et al. Development of an image retrieval model for biomedical image databases
Dixit An analysis of content-based image retrieval

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08857719

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase in:

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC, EPO FORM 1205A, 19/10/2010

122 Ep: pct application non-entry in european phase

Ref document number: 08857719

Country of ref document: EP

Kind code of ref document: A2