JP2013246544A - Image search device and image search method - Google Patents

Image search device and image search method Download PDF

Info

Publication number
JP2013246544A
JP2013246544A JP2012118320A JP2012118320A JP2013246544A JP 2013246544 A JP2013246544 A JP 2013246544A JP 2012118320 A JP2012118320 A JP 2012118320A JP 2012118320 A JP2012118320 A JP 2012118320A JP 2013246544 A JP2013246544 A JP 2013246544A
Authority
JP
Japan
Prior art keywords
image
search
document
unit
character string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2012118320A
Other languages
Japanese (ja)
Other versions
JP5868262B2 (en
Inventor
Naoto Akira
直人 秋良
Atsushi Hiroike
敦 廣池
Original Assignee
Hitachi Ltd
株式会社日立製作所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd, 株式会社日立製作所 filed Critical Hitachi Ltd
Priority to JP2012118320A priority Critical patent/JP5868262B2/en
Publication of JP2013246544A publication Critical patent/JP2013246544A/en
Application granted granted Critical
Publication of JP5868262B2 publication Critical patent/JP5868262B2/en
Application status is Active legal-status Critical
Anticipated expiration legal-status Critical

Links

Images

Abstract

Provided is a search technique with high accuracy and few search omissions when searching for an image in a document by specifying a keyword.
An image search apparatus according to the present invention searches for an image similar to an image in document data including a search character string, the information amount of the search character string in the document data including the similar image, and all documents. Based on the information amount of the search character string in the data, it is determined whether or not the image represents the content of the search character string.
[Selection] Figure 5

Description

  The present invention relates to a technique for searching for an image.

  The ratio of documents containing images is increasing due to the high functionality of document creation software and the increase in storage capacity. In particular, with regard to presentation materials and corporate documents, there are many documents in which the image area is larger than the text area in order to support understanding in a short time. When searching for these files, methods such as directly referring to a folder in which the files are stored or performing a keyword search for text in a document are used as search means.

  However, in searches that rely only on text information, the relationship between images and keywords is used when searching for documents that contain images related to the search string or when searching for images that show topics related to the specified search string. Difficult to do.

  In the following Patent Document 1, in view of the above problems, a paragraph on which an image depends is specified based on a caption number and a caption character string of an image in a document, and a word having a high feature degree is selected from the paragraph as a keyword of the image. The method of extracting as is proposed.

JP 2010-205060 A

  In the technique described in Patent Document 1, when an unexpected caption is included in the image, or when there is no word that indicates the content of the image in the dependent paragraph, the word that is not intended by the user is displayed in the image. Are associated as keywords.

  In addition, since there are an infinite number of combinations of words indicating the contents of an image, if a search keyword is automatically acquired from within a document, a word set suitable for the user's viewpoint may not be used as the search keyword. For example, an incorrect keyword or a keyword composed of words that are correct but rarely input by the user at the time of search may be used and associated with an image. Such keywords are difficult to use for searching images and documents.

  The present invention has been made in view of the above problems, and an object of the present invention is to provide a search technique with high accuracy and less search omission when a keyword is specified to search an image in a document. .

  An image search apparatus according to the present invention searches for an image similar to an image in document data including a search character string, and an information amount of the search character string in the document data including the similar image, and search in all document data Based on the information amount of the character string, it is determined whether or not the image represents the content of the search character string.

  According to the image search device of the present invention, search omissions can be reduced by searching for an image similar to an image in document data including a search character string. Moreover, since it is determined whether the image represents the content of the search character string based on the information amount of the search character string, the search accuracy can be improved.

  Problems, configurations, and effects other than those described above will become apparent from the following description of embodiments.

1 is a configuration diagram of an image search device 100 according to Embodiment 1. FIG. It is a figure which shows the structure and data example of image DB113. It is a figure which shows a mode that the feature-value vector stored in the image feature-value field 1132 is calculated. It is a figure which shows the structural example of the search character string input screen which the screen display part 119 displays on the display part 104. FIG. 5 is a flowchart for explaining the operation of the image search apparatus 100. It is a figure which shows the example of a setting screen for a user to designate the coefficient (alpha) in step S506. 10 is a flowchart illustrating an operation of the image search apparatus 100 according to the second embodiment. It is a block diagram of the image search device 100 concerning Embodiment 3. It is a figure which shows the structure and data example of image metadata DB820. 14 is a flowchart illustrating processing for registering a keyword in the image metadata DB 820 by the image search apparatus 100 according to the third embodiment. 10 is a flowchart for explaining the operation of the image search apparatus 100 according to the fourth embodiment. 10 is a flowchart for explaining the operation of the image search apparatus 100 according to the fifth embodiment. 14 is a flowchart for explaining the operation of the image search apparatus 100 according to the sixth embodiment.

<Embodiment 1>
FIG. 1 is a configuration diagram of an image search apparatus 100 according to Embodiment 1 of the present invention. The image search device 100 is a device that searches for an image representing the content of a search character string specified by a user, and includes a CPU (Central Processing Unit) 101, a main memory 102, an input unit 103, a display unit 104, a communication unit 105, and a storage. The unit 110 is provided.

  The CPU 101 provides the function of the image search device 100 by executing a program stored in the storage unit 110. The main memory 102 temporarily stores data used by the CPU 101. The input unit 103 receives an operation input from the user and outputs it to the CPU 101. The display unit 104 is a screen display device such as a display. The communication unit 105 is an interface for communicating with other devices.

  The storage unit 110 includes an OS (operating system) 111, a document DB (DataBase) 112, an image DB 113, a document analysis unit 114, an image estimation unit 115, a document search unit 116, an image search unit 117, and a morphological analysis unit of the image search apparatus 100. 118 and a screen display unit 119 are stored.

  The document DB 112 is a database that stores document data. The document data itself need not necessarily be stored as long as the document search unit 116 to be described later can search for a document or acquire the word frequency of the document. The image DB 113 is a database that stores image data. Details of the image DB 113 will be described later with reference to FIG.

  The document analysis unit 114 analyzes the document data and extracts text and images in the document data. The document analysis unit 114 can be configured using a library provided by a document creation software developer, an open source library, or the like. For example, Microsoft Office (registered trademark) document data can use SDK (Software Development Kit) provided by Microsoft, and PDF files can use open source iText. If the document search unit 116 described later has a document analysis function, the document analysis function may be used.

  The image estimation unit 115 determines whether each image included in the search result document is an image indicating the content of the search character string designated by the user. A specific determination method will be described later with reference to FIG.

  When document data is newly registered in the document DB 112, the document search unit 116 performs morphological analysis on the text in the document data acquired by the document analysis unit 114 by the function of the morphological analysis unit 118, and the result is obtained. The word is registered in the document DB 112 as index information for search. Further, the document DB 112 is searched for document data including a search character string designated by the user. The document search unit 116 can be configured using, for example, Lucene (http://lucene.apache.org/), which is a full-text search software released as an open source. Not limited to.

  The image search unit 117 searches for an image similar to the designated image. Specifically, the distance between the feature quantity vectors of each image is calculated, and an image with a small distance is determined as an image with a high degree of similarity. The image search unit 117 outputs image search results, for example, in descending order of similarity. At this time, the search results are extracted according to the purpose and sorted and output as appropriate, such as the top N search results or images having a similarity of X or more.

  The morphological analysis unit 118 performs morphological analysis on the character string and divides the character string into words. For example, using the Japanese morphological analysis system Sen (http://www.mlab.im.dendai.ac.jp/~yamada/ir/MorphologicalAnalyzer/Sen.html) that is open source Can do. If the document search unit 116 has a morphological analysis function, that function may be used. In addition, since the language such as English is divided into words, morphological analysis is not necessary. However, since the same semantic word may be a different character string due to a change in word form, stemming processing generally well known in this field may be performed to unify the stem.

  The screen display unit 119 outputs the processing result of each functional unit on the display unit 104. Instead of the screen output, data indicating the processing result may be output, and a device or the like that has received the data may perform the screen output. Other output formats may be employed.

  The document analysis unit 114, the image estimation unit 115, the document search unit 116, the image search unit 117, the morpheme analysis unit 118, and the screen display unit 119 may be configured by using hardware such as a circuit device that realizes these functions. It can also be configured by the CPU 101 executing a program that implements these functions. In the following, it is assumed that the latter is implemented.

  The document DB 112 and the image DB 113 can be configured by a data file that records records stored therein and a storage area on the storage unit 110. These DBs may be configured together with a function unit that controls data reading and writing with respect to the database.

  FIG. 2 is a diagram illustrating a configuration and data example of the image DB 113. The image DB 113 includes an image ID field 1131, an image feature amount field 1132, an image field 1133, a document ID field 1134, a page field 1135, and a coordinate field 1136.

  The image ID field 1131 holds an ID for identifying an image included in the document. The image feature amount field 1132 holds a feature amount vector obtained by quantifying the apparent feature of the image. The image field 1133 holds binary image data. The document ID field 1134 holds the ID of the document that includes the image identified by the image ID field 1131. The page field 1135 holds the page number where the image identified by the image ID field 1131 is arranged in the document identified by the document ID field 1134. The coordinate field 1136 holds coordinates indicating the arrangement position of the image in the document identified by the document ID field 1134.

  For example, an image with an image ID = 10000001 indicates an image in a rectangular area surrounded by upper left coordinates (35, 10) and lower right coordinates (60, 35) of the first page of a document with document ID = 000001. Here, the definition of the coordinate field 1136 uses coordinates when the X and Y directions are each normalized by the maximum value 100, but other units such as the number of pixels may be used.

  When document data is registered in the document DB 112, the document analysis unit 114 acquires an image from the document data and registers the image in the image DB 113. For example, when the image size is equal to or smaller than a predetermined size, registration may be omitted for an image having no substantial content. The image feature field 1132 may be calculated by an appropriate function unit such as the document analysis unit 114 or the image search unit 117.

  FIG. 3 is a diagram showing how the feature quantity vector stored in the image feature quantity field 1132 is calculated. The feature amount vector can be configured as a multidimensional vector representing the appearance feature of the image generated from the image. For example, using the pixel value information of the image, generate a multidimensional vector indicating the distribution of the edge pattern in the image, and dimensionally compress the multidimensional vector using the principal component analysis method, etc. This vector can be generated as a feature vector.

  As for the distribution of edge patterns in the image, a plurality of characteristic edge patterns are set in advance as shown in the example shown in FIG. 3, the image is divided into regions, and the number of edge patterns included in each region is counted. Thus, a multi-dimensional vector is generated, and is generated by dimensional compression using a principal component analysis method.

  Here, the feature amount vector is exemplified as the feature amount indicating the appearance feature of the image. However, the feature amount vector is not limited to this, and any information that can express an equivalent feature may be used. For example, other feature amounts such as edge histogram features and SIFT (Scale-Invariant Feature Transform) feature amounts defined in MPEG-7, which are generally known, may be used. In addition, the calculation of the distance between vectors may be any method as long as the similarity between vectors can be calculated, for example, the square distance.

  FIG. 4 is a diagram illustrating a configuration example of a search character string input screen displayed on the display unit 104 by the screen display unit 119. The user inputs a search character string in the search character string input field 1041 and presses the search button 1042. The image search apparatus 100 receives the search character string in accordance with the flowchart described in FIG. 5 below, and searches for an image representing the content of the search character string.

  FIG. 5 is a flowchart for explaining the operation of the image search apparatus 100. Hereinafter, each step of FIG. 5 will be described.

(FIG. 5: Step S501)
The screen display unit 119 causes the display unit 104 to display the input screen described with reference to FIG. The user operates the input unit 103 to input a search character string. The image search apparatus 100 receives the search character string and performs the following processing. A single word may be input as a search character string, or a combination of a plurality of words may be input.

(FIG. 5: Step S502)
The document search unit 116 acquires from the document DB 112 a set of documents including the search character string received in step S501. When a plurality of words are input as the search character string, a document including all of them may be searched, or a document including at least one of them may be searched. In the first embodiment, a document including all words included in the search character string is searched.

(FIG. 5: Step S503)
The document analysis unit 114 acquires an image included in the search result document acquired in step S <b> 502 from the image DB 113. Since the images acquired in this step are all images in all documents including the search character string, many images that do not represent the contents of the search character string are included.

(FIG. 5: Step S504)
The image search unit 117 searches for a set of documents including similar images for each image acquired in step S503. Specifically, a similarity search may be performed using the image feature field 1132 of the image DB 113, and a document ID field 1134 corresponding to the image obtained as a result may be acquired.

(FIG. 5: Step S505: Part 1)
The image estimation unit 115 acquires the appearance probability P (x) of the search character string for the document set acquired in step S504. For example, if the total number of documents including similar images acquired in step S504 is 1000, and 50 of them include search character strings, P (x) = 50/1000 = 0.05.

(FIG. 5: Step S505: Part 2)
The image estimation unit 115 refers to the document DB 112 and acquires the appearance probability Q (x) of the search character string in the entire document. For example, if there are 500,000 documents registered in the document DB 112 and the number of documents including the search character string is 200, Q (x) = 200/500000 = 0.004. In step S502, when the number of documents including the search character string has already been acquired, the value may be used.

(FIG. 5: Step S506)
The image estimation unit 115 uses the appearance probabilities P (x) and Q (x) calculated in step S505 to determine whether each image acquired in step S504 is an image indicating the content of the search character string. First, if the image shows the contents of the search character string, it is assumed that the search character string should be included in the document acquired in step S504 with a higher probability than the entire document set. Of the images acquired in step S503, those satisfying P (x)> α × Q (x) (α> 1) are determined to be images indicating the contents of the search character string.

(FIG. 5: Step S506: Supplement 1)
Α in the conditional expression is a coefficient for preventing the conditional expression from being satisfied by chance when the appearance probabilities P (x) and Q (x) are substantially equal. The value of α is determined in advance, for example, 1.5. If α is small, the number of images associated with the search character string increases, but there is a tendency that erroneous determination results also increase. If α is large, the tendency is opposite.

(FIG. 5: Step S506: Supplement 2)
In steps S505 to S506, the example of using the appearance frequency of the search character string in the document has been described. However, the present invention is not limited to this, and the information amount of the search character string included in the entire document and the similarity searched in step S504 are described. If the information amount of the search character string included in the document including the image can be compared, steps S505 to S506 can be performed using the other information amount. For example, an amount of KL (Kullback-Leibler) information may be used.

(FIG. 5: Step S506: Supplement 3)
The value of α may be determined in advance, or may be set as appropriate by the user via an interface as shown in FIG.

(FIG. 5: Step S507)
The image estimation unit 115 determines an image satisfying the conditional expression described in step S506 among the images acquired in step S503 as a search result for the search character string designated by the user. In addition to this search result, the similar image obtained in step S504 can be further added to the search result. This is because the appearance of the similar image is considered to be similar to the image included in the search result.

(FIG. 5: Steps S508 to S509)
The image display unit 119 outputs the image determined as the search result in step S507 to the storage unit 110 or the display unit 104 as an image indicating the content of the search character string (S508). If there is a next search instruction, the processes in steps S501 to S508 are repeated until an end instruction is given (S509).

  FIG. 6 is a diagram showing an example of a setting screen for the user to specify the coefficient α in step S506. The screen display unit 119 displays a setting screen as shown in FIG. The user operates the coefficient setting unit 1043 to set an optimal value of α according to the application, and operates the setting button 1044. The storage unit 110 stores the value, and the image estimation unit 115 uses the value in step S506.

<Embodiment 1: Summary>
As described above, the image search apparatus 100 according to the first embodiment includes the information amount P (x) of the search character string in the document data including the similar image and the information amount Q ( x), it is determined whether the image searched by the document search unit 116 or the image search unit 117 represents the content of the search character string. Thereby, even when the search character string and the image are not always directly associated with each other in the document, an image representing the content of the search character string can be searched with high accuracy.

  Further, according to the image search apparatus 100 according to the first embodiment, when a high-quality search character string created by careful consideration by the user is input, an image indicating the content of the search character string can be appropriately acquired. is there. Compared to the case where a search character string is acquired from within a document, a word that a person wants to search is used as a search character string, so that an image can be searched using a keyword that can be easily used for searching.

  In the first embodiment, the image retrieval apparatus 100 retrieves an image, but a document obtained in the process can be presented as a retrieval result, or both an image and a document can be presented. The same applies to the following embodiments.

<Embodiment 2>
In the second embodiment of the present invention, an operation example will be described in which images are classified into a plurality of groups by clustering, and the same processing as that of the first embodiment is performed on the clustered image set. Clustering aims to speed up the overall processing. Since the configuration of the image search apparatus 100 is the same as that of the first embodiment, the following description will focus on the operation related to clustering.

  In the second embodiment, the image search unit 117 has a function of clustering image sets according to their similarity. Since other functions are substantially the same as those of the first embodiment, specific functions will be described with reference to FIG.

  FIG. 7 is a flowchart for explaining the operation of the image search apparatus 100 according to the second embodiment. Hereinafter, each step of FIG. 7 will be described.

(FIG. 7: Steps S701 to S703)
These steps are the same as steps S501 to S503 described in FIG. 5 of the first embodiment.

(FIG. 7: Step S704)
The image search unit 117 classifies the images obtained in step S703 into N groups using the image feature amount. N can be determined according to the number of images to be classified. For example, if N = number of images / 20, when the number of images is 2000, N = 100. As a clustering method, any method such as a K-means clustering method or an ISODATA clustering method may be used.

(FIG. 7: Step S705)
The image estimation unit 115 determines an image to be processed after step S706 from among the images clustered in step S704. For example, it is conceivable to determine the processing target based on the following criteria.

(FIG. 7: Step S705: Criteria Example for Determining Processing Object 1)
In the first embodiment, the similarity search is performed in step S504 for all the images obtained in step S503. Therefore, the number of similar images obtained as a result tends to increase. Therefore, in the second embodiment, a similarity search is performed in step S706 on the representative value of each cluster obtained in step S704. The representative value of the cluster may be, for example, the average feature amount of images belonging to the same cluster. This eliminates the need for performing a similar search for individual images belonging to a cluster, thereby reducing the processing load.

(FIG. 7: Step S705: Criteria Example 2 for Determining Processing Target)
It is considered that there are many similar images with high size differences and fine correction added to images that the user is likely to use frequently. Therefore, the image estimation unit 115 performs a similarity search in step S706 on images included in a cluster in which many similar images are collected and the feature amount dispersion in the cluster is small. Specifically, a cluster including a predetermined number of images or more and a cluster whose feature value variance value is equal to or smaller than a predetermined threshold is searched, and the representative value of the corresponding cluster is determined in step S706. Perform a similar search. As a result, the processing load can be further reduced as compared with the second example.

(FIG. 7: Steps S706 to S711)
The image search unit 117 performs a similarity search for the search target determined in step S705 (S706). The subsequent steps are the same as steps S505 to S509 described with reference to FIG. 5 of the first embodiment. In step S506, it is determined whether the image represents the content of the search character string, whereas in step S708. The difference is that it is determined whether or not the cluster representative value represents the contents of the search character string. In step S711, an image included in the cluster is output as a search result.

<Embodiment 2: Summary>
As described above, the image search apparatus 100 according to the second embodiment clusters images included in the document searched by the document search unit 116, and performs a similar search for the representative value of each cluster. Thereby, it is possible to reduce the number of objects for determining whether or not the image represents the content of the search character string, reduce the load, and speed up the processing.

  Further, according to the image search apparatus 100 according to the second embodiment, it is possible to further speed up the processing by excluding clusters having a large variance of feature amounts in the clusters from the targets of the similar search. In addition, since a cluster with a large distribution of feature amounts is considered to include many images that are far from the contents of the search character string, there is a high possibility of causing noise in search processing. By excluding such clusters from the search target, an effect of improving the search accuracy can be expected.

<Embodiment 3>
In the first and second embodiments, an image obtained as a search result is considered to represent the contents of a search character string. If this point is seen from another aspect, it can be considered that the correspondence between the search character string and the image could be specified. That is, the inventor considered that the search character string can be used in various forms as metadata of an image obtained as a search result. Therefore, in Embodiment 3 of the present invention, a configuration example will be described in which when an image corresponding to a designated search character string is obtained as a search result, the search character string is stored as metadata of the image.

  FIG. 8 is a configuration diagram of the image search apparatus 100 according to the third embodiment. In the third embodiment, the storage unit 110 newly stores an image metadata DB 820 and a metadata management unit 821 in addition to the configurations described in the first and second embodiments. Other configurations are the same as those in the first and second embodiments.

  The image metadata DB 820 is a database that holds the correspondence between keywords and images. The keyword is one or more words included in the search character string, and can be used as metadata of an image associated therewith. Details of the image metadata DB 820 will be described later with reference to FIG. The metadata management unit 821 manages records held in the image metadata DB 820.

  The metadata management unit 821 can be configured by using hardware such as a circuit device that implements the function, or can be configured by the CPU 101 executing a program that implements the function. In the following, it is assumed that the latter is implemented.

  The image metadata DB 820 can be configured by a data file that records records stored in the DB and a storage area on the storage unit 110. The DB may be configured together with a function unit that controls data reading and writing with respect to the database.

  FIG. 9 is a diagram illustrating a configuration of the image metadata DB 820 and a data example. The image metadata DB 820 has a keyword field 8201 and an image ID field 8202. The keyword field 8201 holds one or more words included in the search character string. The image ID field 8202 holds the ID of the image obtained as a search result for each keyword and the number of times obtained as the search result.

  When the search character string includes a plurality of words, the keyword field 8201 may hold a keyword obtained by dividing the search character string into individual words, or hold a combination of a plurality of words as one keyword. You may do it. The same applies to the frequency value. In general, a combination of a plurality of words is considered to have a particular significance. Therefore, it is desirable to store the word combination in the keyword field 8201 as it is.

  In the data example shown in FIG. 9, it can be seen that the keyword “image search landscape” is associated as a search result 203 times for the image 00000002 and 198 times for the image 00000003. Even with the same keyword, the frequency of association may differ from image to image because when any image is registered later, or when a keyword is associated with a search result obtained as a similar image. This is because a keyword may be given indirectly by the above.

  FIG. 10 is a flowchart for describing processing in which the image search apparatus 100 registers keywords in the image metadata DB 820 in the third embodiment. Hereinafter, each step of FIG. 10 will be described.

(FIG. 10: Steps S1001 to S1007, S1009)
These steps are the same as steps S501 to S507 and S509 described in FIG. 5 of the first embodiment.

(FIG. 10: Step S1008)
The metadata management unit 821 stores the correspondence relationship between the image and the search character string determined by the processing in steps S1001 to S1007 in the image metadata DB 820. In this step, step S508 may be performed together, or when this flowchart is performed separately from the image search, only this step may be performed.

<Embodiment 3: Summary>
As described above, the image search apparatus 100 according to the third embodiment stores the correspondence between the search character string and the image obtained as a result of the image search in the image metadata DB 820. Thereby, it is possible not only to search for an image representing the contents of the search character string, but also to add meaning to the image using the search character string and store it as metadata. This metadata can be used in subsequent searches or can be diverted to various other uses.

  Further, the image search apparatus 100 according to the third embodiment can automatically generate and accumulate image metadata having high utility value every time an image is searched using a search character string. Further, by accumulating the frequency with which the search character string is associated with the image, it is possible to easily grasp what the search character string characterizes each image most.

<Embodiment 4>
In the fourth embodiment of the present invention, an operation example that obtains more search results than the operation examples described in the first to third embodiments will be described. It is considered that this can be realized by further searching for an image similar to the image group obtained as a search result and expanding the search result. Since the configuration of the image search apparatus 100 is the same as that of the first to third embodiments, the following description focuses on the operation for expanding the search result.

  FIG. 11 is a flowchart for explaining the operation of the image search apparatus 100 according to the fourth embodiment. Hereinafter, each step of FIG. 11 will be described.

(FIG. 11: Steps S1101 to S1107, S1110)
These steps are the same as steps S501 to S507 and S509 described in FIG. 5 of the first embodiment.

(FIG. 11: Step S1108)
The image search unit 117 further searches for an image similar to each image determined as the search result in step S1107. However, the search target is limited to images that are not set as search targets in step S1104. This is because re-searching for images that have already been searched is duplicate processing.

(FIG. 11: Step S1109)
The image display unit 119 outputs the image obtained as a result of steps S1107 and S1108 to the storage unit 110 and the display unit 104 as an image indicating the contents of the search character string.

(FIG. 11: Modification of Step S1102)
As another method for extending the search result, it is also effective to secure a large number of images to be searched from the beginning. For example, when a plurality of words are input as a search character string in step S1101, a document including all of these words is searched by searching for a document including at least one of these words (or search) in step S1102. More documents and images can be obtained than with (and search). Thus, it is considered that more search results can be obtained by extending the search target. However, in step S1105, in order to accurately determine whether or not the image represents the contents of the search character string, it is desirable to target a document that includes all the words in the search character string.

<Embodiment 4: Summary>
As described above, the image search apparatus 100 according to the fourth embodiment presents an image similar to an image indicating the content of the search character string as a search result. Thereby, even when the word corresponding to the search character string is not included on the text in the document, the possibility that an image representing the content of the search character string can be searched increases. The same effect can be obtained when expanding an image to be searched.

  Further, in step S1108 of the fourth embodiment, the search result is obtained only by the similarity search of the image without using the search character string. Therefore, in the use where the image is desired to be searched, the search result more suitable for the search topic can be obtained. it is conceivable that.

<Embodiment 5>
In the fifth embodiment of the present invention, a configuration example in which the operation related to clustering described in the second embodiment and the operation of registering metadata in the image metadata DB 820 described in the third embodiment are combined will be described. Since the configuration of the image search apparatus 100 is the same as that of the third embodiment, the following description will focus on the operation related to the combination.

  FIG. 12 is a flowchart for explaining the operation of the image search apparatus 100 according to the fifth embodiment. Hereinafter, each step of FIG. 12 will be described.

(FIG. 12: Steps S1201 to S1209, S1211)
These steps are the same as steps S701 to S709 and S711 in FIG.

(FIG. 12: Step S1210)
In steps S1208 to S1209, the metadata management unit 821 registers the image belonging to the cluster determined to represent the contents of the search character string and the search character string corresponding to the image in the image metadata DB 820.

<Embodiment 5: Summary>
As described above, the image search apparatus 100 according to the fifth embodiment clusters the images included in the document searched by the document search unit 116, and the images belonging to the cluster representing the contents of the search character string and the corresponding images. The search character string is stored in the image metadata DB 820. Thus, only frequently used images and their metadata can be stored at high speed.

<Embodiment 6>
In the sixth embodiment of the present invention, a configuration example in which the operation related to clustering described in the second embodiment and the operation of extending the search result (or search target) described in the fourth embodiment are combined will be described. Since the configuration of the image search apparatus 100 is the same as that of the third embodiment, the following description will focus on the operation related to the combination.

  FIG. 13 is a flowchart for explaining the operation of the image search apparatus 100 according to the sixth embodiment. Hereinafter, each step of FIG. 13 will be described.

(FIG. 13: Steps S1301 to S1308, S1311)
These steps are the same as steps S701 to S708 and S711 in FIG. In step S1302, the same processing as that of the modified example of step S1102 described in the fourth embodiment may be performed.

(FIG. 13: Steps S1309 to S1310)
The image search unit 117 further searches for an image similar to each image belonging to the cluster determined as the search result in step S1307 (S1309). The image display unit 119 outputs the image obtained as a result of steps S1308 and S1309 to the storage unit 110 and the display unit 104 as an image indicating the contents of the search character string.

<Embodiment 6: Summary>
As described above, the image search apparatus 100 according to the sixth embodiment clusters images included in the document searched by the book search unit 116, and further displays an image similar to the cluster representing the contents of the search character string. Search and present as search results. Thereby, a search result can be obtained at high speed using only frequently used images. Further, as in the fourth embodiment, an image that represents a search topic can be searched.

  The present invention is not limited to the embodiments described above, and includes various modifications. The above embodiment has been described in detail for easy understanding of the present invention, and is not necessarily limited to the one having all the configurations described. A part of the configuration of one embodiment can be replaced with the configuration of another embodiment. The configuration of another embodiment can be added to the configuration of a certain embodiment. Further, with respect to a part of the configuration of each embodiment, another configuration can be added, deleted, or replaced.

  Each of the above-described configurations, functions, processing units, processing means, and the like may be realized in hardware by designing a part or all of them, for example, with an integrated circuit. Each of the above-described configurations, functions, and the like may be realized by software by interpreting and executing a program that realizes each function by the processor. Information such as programs, tables, and files for realizing each function can be stored in a recording device such as a memory, a hard disk, an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, or a DVD.

  100: Image search device, 101: CPU, 102: Main memory, 103: Input unit, 104: Display unit, 105: Communication unit, 110: Storage unit, 111: OS, 112: Document DB, 113: Image DB, 114 : Document analysis unit, 115: Image estimation unit, 116: Document search unit, 118: Morphological analysis unit, 119: Screen display unit, 820: Image metadata DB, 821: Metadata management unit.

Claims (13)

  1. A document search unit for searching for document data including a specified search character string from a document database storing document data;
    A document analysis unit for acquiring an image included in the document data searched by the document search unit;
    An image search unit that searches for all document data in the document database, and searches for document data including an image similar to the image acquired by the document analysis unit;
    A total amount of information of the search character string included in all document data in the document database, and a total amount of information of the search character string included in each document data obtained as a result of the search by the image search unit; An image estimation unit for determining whether an image included in the document data obtained as a result of the search by the image search unit or the document search unit represents the content of the search character string,
    An output unit for outputting a determination result by the image estimation unit;
    An image search apparatus comprising:
  2. The image search device includes:
    An image metadata DB for storing images and their metadata;
    A metadata management unit for storing an image and its metadata in the image metadata DB;
    With
    The metadata management unit
    The image and the search character string determined as representing the content of the search character string by the image estimation unit are stored in the image metadata DB as an image and its metadata. Image search device.
  3. The metadata management unit
    Each time the image search unit performs the search, and the image estimation unit performs the determination, the image estimation unit determines that the image and the search character string are determined to represent the contents of the search character string. The image search apparatus according to claim 2, wherein the image is stored in the image metadata DB as an image and its metadata.
  4. The image metadata DB is
    The frequency with which the image and its metadata are associated is stored in association with the image and the metadata,
    The metadata management unit
    The image search apparatus according to claim 3, wherein the frequency is added each time an image and its metadata are stored in the image metadata DB.
  5. The document search unit receives a plurality of words as the search character string,
    The metadata management unit stores the combination of the plurality of words as the metadata in the image metadata DB.
    The image search apparatus according to claim 2, wherein:
  6. The image search unit
    Clustering the images acquired by the document analysis unit into a plurality of groups according to the image similarity,
    The image search apparatus according to claim 1, wherein all document data in the document database is searched for, and document data including an image similar to the representative value of the group obtained by the clustering is searched.
  7. The image search unit
    Among the groups obtained by the clustering, only for the group in which the number of images included in the group is equal to or greater than a predetermined threshold and the variance of image feature values in the group is equal to or smaller than the predetermined threshold, The image search apparatus according to claim 6, wherein document data including an image similar to the representative value is searched.
  8. The image search unit
    Searching the document database for document data including a second image similar to the image determined by the image estimation unit to represent the content of the search character string;
    The output unit is
    The image search apparatus according to claim 1, wherein a result of searching document data including the second image is output together with a determination result by the image estimation unit.
  9. The document search unit
    When a plurality of words are received as the search character string, document data including at least one of the plurality of words is searched from the document database,
    The image estimation unit
    The image search apparatus according to claim 1, wherein the determination is performed based on an information amount of the search character string including all of the plurality of words.
  10. The output unit is
    When the image estimation unit determines that the image included in the document data obtained as a result of the search by the image search unit represents the content of the search character string,
    The image search apparatus according to claim 1, wherein only the former is output from an image included in the document data searched by the document search unit and an image obtained as a result of the search by the image search unit. .
  11. The image estimation unit
    The total amount of information of the search character string included in each document data obtained as a result of the search by the image search unit is the total amount of information of the search character string included in all document data in the document database. Based on whether or not it exceeds a predetermined threshold value, the determination is performed,
    The image search device includes:
    The image search apparatus according to claim 1, further comprising a threshold setting unit that receives a designation input that designates the predetermined threshold.
  12. The output unit is
    When the image estimation unit determines that the image included in the document data obtained as a result of the search by the image search unit represents the content of the search character string,
    The image search apparatus according to claim 1, wherein the document data searched by the document search unit and the document data obtained as a result of the search by the image search unit are output.
  13. A document search step of searching for document data including a specified search character string from a document database storing document data;
    A document analysis step of acquiring an image included in the document data searched in the document search step;
    An image search step for searching for document data including an image similar to the image acquired by the document analysis unit, with all document data in the document database as search targets,
    A total amount of information of the search character string included in all document data in the document database, and a total amount of information of the search character string included in each document data obtained as a result of the search by the image search unit; An image estimation step for determining whether or not an image included in the document data obtained as a result of the search in the image search step represents the content of the search character string,
    An output step of outputting a determination result in the image estimation step;
    An image search method characterized by comprising:
JP2012118320A 2012-05-24 2012-05-24 Image search apparatus and image search method Active JP5868262B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2012118320A JP5868262B2 (en) 2012-05-24 2012-05-24 Image search apparatus and image search method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2012118320A JP5868262B2 (en) 2012-05-24 2012-05-24 Image search apparatus and image search method

Publications (2)

Publication Number Publication Date
JP2013246544A true JP2013246544A (en) 2013-12-09
JP5868262B2 JP5868262B2 (en) 2016-02-24

Family

ID=49846279

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2012118320A Active JP5868262B2 (en) 2012-05-24 2012-05-24 Image search apparatus and image search method

Country Status (1)

Country Link
JP (1) JP5868262B2 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6904560B1 (en) * 2000-03-23 2005-06-07 Adobe Systems Incorporated Identifying key images in a document in correspondence to document text
JP2010205060A (en) * 2009-03-04 2010-09-16 Nomura Research Institute Ltd Method for retrieving image in document, and system for retrieving image in document
JP2011113130A (en) * 2009-11-24 2011-06-09 Kddi Corp Device, method and program for retrieving image

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6904560B1 (en) * 2000-03-23 2005-06-07 Adobe Systems Incorporated Identifying key images in a document in correspondence to document text
JP2010205060A (en) * 2009-03-04 2010-09-16 Nomura Research Institute Ltd Method for retrieving image in document, and system for retrieving image in document
JP2011113130A (en) * 2009-11-24 2011-06-09 Kddi Corp Device, method and program for retrieving image

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JPN6014005869; 出原 博: 'WWW画像検索における画像周辺のHTML構文構造を考慮した画像説明文の抽出手法' 電子情報通信学会技術研究報告 Vol.105,No.340(DC2005-30), 20051011, p.19-24, 社団法人電子情報通信学会 *
JPN6015040259; 渡邉 裕樹: '大規模Web画像データベースを用いた画像アノテーションシステムの構築' 情報処理学会研究報告 2011(平成23)年度 [DVD-ROM] , 20120415, p.1-8, 一般社団法人情報処理学会 *

Also Published As

Publication number Publication date
JP5868262B2 (en) 2016-02-24

Similar Documents

Publication Publication Date Title
US7099819B2 (en) Text information analysis apparatus and method
EP2100260B1 (en) Identifying images using face recognition
Li et al. Optimol: automatic online picture collection via incremental model learning
US8433140B2 (en) Image metadata propagation
Deselaers et al. Features for image retrieval: an experimental comparison
US8347206B2 (en) Interactive image tagging
US9710491B2 (en) Content-based image search
US8583647B2 (en) Data processing device for automatically classifying a plurality of images into predetermined categories
US8200027B2 (en) Methods and apparatus for retrieving images from a large collection of images
CN101449271B (en) Annotated by search
KR20130142121A (en) Multi-modal approach to search query input
JP5095535B2 (en) Image processing method, image processing system, image processing apparatus, and program
US20130202213A1 (en) Method and system for fast and robust identification of specific product images
US9411830B2 (en) Interactive multi-modal image search
Shekhar et al. Word image retrieval using bag of visual words
US20110029510A1 (en) Method and apparatus for searching a plurality of stored digital images
JP5281156B2 (en) Annotating images
Rui et al. Bipartite graph reinforcement model for web image annotation
US8027541B2 (en) Image organization based on image content
JP6240916B2 (en) Identifying text terms in response to visual queries
JP4337064B2 (en) Information processing apparatus, information processing method, and program
KR20050026902A (en) Annotation management in a pen-based computing system
JP5181886B2 (en) Electronic document search method, search system, and computer program
US8781255B2 (en) Methods and apparatus for visual search
US20090150376A1 (en) Mutual-Rank Similarity-Space for Navigating, Visualising and Clustering in Image Databases

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20150113

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20150924

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20151006

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20151117

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20151208

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20160105