WO2011105608A1 - 情報処理装置、情報処理方法、情報処理プログラムを記録した記録媒体 - Google Patents
情報処理装置、情報処理方法、情報処理プログラムを記録した記録媒体 Download PDFInfo
- Publication number
- WO2011105608A1 WO2011105608A1 PCT/JP2011/054528 JP2011054528W WO2011105608A1 WO 2011105608 A1 WO2011105608 A1 WO 2011105608A1 JP 2011054528 W JP2011054528 W JP 2011054528W WO 2011105608 A1 WO2011105608 A1 WO 2011105608A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- character string
- image
- character
- search
- visual feature
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/768—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using context analysis, e.g. recognition aided by known co-occurring patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/5846—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using extracted text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/22—Character recognition characterised by the type of writing
- G06V30/224—Character recognition characterised by the type of writing of printed characters having additional code marks or containing code marks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/28—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
- G06V30/287—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of Kanji, Hiragana or Katakana characters
Definitions
- the present invention relates to an information processing apparatus, an information processing method, and a recording medium on which an information processing program is recorded.
- a person who uses the image for a Web page or the like may intentionally use the layout in the image, the font of the character, or the contrast between the character color and the background color.
- Such intentions include, for example, those that want to make a product stand out, or illegal expressions that use exaggerated expressions in advertisements. In such a case, it is not possible to perform an appropriate search according to the intention of a person who uses an image for a Web page or the like simply by searching for a character string included in the image.
- the present invention has been made in view of the above, and has recorded an information processing apparatus, an information processing method, and an information processing program that can facilitate detection when a search keyword is characteristically used in an image.
- An object is to provide a recording medium.
- an information processing apparatus includes an image database that stores an image to be searched, and a character string region that extracts a character string region that includes a character string in an image stored in the image database.
- a character string is composed of an extraction means, a character string recognition means for recognizing a character string included in the character string area extracted by the character string area extraction means, and an image of the character string area extracted by the character string area extraction means.
- Visual feature quantity calculating means for calculating and storing a visual feature quantity of the character string based on at least one of the size, color, shape and decoration of the character to be performed and the contrast between the character color and the background color; It is characterized by providing.
- the character string visually based on at least one of the size, color, shape, and decoration of the character string included in the image and the contrast between the character color and the background color.
- the feature amount is calculated and stored. If a search is performed using this information, a search result can be output according to the visual feature amount. Therefore, for example, when a search keyword is characteristically used for an image, a search result that ranks the image can be output. That is, the information processing apparatus according to the present invention can facilitate detection when a search keyword is characteristically used in an image.
- the visual feature amount calculating means may calculate and store a visual feature amount for each character constituting the character string. According to this configuration, the visual feature amount of each character can be added to obtain the visual feature amount of the character string during the search.
- the information processing apparatus searches for a search keyword input means for inputting a search keyword, and whether or not the keyword input by the keyword input means matches at least a part of the character string recognized by the character string recognition means.
- the visual feature amount calculating means calculates the visual feature amount based on the difference between the lightness of the pixels that constitute the character string of the character string region and the lightness of the pixels that constitute the background of the character string region. It is desirable. According to this configuration, it is possible to appropriately extract the visual feature amount based on the color of the image, and appropriately implement the present invention.
- the visual feature amount calculating means sets the brightness of the pixel that is supposed to constitute the character string of the character string area as the brightness of the pixel of the most color among the pixels, and the brightness of the pixel that is assumed to constitute the background of the character string area It is desirable to set the brightness of the pixel with the most color among the pixels. According to this configuration, the visual feature amount based on the color of the image can be reliably extracted, and the present invention can be reliably implemented.
- the search means calculates score values for a plurality of keywords input by the keyword input means. According to this configuration, it is possible to perform a search for a plurality of search keywords, and it is possible to perform a search that is more convenient for the user.
- the search means calculates the score value based on the ratio of the image including the keyword to the image stored in the image database. According to this configuration, it is possible to output search results according to the appearance rate of keywords included in an image. That is, it is possible to output search results that are more convenient for the user.
- the image database stores search target images so as not to include a plurality of the same images, and stores hash values obtained from the images in association with information indicating locations of Web pages where the images are used.
- the output means uses the information obtained by the search by the search means and indicating an image not including a plurality of the same images, and the image stored in the image database in association with the hash value of the image. It is desirable to output information indicating the location of the current Web page. In this configuration, the hash value of the image to be searched and the information indicating the location of the Web page where the image is used are stored in association with each other, and the search result indicates the image obtained by the search.
- the information and the information indicating the location of the Web page where the image is used that is stored in association with the hash value of the image are output.
- the hash value is the same image, the value is within a certain range. Therefore, an image used in a plurality of Web pages can be handled as one image. Therefore, according to this configuration, even if the same image is used in a plurality of Web page locations, the search result can be used effectively. That is, it is possible to prevent the same image from being arranged as a search result, and to efficiently find the image that the user wants to search. For example, it is possible to prevent the same image describing the keyword searched by the user from being arranged as a search result.
- the present invention can be described as an information processing apparatus invention as described above, and can also be described as a computer-readable recording medium recording an image search method and an image search program as follows. This is substantially the same invention only in different categories, and has the same operations and effects.
- the image search method is an image search method by an information processing apparatus including an image database for storing an image to be searched, and includes a character string region including a character string in an image stored in the image database.
- a character string region extraction step to extract a character string recognition step for recognizing a character string included in the character string region extracted in the character string region extraction step, and an image of the character string region extracted in the character string region extraction step
- Visual feature quantity of the character string that calculates and stores the visual feature quantity of the character string based on at least one of the size, color, shape and decoration of the characters constituting the character string and the contrast between the character color and the background color
- a calculating step is an image search method by an information processing apparatus including an image database for storing an image to be searched, and includes a character string region including a character string in an image stored in the image database.
- the recording medium includes one or more computers, an image database that stores an image to be searched, and a character string region that extracts a character string region that includes a character string in the image stored in the image database.
- a character string is composed of an extraction means, a character string recognition means for recognizing a character string included in the character string area extracted by the character string area extraction means, and an image of the character string area extracted by the character string area extraction means.
- Visual feature amount calculation means for calculating and storing a visual feature amount of the character string based on at least one of the size, color, shape, and decoration of the character to be performed and the contrast between the character color and the background color; It is a computer-readable recording medium on which an information processing program to be operated is recorded.
- search results can be output. Therefore, for example, when a search keyword is characteristically used for an image, a search result that ranks the image can be output. That is, according to the present invention, it is possible to facilitate detection when a search keyword is characteristically used in an image.
- surface which shows the comparison (when N 30) with a visual result and a search result. It is the sample image used for experiment.
- FIG. 1 shows an image search apparatus 10 which is an information processing apparatus according to this embodiment.
- the image search device 10 is a device that receives a search request for an image to be searched and outputs a search result corresponding to the search request.
- the search target image is a description image of a product sold at the cyber mall.
- the purpose of the image search by the image search apparatus 10 is to check whether there is an inappropriate image as a product description image.
- An image inappropriate as a product description image is, for example, an excessive expectation of the effect of a product such as a health product or cosmetics on consumers.
- the image search apparatus 10 is used by a business operator who manages a cyber mall. Therefore, the image search apparatus 10 can acquire an image to be searched by connecting to a server constituting a cyber mall, which is not clearly shown in FIG.
- the image search device 10 is connected to the administrator terminal 30 and can transmit and receive information to and from each other.
- the image search apparatus 10 receives a search request for an image to be searched from the administrator terminal 30 and outputs information indicating a search result corresponding to the search request to the administrator terminal 30.
- the image search device 10 is realized by a computer such as a server device including hardware such as a CPU (Central Processing Unit), a memory, and a communication module.
- a computer such as a server device including hardware such as a CPU (Central Processing Unit), a memory, and a communication module.
- the functions of the image search apparatus 10 to be described later are exhibited by operating these components by a program or the like.
- the image search device 10 may be configured by a computer system including a plurality of computers.
- the administrator terminal 30 is a terminal having a communication function used by a user of a business operator who manages the cyber mall described above, and can send and receive information to and from the image search apparatus 10.
- the administrator terminal 30 corresponds to a communication device such as a PC (Personal Computer).
- the image search apparatus 10 includes an image database 11, an image registration unit 12, a character string region extraction unit 13, a character candidate recognition unit 14, a character candidate storage unit 15, and a visual feature amount calculation.
- a unit 16, a search keyword input unit 17, a search unit 18, and an output unit 19 are configured.
- the image database 11 is a database that stores (stores) images to be searched.
- the image stored in the image database 11 is an explanatory image of a product sold in the cyber mall that is posted on the website that constitutes the cyber mall as described above. Each image is provided with information such as an ID for identifying the image so that the image can be identified.
- the image database 11 does not store the same image redundantly. That is, the image database 11 stores images so as not to include a plurality of identical images.
- the image database 11 is realized by hardware such as a memory or a hard disk included in the image search apparatus 10.
- the image database 11 may manage image data stored by software constituting the database, or may simply store image data in a memory, a hard disk, or the like.
- the image database 11 stores a hash value obtained by applying a hash function to the image in association with the image to be stored.
- the hash function is a specific hash function set in advance. If they are the same image, the hash value obtained from the images is within a certain range. Thereby, when the same image is used in a plurality of Web sites in the cyber mall, it can be managed with one hash value.
- images that can be regarded as the same for the user such as images with similar character colors (red and orange, etc.) and images with similar character sizes, may be the same image.
- the certain range of the hash value can be appropriately set according to images that can be regarded as the same image.
- the image database 11 stores the hash value in association with information indicating a Web site that is a part of the Web page where the image is used.
- the information indicating the Web site is, for example, a URL (Uniform Resource Locator).
- the image database 11 stores a numerical value associated with each piece of information indicating the Web site. This numerical value is, for example, the selling price of the product related to the image on the Web site. Further, each piece of information indicating the Web site may be associated with other information such as a description of the product on the Web site.
- the image database 11 can store an image in association with information on the Web site where the image is used and the sales price information of the product related to the image on the Web site.
- the image registration unit 12 is an image registration unit that inputs an image to be newly registered in the image database 11 and information indicating a Web site where the image is used, and stores the information in the image database.
- the image registration unit 12 stores the specific hash function described above in advance.
- the image registration unit 12 calculates a hash value by applying the hash function to the input image.
- the image registration unit 12 reads the hash value stored in the image database 11 and determines whether or not the calculated hash value is a value within a certain range from the hash value already stored in the image database 11. . When the image registration unit 12 determines that the calculated hash value is within a certain range from the already stored hash value, the image registration unit 12 uses the image for the already stored hash value.
- the information indicating the current Web site is stored in the image database 11 in association with each other.
- the image registration unit 12 determines that the calculated hash value is not a value within a certain range from the already stored hash value, the input image and information indicating the website, and the calculated hash value Are stored in the image database 11 in association with each other. At that time, as described above, it is also possible to register information on the sales price of the product related to the image on the Web site together.
- the input of the image and information indicating the Web site where the image is used to the image registration unit 12 is performed from the administrator terminal 30 by an operation of a user of a business operator who manages the cyber mall, for example.
- the input may be automatically performed when an image is newly used on the cyber mall website.
- the character string area extracting unit 13 is a character string area extracting unit that extracts a character string area including a character string in an image stored in the image database 11.
- the extraction of the character string area is performed as follows, for example. First, in order to extract characters in the image, the target image is converted into a grace case image, and then a threshold value is determined by a discriminant analysis method and converted into a binary image. This method includes, for example, the method described in Otsu: Automatic threshold selection method based on discriminant and least-squares criteria, Science D, Vol.63, No.4, pp.349-356 (1980). Can be used.
- a labeling process is performed on the binary image, and the obtained image connection elements are connected to regions using the pitch, aspect ratio, and angle, and character string images arranged in the horizontal and vertical directions are extracted.
- This method is described in, for example, Hamada, Nagai, Okamoto, Miyao, Yamamoto: Character extraction from scene images, Science D, Vol.J88-D2, No.9, pp.1817-1824 (2005). Can be used.
- the character string region extraction unit 13 outputs the character string region (image data) extracted as described above to the character candidate recognition unit 14 and the visual feature amount calculation unit 16. At this time, it is possible to determine from which image the extracted character string region has been extracted (extraction source image).
- the character string region extraction unit 13 may extract a plurality of character string regions from one image. In this case, each of the character string regions extracted from the image can be discriminated, for example, by giving an ID. Further, the extracted character string area may be an overlapping area in the image. One location in the image may belong to both the vertical character string region and the horizontal character string region. This is to prevent omission of extraction of the character string even when it is not possible to clearly determine in which direction the character string is read.
- the timing at which the character string area extraction unit 13 extracts a character string is, for example, the timing at which an image is newly stored in the image database 11. Alternatively, extraction may be performed using a user operation as a trigger.
- the character candidate recognizing unit 14 performs character recognition from the image and identifies a plurality of character candidates for each character constituting the character string included in the character string region extracted and input by the character string region extracting unit 13. Candidate recognition means. Further, the character candidate recognition unit 14 evaluates the accuracy of character recognition with respect to each character candidate specified at the time of character recognition, and ranks each character candidate. Character recognition is performed as follows.
- the input image of the character string area is divided into image of each character constituting the character string, and character recognition processing is performed on the image of each character.
- Character recognition is performed by extracting a feature amount used for character recognition from an image and comparing the feature amount with a character feature amount that can be a character candidate extracted in advance.
- a feature amount used for character recognition for example, a direction line element feature using the outline of a character can be used. This method includes, for example, Son, Tahara, Aki, Kimura: High-precision character recognition using directional line element features, Science theory, vol.J74-D-II, No.3, pp.330-339 ( 1991) can be used.
- the short Euclidean distance of the feature amount can be used.
- a character candidate having a feature amount closer to the feature amount extracted from the image and a feature amount closer to the Euclidean distance of the feature amount is set as a character candidate having higher accuracy.
- the character candidates are ranked for each target character image.
- the ranked character candidates are held as multiplexed character candidates up to the Nth place (N is a preset natural number of 2 or more).
- N is a preset natural number of 2 or more.
- the character candidate storage unit 15 is a character candidate storage unit that stores a plurality of character candidates specified by the character candidate recognition unit 14 in association with an image that is the specification source of the character candidates in the order of character strings.
- the character candidate storage unit 15 stores information indicating each character candidate in the order of high accuracy evaluated by the character candidate recognition unit 14 as an index for the image for each character. This storage is performed by storing, for example, the following data (file) indicating each character candidate in the memory or hard disk of the image search apparatus 10.
- FIG. 2 An example of information stored in the character candidate storage unit 15 is shown in FIG.
- the example shown in FIG. 2 is a character candidate recognized from one character string area.
- the information stored in the character candidate storage unit 15 is obtained by associating information on the order of characters (“No. j” in FIG. 2), character coordinates, and recognition results. is there.
- the character order is information indicating the number of characters constituting the character string corresponding to the character candidate.
- the character coordinates are information indicating at which position the character candidate is located in the original image of the character candidate.
- the character coordinate information indicates (x coordinate, y coordinate, width, height) of the character image when the upper left corner of the image is the origin (0, 0).
- the x-coordinate and the y-coordinate indicate a preset reference position (for example, the position of the upper left pixel of the character image) in the character image.
- the character coordinate information is acquired by, for example, the character candidate recognition unit 14.
- the recognition result is a list of character candidates for each character arranged in order of high accuracy.
- the nth character candidate jth position of the recognition result C is expressed as C [n] [j].
- C [1] [1], C [1] [2], and C [10] [1] in the recognition result C in the table of FIG. 2 are “So”, “Preliminary”, and “High”, respectively. .
- the character candidate storage unit 15 stores information indicating the recognition result as shown in FIG. 2 in association with information for specifying an image such as a hash value of an image that is a character candidate identification source, and stores any image. It is possible to determine whether it has been extracted from. When a plurality of character string areas are extracted from one image, they are stored in association with the ID of the character string area, etc., so that it can be determined from which character string area it is extracted.
- the character candidate storage unit 15 converts the plurality of character candidates specified by the character candidate recognition unit 14 into a character string obtained by combining any of the character candidates in the order of the character strings, and adds the character candidates to the image that is the identification source of the character candidates.
- the information may be stored in association with each other. That is, a character string obtained by selecting one of the character candidates for each character as shown in FIG. 2 and combining them in the order of the character strings may be stored.
- the character string is as shown in FIG.
- the character strings stored in combination are not consecutive in the order of the character candidates acquired from the image, but may be characters in which character strings are partially missing.
- “safety” and “height” of the character candidates acquired from the image are not consecutive in the order of the character candidates, but such combinations may be used.
- even when character candidates are stored for each character it is not always necessary to determine matching in the order of character candidates in the search described later in the same manner as described above.
- information corresponding to character candidates is associated with the character string and stored for each character. Can be handled in the same way as
- the character strings stored here may be stored for all combinations of character candidates, or only combinations that match a character string assumed as a preset search keyword may be stored. . Moreover, it is good also as memorize
- the visual feature amount calculation unit 16 determines the visual feature of the character string based on at least one of the size and color of the characters constituting the character string from the image of the character string region extracted by the character string region extraction unit 13. It is a visual feature quantity calculating means for calculating and storing a quantity (saliency). The visual feature amount calculation unit 16 calculates a visual feature amount based on the difference between the lightness of the pixels that are assumed to constitute the character string of the character string region and the lightness of the pixels that are assumed to constitute the background of the character string region. To do.
- the visual feature quantity calculation unit 16 sets the lightness of the pixel that constitutes the character string of the character string region as the lightness of the pixel of the most color among the pixels, and the pixel of the pixel that constitutes the background of the character string region.
- the lightness is the lightness of the pixel with the most color among the pixels. More specifically, the visual feature amount calculation unit 16 calculates the visual feature amount of the character string by the following processing.
- the visual feature amount calculation unit 16 stores the calculated visual feature amount in association with the character string. This storage is performed, for example, by storing information in a memory or a hard disk of the image search apparatus 10.
- the visual feature amount calculation unit 16 may calculate and store a visual feature amount for each character constituting the character string. According to this configuration, the visual feature amount of each character can be added to obtain the visual feature amount of the character string during the search.
- the visual feature amount calculation unit 16 performs character recognition in the same manner as the character candidate recognition unit 14. However, the recognition of the character by the visual feature amount calculation unit 16 does not necessarily specify a plurality of character candidates.
- the visual feature amount calculation unit 16 specifies the character size (character size) from the vertical and horizontal sizes of the character image area obtained at the time of character extraction.
- the character size is obtained in units of points (pt), for example.
- the visual feature amount calculation unit 16 acquires the character color and the background color using the representative color selection method for the character region and the background region included in the character image region.
- Representative color selection methods are, for example, Hase, Yoneda, Sakai, Maruyama: Examination of color segmentation for the purpose of extracting character regions in color document images, Science theory D-II vol. J83-D-II No.5 pp.1294-1304 (2000).
- the procedure for selecting a representative color is to first convert pixel values from the RGB color space to the L * a * b * color space for each of the character region and the background region.
- w is a preset value.
- the small region having the largest number of pixels compared to the number of pixels that fell in each of the small regions in the vicinity of 26 around which the pixel value of the small region was around was used as the representative color.
- one of those areas is set as a representative color.
- FIG. 3 shows an example in which the representative color is actually selected and the character color and the background color are acquired.
- a region indicated by a broken line is a character string region.
- the value of w when selecting a representative color is 4.
- the visual feature amount calculation unit 16 obtains the lightness L from the RGB values of the representative color pixels by the following equation (1).
- L 0.298912R + 0.586611G + 0.114478B (1)
- the visual feature amount calculation unit 16 obtains the absolute value of the difference between the lightness L of the character color and the lightness L of the background color.
- the visual feature amount calculation unit 16 calculates the visual feature amount of the character string from the obtained character size and brightness difference according to the table of FIG. In the table shown in FIG. 4, the visual feature amounts are qualitatively labeled as low, medium, high, etc., but they may be converted into quantitative values.
- the visual feature amount calculation unit 16 outputs information indicating the calculated visual feature amount of the character string to the search unit 18.
- the timing at which the visual feature amount calculation unit 16 calculates the visual feature amount is, for example, the timing at which an image is newly stored in the image database 11, similar to the timing at which the character string region extraction unit 13 extracts a character string. is there.
- the visual feature amount is stored as information added to an index that is an index for an image, for example.
- the search process by the search unit 18 it may be performed according to an instruction from the search unit 18.
- the visual feature quantity calculation unit 16 is based on at least one of other character characteristics, such as shape (font) and decoration, and contrast between the character color and the background color. The visual feature amount may be calculated.
- the search keyword input unit 17 is search keyword input means for inputting a search keyword.
- the search keyword input unit 17 may input a plurality of keywords. In that case, information indicating whether to perform an AND search or a OR search using a plurality of keywords may be input together.
- the search keyword is input as follows, for example.
- the search keyword input unit 17 receives an access request from the administrator terminal 30 and transmits data of a Web page having a form for inputting a keyword to the administrator terminal 30.
- the data of the Web page is received and displayed.
- a keyword input operation is performed by the user, and a search request including the keyword is transmitted to the image search device 10.
- the search keyword input unit 17 receives the search request and inputs the keyword by acquiring the keyword from the received search request.
- the search keyword input unit 17 outputs the input keyword to the search unit 18.
- the search unit 18 is a search unit that searches for an image stored in the image database 11 using the keyword input from the search keyword input unit 17.
- each character constituting the keyword input from the search keyword input unit 17 matches any of a plurality of character candidates constituting the character string stored in the character candidate storage unit 15 in the order of the keyword. This is done by determining whether or not. For example, if the search keyword is “safety” and the plurality of character candidates constituting the character string are those shown in the table of FIG. 2, each of the third to fifth character candidates is set to “safe”. ”,“ All ”, and“ sex ”are included, the character string shown in FIG. 2 is determined to have hit the keyword“ safety ”. The determination of whether or not the character string hits the keyword will be described later using a flow.
- the keyword input from the search keyword input unit 17 and the character candidate storage unit 15 store the character string.
- the search may be performed by comparing the character string. If the character string stored in the character candidate storage unit 15 includes the keyword input from the search keyword input unit 17, the character string stored in the character candidate storage unit 15 has hit the keyword Judged to be. If the character candidate storage unit 15 stores the character string in this way, the search can be performed by determining the match between the search keyword and the character strings as described above. Processing can be made faster. If character candidates are stored as information shown in FIG. 2 without storing them as character strings, it is possible to search for unknown words and ambiguous keywords.
- search unit 18 determines the accuracy described above.
- the reliability (matching degree) regarding the matching is evaluated from the information indicating. More specifically, the search unit 18 calculates the character recognition reliability (similarity) for the keyword t as a value indicating the reliability from the order of the character candidates that match the character of the keyword.
- the character recognition reliability is a value in the range of 0.0 to 1.0, and a larger value indicates higher reliability.
- the character recognition reliability is calculated as 5 ⁇ (1 + 1 + 1 + 1 + 1) and becomes 1.00.
- the character recognition reliability is calculated as 5 ⁇ (1 + 1 + 1 + 1 + 3) and becomes 0.71.
- an image with a low character recognition reliability is likely to be erroneously searched, and an image with a high character recognition reliability is highly likely to contain a search keyword correctly. That is, the character recognition reliability can be used as an index that includes the search keyword more accurately. Therefore, by sorting the search results based on the character recognition reliability when listing an image including a search keyword from among a large number of images, it is possible to preferentially present an image with a result with few search errors.
- the search unit 18 may determine the number of character candidates for determining a match with the keyword according to the number of characters of the keyword. As will be described in detail later, when the number of characters in the search keyword is small (the search keyword is short), a tendency for a search error to occur and the relevance rate to be low is recognized. Therefore, for example, when it is determined that the number of characters of the keyword is equal to or less than a preset threshold value, the number of character candidates for determining a match may be smaller than usual. After determining the number of character candidates, the search unit 18 determines a character candidate for determining a match with the keyword from information indicating the accuracy of the character candidate. Specifically, the search unit 18 sets character candidates up to the determined number of character candidates as character candidates for determining a match.
- the search unit 18 calculates a score value of an image including the character candidate for the keyword from the search result of matching between the keyword and the character candidate. This score value indicates the order of images to be output as search results. In this embodiment, the score value indicates a high possibility that a search keyword is included in an image in an inappropriate manner.
- the search unit 18 determines the m-th character string (t, t) included in the image from the visual feature quantity saliency (t) and the character recognition reliability similarity (t) of the search keyword t obtained as described above.
- the character feature amount termscore (t, m) of m) is obtained by the following equation (3).
- termscore (t, m) (1- ⁇ ) ⁇ similarity (t, m) + ⁇ ⁇ saliency (t, m) (3)
- ⁇ is a value indicating the weight between the visual feature quantity and the character recognition reliability.
- ⁇ is a value between 0 and 1, and is a preset value.
- ⁇ 0
- the m-th character string in the image indicates which character string among the character strings related to a plurality of character string regions extracted by the character string region extraction unit 13. m takes one of the values from 1 to the number of character string regions extracted by the character string region extraction unit 13.
- the search unit 18 uses the visual feature amount calculation unit 16 to calculate the visual feature amount from the character string region in which a plurality of character candidates constituting the character string in which the characters constituting the keyword are matched in the order of the keyword.
- a score value of an image including the character string is calculated from the feature amount.
- the search unit 18 calculates score values for a plurality of keywords input by the search keyword input unit 17.
- the search unit 18 calculates tf-idf of the keyword included in the image in order to consider the appearance frequency of the keyword included in the image.
- tf-idf is known as an algorithm for extracting characteristic words in a sentence, and is an index mainly used in fields such as information retrieval and document summarization.
- tf is an appearance frequency of a word in a document
- idf is a reverse appearance frequency for decreasing the importance of a word appearing in many documents and increasing the importance of a word appearing only in a specific document.
- the concept of tf-idf is extended to characters in the image, and the image score is calculated by using it in combination with the visual feature quantity of the character string and the character recognition reliability.
- the search unit 18 uses 2 of the character feature values of each of the search keywords t included in tf (t) in the image according to the following equation (4). The sum of multiplication is obtained and the score of the image by the search keyword is used.
- the association between the character string and the image the information on the association between the character candidate and the image stored by the character candidate storage unit 15 is referred to.
- m is a subscript of a character string including the keyword t in the target image, and is an integer in the range of 1 to tf (t).
- the image score when performing a search with a plurality of search keywords can be calculated using the value of idf (t).
- the idf (t) of the search keyword t is obtained by the following equation (5) using the total number of images to be searched (A) and the number of images including t (S).
- idf (t) becomes a larger value as the number of images including the search keyword t is smaller, and indicates a rare word.
- idf (t) log (A / (S + 1)) + 1 (5)
- the image score when performing an AND search with a plurality of search keywords is a numerical value obtained by multiplying the image score score (t, image) of the plurality of search keywords t included in the query (search request) q by the value of idf (t). Is obtained by the following equation (6).
- the image score when performing an OR search with a plurality of search keywords is the sum of numerical values obtained by multiplying the image score score (t, image) of the plurality of search keywords t included in the query q by the value of idf (t). It is calculated by calculating
- the search unit 18 calculates a score value based on the ratio of the number of images including a keyword to the number of images stored in the image database 11.
- the score of a very large image of 600 ⁇ 10,000 pix (pixel) becomes very low, or the score of a small banner image of about 20 ⁇ 100 pix is It may increase rapidly. For this reason, in the present embodiment, it is not always necessary to perform weighting based on the amount of text in the image.
- the search unit 18 outputs to the output unit 19 information indicating images hit by the keyword by the search, and an image score score (q, image) for those images.
- the output unit 19 is an output unit that outputs the result of the search performed by the search unit 18.
- the output unit 19 outputs information indicating an image hit with the keyword.
- the information indicating the image output by the output unit 19 is based on the correspondence between the character candidates stored in the character candidate storage unit 15 and the images.
- the output by the output unit 19 is performed by transmitting Web page information including search result information to the administrator terminal 30.
- FIG. 6 shows an example in which the Web page is displayed on the browser of the administrator terminal 30. As shown in FIG. 6, an image that hits the keyword is displayed. Here, the displayed images are arranged in descending order of the image score score (q, image). That is, the output unit 19 outputs the search result by the search unit 18 based on the reliability regarding the match between the keyword and the character candidate evaluated by the search unit 18. Further, the output unit 19 outputs the search result by the search unit 18 according to the score value of each image calculated by the search unit 18.
- the output unit 19 outputs information based on information associated with the hash value of the image stored in the image database 11.
- the output unit 19 uses the information obtained by the search by the search means and indicating an image not including a plurality of the same images, and the image stored in the image database 11 in association with the hash value of the image. And information indicating the website being visited. More specifically, the output unit 19 outputs information indicating the image obtained by the search by the search unit 18, receives an input for selecting an image according to the output, and hashes the image related to the input. Information indicating a Web site where the image stored in association with the value is used is output.
- the output unit 19 transmits, to the administrator terminal 30, data of a Web page that displays an image that has hit a keyword as a result of the search by the search unit 18.
- those images are displayed on the browser.
- a region A1 in FIG. 6 is a portion where an image hit with a keyword is displayed.
- the administrator terminal 30 selects Information indicating the processed image is transmitted to the image search apparatus 10.
- the output unit 19 receives information indicating the selected image, refers to the image database 11, acquires information indicating the Web site associated with the hash value of the image, and sends it to the administrator terminal 30. Output.
- the output unit 19 refers to the image database 11 and acquires information indicating the sales price of the product associated with the information indicated on the Web site.
- the output unit 19 outputs the information indicating the website so that the information is displayed in order of the sales price of the product (for example, in order of high price or low price) when transmitting information indicating the Web site to the administrator terminal 30. Further, when the information indicating the Web site is displayed on the administrator terminal 30, the sales price of the product and the description of the product on the Web site may be displayed together.
- a region A2 in FIG. 6 is a portion where information indicating a Web site where an image is used, a selling price of a product, and the like are displayed. As described above, the output unit 19 outputs information indicating the Web site where the image is used according to the sales price stored in the image database 11.
- the functional configuration of the image search device 10 has been described above.
- processing executed by the image search apparatus 10 according to the present embodiment will be described with reference to the flowcharts of FIGS.
- the process until the information for image search is generated will be described using the flowchart of FIG. 7, and then the process of actually performing the image search will be described using the flowcharts of FIGS.
- an image to be searched is input, and the image registration unit 12 registers the image in the image database 11 (S01).
- information accompanying the image such as information indicating the Web site where the image is used and information on the sale price of the product related to the image is also input, and the information is as described above.
- the image search apparatus 10 stores the hash value in association with the hash value.
- the input of the image is performed from the administrator terminal 30 by an operation of a user of a business operator who manages the cyber mall, for example. When a plurality of images are input, registration is performed for each image, and the following processing is performed.
- the character string region extraction unit 13 extracts a character string region including a character string in the image stored in the image database 11 (S02, character string region extraction step).
- the extracted character string image is output from the character string region extraction unit 13 to the character candidate recognition unit 14.
- the character candidate recognition unit 14 divides the extracted image of the character string area into images of characters constituting the character string (S03, character candidate recognition step). Subsequently, the character candidate recognition unit 14 performs character recognition processing on each of the divided images, and specifies a predetermined number of character candidates for each character (S04, character candidate recognition step). . Information indicating the character candidates specified in this way is output from the character candidate recognition unit 14 to the character candidate storage unit 15. When a plurality of character string areas are extracted in S02, the above processing is performed for each character string area.
- the character candidate storage unit 15 stores information on the plurality of character candidates input from the character candidate recognition unit 14 so that the information can be searched from the search unit 18 during the search process (S05, character candidate storage). Step). The above is the processing until the information for image search is generated.
- a search keyword is input by the search keyword input unit 17 (S11, search keyword input step).
- the search keyword is input by receiving, for example, a search request including the keyword from the administrator terminal 30.
- the input search keyword is output from the search keyword input unit 17 to the search unit 18.
- the search unit 18 determines whether the input search keyword matches any of the character candidates stored in the character candidate storage unit 15, thereby performing a search using the keyword ( S12, search step).
- each character of the search keyword is assumed to be Keyword [i].
- i is a subscript indicating the order of the characters of the keyword.
- Keyword [1] represents the first character of the search keyword.
- the number of characters of the search keyword is Keyword.length.
- C [n] [j] be a character candidate of the character string acquired from the image.
- n is a subscript indicating the order of characters in the character string
- j is a subscript indicating the order of character candidates (similar to the description in the table of FIG. 1).
- N indicates the number of characters in the character string.
- the ranking of the character candidates determined to match the keyword is up to the 30th.
- the search process is terminated as if the keyword and the character candidate of the character string did not match. If it is determined that the condition of S1210 is not satisfied (NO in S1210), the process returns to S1202. This is for determining the match between the character next to the keyword and the first character candidate of the character next to the character string.
- the matching between the keyword and the character candidate constituting the character string is determined for all character strings to be searched. If there are a plurality of keywords input in S11, the above determination is made for a plurality of keywords.
- the score of the character string is calculated for the character string determined to match the keyword (S13, search step). Specifically, the score is calculated as follows. First, the search unit 18 calculates the character recognition reliability for the character string (character candidate) that matches the keyword, using the above-described equation (2) (S131, search step).
- the visual feature value of the image of the character string area related to the character string that matches the keyword is calculated using the above-described equation (1) (S132, visual feature value calculating step).
- the visual feature amount is calculated by the visual feature amount calculation unit 16 when an instruction is given from the search unit 18 to the visual feature amount calculation unit 16.
- the calculation of the visual feature amount by the visual feature amount calculation unit 16 does not necessarily have to be performed at this timing, and is performed and stored in advance at the same timing as S04, for example, and stored at this timing. It is also possible to refer to the information.
- Information indicating the calculated visual feature value is output from the visual feature value calculation unit 16 to the search unit 18.
- the search unit 18 calculates the character feature quantity termscore (t, m), which is the score value of the character string, using the above-described equation (3) (S133, search step).
- idf (t) which is a value indicating the keyword usage rate
- search unit 18 uses the above-described equation (5) (S14, search step).
- any one of the above-described formula (4), formula (6), and formula (7) is used by the search unit 18 from the calculated character feature quantity termscore (t, m) and idf (t).
- the image score score (q, image) is calculated (S15, search step).
- Information indicating an image including the character string determined to match the keyword in S12 and information indicating the image score are output from the search unit 18 to the output unit 19.
- the output unit 19 outputs the search result by the search unit 18 (S16, output step).
- the search result is output by generating search result information corresponding to the search request from the administrator terminal 30 from the information input from the search unit 18 and transmitting it to the administrator terminal 30.
- the search result information is displayed on the administrator terminal 30 as information indicating an image including a character string determined to match the keyword in descending order of the image score as described above.
- information on the Web site where the image is used is also transmitted from the output unit 19 to the administrator terminal 30. The user can recognize the search result by referring to the search result displayed on the administrator terminal 30. The above is the process of actually performing the image search in the image search apparatus 10.
- the visual quality of the character string based on at least one of the size, color, shape, and decoration of the character string included in the image and the contrast between the character color and the background color.
- the search result is output according to the target feature amount. Therefore, for example, when a search keyword is characteristically used for an image, a search result that ranks the image can be output. That is, according to this configuration, it is possible to facilitate detection when a search keyword is characteristically used in an image. For example, even in an image including the same character string, the score value is higher in the case of an image represented by a large character such as a title than in the case where the image is described by a small character. This makes it possible to find an expression that is visually noticeable and has a high probability of being illegal.
- the search according to the visual feature amount does not necessarily specify a plurality of character candidates, and the character string may be uniquely recognized from the character string region.
- the character string candidate recognition unit 14 of the image search apparatus 10 described above is a character string recognition unit that recognizes a character string included in the character string region extracted by the character string region extraction unit 13. Further, S03 and S04 in FIG. 7 constitute a character string recognition step of the image search method according to the present embodiment.
- the visual feature amount is calculated from the brightness of the pixels constituting the image as in the above-described embodiment, the visual feature amount can be appropriately and appropriately extracted, and the present invention can be appropriately and appropriately applied. Can be implemented.
- the feature of the present invention using the visual feature amount is obtained by the following knowledge of the inventor of the present invention. Even if an image uses a search keyword for detecting an illegal image, there are many cases where the expression is not necessarily incorrect depending on how the keyword is used.
- the inventor of the present invention visually confirmed 674 images that the manager of the cyber mall determined in advance as an unauthorized image as a preliminary experiment.
- an image including an illegal expression is characterized in that (1) many illegal words are visually conspicuous (2) the appearance frequency of illegal words is high (3) a plurality of illegal words are included in the image. I understood.
- the score value of the image is calculated from the visual feature amount. Note that the appearance frequency and the like are also reflected in the above-described features of the present invention.
- a brightness difference of 125 or more and a color difference of 500 or more are easy to read color combinations. It is known that it is necessary to ensure the brightness difference between the character color and the background color and the contrast due to the color difference in order to make the content easy to read in web content production. In addition, our research shows that the lightness difference of the color scheme is greatly related to the readability from the evaluation results of 1600 samples in which the color of the character and the background are each changed in 40 ways.
- a search for a plurality of search keywords such as an AND search and an OR search can be performed, and a search that is more convenient for the user can be performed.
- the search result can be used effectively. That is, it is possible to prevent the same image from being arranged as a search result, and to efficiently find the image that the user wants to search. For example, it is possible to prevent the same image describing the keyword searched by the user from being arranged as a search result.
- the image search apparatus 10 includes both the processing until the information for image search is generated and the processing for actually performing the image search using the generated image search information. It was. However, apparatuses that perform only the above-described processes may be separately configured as apparatuses according to the present invention. That is, one of the devices includes at least the image database 11, the character string region extraction unit 13, the character candidate recognition unit 14, the character candidate storage unit 15, and the visual feature amount calculation unit 16 among the functions described above.
- An image search information generating apparatus which is an information processing apparatus including: Another apparatus is an image search apparatus that is an information processing apparatus including at least a character candidate storage unit 15, a search keyword input unit 17, a search unit 18, and an output unit 19 among the functions described above. .
- the description image of the product sold at Cyber Mall is described as an example of the search target image.
- the search target image is not limited to the above-described target image, and an arbitrary image is searched. May be a target.
- the present invention can also be applied to a case where a search is performed on a book that is converted into electronic data.
- the purpose of the image search according to the present invention is not limited to the above, and may be used for any purpose.
- the search may be performed using a criterion other than the above-described criterion. For example, when detecting an illegal expression expressed by a small character, a criterion that increases the score as the character is smaller may be used.
- search keywords are, for example, white skin, cells, hair growth, hair loss, hay fever, rejuvenation, and anti-aging.
- the administrator of CyberMall recognized the characters in the image by the above-described method using the sample image containing the illegal expression detected in the “medicine / contact / care category”, and obtained the recognition result.
- the character category 3410 characters including English, numbers, symbols, hiragana, katakana, and kanji (JIS first level) are used.
- Three fonts were used: “style”, “HGP line typeface”, and “MS Gothic”.
- FIG. 10 is a graph showing the relationship between the number of character candidates and the above values. As shown in FIG. 10, it can be seen that increasing the number of character candidates tends to reduce the matching rate and increase the recall rate, and it is possible to reduce omissions by multiplexing the character recognition results. In addition, since the F value is stable when the number of character candidates is around 30, and the difference in search performance is small after the number of character candidates is 30, the character recognition method in this embodiment uses character candidates up to the 30th place. It turns out that a favorable search result is obtained.
- the table of FIG. 11 shows the relationship between the length of the search keyword and the search accuracy when the number of character candidates is 30.
- the search keyword when the search keyword is short, there is a tendency that a search error occurs and the relevance rate is low. This is because increasing the number of character candidates increases the probability of detecting a misrecognized character recognition result, and increases the precision by adjusting the number of character candidates according to the length of the search keyword. It is possible.
- the recall is low as a whole. This is because the sample image includes many cases in which character extraction and recognition are difficult, such as character strings arranged in an arch shape, italic characters, and a small size.
- the table of FIG. 13 shows the result of calculating the sample image score by changing the parameter ⁇ for balancing the above-described character recognition reliability and visual feature amount in increments of 0.2 from 0.0 to 1.0. Show.
- the visual feature quantity saliency (t) described above cannot reflect the visual features of the characters in the image in the score when saliency (t) is 0.0 in Equation (3).
- Low is 0.5
- high is 1.0
- medium is 0.75, which is an intermediate value.
- the image search apparatus 10 character search system in the image
- the created system is a web application that runs on a web server.
- Lucene a full-text search engine managed by the Apache project
- word segmentation analyzer uni-gram
- N-gram implemented in Lucene
- the image score corresponding to the search keyword is calculated by giving the field a visual feature amount obtained from the contrast between the character color and the background color and the character size at the time of index creation.
- the recognition results obtained as a result of performing character recognition in the image in advance for the image are indexed.
- the index with 1 to 30 character candidates is used using the 66 search keywords used for the evaluation of the change in accuracy of the character search in the image according to the number N of character candidates, and the number of character candidates is determined.
- FIG. 14 shows a graph showing the relationship between the number of character candidates and the search time.
- the search time for the number of character candidates increases by O (n).
- the average search time is about 350 milliseconds, and it can be seen that response performance that can withstand practical use is realized in the sense that no stress is felt.
- the average search time is an average time when the above 66 keywords are used as a query and the search is performed 10 times.
- the information processing program 41 is stored in a program storage area 40a that is inserted into a computer and accessed, or formed in a recording medium 40 provided in the computer.
- the information processing program 41 includes a main module 41a that centrally controls image search processing, an image database module 41b, an image registration module 41c, a character string extraction module 41d, a character candidate recognition module 41e, and a character candidate storage module. 41f, a visual feature amount calculation module 41g, a search keyword input module 41h, a search module 41i, and an output module 41j.
- the functions realized by executing the search module 41 i and the output module 41 j are the image database 11, the image registration unit 12, the character string region extraction unit 13, and the character candidate recognition unit of the image search device 10 described above. 14, the character candidate storage unit 15, the visual feature amount calculation unit 16, the search keyword input unit 17, the search unit 18, and the output unit 19.
- each module of the information processing program 41 may be installed in any one of a plurality of computers instead of one computer. In that case, a process of performing information processing of the above-described series of information processing programs 41 is performed by the computer system of the plurality of computers.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Library & Information Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Processing Or Creating Images (AREA)
- Character Input (AREA)
- Character Discrimination (AREA)
Abstract
Description
L=0.298912R+0.586611G+0.114478B (1)
視覚的特徴量算出部16は、文字色の明度Lと背景色の明度Lとの差の絶対値を求める。続いて、視覚的特徴量算出部16は、図4の表に従い、得られた文字サイズと明度差とから文字列の視覚的特徴量を算出する。図4に示す表では、視覚的特徴量は、low、medium及びhigh等と定性的な標記となっているが、それらを定量的な値に変換してもよい。視覚的特徴量算出部16は、算出した文字列の視覚的特徴量を示す情報を検索部18に出力する。
similarity(t)=Keyword(t).length/totalscore(t) (2)
上記の式において、Keyword(t).lengthはキーワードtの長さ(文字数)、totalscore(t)は一致(マッチ)した文字候補の順位の合計とする。なお、第1候補のみでキーワードに一致した文字列の文字認識信頼度は、1.0となる。
termscore(t,m)=(1-α)・similarity(t,m)+α・saliency(t,m) (3)
ここで、αは視覚的特徴量と文字認識信頼度との重みを示す値である。αは、0~1の値であり予め設定した値である。α、similarity及びsaliencyとの間には相関関係があり、これにより更に精度のよい検索結果が得られる。検索の用い方、目的によってαのつけ方、つまりsimilarityとsaliencyとにどのように重みを付けるべきかを決めることが望ましい。α=0とした場合には、文字認識信頼度similarityのみが反映されたスコアとなり視覚的特徴量saliencyが考慮されない。一方、α=1とした場合には、視覚的特徴量saliencyのみが反映されたスコアとなり文字認識信頼度similarityが考慮されない。αを1に近づけるほど文字が合っているか否かという観点では結果が悪くなる。また、画像内のm番目の文字列とは、文字列領域抽出部13によって抽出された複数の文字列領域に係る文字列のうちのどの文字列かを示すものである。mは、1から文字列領域抽出部13によって抽出された文字列領域の数までの値のうちの何れかの値をとる。
idf(t)=log(A/(S+1))+1 (5)
但し、上記のようなケースを想定していない場合等については、ハッシュ値を用いた情報の格納や出力を必ずしも行う必要はない。
Recall=T/S (8)
Precision=T/(T+E) (9)
F=(2・Recall・Precision) (10)
画像数:567,667枚
インデックスサイズ(GB):2.2(N=1)、2.8(N=5)、3.6(N=10)、4.4(N=15)、5.2(N=20)、6.0(N=25)、30(N=6.8)
Claims (10)
- 検索対象の画像を格納する画像データベースと、
前記画像データベースに格納された画像における文字列が含まれる文字列領域を抽出する文字列領域抽出手段と、
前記文字列領域抽出手段によって抽出された文字列領域に含まれる文字列を認識する文字列認識手段と、
前記文字列領域抽出手段によって抽出された文字列領域の画像から、前記文字列を構成する文字の大きさ、色、形状及び装飾、並びに文字色と背景色とのコントラストの少なくとも何れかに基づく当該文字列の視覚的特徴量を算出して記憶する視覚的特徴量算出手段と、
を備える情報処理装置。 - 前記視覚的特徴量算出手段は、前記文字列を構成する文字毎の視覚的特徴量を算出して記憶することを特徴とする請求項1に記載の情報処理装置。
- 前記視覚的特徴量算出手段は、前記文字列領域の文字列を構成するとされた画素の明度と、前記文字列領域の背景を構成するとされた画素の明度との差に基づいて前記視覚的特徴量を算出することを特徴とする請求項1又は2に記載の情報処理装置。
- 前記視覚的特徴量算出手段は、前記文字列領域の文字列を構成するとされた画素の明度を当該画素の中で最も多い色の画素の明度とし、前記文字列領域の背景を構成するとされた画素の明度を当該画素の中で最も多い色の画素の明度とすることを特徴とする請求項3に記載の情報処理装置。
- 検索用のキーワードを入力する検索キーワード入力手段と、
前記キーワード入力手段によって入力されたキーワードが前記文字列認識手段によって認識された文字列の少なくとも一部に一致するか否かを検索すると共に、一致した文字列が認識された前記文字列領域の前記視覚的特徴量から、当該文字列が含まれる画像のスコア値を算出する検索手段と、
前記検索手段による検索結果を、前記検索手段によって算出されたスコア値に応じて出力する出力手段と、
を更に備える請求項1~4の何れか一項に記載の情報処理装置。 - 前記検索手段は、前記キーワード入力手段によって入力された複数の前記キーワードに対して前記スコア値を算出する、
ことを特徴とする請求項5の何れか一項に記載の情報処理装置。 - 前記検索手段は、前記画像データベースに格納された画像に対する、前記キーワードを含む画像の割合に基づいて、前記スコア値を算出することを特徴とする請求項5又は6の何れか一項に記載の情報処理装置。
- 前記画像データベースは、同一の画像を複数含まないように検索対象の画像を格納すると共に、当該画像から得られるハッシュ値と当該画像が利用されているWebページの箇所を示す情報とを対応付けて格納し、
前記出力手段は、前記検索手段による検索によって得られたと共に同一の画像を複数含まない画像を示す情報と、前記画像データベースに当該画像のハッシュ値に対応付けられて格納されている当該画像が利用されている箇所を示す情報とを出力する、
ことを特徴とする請求項5~7の何れか一項に記載の情報処理装置。 - 検索対象の画像を格納する画像データベースを備える情報処理装置による画像検索方法であって、
前記画像データベースに格納された画像における文字列が含まれる文字列領域を抽出する文字列領域抽出ステップと、
前記文字列領域抽出ステップにおいて抽出された文字列領域に含まれる文字列を認識する文字列認識ステップと、
前記文字列領域抽出ステップにおいて抽出された文字列領域の画像から、前記文字列を構成する文字の大きさ、色、形状及び装飾、並びに文字色と背景色とのコントラストの少なくとも何れかに基づく当該文字列の視覚的特徴量を算出して記憶する視覚的特徴量算出ステップと、
を含む情報処理方法。 - 一つ以上のコンピュータを、
検索対象の画像を格納する画像データベースと、
前記画像データベースに格納された画像における文字列が含まれる文字列領域を抽出する文字列領域抽出手段と、
前記文字列領域抽出手段によって抽出された文字列領域に含まれる文字列を認識する文字列認識手段と、
前記文字列領域抽出手段によって抽出された文字列領域の画像から、前記文字列を構成する文字の大きさ、色、形状及び装飾、並びに文字色と背景色とのコントラストの少なくとも何れかに基づく当該文字列の視覚的特徴量を算出して記憶する視覚的特徴量算出手段と、
して機能させる情報処理プログラムを記録したコンピュータ読み取り可能な記録媒体。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201180010551.0A CN102782680B (zh) | 2010-02-26 | 2011-02-28 | 信息处理装置、信息处理方法、记录了信息处理程序的记录介质 |
EP11747562.4A EP2541441A4 (en) | 2010-02-26 | 2011-02-28 | INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND RECORDING MEDIUM CONTAINING INFORMATION PROCESSING PROGRAM |
JP2012501908A JP5259876B2 (ja) | 2010-02-26 | 2011-02-28 | 情報処理装置、情報処理方法、情報処理プログラムを記録した記録媒体 |
US13/580,789 US8825670B2 (en) | 2010-02-26 | 2011-02-28 | Information processing device, information processing method, and recording medium that has recorded information processing program |
Applications Claiming Priority (10)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2010-043468 | 2010-02-26 | ||
JP2010043468 | 2010-02-26 | ||
JP2010043469 | 2010-02-26 | ||
JP2010-043469 | 2010-02-26 | ||
JP2010-194422 | 2010-08-31 | ||
JP2010-194410 | 2010-08-31 | ||
JP2010194431 | 2010-08-31 | ||
JP2010-194431 | 2010-08-31 | ||
JP2010194422 | 2010-08-31 | ||
JP2010194410 | 2010-08-31 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011105608A1 true WO2011105608A1 (ja) | 2011-09-01 |
Family
ID=44507001
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/054527 WO2011105607A1 (ja) | 2010-02-26 | 2011-02-28 | 情報処理装置、情報処理方法、情報処理プログラムを記録した記録媒体 |
PCT/JP2011/054528 WO2011105608A1 (ja) | 2010-02-26 | 2011-02-28 | 情報処理装置、情報処理方法、情報処理プログラムを記録した記録媒体 |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2011/054527 WO2011105607A1 (ja) | 2010-02-26 | 2011-02-28 | 情報処理装置、情報処理方法、情報処理プログラムを記録した記録媒体 |
Country Status (5)
Country | Link |
---|---|
US (2) | US8825670B2 (ja) |
EP (2) | EP2541441A4 (ja) |
JP (4) | JP5647916B2 (ja) |
CN (2) | CN102763104B (ja) |
WO (2) | WO2011105607A1 (ja) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210345758A1 (en) * | 2015-06-11 | 2021-11-11 | The Procter & Gamble Company | Apparatus and methods for modifying keratinous surfaces |
Families Citing this family (42)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7030863B2 (en) | 2000-05-26 | 2006-04-18 | America Online, Incorporated | Virtual keyboard system with automatic correction |
US7286115B2 (en) | 2000-05-26 | 2007-10-23 | Tegic Communications, Inc. | Directional input system with automatic correction |
US8201087B2 (en) * | 2007-02-01 | 2012-06-12 | Tegic Communications, Inc. | Spell-check for a keyboard system with automatic correction |
WO2013021889A1 (ja) * | 2011-08-05 | 2013-02-14 | 楽天株式会社 | 色名決定装置、色名決定方法、情報記録媒体、ならびに、プログラム |
KR102007840B1 (ko) * | 2012-04-13 | 2019-08-06 | 엘지전자 주식회사 | 이미지 검색 방법 및 이를 위한 디지털 디바이스 |
US9576042B2 (en) * | 2012-08-01 | 2017-02-21 | Google Inc. | Categorizing search terms |
JP5831420B2 (ja) * | 2012-09-28 | 2015-12-09 | オムロン株式会社 | 画像処理装置および画像処理方法 |
JP2014078168A (ja) * | 2012-10-11 | 2014-05-01 | Fuji Xerox Co Ltd | 文字認識装置及びプログラム |
US9916081B2 (en) * | 2013-02-01 | 2018-03-13 | Intel Corporation | Techniques for image-based search using touch controls |
US9910887B2 (en) * | 2013-04-25 | 2018-03-06 | Facebook, Inc. | Variable search query vertical access |
KR101845780B1 (ko) | 2013-07-09 | 2018-04-05 | 류중하 | 기호 이미지 검색 서비스 제공 방법 및 이에 사용되는 기호 이미지 검색용 서버 |
CN104298982B (zh) * | 2013-07-16 | 2019-03-08 | 深圳市腾讯计算机系统有限公司 | 一种文字识别方法及装置 |
CN104462109B (zh) * | 2013-09-17 | 2018-10-26 | 阿尔派株式会社 | 检索装置及检索方法 |
CN110032656A (zh) * | 2014-02-21 | 2019-07-19 | 联想(北京)有限公司 | 信息处理方法及信息处理装置 |
US10152540B2 (en) * | 2014-10-10 | 2018-12-11 | Qualcomm Incorporated | Linking thumbnail of image to web page |
WO2016082092A1 (en) * | 2014-11-25 | 2016-06-02 | Yahoo! Inc. | Method and system for analyzing user agent string |
US10025847B2 (en) | 2014-11-25 | 2018-07-17 | Oath Inc. | Method and system for providing a user agent string database |
KR20170037302A (ko) * | 2015-09-25 | 2017-04-04 | 삼성전자주식회사 | 전자 장치 및 이의 제어 방법 |
CN105912739B (zh) * | 2016-07-14 | 2019-03-26 | 湖南琴海数码股份有限公司 | 一种相似图片检索系统及其方法 |
JP2018028714A (ja) * | 2016-08-15 | 2018-02-22 | 富士ゼロックス株式会社 | 情報処理装置及びプログラム |
US10565255B2 (en) * | 2016-08-24 | 2020-02-18 | Baidu Usa Llc | Method and system for selecting images based on user contextual information in response to search queries |
CN106372225B (zh) * | 2016-09-07 | 2020-05-19 | 知识产权出版社有限责任公司 | 一种基于高价值对比库的信息处理装置及方法 |
US10438083B1 (en) * | 2016-09-27 | 2019-10-08 | Matrox Electronic Systems Ltd. | Method and system for processing candidate strings generated by an optical character recognition process |
JP6804292B2 (ja) * | 2016-12-28 | 2020-12-23 | オムロンヘルスケア株式会社 | 端末装置 |
US11157299B2 (en) | 2017-08-15 | 2021-10-26 | Citrix Systems, Inc. | Thin provisioning virtual desktop infrastructure virtual machines in cloud environments without thin clone support |
CN107707396B (zh) * | 2017-09-28 | 2020-01-24 | 平安科技(深圳)有限公司 | 一种乱码监控方法、存储介质和服务器 |
JP6506427B1 (ja) * | 2018-01-25 | 2019-04-24 | 株式会社リクルート | 情報処理装置、動画検索方法、生成方法及びプログラム |
JP7160432B2 (ja) * | 2018-04-02 | 2022-10-25 | 日本電気株式会社 | 画像処理装置、画像処理方法、プログラム |
JP7139669B2 (ja) * | 2018-04-17 | 2022-09-21 | 富士フイルムビジネスイノベーション株式会社 | 情報処理装置及びプログラム |
JP7247472B2 (ja) * | 2018-04-19 | 2023-03-29 | 富士フイルムビジネスイノベーション株式会社 | 情報処理装置及びプログラム |
CN112868001B (zh) | 2018-10-04 | 2024-04-26 | 株式会社力森诺科 | 文档检索装置、文档检索程序、文档检索方法 |
JP2020064390A (ja) * | 2018-10-16 | 2020-04-23 | ファナック株式会社 | データ収集システム及びデータ収集方法 |
JP7383882B2 (ja) * | 2019-01-22 | 2023-11-21 | 富士フイルムビジネスイノベーション株式会社 | 情報処理装置、及び情報処理プログラム |
CN111027556B (zh) * | 2019-03-11 | 2023-12-22 | 广东小天才科技有限公司 | 一种基于图像预处理的搜题方法及学习设备 |
WO2020194576A1 (ja) * | 2019-03-27 | 2020-10-01 | 三菱電機ビルテクノサービス株式会社 | 設備機器情報収集システム |
CN110399772B (zh) * | 2019-04-15 | 2020-09-08 | 安徽省徽腾智能交通科技有限公司泗县分公司 | 基于环境分析的设备控制系统 |
CN110688995B (zh) * | 2019-09-19 | 2022-11-15 | 浙江善政科技有限公司 | 地图查询的处理方法,计算机可读存储介质和移动终端 |
JP6879529B1 (ja) * | 2020-04-16 | 2021-06-02 | 株式会社クロスドリーム | 商品・役務注文システム、商品・役務注文方法及びそのプログラム |
JP2021193495A (ja) * | 2020-06-08 | 2021-12-23 | コニカミノルタ株式会社 | 検索システム |
CN113626444B (zh) * | 2021-08-26 | 2023-11-28 | 平安国际智慧城市科技股份有限公司 | 基于位图算法的表格查询方法、装置、设备及介质 |
CN114120016B (zh) * | 2022-01-26 | 2022-05-27 | 北京阿丘科技有限公司 | 字符串提取方法、装置、设备及存储介质 |
CN118334639B (zh) * | 2024-06-12 | 2024-08-23 | 深圳市瑞意博医疗设备有限公司 | 一种药品复核方法和系统 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001337993A (ja) | 2000-05-30 | 2001-12-07 | Fujitsu Ltd | 文字認識結果を利用して情報を検索する検索装置および方法 |
JP2008288898A (ja) * | 2007-05-17 | 2008-11-27 | Canon Inc | 動画撮像装置及び動画撮像方法 |
JP2009295104A (ja) * | 2008-06-09 | 2009-12-17 | Fujifilm Corp | ウェブサイト検索装置、画像情報収集サーバ、及びウェブサイト検索方法 |
JP2010039533A (ja) * | 2008-07-31 | 2010-02-18 | Fujifilm Corp | 画像ランキング装置、画像ランキング方法及びプログラム |
Family Cites Families (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6415307B2 (en) | 1994-10-24 | 2002-07-02 | P2I Limited | Publication file conversion and display |
JP3230641B2 (ja) | 1995-05-08 | 2001-11-19 | シャープ株式会社 | 文字列検索装置 |
JPH10177641A (ja) * | 1996-12-18 | 1998-06-30 | Fuji Xerox Co Ltd | 文書ファイリング装置 |
US6944344B2 (en) * | 2000-06-06 | 2005-09-13 | Matsushita Electric Industrial Co., Ltd. | Document search and retrieval apparatus, recording medium and program |
JP3669626B2 (ja) * | 2000-06-06 | 2005-07-13 | 松下電器産業株式会社 | 検索装置、記録媒体およびプログラム |
JP2002007413A (ja) * | 2000-06-20 | 2002-01-11 | Fujitsu Ltd | 画像検索装置 |
JP2004206520A (ja) | 2002-12-26 | 2004-07-22 | Nec Corp | 文書画像配信システム、文書画像配信装置、端末装置および文書画像配信プログラム |
US20030177115A1 (en) * | 2003-02-21 | 2003-09-18 | Stern Yonatan P. | System and method for automatic preparation and searching of scanned documents |
JP4349183B2 (ja) * | 2004-04-01 | 2009-10-21 | 富士ゼロックス株式会社 | 画像処理装置および画像処理方法 |
JP4817108B2 (ja) * | 2004-11-05 | 2011-11-16 | 富士ゼロックス株式会社 | 画像処理装置、画像処理方法及び画像処理プログラム |
US20090193334A1 (en) * | 2005-05-18 | 2009-07-30 | Exb Asset Management Gmbh | Predictive text input system and method involving two concurrent ranking means |
JP2007058605A (ja) * | 2005-08-24 | 2007-03-08 | Ricoh Co Ltd | 文書管理システム |
US8363939B1 (en) * | 2006-10-06 | 2013-01-29 | Hrl Laboratories, Llc | Visual attention and segmentation system |
JP2008139981A (ja) * | 2006-11-30 | 2008-06-19 | Sharp Corp | 制御装置、端末装置、表示システム、表示方法、プログラムおよびその記録媒体 |
US8094202B2 (en) | 2007-05-17 | 2012-01-10 | Canon Kabushiki Kaisha | Moving image capture apparatus and moving image capture method |
US7940985B2 (en) * | 2007-06-06 | 2011-05-10 | Microsoft Corporation | Salient object detection |
CN101354705B (zh) * | 2007-07-23 | 2012-06-13 | 夏普株式会社 | 文档图像处理装置和文档图像处理方法 |
JP2009075908A (ja) * | 2007-09-21 | 2009-04-09 | Sony Corp | ウェブ・ページ閲覧履歴管理システム及びウェブ・ページ閲覧履歴管理方法、並びにコンピュータ・プログラム |
EP2223265A1 (en) * | 2007-11-20 | 2010-09-01 | Lumex As | A method for resolving contradicting output data from an optical character recognition (ocr) system, wherein the output data comprises more than one recognition alternative for an image of a character |
JP2009282883A (ja) | 2008-05-26 | 2009-12-03 | Fujifilm Corp | 画像検索システム、クローリング装置及び画像検索装置 |
US8442813B1 (en) * | 2009-02-05 | 2013-05-14 | Google Inc. | Methods and systems for assessing the quality of automatically generated text |
US8542950B2 (en) * | 2009-06-02 | 2013-09-24 | Yahoo! Inc. | Finding iconic images |
US8811742B2 (en) * | 2009-12-02 | 2014-08-19 | Google Inc. | Identifying matching canonical documents consistent with visual query structural information |
-
2011
- 2011-02-28 US US13/580,789 patent/US8825670B2/en active Active
- 2011-02-28 JP JP2011042642A patent/JP5647916B2/ja active Active
- 2011-02-28 EP EP11747562.4A patent/EP2541441A4/en not_active Ceased
- 2011-02-28 JP JP2012501908A patent/JP5259876B2/ja active Active
- 2011-02-28 WO PCT/JP2011/054527 patent/WO2011105607A1/ja active Application Filing
- 2011-02-28 US US13/580,880 patent/US8949267B2/en active Active
- 2011-02-28 EP EP11747561.6A patent/EP2541440A4/en not_active Ceased
- 2011-02-28 CN CN201180010163.2A patent/CN102763104B/zh active Active
- 2011-02-28 JP JP2012501907A patent/JP5075291B2/ja active Active
- 2011-02-28 WO PCT/JP2011/054528 patent/WO2011105608A1/ja active Application Filing
- 2011-02-28 CN CN201180010551.0A patent/CN102782680B/zh active Active
-
2012
- 2012-10-10 JP JP2012225214A patent/JP2013041602A/ja not_active Withdrawn
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001337993A (ja) | 2000-05-30 | 2001-12-07 | Fujitsu Ltd | 文字認識結果を利用して情報を検索する検索装置および方法 |
JP2008288898A (ja) * | 2007-05-17 | 2008-11-27 | Canon Inc | 動画撮像装置及び動画撮像方法 |
JP2009295104A (ja) * | 2008-06-09 | 2009-12-17 | Fujifilm Corp | ウェブサイト検索装置、画像情報収集サーバ、及びウェブサイト検索方法 |
JP2010039533A (ja) * | 2008-07-31 | 2010-02-18 | Fujifilm Corp | 画像ランキング装置、画像ランキング方法及びプログラム |
Non-Patent Citations (5)
Title |
---|
ASHIDA; NAGAI; OKAMOTO; MIYAO; YAMAMOTO: "Extraction of Characters from Scene Images", TRANSACTIONS D OF IECE, vol. J88-D2, no. 9, 2005, pages 1817 - 1824 |
HASE; YONEDA; SAKAI; MARUYAMA: "Consideration of Color Segmentation to Extract Character Areas from Color Document Images", TRANSACTIONS D-II OFIECE, vol. J83-D-II, no. 5, 2000, pages 1294 - 1304, XP002907915 |
OTSU: "An Automatic Threshold Selection Method Based on Discriminant and Least Squares Criteria", TRANSACTIONS D OF IECE, vol. 63, no. 4, April 1980 (1980-04-01), pages 349 - 356 |
See also references of EP2541441A4 |
SON; TAWARA; ASO; KIMURA: "High-precision character recognition using directional element feature", TRANSACTIONS OF IECE, vol. J74-D-II, no. 3, 1991, pages 330 - 339 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210345758A1 (en) * | 2015-06-11 | 2021-11-11 | The Procter & Gamble Company | Apparatus and methods for modifying keratinous surfaces |
Also Published As
Publication number | Publication date |
---|---|
US8825670B2 (en) | 2014-09-02 |
EP2541441A1 (en) | 2013-01-02 |
CN102763104B (zh) | 2015-04-01 |
US20120323901A1 (en) | 2012-12-20 |
JP2013041602A (ja) | 2013-02-28 |
JPWO2011105608A1 (ja) | 2013-06-20 |
US8949267B2 (en) | 2015-02-03 |
CN102782680A (zh) | 2012-11-14 |
US20130188872A1 (en) | 2013-07-25 |
JP5075291B2 (ja) | 2012-11-21 |
CN102763104A (zh) | 2012-10-31 |
JP5647916B2 (ja) | 2015-01-07 |
JPWO2011105607A1 (ja) | 2013-06-20 |
EP2541441A4 (en) | 2014-10-15 |
JP2012073999A (ja) | 2012-04-12 |
EP2541440A1 (en) | 2013-01-02 |
JP5259876B2 (ja) | 2013-08-07 |
EP2541440A4 (en) | 2014-10-15 |
CN102782680B (zh) | 2016-01-20 |
WO2011105607A1 (ja) | 2011-09-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5259876B2 (ja) | 情報処理装置、情報処理方法、情報処理プログラムを記録した記録媒体 | |
AU2017272149B2 (en) | Identifying matching canonical documents in response to a visual query | |
CA3068761C (en) | Architecture for responding to a visual query | |
US9183224B2 (en) | Identifying matching canonical documents in response to a visual query | |
EP4057163B1 (en) | Facilitating use of images as search queries | |
JP2019023923A (ja) | ソーシャルネットワークの支援による顔認識 | |
US8325189B2 (en) | Information processing apparatus capable of easily generating graph for comparing of a plurality of commercial products | |
US20140280295A1 (en) | Multi-language information retrieval and advertising | |
CN105653562B (zh) | 一种文本内容与查询请求之间相关性的计算方法及装置 | |
US20110078176A1 (en) | Image search apparatus and method | |
CN105917334A (zh) | 搜索结果中的相干问题回答 | |
US20060047732A1 (en) | Document processing apparatus for searching documents, control method therefor, program for implementing the method, and storage medium storing the program | |
US20090276418A1 (en) | Information processing apparatus, information processing method, information processing program and recording medium | |
US11755659B2 (en) | Document search device, document search program, and document search method | |
US8549008B1 (en) | Determining section information of a digital volume | |
CN113806491A (zh) | 一种信息处理的方法、装置、设备和介质 | |
KR101440385B1 (ko) | 인디케이터를 이용한 정보 관리 장치 | |
CN111681776A (zh) | 基于医药大数据的医药对象关系分析的方法及系统 | |
JP5233424B2 (ja) | 検索装置およびプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201180010551.0 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 11747562 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2012501908 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2011747562 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13580789 Country of ref document: US |