JP2011018311A

JP2011018311A - Device and program for retrieving image, and recording medium

Info

Publication number: JP2011018311A
Application number: JP2010007497A
Authority: JP
Inventors: Jilin Li; 季▲りん▼ 李; Zhi-Gang Fan; 志剛范; Atou Go; 亜棟呉; Ning Le; 寧楽
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2009-07-07
Filing date: 2010-01-15
Publication date: 2011-01-27
Also published as: CN101944091A

Abstract

PROBLEM TO BE SOLVED: To provide an image retrieval device, an image retrieval program and a recording medium, allowing speed-up of processing speed related to retrieval processing, and allowing improvement of retrieval accuracy of document image data.SOLUTION: In a step A1, a preprocess part 130 performs binarization processing as preprocessing to input image data. In step A2, based on the binarized image data, a characteristic extraction part 131 extracts a characteristic amount of the binarized image data. In step A3, a retrieval part 132 compares a characteristic amount of registration image data and a characteristic amount of the binarized data, and retrieves image data having high similarity to the input image data from the registration image data. In step A4, a retrieval result is output to a display part 14. Here, the characteristic amount is arrangement wherein a ratio fn of word lengths of two adjacent words is arranged along arrangement order of the words.

Description

本発明は、登録された画像データから特定の画像データを検索する画像検索装置、画像検索プログラムおよび記録媒体に関する。 The present invention relates to an image search device, an image search program, and a recording medium that search for specific image data from registered image data.

複写機、ファクシミリ装置、プリンターおよびこれらの機能を複数併せ持つ複合機などの画像形成装置では、入力された原稿画像などの画像データを大容量の記憶装置に記憶しておき、１度入力され、登録された画像データであればいつでも読み出して再出力することができる機能が備えられているものがある。 In image forming apparatuses such as copiers, facsimile machines, printers, and multifunction machines having a plurality of these functions, image data such as input original images is stored in a large-capacity storage device, which is input once and registered. Some image data are provided with a function that can be read and re-output at any time.

再出力できる機能は便利ではあるが、登録されるデータ量が多くなると再出力したいデータを探すことが困難になるため、複数の画像データの中から所望の画像データを検索する画像検索技術が重要となる。 Although the function that can be re-output is convenient, it becomes difficult to find the data that you want to re-output if the amount of registered data increases, so image search technology that searches for desired image data from multiple image data is important It becomes.

画像データを検索する際には、登録された画像データと、入力された画像データとを比較して類似性を算出する必要があるが、登録された画像データには、文書画像データと、非文書画像データ（写真や図形、イラスト画像など）とがある。 When searching for image data, it is necessary to calculate the similarity by comparing the registered image data with the input image data. The registered image data includes document image data and non-image data. Document image data (photos, figures, illustration images, etc.).

文書画像データは、文字画像で構成されているため、文字画像同士を比較することになり、非文書画像データに比べて類似性を判断することが困難である。 Since the document image data is composed of character images, the character images are compared with each other, and it is difficult to determine the similarity compared to the non-document image data.

特許文献１記載の文書画像検索装置は、入力部が入力した文書画像データまたは検索文書データから句読点を認識し、句読点間の文字数を計測し、計測された文字数をインデックスとして登録し、計測された検索文書データの各句読点間の文字数と同一の文字数を持つインデックスを検索する。 The document image search device described in Patent Document 1 recognizes punctuation marks from document image data or search document data input by an input unit, measures the number of characters between punctuation marks, registers the measured number of characters as an index, and measures An index having the same number of characters as the number of characters between the punctuation marks of the search document data is searched.

特開２００８−１５２５０２号公報JP 2008-152502 A

特許文献１記載の文書画像検索装置は、句読点間の文字数をインデックスとして検索しており、このようなインデックスは、比較的大きなレイアウト情報を表すものである。句読点間の文字数が同じであっても、文字自体は異なっているという文書画像データは十分にありうるので、句読点間の文字数で判断する場合には、検索精度を十分に高くすることができない。 The document image search apparatus described in Patent Document 1 searches using the number of characters between punctuation marks as an index, and such an index represents relatively large layout information. Even if the number of characters between punctuation marks is the same, there can be document image data that the characters themselves are different. Therefore, when judging by the number of characters between punctuation marks, the search accuracy cannot be sufficiently increased.

また、句読点は、文字に比べて小さな画像であり、登録された文書画像データ、入力画像データから句読点を認識する際に誤認識が生じる可能性が高く、句読点を認識する段階で誤認識が生じると、登録されるインデックスの正確性に欠けることになる。 Punctuation marks are images that are smaller than characters, and there is a high possibility that misrecognition will occur when recognizing punctuation marks from registered document image data and input image data, and misrecognition occurs when recognizing punctuation marks. In this case, the accuracy of the registered index is lacking.

検索精度を上げるために、句読点の認識精度を向上させようとすると、読み取りの解像度を高くすることが必要であり、その結果１つ１つの文書画像データのデータ量が多くなり、記憶容量の増大、処理速度の低下を招く。 To improve the punctuation recognition accuracy in order to increase the search accuracy, it is necessary to increase the reading resolution, resulting in an increase in the amount of document image data and an increase in storage capacity. This causes a decrease in processing speed.

本発明の目的は、検索処理に係る処理速度を高速化し、文書画像データの検索精度を向上させることができる画像検索装置、画像検索プログラムおよび記録媒体を提供することである。 An object of the present invention is to provide an image search device, an image search program, and a recording medium that can increase the processing speed of search processing and improve the search accuracy of document image data.

本発明は、予め登録された文書画像データの中から、入力された文書画像データに類似した文書画像データを検索する画像検索装置であって、
入力された文書画像データに含まれる単語を検出して単語分割を行い、隣接する２つの単語ごとに、当該２つの単語の単語長の比を算出し、算出した単語長の比を単語の並び順に沿って並べた単語長の比の配列を、入力された文書画像データの特徴量として抽出する特徴量抽出部と、
前記登録された文書画像データと、前記登録された文書画像データの前記特徴量とを関連付けて記憶する登録画像記憶部と、
前記登録された文書画像データの特徴量と、前記特徴量抽出部で生成された前記入力された文書画像データの特徴量とに基づいて、前記登録された文書画像データの中から、前記入力された文書画像データに類似した画像データを検索する検索部と、
検索部による検索結果に基づいて、前記登録された文書画像データのうち前記入力された文書画像データに類似した文書画像データを表示する表示部とを備えることを特徴とする画像検索装置である。 The present invention is an image search device for searching document image data similar to input document image data from pre-registered document image data,
A word contained in the input document image data is detected and divided into words, and for each two adjacent words, a ratio of the word lengths of the two words is calculated, and the calculated ratio of the word lengths is arranged as a word sequence. A feature amount extraction unit that extracts an array of word length ratios arranged in order as feature amounts of input document image data;
A registered image storage unit that stores the registered document image data and the feature amount of the registered document image data in association with each other;
Based on the feature amount of the registered document image data and the feature amount of the input document image data generated by the feature amount extraction unit, the input is performed from the registered document image data. A search unit for searching for image data similar to the document image data,
An image search apparatus comprising: a display unit configured to display document image data similar to the input document image data among the registered document image data based on a search result by a search unit.

また本発明は、前記特徴量抽出部は、前記単語を構成する領域の長さを示す画素数を前記単語長として、前記単語長の比を算出することを特徴とする。 In the invention, it is preferable that the feature amount extraction unit calculates a ratio of the word lengths using the number of pixels indicating the length of a region constituting the word as the word length.

また本発明は、登録画像記憶部は、前記単語長の比の配列と、前記登録された文書画像データにおける前記単語長の比の配列の位置情報とを関連付けて記憶し、
前記検索部は、検索された文書画像データにおける前記入力された文書画像データの前記単語長の比の配列と一致した部分を検出し、
前記表示部は、検出された前記一致した部分を他の部分から識別可能に表示することを特徴とする。 In the present invention, the registered image storage unit stores the word length ratio array in association with the positional information of the word length ratio array in the registered document image data,
The search unit detects a portion in the searched document image data that matches the word length ratio array of the input document image data,
The display unit displays the detected matched portion so as to be distinguishable from other portions.

また本発明は、予め登録された文書画像データの中から、入力された文書画像データに類似した文書画像データを検索する画像検索装置であって、
入力された文書画像データに含まれる文字を検出して文字分割を行い、１文字に外接する外接矩形を検出し、文字を構成する画素が前記外接矩形内を占める割合である画素密度を算出し、算出した画素密度を単語の並び順に沿って並べた画素密度の配列を、入力された文書画像データの特徴量として抽出する特徴量抽出部と、
前記登録された文書画像データと、前記登録された文書画像データの前記特徴量とを関連付けて記憶する登録画像記憶部と、
前記登録された文書画像データの特徴量と、前記特徴量抽出部で生成された前記入力された文書画像データの特徴量とに基づいて、前記登録された文書画像データの中から、前記入力された文書画像データに類似した画像データを検索する検索部と、
検索部による検索結果に基づいて、前記登録された文書画像データのうち前記入力された文書画像データに類似した文書画像データを表示する表示部とを備えることを特徴とする画像検索装置である。 The present invention is also an image search device for searching document image data similar to input document image data from previously registered document image data,
Characters included in the input document image data are detected, character division is performed, a circumscribed rectangle circumscribing one character is detected, and a pixel density which is a ratio of pixels constituting the character in the circumscribed rectangle is calculated. A feature amount extraction unit that extracts an array of pixel densities obtained by arranging the calculated pixel densities along the word arrangement order as feature amounts of input document image data;
A registered image storage unit that stores the registered document image data and the feature amount of the registered document image data in association with each other;
Based on the feature amount of the registered document image data and the feature amount of the input document image data generated by the feature amount extraction unit, the input is performed from the registered document image data. A search unit for searching for image data similar to the document image data,
An image search apparatus comprising: a display unit configured to display document image data similar to the input document image data among the registered document image data based on a search result by a search unit.

また本発明は、コンピュータを上記の画像検索装置として機能させるための画像検索プログラムである。 The present invention is also an image search program for causing a computer to function as the above-described image search apparatus.

また本発明は、コンピュータを上記の画像検索装置として機能させるための画像検索プログラムを記録したコンピュータ読み取り可能な記録媒体である。 The present invention is also a computer-readable recording medium on which an image search program for causing a computer to function as the above-described image search apparatus is recorded.

本発明によれば、特徴量抽出部が、入力された文書画像データに含まれる単語を検出して単語分割を行い、隣接する２つの単語ごとに、当該２つの単語の単語長の比を算出し、算出した単語長の比を単語の並び順に沿って並べた単語長の比の配列を、入力された文書画像データの特徴量として抽出する。登録画像記憶部には、前記登録された文書画像データと、前記登録された文書画像データの前記特徴量とが関連付けて記憶されており、検索部は、前記登録された文書画像データの特徴量と、前記特徴量抽出部で生成された前記入力された文書画像データの特徴量とに基づいて、前記登録された文書画像データの中から、前記入力された文書画像データに類似した画像データを検索する。 According to the present invention, the feature amount extraction unit detects words included in the input document image data, performs word division, and calculates the ratio of the word lengths of the two words for each two adjacent words. Then, an array of word length ratios obtained by arranging the calculated word length ratios along the word arrangement order is extracted as a feature amount of the input document image data. The registered image storage unit stores the registered document image data in association with the feature amount of the registered document image data, and the search unit stores the feature amount of the registered document image data. And image data similar to the input document image data from the registered document image data based on the feature amount of the input document image data generated by the feature amount extraction unit. Search for.

表示部は、検索部による検索結果に基づいて、前記登録された文書画像データのうち前記入力された文書画像データに類似した文書画像データを表示する。 The display unit displays document image data similar to the input document image data among the registered document image data based on the search result by the search unit.

隣接する２つの単語の単語長の比を特徴量として用いることにより、従来技術のような句読点間の文字数を特徴量して検索する場合に比べて、異なる文章であっても同じ特徴量となる可能性が低いために、文書画像データの検索精度を向上させることができる。 By using the ratio of the word lengths of two adjacent words as a feature amount, the same feature amount can be obtained even in different sentences compared to the case of searching by using the number of characters between punctuation marks as in the prior art. Since the possibility is low, the search accuracy of the document image data can be improved.

さらに、単語長は、比較的低い解像度で読み取った文書画像データであっても誤検出されないので、低解像度の文書画像データを用いることができ、検索処理に係る処理速度を高速化し、文書画像データを記憶するための記憶容量も削減できる。 Furthermore, since the word length is not erroneously detected even if the document image data is read at a relatively low resolution, the document image data with a low resolution can be used, the processing speed related to the search process is increased, and the document image data The storage capacity for storing can also be reduced.

また、単語長の比は、画像が拡大、縮小された場合であっても変化しないので、画像の変倍率によって検索精度が変化しない。したがって、１ページの画像を１／２または１／４に縮小し、複数ページを１つの画像データとする、いわゆるＮアップ画像データを検索の対象とする検索処理にも有効である。 Further, since the ratio of word lengths does not change even when the image is enlarged or reduced, the search accuracy does not change depending on the magnification of the image. Therefore, it is also effective for a search process in which so-called N-up image data is searched for by reducing an image of one page to 1/2 or 1/4 and using a plurality of pages as one image data.

また本発明によれば、前記特徴量抽出部は、前記単語を構成する領域の長さを示す画素数を前記単語長として、前記単語長の比を算出する。 According to the present invention, the feature amount extraction unit calculates the ratio of the word lengths using the number of pixels indicating the length of the area constituting the word as the word length.

画素数を用いることにより、容易に単語長を検出することができ、単語長の比も容易に算出することができる。 By using the number of pixels, the word length can be easily detected, and the ratio of word lengths can also be easily calculated.

また本発明によれば、登録画像記憶部は、前記単語長の比の配列と、前記登録された文書画像データにおける前記単語長の比の配列の位置情報とを関連付けて記憶しておく。前記検索部は、検索された文書画像データにおける前記入力された文書画像データの前記単語長の比の配列と一致した部分を検出し、前記表示部は、検出された前記一致した部分を他の部分から識別可能に表示する。 According to the invention, the registered image storage unit stores the word length ratio array in association with the positional information of the word length ratio array in the registered document image data. The search unit detects a portion in the searched document image data that matches the word length ratio array of the input document image data, and the display unit detects the detected match portion as another Display distinguishable from parts.

これにより、登録された文書画像データの中から、特定の文章を含む文書画像データを検索することができ、文章コンテンツの検索を行うことができる。 Thereby, document image data including a specific sentence can be searched from the registered document image data, and a sentence content can be searched.

本発明によれば、特徴量抽出部が、入力された文書画像データに含まれる文字を検出して文字分割を行い、１文字に外接する外接矩形を検出し、文字を構成する画素が前記外接矩形内を占める割合である画素密度を算出し、算出した画素密度を単語の並び順に沿って並べた画素密度の配列を、入力された文書画像データの特徴量として抽出する。登録画像記憶部には、前記登録された文書画像データと、前記登録された文書画像データの前記特徴量とが関連付けて記憶されており、検索部は、前記登録された文書画像データの特徴量と、前記特徴量抽出部で生成された前記入力された文書画像データの特徴量とに基づいて、前記登録された文書画像データの中から、前記入力された文書画像データに類似した画像データを検索する。 According to the present invention, the feature amount extraction unit detects characters included in the input document image data, performs character division, detects a circumscribed rectangle circumscribing one character, and pixels constituting the character are the circumscribed pixels. A pixel density, which is a ratio of occupying the rectangle, is calculated, and an array of pixel densities obtained by arranging the calculated pixel densities along the word arrangement order is extracted as a feature amount of the input document image data. The registered image storage unit stores the registered document image data in association with the feature amount of the registered document image data, and the search unit stores the feature amount of the registered document image data. And image data similar to the input document image data from the registered document image data based on the feature amount of the input document image data generated by the feature amount extraction unit. Search for.

文字の画素密度を特徴量として用いることにより、従来技術のような句読点間の文字数を特徴量して検索する場合に比べて、異なる文章であっても同じ特徴量となる可能性が低いために、文書画像データの検索精度を向上させることができる。 By using the pixel density of characters as a feature amount, it is unlikely that the same feature amount will be obtained even in different sentences compared to the case of searching with the number of characters between punctuation marks as in the prior art. The search accuracy of document image data can be improved.

さらに、画素密度は、比較的低い解像度で読み取った文書画像データであっても誤検出されないので、低解像度の文書画像データを用いることができ、検索処理に係る処理速度を高速化し、文書画像データを記憶するための記憶容量も削減できる。 Further, since the pixel density is not erroneously detected even when the document image data is read at a relatively low resolution, the low-resolution document image data can be used, the processing speed related to the search process is increased, and the document image data The storage capacity for storing can also be reduced.

また本発明によれば、コンピュータを上記の画像検索装置として機能させるための画像検索プログラムとして供給することができ、また画像検索プログラムを記録したコンピュータ読み取り可能な記録媒体として供給することができる。 In addition, according to the present invention, the computer can be supplied as an image search program for causing a computer to function as the above-described image search device, and can be supplied as a computer-readable recording medium on which the image search program is recorded.

画像検索装置１０の機械的構成を示すブロック図である。2 is a block diagram showing a mechanical configuration of the image search device 10. FIG. 画像検索装置１０の機能的構成を示すブロック図である。3 is a block diagram showing a functional configuration of the image search device 10. FIG. 画像検索部１３の機能的構成を示すブロック図である。3 is a block diagram showing a functional configuration of an image search unit 13. FIG. 画像検索部１３による検索処理を示すフローチャートである。5 is a flowchart showing search processing by an image search unit 13; 前処理部１３０によるステップＡ１の前処理を示すフローチャートである。It is a flowchart which shows the pre-processing of step A1 by the pre-processing part 130. 特徴抽出部１３１によるステップＡ２の特徴抽出処理を示すフローチャートである。It is a flowchart which shows the feature extraction process of step A2 by the feature extraction part 131. 単語長および隣接する２つの単語長の比を示す模式図である。It is a schematic diagram which shows word ratio and ratio of two adjacent word lengths. 画素密度を説明するための模式図である。It is a schematic diagram for demonstrating pixel density.

以下図面を参考にして本発明の好適な実施形態を詳細に説明する。
図１は、画像検索装置１０の機械的構成を示すブロック図である。画像検索装置１０は、プロセッサ４と、プロセッサ４が実際の処理を行うためのソフトウエアなどを格納する外部記憶装置５とを含む。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings.
FIG. 1 is a block diagram illustrating a mechanical configuration of the image search apparatus 10. The image search device 10 includes a processor 4 and an external storage device 5 that stores software for the processor 4 to perform actual processing.

プロセッサ４は、入力された画像データ（以下では「入力画像データ」という）の特徴量を抽出し、予め登録されている複数の画像データ（以下では「登録画像データ」という）との照合を行い、入力画像データに類似の登録画像データを検索して表示する画像検索処理などを実際に行う。プロセッサ４における実際の処理は、外部記憶装置５に格納されるソフトウエアによって実行される。プロセッサ４は、たとえば通常のコンピュータ本体などで構成される。 The processor 4 extracts the feature amount of the input image data (hereinafter referred to as “input image data”), and compares it with a plurality of pre-registered image data (hereinafter referred to as “registered image data”). An image search process for searching and displaying registered image data similar to the input image data is actually performed. Actual processing in the processor 4 is executed by software stored in the external storage device 5. The processor 4 is composed of, for example, a normal computer main body.

外部記憶装置５は、たとえば高速アクセスが可能なハードディスクなどで構成することができる。外部記憶装置５は、登録画像データを大量に保持するために光ディスクなどの大容量デバイスを用いるような構成であっても構わない。また、検索処理中に各処理ステップの段階で作成された一時的なデータなどは、外部記憶装置５に記憶してもよいし、プロセッサ４に内蔵される半導体メモリに記憶してもよい。 The external storage device 5 can be composed of, for example, a hard disk that can be accessed at high speed. The external storage device 5 may be configured to use a large-capacity device such as an optical disk in order to hold a large amount of registered image data. Temporary data created at the stage of each processing step during the search process may be stored in the external storage device 5 or may be stored in a semiconductor memory built in the processor 4.

画像検索装置１０には、キーボード１が接続されるとともに、表示装置３が接続される。キーボード１は、各種ソフトウエアを実行するための指示の入力などに用いられる。 A keyboard 1 and a display device 3 are connected to the image search device 10. The keyboard 1 is used for inputting instructions for executing various software.

表示装置３は、入力画像データおよび登録画像データに基づく画像の表示、検索結果の表示などを行う。 The display device 3 displays an image based on input image data and registered image data, displays a search result, and the like.

画像検索装置１０には、イメージスキャナ２がさらに接続される。イメージスキャナ２は、画像が印刷された原稿を読み取り、入力画像データおよび登録画像データを取り込むために用いられる。 An image scanner 2 is further connected to the image search device 10. The image scanner 2 is used to read a document on which an image is printed and to capture input image data and registered image data.

入力画像データおよび登録画像データの取得は、イメージスキャナ２からの入力の他に通信Ｉ／Ｆ（インターフェイス）６を介して、ネットワーク上の他の装置からデータ通信により取得することもできる。通信Ｉ／Ｆ６は、ＬＡＮ（Local Area Network）に接続するためのＬＡＮカードや、公衆交換電話網に接続してデータ通信を行うためのモデムカードなどで実現される。 The input image data and the registered image data can be acquired by data communication from other devices on the network via the communication I / F (interface) 6 in addition to the input from the image scanner 2. The communication I / F 6 is realized by a LAN card for connecting to a LAN (Local Area Network), a modem card for connecting to a public switched telephone network, and performing data communication.

図２は、画像検索装置１０の機能的構成を示すブロック図である。画像検索装置１０は、入力部１２、画像検索部１３、表示部１４および登録画像記憶部１５を含んで構成される。 FIG. 2 is a block diagram showing a functional configuration of the image search apparatus 10. The image search apparatus 10 includes an input unit 12, an image search unit 13, a display unit 14, and a registered image storage unit 15.

入力部１２は、入力画像データ、登録画像データを入力する。図１に示したハードウエア構成のうち、イメージスキャナ２、通信Ｉ／Ｆ６などが機能的に入力部１２に相当する。登録画像データは、入力画像データが入力されるより前に予め入力されていた画像データであり、登録画像記憶部１５に記憶されている。 The input unit 12 inputs input image data and registered image data. In the hardware configuration shown in FIG. 1, the image scanner 2, the communication I / F 6, and the like functionally correspond to the input unit 12. The registered image data is image data input in advance before the input image data is input, and is stored in the registered image storage unit 15.

図３は、画像検索部１３の機能的構成を示すブロック図である。画像検索部１３は、前処理部１３０、特徴抽出部１３１、検索部１３２を含んで構成される。 FIG. 3 is a block diagram illustrating a functional configuration of the image search unit 13. The image search unit 13 includes a preprocessing unit 130, a feature extraction unit 131, and a search unit 132.

画像検索部１３は、入力部１２によって入力された入力画像データから特徴量を抽出し、登録画像データに対して予め抽出しておいた特徴量と比較して画像を検索する。 The image search unit 13 extracts a feature amount from the input image data input by the input unit 12, and searches for an image by comparing it with the feature amount extracted in advance for the registered image data.

図４は、画像検索部１３による検索処理を示すフローチャートである。ステップＡ１では、前処理部１３０が、入力された画像データに対する前処理として２値化処理を施す。ステップＡ２では、２値化された画像データに基づいて、特徴抽出部１３１が、２値化画像データの特徴量を抽出する。ステップＡ３では、検索部１３２が、登録画像データの特徴量と、２値化データの特徴量とを比較して登録画像データの中から入力画像データと類似度が高い画像データを検索する。ステップＡ４では、検索結果を表示部１４に出力する。 FIG. 4 is a flowchart showing search processing by the image search unit 13. In step A1, the preprocessing unit 130 performs binarization processing as preprocessing for the input image data. In step A2, the feature extraction unit 131 extracts the feature amount of the binarized image data based on the binarized image data. In step A3, the search unit 132 compares the feature amount of the registered image data with the feature amount of the binarized data, and searches the registered image data for image data having a high similarity to the input image data. In step A4, the search result is output to the display unit 14.

以下では、各ステップについて詳細に説明する。前処理部１３０によるステップＡ１の前処理は、たとえば図５のフローチャートに示される。 Hereinafter, each step will be described in detail. The preprocessing of step A1 by the preprocessing unit 130 is shown in the flowchart of FIG. 5, for example.

画像データが入力されると、ステップＢ１で、入力された画像データがカラー画像データかどうかを判断する。カラー画像データであれば、ステップＢ２に進み、明度成分に基づくグレイ化を行い濃淡画像データに変換し、ステップＢ３に進む。カラー画像でなければ、ステップＢ３に進み、濃淡画像データであるかどうかを判断する。濃淡画像データであれば、ステップＢ４に進み、予め定める閾値を用いて２値化し、濃淡画像データを２値画像データに変換し、ステップＢ５で２値画像データを出力して処理を終了する。濃淡画像データでなければ、すなわち２値画像データであるので、ステップＢ５で２値画像データを出力して処理を終了する。 When the image data is input, it is determined in step B1 whether the input image data is color image data. If it is color image data, the process proceeds to step B2, grayed based on the lightness component, converted to grayscale image data, and the process proceeds to step B3. If it is not a color image, the process proceeds to step B3 and it is determined whether or not the image is grayscale image data. If it is grayscale image data, the process proceeds to step B4, binarized using a predetermined threshold, the grayscale image data is converted to binary image data, binary image data is output in step B5, and the process is terminated. If the image data is not grayscale image data, that is, binary image data, the binary image data is output in step B5, and the process ends.

２値画像データは、画像データを構成する各画素の画素値を０か１（白画素か黒画素）のいずれかとする、いわゆる白黒画像データであり、濃淡画像データの各画素の濃淡度（濃度）を閾値処理して、全画素を黒画素と白画素とに分類する。 The binary image data is so-called monochrome image data in which the pixel value of each pixel constituting the image data is 0 or 1 (white pixel or black pixel), and the density (density) of each pixel of the gray image data. ) Is thresholded to classify all pixels into black pixels and white pixels.

文書画像データでは、一般に下地（背景）が白く、文字部分が黒いので、２値化処理によって黒画素に分類された画素が、文字を構成する画素であると言える。 In document image data, since the background (background) is generally white and the character portion is black, it can be said that the pixels classified into black pixels by the binarization process are the pixels constituting the character.

特徴抽出部１３１によるステップＡ２の特徴抽出処理は、たとえば図６のフローチャートに示される。 The feature extraction processing in step A2 by the feature extraction unit 131 is shown in the flowchart of FIG. 6, for example.

ステップＣ１で前処理部１３０によって２値化処理された２値画像データが入力されると、ステップＣ２では、２値画像データ中のすべての結合要素を検出する。 When the binary image data binarized by the preprocessing unit 130 is input in step C1, all the coupling elements in the binary image data are detected in step C2.

結合要素とは、連結した同じ色の画素が集合した画素群である。黒画素の結合要素を検出するか、白画素の結合要素を検出するかは、入力された画像データの下地が黒画素であるか白画素であるかに依存する。上記のように、一般的には下地が白画素である場合が多く、文字画像が黒画素で描画されているので、本実施形態では、黒画素の結合要素を検出するものとして説明する。下地が黒画素の場合は、文字画像が白画素で描画される、白抜き文字であり、この場合は、白画素の結合要素を検出すればよい。 A coupling element is a pixel group in which connected pixels of the same color are gathered. Whether a black pixel coupling element or a white pixel coupling element is detected depends on whether the background of the input image data is a black pixel or a white pixel. As described above, generally, the background is often a white pixel, and a character image is drawn with a black pixel. Therefore, in the present embodiment, the description will be made assuming that a black pixel coupling element is detected. When the background is a black pixel, the character image is a white character drawn with white pixels, and in this case, a combination element of white pixels may be detected.

なお、下地が黒画素であるか白画素であるかは、公知の下地判別処理で判別することができ、たとえば、上記の全体黒画素割合が所定の割合よりも小さいと下地が白画素と判別し、所定の割合よりも大きいと下地が黒画素と判別する。 Whether the background is a black pixel or a white pixel can be determined by a known background determination process. For example, if the overall black pixel ratio is smaller than a predetermined ratio, the background is determined to be a white pixel. If the ratio is larger than the predetermined ratio, the background is determined to be a black pixel.

結合要素の検出は、公知の検出方法で検出することができる。たとえば、１ラインについて、そのライン中で互いに隣接する黒画素の連続部分（黒ラン）を検出し、黒ランのランレングスと、黒ランの両端の黒画素の座標とを、ラインごとに記憶しておく。座標は、たとえば、ラインに平行な方向をｘ軸とし、ラインに直交する方向をｙ軸として予め決定される。 The binding element can be detected by a known detection method. For example, for one line, a continuous portion of black pixels adjacent to each other in the line (black run) is detected, and the run length of the black run and the coordinates of the black pixels at both ends of the black run are stored for each line. Keep it. The coordinates are determined in advance, for example, with the direction parallel to the line as the x-axis and the direction orthogonal to the line as the y-axis.

１つの注目ラインをｙ方向に挟む上下ラインの黒ランについて、その両端の黒画素のｘ座標が、注目ラインの各黒ランにおける両端の黒画素座標のｘ座標の範囲内にあれば注目ラインの当該黒ランと、ｘ座標が範囲内となる黒画素を端部画素とする黒ランとはｙ方向に連結されているものとみなすことができる。このようにして、注目ラインを順次ずらしながらすべての画像データに対して、ｘ方向の連結部分とｙ方向の連結部分を検出し、黒画素の結合要素を検出する。 For a black run on the upper and lower lines that sandwich one target line in the y direction, if the x coordinates of the black pixels at both ends are within the x coordinate range of the black pixel coordinates at both ends in each black run of the target line, The black run and the black run whose end pixel is a black pixel whose x coordinate is within the range can be regarded as being connected in the y direction. In this way, with respect to all the image data while sequentially shifting the line of interest, the x-direction connection portion and the y-direction connection portion are detected, and the black pixel connection element is detected.

ステップＣ３では、検出した結合要素に基づいて、単語分割を行う。単語分割を行うために、まずは検出した結合要素に基づいてテキストライン（文字列）の抽出を行う。入力された原稿画像における単語の位置を検出する前段階としてテキストラインを抽出する。テキストラインの抽出には、たとえばランレングス平滑化アルゴリズム（ＲＬＳＡ）を用いることができる。ここで、検出された結合要素は、ＲＬＳＡにおけるフォアグランドセパレータとなり、結合要素からテキストラインとして再構築される。 In step C3, word division is performed based on the detected coupling element. In order to divide the word, first, a text line (character string) is extracted based on the detected combination element. A text line is extracted as a pre-stage for detecting the position of the word in the input document image. For example, a run-length smoothing algorithm (RLSA) can be used to extract the text line. Here, the detected coupling element becomes a foreground separator in RLSA, and is reconstructed as a text line from the coupling element.

テキストラインが抽出されると、テキストラインを単語ごとに分割する。本実施形態では、黒画素によって文字が構成され、文字以外の部分は白画素であるので、横方向の画素列に注目したときに白画素の領域（セグメント）の長さを抽出する。そして抽出されたこれら白画素領域の長さの平均値を算出し、平均値よりも小さい白画素領域を文字間の領域とする。 When the text line is extracted, the text line is divided into words. In the present embodiment, a character is composed of black pixels, and a portion other than the character is a white pixel. Therefore, when attention is paid to a horizontal pixel row, the length of a white pixel region (segment) is extracted. Then, an average value of the lengths of these extracted white pixel areas is calculated, and a white pixel area smaller than the average value is set as an area between characters.

文字間の白画素領域の両側の黒画素領域は、１つの同じ単語に含まれる文字を構成する黒画素領域であるので、これらの白画素領域および黒画素領域の集合が１つの単語となる。 Since the black pixel areas on both sides of the white pixel area between the characters are black pixel areas constituting a character included in one same word, a set of these white pixel areas and black pixel areas becomes one word.

このような処理を全てのテキストラインに対して施すことで単語分割を行うことができる。 Word division can be performed by applying such processing to all text lines.

ステップＣ４では、隣接する２つの単語における単語長の比を算出して画像データの特徴量を抽出する。 In step C4, the ratio of word lengths between two adjacent words is calculated to extract the feature amount of the image data.

単語分割によって分割された各単語の単語長は、ラインに沿ったｘ方向の画素数で表わされる。１つの単語の両端に位置する画素を検出し、この画素間に並ぶ画素数（両端画素を含む）を単語長とする。 The word length of each word divided by word division is represented by the number of pixels in the x direction along the line. Pixels located at both ends of one word are detected, and the number of pixels arranged between the pixels (including pixels at both ends) is set as the word length.

図７は、単語長および隣接する２つの単語長の比を示す模式図である。図７に示す例は、文書画像データの一部を抜き出して示しており、文書画像データにおいて、「Ｂａｓｅｄｏｎｔｈｅｒｅｓｕｌｔ」と記載された部分を示す。 FIG. 7 is a schematic diagram showing a word length and a ratio of two adjacent word lengths. The example shown in FIG. 7 shows a part of the document image data extracted, and shows a portion described as “Based on the result” in the document image data.

単語分割によって、「Ｂａｓｅｄ」に相当する画素群２０、「ｏｎ」に相当する画素群２１、「ｔｈｅ」に相当する画素群２２および「ｒｅｓｕｌｔ」に相当する画素群２３に分割される。 By word division, the pixel group 20 corresponding to “Based”, the pixel group 21 corresponding to “on”, the pixel group 22 corresponding to “the”, and the pixel group 23 corresponding to “result” are divided.

画素群２０の単語長としてｄ１、画素群２１の単語長としてｄ２、画素群２２の単語長としてｄ３、画素群２３の単語長としてｄ４が検出される。このとき隣接する２つの単語における単語長の比ｆｎは、ｆｎ＝ｄｎ／ｄｎ＋１で算出される。ｎは１ラインに含まれる単語数または１ページに含まれる全単語数である。図７に示す例では、単語長の比ｆｎはｆ１＝ｄ１／ｄ２，ｆ２＝ｄ２／ｄ３，ｆ３＝ｄ３／ｄ４でそれぞれ算出される。 D1 is detected as the word length of the pixel group 20, d2 as the word length of the pixel group 21, d3 as the word length of the pixel group 22, and d4 as the word length of the pixel group 23. At this time, the ratio fn of the word lengths between two adjacent words is calculated as fn = dn / dn + 1. n is the number of words included in one line or the total number of words included in one page. In the example shown in FIG. 7, the word length ratio fn is calculated as f1 = d1 / d2, f2 = d2 / d3, and f3 = d3 / d4.

これら算出された単語長の比ｆｎの配列は、単語の並び順に沿って単語長の比を並べたものであり、図７の例では、ｆ１，ｆ２，ｆ３，ｆ４の配列が特徴量として抽出される。 The array of the calculated word length ratios fn is obtained by arranging the word length ratios along the word arrangement order. In the example of FIG. 7, the array of f1, f2, f3, and f4 is extracted as a feature quantity. Is done.

ラインごとに単語長の比ｆｎの配列を抽出する場合は、１つの画像データに含まれる全ラインについて単語長の比ｆｎの配列をそれぞれ抽出し、全ての配列により１つの画像データの特徴量が構成される。 When extracting an array of word length ratio fn for each line, an array of word length ratios fn is extracted for all lines included in one image data, and the feature amount of one image data is determined by all the arrays. Composed.

ステップＣ５では、抽出された特徴量を、検索部１３２に出力する。検索部１３２によるステップＡ３の画像検索処理は、たとえば以下のようにして行われる。 In step C5, the extracted feature amount is output to the search unit 132. The image search process in step A3 by the search unit 132 is performed as follows, for example.

検索部１３２では、上記のようにして得られた入力画像データの検索用の特徴量と、予め抽出された登録画像データの特徴量とを比較し、比較結果によって入力画像データと登録画像データとの類似度を求める。登録画像データの中から最も類似度が高い登録画像データを選択し、検索結果とする。 In the search unit 132, the feature amount for searching the input image data obtained as described above is compared with the feature amount of the registered image data extracted in advance, and the input image data and the registered image data are compared based on the comparison result. Find the similarity of. The registered image data having the highest similarity is selected from the registered image data, and is used as a search result.

検索結果としては、最も類似度が高い登録画像データのみではなく、類似度の高い方から所定数の登録画像データを選択してこれを検索結果としてもよい。 As a search result, not only registered image data with the highest similarity but also a predetermined number of registered image data from the higher similarity may be selected and used as a search result.

検索部１３２によって検索結果が出力されると、表示部１４が、検索結果として選択された登録画像データを可視化した画像を表示する。 When the search result is output by the search unit 132, the display unit 14 displays an image obtained by visualizing the registered image data selected as the search result.

登録画像データについては、登録時に上記のような特徴量の抽出を行い、特徴量と関連付けて登録画像記憶部１５に記憶しておく。 The registered image data is extracted in the registered image storage unit 15 by extracting the feature amount as described above at the time of registration and associating it with the feature amount.

本発明の特徴量は、複数の単語長の比ｆｎで構成される配列であるので、登録画像データに関連付けられた特徴量と、入力画像データの特徴量とが完全一致しなくとも類似度を求めることができる。 Since the feature quantity of the present invention is an array composed of a plurality of word length ratios fn, the similarity is obtained even if the feature quantity associated with the registered image data does not completely match the feature quantity of the input image data. Can be sought.

たとえば、入力画像データが、登録画像データの一部であった場合、入力画像データの特徴量は、登録画像データの特徴量と完全に一致することはなく、入力画像データの特徴量が、登録画像データの特徴量の一部として含まれることになる。 For example, when the input image data is a part of the registered image data, the feature amount of the input image data does not completely match the feature amount of the registered image data. It is included as a part of the feature amount of the image data.

また、入力画像データの一部と、登録画像データの一部とが重複するような場合、入力画像データの特徴量は、登録画像データの特徴量と完全に一致することはなく、入力画像データの特徴量の一部が、登録画像データの特徴量の一部と重複することになる。 In addition, when a part of the input image data and a part of the registered image data overlap, the feature amount of the input image data does not completely match the feature amount of the registered image data, and the input image data A part of the feature amount overlaps with a part of the feature amount of the registered image data.

したがって、特徴量が完全一致しなくとも、類似の画像データを登録画像データの中から検索することが可能であり、さらに、入力画像データの特徴が、登録画像データの特徴量とどのように一致するかによって、入力画像データと登録画像データの一致部分をも検出することができる。 Therefore, it is possible to search for similar image data from registered image data even if the feature values do not completely match, and how the features of the input image data match the feature values of the registered image data. By doing so, it is possible to detect a matching portion between the input image data and the registered image data.

入力画像データおよび登録画像データについて、単語長の比ｆｎの配列と、各画像データの位置情報（画素座標）とを関連付けて記憶しておけば、一致する単語長の比ｆｎの配列を検出することで、一致した配列に対応する位置情報に基づいて、入力画像データと登録画像データの一致部分を検出できる。 If the array of word length ratios fn and the position information (pixel coordinates) of each image data are stored in association with each other for the input image data and registered image data, an array of matching word length ratios fn is detected. Thus, a matching portion between the input image data and the registered image data can be detected based on the position information corresponding to the matched array.

入力画像データを文章コンテンツの一部であるとすると、登録画像データ中から一致部分を検出し、検出した一致部分を表示部１４などによって表示することでコンテンツ検索も可能となる。 Assuming that the input image data is a part of the text content, it is possible to search for the content by detecting a matching portion from the registered image data and displaying the detected matching portion on the display unit 14 or the like.

ここで、有効な特徴量について説明する。非常に短い文章、すなわち単語数が少ない文章の場合は、同じ単語長比ｆｎの配列を有するにもかかわらず、異なる文章である可能性が高くなる。 Here, an effective feature amount will be described. In the case of a very short sentence, that is, a sentence with a small number of words, there is a high possibility that it is a different sentence despite having the same word length ratio fn.

したがって、比較するための特徴量として、予め定める単語数以上の単語数、すなわち配列に含まれる単語長比ｆｎの数（Ｌ）が予め定める数以上の特徴量とすることが有効である。特徴量を有効とするＬについては、実験結果により７以上が好ましく、より好ましくは１０以上である。 Therefore, it is effective that the number of words equal to or greater than the predetermined number of words, that is, the number (L) of the word length ratio fn included in the array be equal to or greater than the predetermined number as the characteristic amount for comparison. About L which makes a feature-value effective, 7 or more are preferable by an experimental result, More preferably, it is 10 or more.

なお、隣接する２つの単語長の比ｆｎが特徴量として有効となるのは、単語長が単語ごとに異なるような言語で記載された文書画像である。アルファベットを用いて文章を記載するラテン語系の言語で記載された文書画像がこれに当たる。 It should be noted that the ratio fn between two adjacent word lengths is effective as a feature amount for a document image described in a language having a different word length for each word. This is a document image written in a Latin language in which sentences are written using alphabets.

これに対して日本語、中国語などで用いられる漢字、ひらがなおよび片仮名は、１文字ごとの大きさがほぼ等しく、単語ごとに明確に分割することが難しいので、１文字ごとに画素密度を算出してこれを特徴量とすることが有効である。 In contrast, kanji, hiragana and katakana used in Japanese, Chinese, etc. are approximately the same size for each character, and it is difficult to clearly divide each word, so the pixel density is calculated for each character. Thus, it is effective to use this as a feature amount.

図８は、画素密度を説明するための模式図である。画素密度は、１文字に外接する外接矩形を検出し、外接矩形に含まれる１つの文字を構成する画素の画素数がこの外接矩形全体を構成する全画素数に対して占める割合である。図８に示すように、漢字の「我」１文字を例として、画素密度の算出について説明する。 FIG. 8 is a schematic diagram for explaining the pixel density. The pixel density is a ratio in which a circumscribed rectangle circumscribing one character is detected, and the number of pixels constituting one character included in the circumscribed rectangle occupies the total number of pixels constituting the entire circumscribed rectangle. As shown in FIG. 8, the calculation of the pixel density will be described using a single “I” character as an example.

図８の例では、まず漢字の「我」に外接する外接矩形３０を抽出し、この外接矩形３０の幅ｗ（画素数）と高さｈ（画素数）をカウントする。次に外接矩形３０に含まれる１文字の漢字「我」を構成する黒画素３１の画素数Ｂをカウントする。画素密度は上記のように外接矩形全体を構成する全画素数に対する１つの文字を構成する黒画素の画素数Ｂの割合であるから、画素密度をｆとしたとき、ｆ＝Ｂ／（ｗ×ｈ）で算出することができる。 In the example of FIG. 8, first, the circumscribed rectangle 30 circumscribing the Chinese character “I” is extracted, and the width w (number of pixels) and height h (number of pixels) of the circumscribed rectangle 30 are counted. Next, the number of pixels B of the black pixels 31 constituting one character Kanji “I” included in the circumscribed rectangle 30 is counted. Since the pixel density is the ratio of the number B of black pixels constituting one character to the total number of pixels constituting the entire circumscribed rectangle as described above, when the pixel density is f, f = B / (w × h).

このような画素密度ｆを１文字ごとに算出し、テキストライン上の文字の並び順に複数の画素密度ｆを並べた配列が特徴量として抽出される。 Such a pixel density f is calculated for each character, and an array in which a plurality of pixel densities f are arranged in the order of characters on the text line is extracted as a feature amount.

ラインごとに画素密度ｆの配列を抽出する場合は、１つの画像データに含まれる全ラインについて画素密度ｆの配列をそれぞれ抽出し、全ての配列により１つの画像データの特徴量が構成される。 When an array of pixel density f is extracted for each line, an array of pixel density f is extracted for all lines included in one image data, and a feature amount of one image data is configured by all the arrays.

日本語、中国語で記載された文書画像データを検索する場合には、特徴量を、画素密度ｆの配列とすること以外は、上記の単語長比ｆｎの配列を特徴量とする検索処理と同様に検索を行うことができる。 When searching for document image data written in Japanese or Chinese, except that the feature amount is an array of pixel density f, a search process using the array of word length ratios fn as the feature amount Similarly, a search can be performed.

本発明によれば以下のような効果が得られる。
従来技術のような句読点間の文字数を特徴量して検索する場合に比べて、隣接する２つの単語の単語長の比を特徴量として用いることにより、異なる文章であっても同じ特徴量となる可能性が低いために、文書画像データの検索精度を向上させることができる。 According to the present invention, the following effects can be obtained.
Compared to the case where the number of characters between punctuation marks is searched for as a feature amount as in the prior art, the same feature amount is obtained even in different sentences by using the ratio of the word lengths of two adjacent words as a feature amount. Since the possibility is low, the search accuracy of the document image data can be improved.

また、単語長は、比較的低い解像度で読み取った文書画像データであっても誤検出されないので、低解像度の文書画像データを用いることができ、検索処理に係る処理速度を高速化し、文書画像データを記憶するための記憶容量も削減できる。 Further, since the word length is not erroneously detected even if the document image data is read at a relatively low resolution, the low resolution document image data can be used, the processing speed related to the search processing is increased, and the document image data The storage capacity for storing can also be reduced.

また、入力された文書画像データと、登録された文書画像データの特徴量が完全一致しなくとも、類似の画像データを登録画像データの中から検索することが可能である。 Further, similar image data can be searched from registered image data even if the input document image data and the registered document image data do not completely match the feature amount.

なお、画像検索装置１０の各ブロック、特に、入力部１２、画像検索部１３、表示部１４および登録画像記憶部１５等は、ハードウエアロジックによって構成してもよいし、次のようにＣＰＵを用いてソフトウエア（画像検索プログラム）によって実現してもよい。 It should be noted that each block of the image search apparatus 10, in particular, the input unit 12, the image search unit 13, the display unit 14, and the registered image storage unit 15 may be configured by hardware logic, or the CPU may be It may be realized by software (image search program).

すなわち、画像検索装置１０は、各機能を実現する制御プログラムの命令を実行するＣＰＵ（central processing unit）、上記プログラムを格納したＲＯＭ（read only
memory）、上記プログラムを展開するＲＡＭ（random access memory）、上記プログラムおよび各種データを格納するメモリなどの記憶装置（記録媒体）などを備えている。そして、本発明の目的は、上述した機能を実現するソフトウエアである画像検索装置１０の制御プログラムのプログラムコード（実行形式プログラム、中間コードプログラム、ソースプログラム）をコンピュータで読み取り可能に記録した記録媒体を、上記画像検索装置１０に供給し、そのコンピュータ（またはＣＰＵやＭＰＵ）が記録媒体に記録されているプログラムコードを読み出し実行することによっても、達成可能である。 That is, the image search apparatus 10 includes a central processing unit (CPU) that executes instructions of a control program that realizes each function, and a ROM (read only) that stores the program
memory), a RAM (random access memory) for expanding the program, and a storage device (recording medium) such as a memory for storing the program and various data. An object of the present invention is a recording medium in which program codes (execution format program, intermediate code program, source program) of a control program of the image search apparatus 10 which is software for realizing the above-described functions are recorded so as to be readable by a computer. Can also be achieved by supplying the program to the image search apparatus 10 and reading and executing the program code recorded on the recording medium by the computer (or CPU or MPU).

上記記録媒体としては、例えば、磁気テープやカセットテープなどのテープ系、フロッピー（登録商標）ディスク／ハードディスクなどの磁気ディスクやＣＤ−ＲＯＭ／ＭＯ／ＭＤ／ＤＶＤ／ＣＤ−Ｒなどの光ディスクを含むディスク系、ＩＣカード（メモリカードを含む）／光カードなどのカード系、あるいはマスクＲＯＭ／ＥＰＲＯＭ／ＥＥＰＲＯＭ／フラッシュＲＯＭなどの半導体メモリ系などを用いることができる。 Examples of the recording medium include tapes such as magnetic tapes and cassette tapes, magnetic disks such as floppy (registered trademark) disks / hard disks, and disks including optical disks such as CD-ROM / MO / MD / DVD / CD-R. Card system such as IC card, IC card (including memory card) / optical card, or semiconductor memory system such as mask ROM / EPROM / EEPROM / flash ROM.

また、画像検索装置１０を通信ネットワークと接続可能に構成し、上記プログラムコードを、通信ネットワークを介して供給してもよい。この通信ネットワークとしては、特に限定されず、例えば、インターネット、イントラネット、エキストラネット、ＬＡＮ、ＩＳＤＮ、ＶＡＮ、ＣＡＴＶ通信網、仮想専用網（virtual private network）、電話回線網、移動体通信網、衛星通信網などが利用可能である。また、通信ネットワークを構成する伝送媒体としては、特に限定されず、例えば、ＩＥＥＥ１３９４、ＵＳＢ、電力線搬送、ケーブルＴＶ回線、電話線、ＡＤＳＬ回線などの有線でも、ＩｒＤＡやリモコンのような赤外線、Ｂｌｕｅｔｏｏｔｈ（登録商標）、８０２．１１無線、ＨＤＲ、携帯電話網、衛星回線、地上波デジタル網などの無線でも利用可能である。なお、本発明は、上記プログラムコードが電子的な伝送で具現化された、搬送波に埋め込まれたコンピュータデータ信号の形態でも実現され得る。 Further, the image search apparatus 10 may be configured to be connectable to a communication network, and the program code may be supplied via the communication network. The communication network is not particularly limited. For example, the Internet, intranet, extranet, LAN, ISDN, VAN, CATV communication network, virtual private network, telephone line network, mobile communication network, satellite communication. A net or the like is available. Also, the transmission medium constituting the communication network is not particularly limited. For example, even in the case of wired such as IEEE 1394, USB, power line carrier, cable TV line, telephone line, and ADSL line, infrared rays such as IrDA and remote control, Bluetooth ( (Registered trademark), 802.11 wireless, HDR, mobile phone network, satellite line, terrestrial digital network, and the like can also be used. The present invention can also be realized in the form of a computer data signal embedded in a carrier wave in which the program code is embodied by electronic transmission.

本発明は、その精神または主要な特徴から逸脱することなく、他のいろいろな形態で実施できる。したがって、前述の実施形態はあらゆる点で単なる例示に過ぎず、本発明の範囲は特許請求の範囲に示すものであって、明細書本文には何ら拘束されない。さらに、特許請求の範囲に属する変形や変更は全て本発明の範囲内のものである。 The present invention can be implemented in various other forms without departing from the spirit or main features thereof. Therefore, the above-described embodiment is merely an example in all respects, and the scope of the present invention is shown in the claims, and is not restricted by the text of the specification. Further, all modifications and changes belonging to the scope of the claims are within the scope of the present invention.

１０画像検索装置
１２入力部
１３画像検索部
１４表示部
１５登録画像記憶部
１３０前処理部
１３１特徴抽出部
１３２検索部 DESCRIPTION OF SYMBOLS 10 Image search device 12 Input part 13 Image search part 14 Display part 15 Registered image memory | storage part 130 Pre-processing part 131 Feature extraction part 132 Search part

Claims

An image search apparatus for searching document image data similar to input document image data from previously registered document image data,
A word contained in the input document image data is detected and divided into words, and for each two adjacent words, a ratio of the word lengths of the two words is calculated, and the calculated ratio of the word lengths is arranged as a word sequence. A feature amount extraction unit that extracts an array of word length ratios arranged in order as feature amounts of input document image data;
A registered image storage unit that stores the registered document image data and the feature amount of the registered document image data in association with each other;
Based on the feature amount of the registered document image data and the feature amount of the input document image data generated by the feature amount extraction unit, the input is performed from the registered document image data. A search unit for searching for image data similar to the document image data,
An image search apparatus comprising: a display unit configured to display document image data similar to the input document image data among the registered document image data based on a search result by a search unit.

The image search apparatus according to claim 1, wherein the feature amount extraction unit calculates a ratio of the word lengths using the number of pixels indicating the length of a region constituting the word as the word length.

The registered image storage unit stores the word length ratio array in association with the positional information of the word length ratio array in the registered document image data,
The search unit detects a portion in the searched document image data that matches the word length ratio array of the input document image data,
The image search apparatus according to claim 1, wherein the display unit displays the detected matched portion so as to be distinguishable from other portions.

An image search apparatus for searching document image data similar to input document image data from previously registered document image data,
Characters included in the input document image data are detected, character division is performed, a circumscribed rectangle circumscribing one character is detected, and a pixel density which is a ratio of pixels constituting the character in the circumscribed rectangle is calculated. A feature amount extraction unit that extracts an array of pixel densities obtained by arranging the calculated pixel densities along the word arrangement order as feature amounts of input document image data;
A registered image storage unit that stores the registered document image data and the feature amount of the registered document image data in association with each other;
Based on the feature amount of the registered document image data and the feature amount of the input document image data generated by the feature amount extraction unit, the input is performed from the registered document image data. A search unit for searching for image data similar to the document image data,
An image search apparatus comprising: a display unit configured to display document image data similar to the input document image data among the registered document image data based on a search result by a search unit.

An image search program for causing a computer to function as the image search device according to claim 1.

A computer-readable recording medium on which an image search program for causing a computer to function as the image search device according to claim 1 is recorded.