JP2008077454A

JP2008077454A - Title extraction device, image reading device, title extraction method, and title extraction program

Info

Publication number: JP2008077454A
Application number: JP2006256686A
Authority: JP
Inventors: Takenobu Ikeuchi; 建展池内
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2006-09-22
Filing date: 2006-09-22
Publication date: 2008-04-03
Anticipated expiration: 2026-09-22
Also published as: JP4891013B2

Abstract

<P>PROBLEM TO BE SOLVED: To precisely extract a character string having no exterior features on a document image as a title. <P>SOLUTION: A title extraction device includes an extraction requirement acquisition part 31 for acquiring a keyword character string described near a title character string on a document image and relative location information of the title character string corresponding to the keyword character string, a character recognition part 33 for recognizing each character in a region including the title character string and the keyword character string in the document image, a keyword retrieving part 34 for retrieving the keyword character string from the recognition result to acquire the location, a title location acquisition part 35 for acquiring a location of the title character string from the acquired location of the keyword character string and the relative location of the title character string corresponding to the keyword character string, and a title output part 36 for outputting title character string data from the acquired location of the title character string. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、文書画像からタイトルを抽出するタイトル抽出装置、画像読取装置、タイトル抽出方法、及びタイトル抽出プログラムに関するものである。 The present invention relates to a title extraction apparatus, an image reading apparatus, a title extraction method, and a title extraction program that extract a title from a document image.

原稿の読み取りなどにより取得した文書画像から、その文書の内容を示すタイトルを自動的に抽出することができると、そのタイトルを文書のファイル名として利用するなどして、文書の管理が容易になるため、利便性が向上する。 If a title indicating the contents of a document can be automatically extracted from a document image acquired by reading a document, the document can be easily managed by using the title as a file name of the document. Therefore, convenience is improved.

このように文書画像からタイトルを抽出するには、従来、文書上の位置や、下線、枠線といったタイトル部分を際立たせるために施される外観的特徴に着目して、その外観的特徴が施された部分をタイトルとして取り出す技術が知られている（例えば、特許文献１参照）。
特開平９−１３４４０６号公報 In order to extract a title from a document image in this way, conventionally, the appearance feature is applied by focusing on the appearance feature that is given to make the title portion such as the position on the document and the underline and frame line stand out. There is known a technique for taking out the recorded portion as a title (see, for example, Patent Document 1).
Japanese Patent Laid-Open No. 9-134406

しかしながら、文書画像には、それを識別する上で適切な文字列に必ずしも外観的特徴が施されているとは限らず、前記従来の技術では、タイトルとして適切な文字列に外観的特徴が施されていない場合には、その文字列をタイトルとして抽出することができないという問題がある。 However, in a document image, an appearance feature is not necessarily applied to an appropriate character string for identifying the document image. In the conventional technique, an appearance feature is applied to an appropriate character string as a title. If not, the character string cannot be extracted as a title.

例えば図２５に示す文書画像の場合、従来の技術によると、最も特徴的な文字列である「請求書」がタイトル文字列として抽出されるが、請求書は業務で多用されるため、文書を識別するタイトルとしては不適切である。この例の場合では、文書固有の情報である請求書番号を示す文字列「ＩＶ０１２３４５６７」がタイトル文字列として適切であるが、この文字列には大きな特徴がなく、従来方式ではタイトルとして抽出することができない。 For example, in the case of the document image shown in FIG. 25, according to the conventional technique, the most characteristic character string “invoice” is extracted as the title character string. It is inappropriate as a title to identify. In this example, the character string “IV01234567” indicating the invoice number, which is document-specific information, is appropriate as the title character string. However, this character string has no significant characteristics and is extracted as a title in the conventional method. I can't.

また文書画像から所要の文字列をテキストデータで取り出すには文字認識が用いられるが、この文字認識では誤認識が避けられず、この誤認識によりタイトル抽出の精度が低下したり、あるいはタイトル抽出ができなくなる不都合を避けることができるように構成することが望まれる。 In addition, character recognition is used to extract a required character string from a document image as text data. However, this character recognition cannot avoid misrecognition, and this misrecognition reduces the accuracy of title extraction, or title extraction. It is desirable to configure so that inconveniences that cannot be avoided can be avoided.

本発明は、このような従来技術の問題点を解消するべく案出されたものであり、その主な目的は、文書画像上で外観的特徴を有しない文字列でもタイトルとして精度良く抽出することができるように構成されたタイトル抽出装置、画像読取装置、タイトル抽出方法、及びタイトル抽出プログラムを提供することにある。さらに本発明は、文字認識での誤認識によるタイトル抽出の精度低下や抽出不能を避けることができるように構成することも目的とする。 The present invention has been devised to solve such problems of the prior art, and its main purpose is to accurately extract even a character string having no external features on a document image as a title. It is an object of the present invention to provide a title extraction device, an image reading device, a title extraction method, and a title extraction program that are configured to be capable of performing the above. Another object of the present invention is to make it possible to avoid a reduction in title extraction accuracy and inability to extract due to erroneous recognition in character recognition.

本発明は、文書画像上でタイトル文字列の近傍に記載されるキーワード文字列、及びこのキーワード文字列に対する前記タイトル文字列の相対的な位置情報を取得する抽出条件取得手段と、文書画像内の少なくとも前記タイトル文字列及び前記キーワード文字列を含む領域を対象にして文字認識を行う文字認識手段と、この文字認識手段の認識結果から、前記抽出条件取得手段で取得したキーワード文字列を検索してその位置を取得するキーワード検索手段と、このキーワード検索手段で取得したキーワード文字列の位置、及び前記抽出条件取得手段で取得したキーワード文字列に対するタイトル文字列の相対的な位置に基づいて、タイトル文字列の位置を取得するタイトル位置取得手段と、このタイトル位置取得手段で取得したタイトル文字列の位置に基づいて、タイトル文字列のデータを出力するタイトル出力手段とを有する構成とする。 The present invention provides a keyword character string described in the vicinity of a title character string on a document image, an extraction condition acquisition unit that acquires relative position information of the title character string with respect to the keyword character string, Character recognition means for recognizing at least an area including the title character string and the keyword character string, and a keyword character string acquired by the extraction condition acquisition means is searched from the recognition result of the character recognition means. Based on the keyword search means for acquiring the position, the position of the keyword character string acquired by the keyword search means, and the relative position of the title character string with respect to the keyword character string acquired by the extraction condition acquisition means Title position acquisition means for acquiring the position of the column and title sentence acquired by the title position acquisition means Based on the position of the column, a configuration having a title output means for outputting the data of the title string.

本発明によれば、タイトル文字列の近傍に記載されるキーワード文字列を目安にしてタイトル文字列の位置を求めるため、外観的特徴を有しない文字列でもタイトルとして確実に抽出することができ、さらに文書フォーマットごとに抽出条件を設定することで、多様な文書画像に対して精度良くタイトルを抽出することが可能となる。 According to the present invention, since the position of the title character string is obtained with reference to the keyword character string described in the vicinity of the title character string, even a character string having no appearance feature can be reliably extracted as a title. Furthermore, by setting extraction conditions for each document format, it is possible to accurately extract titles from various document images.

本発明の第１の態様に係るタイトル抽出装置は、文書画像上でタイトル文字列の近傍に記載されるキーワード文字列、及びこのキーワード文字列に対する前記タイトル文字列の相対的な位置情報を取得する抽出条件取得手段と、文書画像内の少なくとも前記タイトル文字列及び前記キーワード文字列を含む領域を対象にして文字認識を行う文字認識手段と、この文字認識手段の認識結果から、前記抽出条件取得手段で取得したキーワード文字列を検索してその位置を取得するキーワード検索手段と、このキーワード検索手段で取得したキーワード文字列の位置、及び前記抽出条件取得手段で取得したキーワード文字列に対するタイトル文字列の相対的な位置に基づいて、タイトル文字列の位置を取得するタイトル位置取得手段と、このタイトル位置取得手段で取得したタイトル文字列の位置に基づいて、タイトル文字列のデータを出力するタイトル出力手段とを有するものである。 A title extraction device according to a first aspect of the present invention acquires a keyword character string described in the vicinity of a title character string on a document image, and relative position information of the title character string with respect to the keyword character string. Extraction condition acquisition means, character recognition means for recognizing at least an area including the title character string and the keyword character string in the document image, and the extraction condition acquisition means from the recognition result of the character recognition means The keyword search means for searching the keyword character string acquired in step (b) and acquiring the position thereof, the position of the keyword character string acquired by the keyword search means, and the title character string for the keyword character string acquired by the extraction condition acquisition means Title position acquisition means for acquiring the position of the title character string based on the relative position, and this title Based on the position of the title character string acquired by the position acquisition means, and has a title output means for outputting the data of the title string.

これによれば、タイトル文字列の近傍に記載されるキーワード文字列を目安にしてタイトル文字列の位置を求めるため、外観的特徴を有しない文字列でもタイトルとして確実に抽出することができ、さらに文書フォーマットごとに抽出条件を設定することで、多様な文書画像に対して精度良くタイトルを抽出することが可能となる。 According to this, since the position of the title character string is obtained with reference to the keyword character string described in the vicinity of the title character string, even a character string having no appearance feature can be reliably extracted as a title. By setting extraction conditions for each document format, it is possible to accurately extract titles from various document images.

この場合、抽出条件取得手段は、自装置あるいは他装置の入力操作手段（キーボードなど）を利用した使用者の入力操作により入力された抽出条件に関する情報を取得する。 In this case, the extraction condition acquisition unit acquires information regarding the extraction condition input by the user's input operation using the input operation unit (keyboard or the like) of the own device or another device.

本発明の第２の態様に係るタイトル抽出装置は、前記第１の態様に係るタイトル抽出装置において、前記抽出条件取得手段が、前記タイトル文字列及び前記キーワード文字列を含むタイトル包含領域の位置情報を使用者の入力操作により取得し、前記文字認識手段が、前記抽出条件取得手段により取得したタイトル包含領域を対象にして文字認識を行うものである。 The title extraction device according to a second aspect of the present invention is the title extraction device according to the first aspect, wherein the extraction condition acquisition unit includes position information of a title inclusion region including the title character string and the keyword character string. Is acquired by the user's input operation, and the character recognition means performs character recognition for the title inclusion area acquired by the extraction condition acquisition means.

これによれば、文字認識範囲が限定されるため、タイトル抽出の精度及び処理速度を高めることができる。 According to this, since the character recognition range is limited, the accuracy and processing speed of title extraction can be increased.

本発明の第３の態様に係るタイトル抽出装置は、前記第１の態様に係るタイトル抽出装置において、前記キーワード検索手段が、前記キーワード文字列を複数の部分文字列に分割する文字列分割手段と、この文字列分割手段で得られた各部分文字列を、前記文字認識手段の認識結果文字列内で検索して、各部分文字列の位置を検出する部分文字列位置検出手段と、この部分文字列位置検出手段で取得した各部分文字列の位置に基づく語順と正規の語順とを比較して、前記部分文字列位置検出手段で取得した部分文字列の位置に基づく前記キーワード文字列の位置が適切か否かを評価する評価手段とを有するものである。 The title extraction device according to a third aspect of the present invention is the title extraction device according to the first aspect, wherein the keyword search means includes a character string dividing means for dividing the keyword character string into a plurality of partial character strings; The partial character string obtained by the character string dividing means is searched for in the character recognition result character string of the character recognizing means, and the position of each partial character string is detected. The position of the keyword character string based on the position of the partial character string acquired by the partial character string position detecting means by comparing the word order based on the position of each partial character string acquired by the character string position detecting means with the normal word order Evaluation means for evaluating whether or not is appropriate.

これによれば、文字認識手段の認識結果に誤認識を含む場合でも、キーワード文字列の位置を精度良く取得することができる。 According to this, even when the recognition result of the character recognition means includes erroneous recognition, the position of the keyword character string can be acquired with high accuracy.

この場合、文字認識手段は、認識結果文字列の各文字ごとに複数の候補文字を取得し、部分文字列位置検出手段は、部分文字列の各文字ごとに複数の候補文字と比較して部分文字列の位置を検出する構成とすると良い。これによると、部分文字列と認識結果文字列とが下位の候補文字で一致すれば良く、誤認識により部分文字列検索が無用に失敗する不都合を避けることができる。 In this case, the character recognition means obtains a plurality of candidate characters for each character of the recognition result character string, and the partial character string position detection means compares the plurality of candidate characters with respect to each character of the partial character string. A configuration for detecting the position of the character string is preferable. According to this, it is only necessary that the partial character string and the recognition result character string match with the lower candidate characters, and it is possible to avoid the disadvantage that the partial character string search unnecessarily fails due to erroneous recognition.

また、文字列分割手段は、各文字の間に比較的大きな空白がある部分を境界にしてキーワード文字列を分割すると良い。これによると、ノイズによる誤認識、特に不正な文字挿入が発生し易い空白部分でキーワード文字列が分割されるため、ノイズによる誤認識の影響を低く抑えることができる。 Further, the character string dividing means may divide the keyword character string at a boundary where there is a relatively large space between each character. According to this, since the keyword character string is divided at a blank portion where misrecognition due to noise, in particular, illegal character insertion is likely to occur, the influence of misrecognition due to noise can be suppressed low.

また、評価手段は、部分文字列の語順による評価によりキーワード文字列の位置として適切な候補が複数残った場合に、各候補ごとに、各部分文字列の位置から部分文字列相互の間隔を取得して、この部分文字列相互の間隔が正規の間隔に最も近い候補を選択する構成とすると良い。これによると、候補を最も適切な１つに絞り込むことができる。 Also, the evaluation means obtains the interval between the partial character strings from the position of each partial character string for each candidate when a plurality of suitable candidates for the position of the keyword character string remain due to the evaluation of the partial character strings in the word order. Thus, it is preferable to select a candidate whose interval between the partial character strings is closest to the regular interval. According to this, a candidate can be narrowed down to the most appropriate one.

本発明の第４の態様に係るタイトル抽出装置は、前記第１の態様に係るタイトル抽出装置において、前記キーワード検索手段が、認識結果文字列内の比較対象文字列とキーワード文字列との相違度に基づいて、認識結果文字列上のキーワード文字列の位置を検出する第１及び第２の位置検出手段を有し、前記第１の位置検出手段が、比較対象文字列の長さをキーワード文字列と同一に設定して求められた相違度に基づいて、認識結果文字列上のキーワード文字列の概略位置を検出し、前記第２の位置検出手段が、前記第１の位置検出手段で取得したキーワード文字列の概略位置を基準にした比較対象範囲内に比較対象を限定すると共に、比較対象文字列の長さを所要の範囲で増減して求められた相違度に基づいて、認識結果文字列上のキーワード文字列の詳細位置を検出するものである。 The title extraction device according to a fourth aspect of the present invention is the title extraction device according to the first aspect, wherein the keyword search unit is configured to determine the difference between the comparison target character string in the recognition result character string and the keyword character string. Based on the recognition result character string, the first and second position detecting means for detecting the position of the keyword character string, and the first position detecting means determines the length of the comparison target character string as the keyword character. The approximate position of the keyword character string on the recognition result character string is detected based on the degree of difference obtained by setting the same as that of the column, and the second position detecting means is acquired by the first position detecting means. Based on the degree of difference obtained by limiting the comparison target within the comparison target range based on the approximate position of the keyword string, and increasing or decreasing the length of the comparison target string within the required range Key on the row And it detects detailed positions of de string.

これによれば、文字認識手段の認識結果に誤認識を含む場合でも、キーワード文字列の位置を精度良く取得することができる。特に前記のキーワード文字列を部分文字列に分割してキーワード文字列の位置を求める方法では、ノイズにより認識結果文字列内の部分文字列に相当する部分に不正な文字が挿入された場合に、キーワード検出が困難となるが、この方法では、文字挿入に影響されることなくキーワード検出を行うことができる。 According to this, even when the recognition result of the character recognition means includes erroneous recognition, the position of the keyword character string can be acquired with high accuracy. In particular, in the method of obtaining the position of the keyword character string by dividing the keyword character string into partial character strings, when an illegal character is inserted in a portion corresponding to the partial character string in the recognition result character string due to noise, Although it is difficult to detect a keyword, this method enables keyword detection without being affected by character insertion.

また、第２の位置検出手段では、比較対象文字列の長さを所要の範囲で増減することから、比較回数が増加するが、比較対象範囲、すなわちキーワード文字列と比較される比較対象文字列を認識結果文字列中から取り出す範囲を限定することで、処理の高速化を図ることができる。 In the second position detecting means, the length of the comparison target character string is increased or decreased within a required range, so that the number of comparisons increases. However, the comparison target range, that is, the comparison target character string to be compared with the keyword character string By limiting the range from which the character string is extracted from the recognition result character string, the processing speed can be increased.

この場合、相違度は、２つの文字列がどの程度異なっているかを、文字の置換、消去、及び挿入の回数（文字数）に応じて数値化して表わすレーベンシュタイン距離（編集距離）が好適である。 In this case, the degree of difference is preferably the Levenshtein distance (edit distance) that represents how much the two character strings are different by quantifying them according to the number of character replacements, deletions, and insertions (number of characters). .

また、文字認識手段は、認識結果文字列の各文字ごとの候補文字を順位付けして複数取得し、位置検出手段は、一致する候補文字の順位に応じて相違度を求める構成とすると良い。ここで、下位の候補文字と一致する際の相違度は、最上位の候補文字と一致する場合より高く且つ文字置換の場合より低く設定され、さらに候補文字の順位が下がるのに伴って大きくなるように設定される。これにより、候補文字の認識信頼度に応じた判断を行うことができる。 In addition, the character recognition unit may rank and acquire a plurality of candidate characters for each character of the recognition result character string, and the position detection unit may obtain a degree of difference according to the rank of matching candidate characters. Here, the degree of difference when matching with the lower candidate character is set higher than when matching with the highest candidate character and lower than in the case of character replacement, and further increases as the rank of the candidate character is lowered. Is set as follows. Thereby, the determination according to the recognition reliability of a candidate character can be performed.

また、比較対象範囲は、キーワード文字列の概略位置の前後に、第１の位置検出手段で取得したレーベンシュタイン距離に応じた拡大幅を付加して設定されるようにすると良い。これによると、比較対象範囲を、キーワード文字列が存在するものと想定される必要十分な範囲とすることができる。 The comparison target range may be set by adding an expansion width according to the Levenshtein distance acquired by the first position detecting means before and after the approximate position of the keyword character string. According to this, the comparison target range can be set to a necessary and sufficient range in which the keyword character string is assumed to exist.

また、比較対象文字列の長さの最小値は、キーワード文字列の長さからレーベンシュタイン距離を減算して求め、比較対象文字列の長さの最大値は、キーワード文字列の長さにレーベンシュタイン距離を加算して求めるようにすると良い。これによると、比較対象文字列の長さを、誤認識による文字の消去や挿入により変動するものと想定される必要十分な範囲で増減することができる。 Also, the minimum length of the comparison target string is obtained by subtracting the Levenshtein distance from the length of the keyword string, and the maximum length of the comparison target string is calculated by subtracting the length of the keyword string. It is recommended to add the Stein distance. According to this, the length of the comparison target character string can be increased or decreased within a necessary and sufficient range that is assumed to vary due to the deletion or insertion of characters due to erroneous recognition.

本発明の第５の態様に係るタイトル抽出装置は、前記第１の態様に係るタイトル抽出装置において、前記タイトル出力手段が、前記文字認識手段で取得した認識結果文字列内のタイトル文字列のテキストデータを出力するものである。 The title extraction device according to a fifth aspect of the present invention is the title extraction device according to the first aspect, wherein the title output means is a text of the title character string in the recognition result character string acquired by the character recognition means. Data is output.

これによれば、取得したタイトル文字列のテキストデータを、文書画像を格納する文書ファイルにタイトル名などとして付加したり、電子ファイリングシステムにキーワードなどとして登録したりすることで、文書の管理を容易にすることができる。 According to this, it is easy to manage the document by adding the text data of the acquired title character string as a title name or the like to the document file storing the document image or registering it as a keyword or the like in the electronic filing system. Can be.

本発明の第６の態様に係るタイトル抽出装置は、前記第１の態様に係るタイトル抽出装置において、前記タイトル出力手段が、文書画像内のタイトル文字列の領域を切り抜いて作成されたイメージデータを出力するものである。 A title extraction device according to a sixth aspect of the present invention is the title extraction device according to the first aspect, wherein the title output means extracts image data created by cutting out a title character string region in a document image. Output.

これによれば、文字認識での誤認識により誤ったタイトル文字列が出力されることを避けることができる。 According to this, it is possible to avoid outputting an incorrect title character string due to erroneous recognition in character recognition.

本発明の第７の態様に係る画像読取装置は、前記第１乃至第６のいずれかの態様に係るタイトル抽出装置を備えたものである。 An image reading apparatus according to a seventh aspect of the present invention includes the title extraction apparatus according to any one of the first to sixth aspects.

そして、読み取った文書のタイトルを取得することができるため、電子化された文書の管理が容易になり、利便性が向上する。 Since the title of the read document can be acquired, management of the digitized document is facilitated, and convenience is improved.

この場合、画像読取装置は、タイトル抽出装置の他に、文書の画像を読み取る画像読取手段と、この画像読取手段で取得した文書画像を対象にしたタイトル抽出処理を前記タイトル抽出装置に行わせる制御手段とを有する。 In this case, in addition to the title extraction device, the image reading apparatus controls the title extraction apparatus to perform an image reading unit that reads an image of a document and a title extraction process for a document image acquired by the image reading unit. Means.

本発明の第８の態様に係るタイトル抽出方法は、文書画像上でタイトル文字列の近傍に記載されるキーワード文字列、及びこのキーワード文字列に対する前記タイトル文字列の相対的な位置情報を取得する抽出条件取得ステップと、文書画像内の少なくとも前記タイトル文字列及び前記キーワード文字列を含む領域を対象にして文字認識を行う文字認識ステップと、この文字認識ステップの認識結果から、前記抽出条件取得ステップで取得したキーワード文字列を検索してその位置を取得するキーワード検索ステップと、このキーワード検索ステップで取得したキーワード文字列の位置、及び前記抽出条件取得ステップで取得したキーワード文字列に対するタイトル文字列の相対的な位置に基づいて、タイトル文字列の位置を取得するタイトル位置取得ステップと、このタイトル位置取得ステップで取得したタイトル文字列の位置に基づいて、タイトル文字列のデータを出力するタイトル出力ステップとを有するものである。 A title extraction method according to an eighth aspect of the present invention acquires a keyword character string described in the vicinity of a title character string on a document image and relative position information of the title character string with respect to the keyword character string. An extraction condition acquisition step; a character recognition step for performing character recognition on an area including at least the title character string and the keyword character string in the document image; and from the recognition result of the character recognition step, the extraction condition acquisition step The keyword search step of searching for the keyword character string acquired in step (b) and acquiring the position thereof, the position of the keyword character string acquired in the keyword search step, and the title character string for the keyword character string acquired in the extraction condition acquisition step Tight to get title string position based on relative position A position obtaining step, based on the position of the title character string acquired by the title position acquisition step, and has a title output step of outputting the data of the title string.

本発明の第９の態様に係るタイトル抽出プログラムは、前記第８の態様に係るタイトル抽出方法における各ステップをコンピュータに実行させるものである。 A title extraction program according to a ninth aspect of the present invention causes a computer to execute each step in the title extraction method according to the eighth aspect.

以下、本発明の実施の形態について、図面を参照して詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明が適用される画像処理装置の概略構成を示すブロック図である。この画像処理装置は、外部の画像読取装置などから取得した文書ファイルから文書画像のイメージデータを取り出す画像入力部１と、この画像入力部１で取得した文書画像のイメージデータを格納する画像メモリ２と、文書画像からタイトルを抽出するタイトル抽出部（タイトル抽出装置）３と、これら各部の動作を制御する制御部４とを有している。 FIG. 1 is a block diagram showing a schematic configuration of an image processing apparatus to which the present invention is applied. The image processing apparatus includes an image input unit 1 that extracts image data of a document image from a document file acquired from an external image reading device and the like, and an image memory 2 that stores image data of the document image acquired by the image input unit 1. A title extraction unit (title extraction device) 3 that extracts a title from the document image, and a control unit 4 that controls the operation of each unit.

この画像処理装置は、所定のＯＳ上で動作するタイトル抽出用のアプリケーションプログラムをＰＣなどの情報処理装置に導入することで実現することができる。このタイトル抽出用のアプリケーションプログラムは、コンピュータ読み取り可能な記憶媒体、例えばＣＤ−ＲＯＭに格納して提供される。 This image processing apparatus can be realized by introducing an application program for title extraction that operates on a predetermined OS into an information processing apparatus such as a PC. The application program for title extraction is provided by being stored in a computer-readable storage medium such as a CD-ROM.

タイトル抽出部３で抽出されたタイトルは、例えば文書画像のイメージデータと共に、所要のインタフェイス（ＬＡＮインタフェイスなど）５を介して電子ファイリングシステムに送られ、ここに検索キーワードなどとして登録される。 The title extracted by the title extraction unit 3 is sent to an electronic filing system via a required interface (LAN interface or the like) 5 together with image data of a document image, for example, and registered therein as a search keyword or the like.

図２は、本発明が適用される画像読取装置の概略構成を示すブロック図である。この画像読取装置は、紙などの記録媒体上に記録された文書の画像を光学的に読み取ってイメージデータを生成する画像読取部７と、この画像読取部７で取得した文書画像のイメージデータを格納する画像メモリ２と、文書画像からタイトルを抽出するタイトル抽出部（タイトル抽出装置）３と、これら各部の動作を制御する制御部８とを有している。 FIG. 2 is a block diagram showing a schematic configuration of an image reading apparatus to which the present invention is applied. The image reading apparatus optically reads an image of a document recorded on a recording medium such as paper and generates image data, and the image data of the document image acquired by the image reading unit 7. It has an image memory 2 to be stored, a title extraction unit (title extraction device) 3 that extracts titles from document images, and a control unit 8 that controls the operation of these units.

タイトル抽出部３で抽出されたタイトルは、文書画像のイメージデータを格納する文書ファイルにタイトル名などとして付加され、このタイトルが付加された文書ファイルは、所要のインタフェイス（ＬＡＮインタフェイスなど）９を介して、あるいは記録メディアを介してＰＣなどに送られる。 The title extracted by the title extraction unit 3 is added as a title name or the like to a document file storing image data of a document image. The document file to which the title is added is a required interface (LAN interface or the like) 9. Or to a PC or the like via a recording medium.

図３は、図１・図２に示したタイトル抽出部３での処理の概要を示している。種々の業務では、多数作成される文書が定型化されており、その文書フォーマットでは、文書を識別するタイトルとして適切な文字列の近傍に、その文字列の内容を説明する特定の文字列が記載されており、図示する請求書の例では、請求書番号を示す文字列「ＩＶ０１２３４５６７」が文書を識別するタイトルとして適切であり、その左横に記載された文字列「請求書番号：」が請求書の文書フォーマットで必ず記載されるものである。 FIG. 3 shows an outline of processing in the title extraction unit 3 shown in FIGS. In various tasks, a large number of documents to be created are standardized, and in the document format, a specific character string that describes the contents of the character string is described near the appropriate character string as a title for identifying the document. In the example of the invoice shown in the figure, the character string “IV01234567” indicating the invoice number is appropriate as a title for identifying the document, and the character string “invoice number:” written on the left side of the character string is invoiced. It must be written in the document format.

そこで、タイトル抽出部３では、文字列「請求書番号：」をキーワード文字列として、その位置を特定した後、そのキーワード文字列との相対的な位置関係からタイトル文字列「ＩＶ０１２３４５６７」の位置を特定してそのタイトル文字列を抽出する。 Therefore, the title extraction unit 3 specifies the position of the character string “invoice number:” as the keyword character string, and then determines the position of the title character string “IV01234567” from the relative positional relationship with the keyword character string. Identify and extract the title string.

またここでは、文書画像内のタイトル文字列及びキーワード文字列を含むタイトル包含領域２１が指定され、このタイトル包含領域２１を対象にして文字認識が行われる。所定の文書フォーマットに基づいて作成された文書では、タイトル文字列及びキーワード文字列が存在する領域が決まっており、ここに文字認識範囲を限定することで、タイトル抽出の精度及び処理速度を高めることができる。 Further, here, a title inclusion area 21 including a title character string and a keyword character string in the document image is designated, and character recognition is performed on the title inclusion area 21. In a document created based on a predetermined document format, the area where the title character string and the keyword character string exist is determined, and by limiting the character recognition range here, the accuracy and processing speed of title extraction can be improved. Can do.

この例では、タイトル包含領域２１の文書画像上の位置を特定するため、タイトル包含領域２１の領域開始位置、すなわちタイトル包含領域２１の左上端の座標（ｘ，ｙ）と、領域の幅Ｗ及び高さＨが指定される。 In this example, in order to specify the position of the title inclusion area 21 on the document image, the area start position of the title inclusion area 21, that is, the coordinates (x, y) of the upper left corner of the title inclusion area 21, the width W of the area, Height H is specified.

なお、このタイトル包含領域２１は、少なくともタイトル文字列及びキーワード文字列の双方を含んでいれば良く、画像読取装置の性能などにより、画像の伸縮や傾き、位置ずれ等が発生する場合は、タイトル文字列及びキーワード文字列がタイトル包含領域２１から外れないように、適宜にタイトル包含領域２１を拡大して設定する必要がある。また、タイトル包含領域を設定せず、文書画像全体を対象にして文字認識を行うことも可能である。 The title inclusion area 21 only needs to include at least both the title character string and the keyword character string. If the image expands or contracts, tilts, or misaligns due to the performance of the image reading apparatus, the title inclusion area 21 It is necessary to enlarge and set the title inclusion area 21 appropriately so that the character string and the keyword character string do not deviate from the title inclusion area 21. It is also possible to perform character recognition for the entire document image without setting the title inclusion area.

図４は、図３に示したキーワード文字列に対するタイトル文字列の相対的な位置関係の指定方法の例を示している。図４（Ａ）は、キーワード文字列に対するタイトル文字列の相対的な位置関係を方向で指定する例である。ここでは、タイトル文字列「ＩＶ０１２３４５６７」がキーワード文字列「請求書番号：」の右横に存在するため、タイトル文字列の相対位置を表す情報は「右横」となる。 FIG. 4 shows an example of a method for specifying the relative positional relationship of the title character string with respect to the keyword character string shown in FIG. FIG. 4A is an example in which the relative positional relationship of the title character string with respect to the keyword character string is designated by the direction. Here, since the title character string “IV01234567” is present on the right side of the keyword character string “invoice number:”, the information indicating the relative position of the title character string is “right side”.

図４（Ｂ）は、キーワード文字列に対するタイトル文字列の相対的な位置関係を座標で指定する例である。この例では、キーワード文字列「請求書番号：」に外接する矩形領域２２の中心を基準点として、タイトル文字列「ＩＶ０１２３４５６７」に外接する矩形領域２３の代表点の座標、ここでは基準点を原点Ｏ（０，０）とした場合の矩形領域２３の左上の頂点Ｐの座標（ｘ０，ｙ０）、並びに矩形領域２３の幅ｗ０及び高さｈ０により相対位置関係が指定される。この場合、文書画像の元になる原稿などで寸法を実測するなどして所要の数値を取得し、その数値をキーボードなどを利用して入力すれば良い。 FIG. 4B is an example in which the relative positional relationship of the title character string with respect to the keyword character string is designated by coordinates. In this example, the coordinates of the representative point of the rectangular area 23 circumscribing the title character string “IV01234567”, where the center of the rectangular area 22 circumscribing the keyword character string “invoice number:” is a reference point, here the reference point is the origin The relative positional relationship is specified by the coordinates (x0, y0) of the upper left vertex P of the rectangular area 23 and the width w0 and height h0 of the rectangular area 23 when O (0, 0) is set. In this case, a required numerical value may be acquired by actually measuring a dimension with a document or the like that is a source of a document image, and the numerical value may be input using a keyboard or the like.

図５は、図１・図２に示したタイトル抽出部３を示すブロック図である。タイトル抽出部３は、文書画像上でタイトル文字列の近傍に記載されるキーワード文字列、及びこのキーワード文字列に対するタイトル文字列の相対的な位置情報を取得する抽出条件取得部（抽出条件取得手段）３１と、この抽出条件取得部３１で取得した抽出条件を記憶する抽出条件記憶部３２と、所定の文字認識対象領域の画像を解析して文字情報を取得する文字認識部（文字認識手段）３３と、この文字認識部３３で取得した認識結果文字列から、抽出条件記憶部３２から取り出したキーワード文字列を検索してその位置を取得するキーワード検索部（キーワード検索手段）３４と、このキーワード検索部３４で取得したキーワード文字列の位置、及び抽出条件記憶部３２から取り出したキーワード文字列に対するタイトル文字列の相対的な位置に基づいて、タイトル文字列の位置を取得するタイトル位置取得部（タイトル位置取得手段）３５と、このタイトル位置取得部３５で取得したタイトル文字列の位置に基づいて、タイトル文字列のデータを出力するタイトル出力部（タイトル出力手段）３６とを有している。 FIG. 5 is a block diagram showing the title extraction unit 3 shown in FIGS. The title extraction unit 3 includes an extraction condition acquisition unit (extraction condition acquisition unit) that acquires a keyword character string described in the vicinity of the title character string on the document image and relative position information of the title character string with respect to the keyword character string. ) 31, an extraction condition storage unit 32 that stores the extraction conditions acquired by the extraction condition acquisition unit 31, and a character recognition unit (character recognition means) that acquires character information by analyzing an image of a predetermined character recognition target area 33, a keyword search unit (keyword search means) 34 for searching the keyword character string extracted from the extraction condition storage unit 32 from the recognition result character string acquired by the character recognition unit 33 and acquiring the position thereof, and the keyword The position of the keyword character string acquired by the search unit 34 and the matching of the title character string with respect to the keyword character string extracted from the extraction condition storage unit 32 Title position acquisition unit (title position acquisition means) 35 for acquiring the position of the title character string based on the specific position, and the position of the title character string based on the position of the title character string acquired by the title position acquisition unit 35 And a title output unit (title output means) 36 for outputting data.

抽出条件取得部３１は、タイトル抽出処理に必要となる抽出条件を使用者の入力操作により取得するものであり、前記のキーワード文字列、及びキーワード文字列に対するタイトル文字列の相対的な位置情報の他に、文書画像内のタイトル文字列及びキーワード文字列を含むタイトル包含領域の情報が、使用者の入力操作により取得される。この使用者の入力操作は、図１に示したＰＣなどからなる画像処理装置であれば、キーボードなどを用いて行われ、図２に示した画像読取装置であれば、自装置の操作パネルを用いて行う他、ネットワークなどを介して接続されたＰＣのキーボードなどを用いて行うようにしても良い。 The extraction condition acquisition unit 31 acquires an extraction condition necessary for the title extraction process by a user's input operation, and the keyword character string and the position information of the title character string relative to the keyword character string. In addition, information on the title inclusion area including the title character string and the keyword character string in the document image is acquired by the user's input operation. The user's input operation is performed using a keyboard or the like in the case of the image processing apparatus including the PC shown in FIG. 1, and the operation panel of the own apparatus is used in the case of the image reading apparatus shown in FIG. It may be performed using a PC keyboard connected via a network or the like.

文字認識部３３では、パターンマッチングなどの公知の解析手法で文字を特定し、その文字のテキストデータが認識結果として出力される。このテキストデータは、ASCII、Shift-JIS、UNICODEなどの一般的な文字コードを用いれば良い。 The character recognition unit 33 specifies a character by a known analysis method such as pattern matching and outputs text data of the character as a recognition result. For this text data, a general character code such as ASCII, Shift-JIS, UNICODE or the like may be used.

図６は、図５に示した文字認識部３３で行われる文字認識の概要を示している。文字認識部３３では、図６（Ａ）に示すように、文字ごとの文書画像上の位置、例えば各文字に外接する矩形領域の文書画像上の位置、すなわち矩形領域の左上端の開始座標（ｘ，ｙ）、矩形領域の幅ｗ及び高さｈに関する情報を取得する。 FIG. 6 shows an outline of character recognition performed by the character recognition unit 33 shown in FIG. In the character recognition unit 33, as shown in FIG. 6A, the position on the document image for each character, for example, the position on the document image of the rectangular area circumscribing each character, that is, the start coordinate ( x, y), information on the width w and height h of the rectangular area is acquired.

また文字認識部３３では、図６（Ｂ）に示すように、認識結果文字列の各文字ごとの候補文字に認識信頼度が付与され、この認識信頼度の高いものから順に候補文字が順位付けされ、所定の順位（例えば第３位）までの候補文字が出力される。 Further, in the character recognition unit 33, as shown in FIG. 6B, recognition reliability is given to the candidate characters for each character of the recognition result character string, and the candidate characters are ranked in descending order of the recognition reliability. Then, candidate characters up to a predetermined rank (for example, the third rank) are output.

さらに文字認識部３３では、図６（Ｃ）に示すように、認識結果が行単位で分割可能に出力される。画像読取時のスキューなどにより図３に示したタイトル包含領域２１に位置ずれが生じたことが原因で、タイトル文字列が存在する行を含む複数の行の画像が認識対象として取得された場合には、その中の全ての行に対して文字認識が行われ、行区切りを示す情報に基づいて認識結果文字列を行単位で取り出すようにする。この例では、キーワード文字列「請求書番号：」及びタイトル文字列「ＩＶ０１２３４５６７」が存在する１行目と共に２行目も文字認識され、各行ごとの認識結果文字列が出力される。 Furthermore, as shown in FIG. 6C, the character recognition unit 33 outputs the recognition result so that it can be divided in units of lines. When an image of a plurality of lines including a line in which the title character string exists is acquired as a recognition target due to a position shift in the title inclusion area 21 illustrated in FIG. 3 due to skew or the like at the time of image reading. The character recognition is performed for all the lines in the line, and the recognition result character string is taken out line by line based on the information indicating the line break. In this example, the second line as well as the first line in which the keyword character string “invoice number:” and the title character string “IV01234567” exist are recognized, and a recognition result character string for each line is output.

次に、図５に示したキーワード検索部３４、タイトル位置取得部３５、及びタイトル出力部３６について説明する。まずキーワード検索部３４では、その詳細は後述するが、語順判定による方法及びレーベンシュタイン距離による方法の２つの方式のいずれか一方あるいは双方を利用してキーワード検索が行われる。またタイトル位置取得部３５では、図４に示したように、キーワード文字列との相対的な位置関係から、文字認識部３３で取得した認識結果文字列上のタイトル文字列の位置を特定する。 Next, the keyword search unit 34, title position acquisition unit 35, and title output unit 36 shown in FIG. 5 will be described. First, the keyword search unit 34 performs keyword search using one or both of the two methods of the word order determination method and the Levenshtein distance method, details of which will be described later. Further, as shown in FIG. 4, the title position acquisition unit 35 specifies the position of the title character string on the recognition result character string acquired by the character recognition unit 33 from the relative positional relationship with the keyword character string.

タイトル出力部３６では、タイトル位置取得部３５で取得したタイトル文字列の位置に基づいてタイトル文字列のテキストデータを認識結果文字列から取得して出力する。図６（Ｃ）の例では、キーワード文字列が存在する１行目の認識結果文字列におけるキーワード文字列の次の文字から行末までの文字列がタイトル文字列となる。なお、タイトル文字列の文字数や、書式、例えば図６（Ｃ）の例では最初に英字が２文字で、続いて数字が８文字といった表記規則との比較によりタイトル文字列を特定するようにしても良い。 The title output unit 36 acquires the text data of the title character string from the recognition result character string based on the position of the title character string acquired by the title position acquisition unit 35 and outputs it. In the example of FIG. 6C, the character string from the character next to the keyword character string to the end of the line in the recognition result character string on the first line where the keyword character string exists is the title character string. It should be noted that the title character string is specified by comparison with the number of characters of the title character string and the format, for example, in the example of FIG. Also good.

またタイトル出力部３６では、タイトル文字列をイメージデータで出力するようにしても良い。この場合も、図４に示したように、キーワード文字列との相対的な位置関係から、文書画像上のタイトル文字列の領域の位置を特定して、文書画像のイメージデータにおけるタイトル文字列の領域を切り抜いてイメージデータを取得する。 The title output unit 36 may output the title character string as image data. Also in this case, as shown in FIG. 4, the position of the title character string area on the document image is specified from the relative positional relationship with the keyword character string, and the title character string in the image data of the document image is identified. Cut out the area and get the image data.

図７は、図１・図２に示した画像処理装置及び画像読取装置で行われる処理の手順を示すフロー図である。まず使用者の入力操作により抽出条件、すなわちキーワード文字列、タイトル文字列の相対位置情報、及びタイトル包含領域の位置情報がタイトル抽出部３の抽出条件取得部３１にて取得され、これらの抽出条件が抽出条件記憶部３２に記憶される（ステップ１０１）。 FIG. 7 is a flowchart showing a procedure of processes performed in the image processing apparatus and the image reading apparatus shown in FIGS. First, the extraction conditions, that is, the keyword character string, the relative position information of the title character string, and the position information of the title inclusion area are acquired by the extraction condition acquisition unit 31 of the title extraction unit 3 by the user's input operation. Is stored in the extraction condition storage unit 32 (step 101).

そして文書画像を取得する、すなわち図１に示した画像処理装置であれば、画像入力部１で文書ファイルから文書画像のイメージデータを取り出し、図２に示した画像読取装置であれば、画像読取部７で原稿の画像を読み取る処理が行われ（ステップ１０２）、文書画像のイメージデータが画像メモリ２に格納される。 Then, a document image is acquired, that is, if the image processing apparatus shown in FIG. 1 is used, the image input unit 1 extracts the image data of the document image from the document file. If the image reading apparatus shown in FIG. A process of reading an image of a document is performed by the unit 7 (step 102), and image data of a document image is stored in the image memory 2.

次に抽出条件記憶部３２から取り出したタイトル包含領域の位置情報に基づいて、画像メモリ２から読み出した文書画像からタイトル包含領域の画像を取り出して文字認識する処理が文字認識部３３で行われ（ステップ１０３）、ついで文字認識部３３で取得した認識結果文字列から、抽出条件記憶部３２から取り出したキーワード文字列を検索してその位置を取得する処理がキーワード検索部３４にて行われる（ステップ１０４）。 Next, based on the position information of the title inclusion area extracted from the extraction condition storage unit 32, the character recognition unit 33 performs processing for extracting the image of the title inclusion area from the document image read from the image memory 2 and recognizing the character ( Step 103) Next, the keyword search unit 34 performs processing for searching the keyword character string extracted from the extraction condition storage unit 32 from the recognition result character string acquired by the character recognition unit 33 and acquiring the position (Step 103). 104).

そしてキーワード検索部３４でのキーワード検索によりキーワード文字列の位置を特定することができた場合には（ステップ１０５）、取得したキーワード文字列の位置に関する情報、及び抽出条件記憶部３２から取り出したタイトル文字列の相対位置情報に基づいて、タイトル文字列の位置を取得する処理がタイトル位置取得部３５にて行われ（ステップ１０６）、ついでここで取得したタイトル文字列の位置に基づくタイトル文字列のデータがタイトル出力部３６から出力される（ステップ１０７）。 Then, when the keyword character string position can be specified by the keyword search in the keyword search unit 34 (step 105), information on the acquired keyword character string position and the title extracted from the extraction condition storage unit 32 Based on the relative position information of the character string, processing for acquiring the position of the title character string is performed in the title position acquisition unit 35 (step 106), and then the title character string based on the position of the title character string acquired here is obtained. Data is output from the title output unit 36 (step 107).

他方、キーワード検索部３４でのキーワード検索によりキーワード文字列の位置を特定することができなかった場合には（ステップ１０５）、タイトル抽出が失敗したものとして（ステップ１０８）、終了する。 On the other hand, when the position of the keyword character string cannot be specified by the keyword search in the keyword search unit 34 (step 105), it is determined that the title extraction has failed (step 108), and the process ends.

図８は、図５に示した文字認識部３３での文字認識で生じる誤認識の例を示している。図８（Ａ）の例では、画像データ内のノイズにより認識結果文字列の３番目及び７番目に不正な文字「，」が挿入されている。図８（Ｂ）の例では、認識結果文字列の１番目及び２番目が不正な文字「だ」及び「晝」に置換されている。図８（Ｃ）の例では、図８（Ａ）の挿入と図８（Ｂ）の置換の各誤認識が複合して生じている。 FIG. 8 shows an example of erroneous recognition caused by character recognition in the character recognition unit 33 shown in FIG. In the example of FIG. 8A, an illegal character “,” is inserted into the third and seventh recognition result character strings due to noise in the image data. In the example of FIG. 8B, the first and second recognition result character strings are replaced with illegal characters “da” and “晝”. In the example of FIG. 8C, each misrecognition of the insertion of FIG. 8A and the replacement of FIG. 8B occurs in combination.

図８（Ｄ）の例では、図８（Ａ）と同様に、認識結果文字列の２番目に不正な文字「，」が挿入されているが、特にここでは文字列「請求書」の中に不正な文字が挿入されている。 In the example of FIG. 8D, as in FIG. 8A, the second illegal character “,” is inserted in the recognition result character string. An invalid character is inserted in.

このような誤認識は、画像データに対するノイズ除去などの前処理を行うことである程度は低減できるものの、完全に解消することは不可能であり、このような誤認識が発生した場合でも認識結果文字列上のキーワード文字列の位置を正確に取得することができるように、以下に示す語順判定による方式及びレーベンシュタイン距離による方式の２つの方式のいずれか一方あるいは双方を利用してキーワード検索が行われる。 Such misrecognition can be reduced to some extent by performing preprocessing such as noise removal on the image data, but it cannot be completely eliminated. Even if such misrecognition occurs, the recognition result characters In order to accurately obtain the position of the keyword character string on the column, keyword search is performed using one or both of the following two methods: the method based on word order determination and the method based on Levenshtein distance. Is called.

図９は、図５に示したキーワード検索部３４を示すブロック図である。キーワード検索部３４は、キーワード文字列を複数の部分文字列に分割する文字列分割部（文字列分割手段）４１と、この文字列分割部４１で得られた各部分文字列を、文字認識部３３で取得した認識結果文字列内で検索して、各部分文字列の位置を検出する部分文字列位置検出部（部分文字列位置検出手段）４２と、部分文字列の評価値を求める評価値算出部４３と、部分文字列位置検出部４２で取得した各部分文字列の位置及び評価値算出部４３で取得した評価値が登録される候補登録部４４と、部分文字列位置検出部４２で取得した部分文字列の位置に基づく語順と正規の語順とを比較して、部分文字列位置検出部４２で取得した部分文字列の位置に基づくキーワード文字列の位置が適切か否かを評価する候補評価部（評価手段）４５とを有している。 FIG. 9 is a block diagram showing the keyword search unit 34 shown in FIG. The keyword search unit 34 includes a character string dividing unit (character string dividing unit) 41 that divides the keyword character string into a plurality of partial character strings, and character recognition units that obtain the partial character strings obtained by the character string dividing unit 41. A partial character string position detecting unit (partial character string position detecting means) 42 for detecting the position of each partial character string by searching in the recognition result character string acquired in 33, and an evaluation value for obtaining an evaluation value of the partial character string A candidate registration unit 44 in which the position of each partial character string acquired by the calculation unit 43, the partial character string position detection unit 42 and the evaluation value acquired by the evaluation value calculation unit 43 are registered, and the partial character string position detection unit 42 The word order based on the position of the acquired partial character string is compared with the normal word order to evaluate whether the position of the keyword character string based on the position of the partial character string acquired by the partial character string position detection unit 42 is appropriate. Candidate evaluation unit (evaluation means) 45 The has.

図１０は、図９に示した文字列分割部４１で行われる文字列分割の一例を示している。文字認識部３３での文字認識では、各文字の間に比較的大きな空白がある部分でノイズによる誤認識が発生し易く、この誤認識の影響が低くなるように、空白部分を境界にしてキーワード文字列が分割される。この例では、キーワード文字列「請求書番号：」が、空白部分を境界にして「請求書」、「番号」、及び「：」の各部分文字列に分割される。 FIG. 10 shows an example of character string division performed by the character string division unit 41 shown in FIG. In the character recognition by the character recognition unit 33, misrecognition due to noise is likely to occur in a portion where there is a relatively large blank between each character, and the keyword is separated from the blank portion so that the influence of this misrecognition is reduced. The string is split. In this example, the keyword character string “invoice number:” is divided into partial character strings “invoice”, “number”, and “:” with a blank portion as a boundary.

図１１は、図９に示した部分文字列位置検出部４２で行われる部分文字列位置検出の要領を示している。部分文字列位置検出部４２では、タイトル包含領域を対象にした文字認識で取得した認識結果文字列と部分文字列とを比較し、このとき、認識結果文字列に対する部分文字列の比較位置を行頭から行末に向けて１文字ずつずらしながら、その比較位置に部分文字列が存在するか否かの判定を行って、認識結果文字列上での部分文字列の位置を特定する。 FIG. 11 shows the point of partial character string position detection performed by the partial character string position detection unit 42 shown in FIG. The partial character string position detection unit 42 compares the recognition result character string acquired by the character recognition for the title inclusion region with the partial character string, and at this time, the comparison position of the partial character string with respect to the recognition result character string is set to the beginning of the line. While shifting from character to character toward the end of the line, it is determined whether or not a partial character string exists at the comparison position, and the position of the partial character string on the recognition result character string is specified.

部分文字列の有無の判定は、部分文字列の最初の文字から順に１文字ずつ、対応する認識結果文字列内の候補文字と比較し、このとき、上位の候補文字から順に比較し、一致しなければ次順位の候補文字との比較に移る。候補文字の中に一致したものがあれば、部分文字列の次の文字と候補文字との比較に移り、この候補文字との比較を部分文字列内の全ての文字に対して行い、部分文字列内の全ての文字に対して候補文字の中に一致したものがあれば、その比較位置に部分文字列が存在するものと判定する。 The presence / absence of a partial character string is determined by comparing each character in order from the first character of the partial character string with the candidate characters in the corresponding recognition result character string. If not, the process moves to comparison with the next candidate character. If there is a match among the candidate characters, the process proceeds to the comparison of the next character of the partial character string with the candidate character, the comparison with the candidate character is performed for all characters in the partial character string, and the partial character If there is a match among all candidate characters for all characters in the column, it is determined that a partial character string exists at the comparison position.

他方、候補文字の中に一致したものがなければ、その比較位置に部分文字列が存在しないものと判定して、行末側に１文字ずらした比較位置で認識結果文字列と部分文字列との比較に移り、考えられる比較位置の全てで部分文字列が存在しないものと判定された場合には、認識結果文字列内に部分文字列が存在しないものとみなして、キーワード検索を終了する。 On the other hand, if there is no match among the candidate characters, it is determined that there is no partial character string at the comparison position, and the recognition result character string and the partial character string are compared at the comparison position shifted by one character toward the end of the line. When it is determined that the partial character string does not exist at all of the possible comparison positions, the keyword search is terminated assuming that the partial character string does not exist in the recognition result character string.

評価値算出部４３では、部分文字列内の全ての文字に対して候補文字の中に一致したものがあれば、その一致した各候補文字ごとに文字認識部３３にて付与された認識信頼度を加算してその部分文字列の評価値を求める処理が行われる。 In the evaluation value calculation unit 43, if there is a match among the candidate characters for all characters in the partial character string, the recognition reliability assigned by the character recognition unit 33 for each of the matched candidate characters Is added to obtain the evaluation value of the partial character string.

なおこの例では、候補文字を第３位まで設定したが、これ以外の態様も可能である。 In this example, the candidate characters are set up to the third place, but other modes are possible.

図１２は、図９に示した候補登録部４４に登録される情報を示している。候補登録部４４には、部分文字列位置検出部４２で取得した各部分文字列の位置（開始位置及び終了位置）と、評価値算出部４３で取得した各部分文字列ごとの評価値とが登録される。文字認識部３３でのタイトル包含領域を対象とした文字認識により複数の行の認識結果文字列を取得した場合には、各行を１つの候補として、図１２（Ａ）〜（Ｃ）に示す部分文字列の位置及び評価値のテーブルが行ごとに作成され、候補評価部４５での評価が行単位で行われる。 FIG. 12 shows information registered in the candidate registration unit 44 shown in FIG. The candidate registration unit 44 includes the position (start position and end position) of each partial character string acquired by the partial character string position detection unit 42 and the evaluation value for each partial character string acquired by the evaluation value calculation unit 43. be registered. When the recognition result character strings of a plurality of lines are acquired by character recognition for the title inclusion area in the character recognition unit 33, the parts shown in FIGS. 12A to 12C with each line as one candidate A table of character string positions and evaluation values is created for each line, and evaluation by the candidate evaluation unit 45 is performed on a line-by-line basis.

図１２（Ａ）の例は、図１１に示した認識結果文字列を対象にした場合である。この場合、例えば部分文字列「請求書」では、最初の文字「請」と一致する第１候補の信頼度が１．０であり、２番目の文字「求」と一致する第２候補の信頼度が０．８であり、３番目の文字「書」と一致する第３候補の信頼度が０．７であり、部分文字列「請求書」の評価値は、各文字ごとの認識信頼度を加算して、２．５となる。 The example of FIG. 12A is a case where the recognition result character string shown in FIG. 11 is targeted. In this case, for example, in the partial character string “invoice”, the reliability of the first candidate that matches the first character “confirmation” is 1.0, and the reliability of the second candidate that matches the second character “request”. The degree of reliability is 0.8, the reliability of the third candidate that matches the third character “Cho” is 0.7, and the evaluation value of the partial character string “Bill” is the recognition reliability for each character. Is added to 2.5.

図１２（Ｂ）の例は、図６（Ｃ）に示したように、画像読取時のスキューなどによりタイトル包含領域に位置ずれが生じたことが原因で、タイトル文字列が存在する１行目と共に２行目も文字認識された場合に、その２行目の認識結果文字列を対象にしてキーワード検索が行われた場合であり、文字列「請求書」が認識結果文字列内に存在しない。 In the example of FIG. 12B, as shown in FIG. 6C, the first character line in which the title character string exists due to the positional displacement in the title inclusion area due to skew or the like at the time of image reading. In addition, when the second line is also recognized, a keyword search is performed on the recognition result character string of the second line, and the character string “invoice” does not exist in the recognition result character string. .

図１２（Ｃ）の例は、図８（Ａ）や図８（Ｃ）に示したように、不正な文字が挿入された誤認識の場合であり、２番目の文字列「番号」及び３番目の文字列「：」の位置がずれている。 The example of FIG. 12C is a case of erroneous recognition in which an illegal character is inserted as shown in FIGS. 8A and 8C, and the second character string “number” and 3 The position of the second character string “:” is shifted.

次に、図９に示した候補評価部４５について説明する。この候補評価部４５では、候補登録部４４に登録された候補ごとに、各部分文字列の位置から語順を求め、この語順と正規の語順とを比較して、キーワード文字列の位置として適切か否かが判定される。図１１の例では、部分文字列「請求書」、「番号」、及び「：」がこの順番で並んでいるか否かが判定される。候補登録部４４に登録された候補の中に語順が正しいものがない場合はキーワード検索が失敗したものとする。 Next, the candidate evaluation unit 45 shown in FIG. 9 will be described. In this candidate evaluation unit 45, for each candidate registered in the candidate registration unit 44, the word order is obtained from the position of each partial character string, and this word order is compared with the normal word order to determine whether it is appropriate as the keyword character string position. It is determined whether or not. In the example of FIG. 11, it is determined whether or not the partial character strings “invoice”, “number”, and “:” are arranged in this order. If no candidate registered in the candidate registration unit 44 has the correct word order, the keyword search has failed.

さらに候補評価部４５では、候補登録部４４に登録された候補ごとに、各部分文字列の位置から部分文字列相互の間隔、すなわち部分文字列の間に存在する文字やスペースの数を取得し、この部分文字列相互の間隔が正規の間隔に最も近いものが最終候補に選択される。この部分文字列間隔に基づく選択は、前記の語順判定により複数の候補が適切と判定された場合に行われ、これにより最終候補を１つに絞り込むことができる。 Further, in the candidate evaluation unit 45, for each candidate registered in the candidate registration unit 44, the interval between the partial character strings, that is, the number of characters or spaces existing between the partial character strings is obtained from the position of each partial character string. The one whose interval between the partial character strings is closest to the regular interval is selected as the final candidate. The selection based on the partial character string interval is performed when a plurality of candidates are determined to be appropriate by the above-described word order determination, whereby the final candidates can be narrowed down to one.

図１１の例では、最初の部分文字列「請求書」の終了位置と２番目の部分文字列「番号」の開始位置と間の間隔（文字数）、及び２番目の部分文字列「番号」の終了位置と３番目の部分文字列「号」の開始位置と間の間隔（文字数）が算出され、これらの間隔（文字数）が正規の間隔に最も近いもの、すなわち部分文字列「請求書」及び「番号」相互の間隔では正規の間隔１との差、また部分文字列「番号」及び「：」相互の間隔では正規の間隔０との差がそれぞれ最小となるものが最終候補に選択される。 In the example of FIG. 11, the interval (number of characters) between the end position of the first partial character string “invoice” and the start position of the second partial character string “number”, and the second partial character string “number” An interval (number of characters) between the end position and the start position of the third partial character string “No.” is calculated, and these intervals (number of characters) are closest to the regular interval, that is, the partial character string “invoice” and The difference between the “number” and the regular interval 1 is selected as the final candidate, and the difference between the partial character strings “number” and “:” between the regular interval 0 and the minimum is selected as the final candidate. .

また、評価値算出部４３で取得した評価値が異常に小さい場合や、部分文字列の間隔（文字数）が異常に大きい場合は、キーワード文字列の位置として不適切であり、候補評価部４５では、評価値や部分文字列間隔を所定の閾値と比較することにより、キーワード文字列の位置の妥当性を判定して、妥当でないと判定された場合には、キーワード検索が失敗したものとする。 Further, when the evaluation value acquired by the evaluation value calculation unit 43 is abnormally small or when the interval (number of characters) between the partial character strings is abnormally large, the position of the keyword character string is inappropriate, and the candidate evaluation unit 45 The validity of the position of the keyword character string is determined by comparing the evaluation value and the partial character string interval with a predetermined threshold. If it is determined that the keyword character string is not valid, the keyword search is assumed to have failed.

図１３は、図９に示したキーワード検索部３４で行われるキーワード検索の手順を示すフロー図である。これは、図７に示したキーワード検索（ステップ１０４）で実行されるものである。 FIG. 13 is a flowchart showing a keyword search procedure performed by the keyword search unit 34 shown in FIG. This is executed by the keyword search (step 104) shown in FIG.

ここではまず抽出条件記憶部３２からキーワード文字列を読み出して部分文字列に分割する処理が文字列分割部４１で行われる（ステップ２０１）。そしてここで取得した部分文字列の中から１つの部分文字列を取り出して（ステップ２０２）、図１１に示したように、文字認識部３３で取得したタイトル包含領域の認識結果文字列と部分文字列とを比較して各部分文字列の位置を検出する処理が部分文字列位置検出部４２で行われる（ステップ２０３）。ここで認識結果文字列内に部分文字列が検出されると（ステップ２０４）、その部分文字列の開始位置及び終了位置が候補登録部４４に登録される（ステップ２０５）。また評価値算出部４３で部分文字列の評価値が算出され、その評価値が候補登録部４４に登録される。 Here, the character string dividing unit 41 first reads the keyword character string from the extraction condition storage unit 32 and divides it into partial character strings (step 201). Then, one partial character string is extracted from the partial character strings acquired here (step 202). As shown in FIG. 11, the recognition result character string and partial characters of the title inclusion area acquired by the character recognition unit 33 are obtained. The partial character string position detection unit 42 performs processing for comparing the columns and detecting the position of each partial character string (step 203). When a partial character string is detected in the recognition result character string (step 204), the start position and end position of the partial character string are registered in the candidate registration unit 44 (step 205). The evaluation value calculation unit 43 calculates the evaluation value of the partial character string, and the evaluation value is registered in the candidate registration unit 44.

ついで全ての部分文字列の比較処理が終了した否かの判定が行われ（ステップ２０６）、終了していなければ、次の部分文字列の比較処理に進み、全ての部分文字列に対する比較処理が終了すれば、候補登録部４４に登録された候補の評価、すなわち語順に基づく適切な候補か否かの判定が候補評価部４５にて行われる（ステップ２０７）。ここで、複数行の認識結果文字列がある場合には、前記の処理（ステップ２０２〜２０６）が各行ごとに繰り返され、語順判定で適切と判定された候補が複数残った場合には、部分文字列間隔に基づく絞り込みにより最終候補を選択する処理が候補評価部４５にて行われる。 Next, it is determined whether or not all the partial character string comparison processes have been completed (step 206). If not, the process proceeds to the next partial character string comparison process, and the comparison process for all the partial character strings is performed. If completed, the candidate evaluation unit 45 determines whether the candidate is registered in the candidate registration unit 44, that is, whether the candidate is an appropriate candidate based on the word order (step 207). Here, when there are a plurality of lines of recognition result character strings, the above processing (steps 202 to 206) is repeated for each line, and when there are a plurality of candidates determined to be appropriate in the word order determination, The candidate evaluation unit 45 performs a process of selecting a final candidate by narrowing down based on the character string interval.

そしてキーワード位置が検出されたか否かの判定が行われ（ステップ２０８）、キーワード位置が検出された場合には、そのキーワード位置の情報が出力され（ステップ２０９）、他方、キーワード位置が検出されなかった場合には、キーワード検索が失敗したものとして（ステップ２１０）、終了する。 Then, a determination is made as to whether or not a keyword position has been detected (step 208). If a keyword position is detected, information on the keyword position is output (step 209), while no keyword position is detected. If it is found that the keyword search has failed (step 210), the process ends.

図１４は、図５に示したキーワード検索部３４の別の例を示すブロック図である。ここではキーワード検索部３４が、認識結果文字列内の比較対象文字列とキーワード文字列とのレーベンシュタイン距離（相違度）に基づいて、認識結果文字列上のキーワード文字列の概略位置を検出する概略位置検出部（第１の位置検出手段）５１及び詳細位置検出部（第２の位置検出手段）５２を有している。 FIG. 14 is a block diagram showing another example of the keyword search unit 34 shown in FIG. Here, the keyword search unit 34 detects the approximate position of the keyword character string on the recognition result character string based on the Levenshtein distance (difference) between the comparison target character string and the keyword character string in the recognition result character string. An approximate position detector (first position detector) 51 and a detailed position detector (second position detector) 52 are provided.

図１５は、図１４に示したレーベンシュタイン距離算出の要領を示している。レーベンシュタイン距離は、２つの文字列がどの程度異なっているかを数値化して表わすものであり、比較の基準となる文字列に対して、以下に示すように置換、消去、及び挿入の各条件が当てはまる場合に、レーベンシュタイン距離が１となり、２つの文字列が完全に一致する場合はレーベンシュタイン距離は０となる。 FIG. 15 shows a procedure for calculating the Levenshtein distance shown in FIG. The Levenshtein distance is a numerical representation of how different two character strings are. For a character string that serves as a reference for comparison, the conditions for replacement, deletion, and insertion are as follows: When this is the case, the Levenshtein distance is 1, and the Levenshtein distance is 0 if the two character strings completely match.

比較文字列１では、比較の基準となる正解文字列「ＡＢＣＤＥＦ」と完全に一致するため、レーベンシュタイン距離は０となる。比較文字列２では、「Ｃ」が「Ｘ」に置き換わっており、比較文字列３では、「Ｃ」が消去されており、比較文字列４では、「Ｃ」と「Ｄ」の間に「Ｘ」が挿入されており、この比較文字列２〜４では、１文字の置換、消去、及び挿入の各条件が当てはまるため、レーベンシュタイン距離は１となる。 In the comparison character string 1, the Levenshtein distance is 0 because it matches the correct character string “ABCDEF” that is a reference for comparison. In the comparison character string 2, “C” is replaced by “X”, in the comparison character string 3, “C” is deleted, and in the comparison character string 4, “C” and “D” are “ X "is inserted, and in the comparison character strings 2 to 4, the Levenshtein distance is 1 because the replacement, deletion, and insertion conditions of one character are applicable.

また、置換、消去、及び挿入の各条件が単独で複数回（複数文字）当てはまる場合や、置換、消去、及び挿入の各条件が複合して当てはまる場合には、その回数（文字数）に応じてレーベンシュタイン距離の値に１が加算される。比較文字列５では、「Ｂ」が消去され、またＥが「Ｙ」に置換され、また「Ｘ」が「Ｃ」と「Ｄ」の間に挿入されており、１文字の消去、１文字の置換、及び１文字の挿入が発生しているため、レーベンシュタイン距離は３となる。 In addition, when each of the replacement, deletion, and insertion conditions is applied multiple times (multiple characters), or when the replacement, deletion, and insertion conditions are combined and applied, depending on the number of times (number of characters) 1 is added to the value of the Levenshtein distance. In the comparison character string 5, “B” is deleted, E is replaced with “Y”, and “X” is inserted between “C” and “D”. The Levenshtein distance is 3 because of the replacement and the insertion of one character.

図１６は、図１４に示したキーワード検索部３４で行われるキーワード検索の手順を示すフロー図である。これは、図７に示したキーワード検索（ステップ１０４）で実行されるものである。 FIG. 16 is a flowchart showing a keyword search procedure performed by the keyword search unit 34 shown in FIG. This is executed by the keyword search (step 104) shown in FIG.

ここではまずキーワードの概略位置を検出する処理が概略位置検出部５１にて行われ（ステップ３０１）、ここで取得したキーワード概略位置の候補が候補登録部５６に登録される。ついで候補登録部５６に登録されたキーワード概略位置の候補を元にしてキーワードの詳細な位置を検出する処理が詳細位置検出部５２にて行われる（ステップ３０２）。 Here, the process of detecting the approximate position of the keyword is first performed by the approximate position detection unit 51 (step 301), and the keyword approximate position candidate obtained here is registered in the candidate registration unit 56. Next, a process for detecting the detailed position of the keyword based on the keyword approximate position candidates registered in the candidate registration unit 56 is performed by the detailed position detection unit 52 (step 302).

そしてキーワード位置が検出されたか否かの判定が行われ（ステップ３０３）、キーワード位置が検出された場合には、そのキーワード位置の情報が出力され（ステップ３０４）、他方、キーワード位置が検出されていない場合には、キーワード検索が失敗したものとして（ステップ３０５）、終了する。 Then, it is determined whether or not the keyword position has been detected (step 303). If the keyword position is detected, information on the keyword position is output (step 304), while the keyword position is detected. If not, it is determined that the keyword search has failed (step 305), and the process ends.

次に、図１４に示した概略位置検出部５１について説明する。この概略位置検出部５１は、キーワード文字列と比較される認識結果文字列内の比較対象文字列の長さをキーワード文字列と同一に設定して比較位置をずらしながら比較対象文字列とキーワード文字列とのレーベンシュタイン距離（相違度）を算出する距離算出部５３と、この距離算出部５３で取得した比較位置ごとのレーベンシュタイン距離が暫定候補として登録される暫定候補登録部５４と、この暫定候補登録部５４に登録された暫定候補の中でレーベンシュタイン距離が最小となるものをキーワード概略位置の候補に選択する候補評価部５５とを備えており、この候補評価部５５で取得したキーワード位置の候補が候補登録部５６に登録される。 Next, the approximate position detector 51 shown in FIG. 14 will be described. The approximate position detection unit 51 sets the length of the comparison target character string in the recognition result character string to be compared with the keyword character string to be the same as the keyword character string and shifts the comparison position while shifting the comparison position. A distance calculation unit 53 that calculates a Levenshtein distance (dissimilarity) with a column, a provisional candidate registration unit 54 in which the Levenstein distance for each comparison position acquired by the distance calculation unit 53 is registered as a provisional candidate, and this provisional A candidate evaluation unit 55 that selects a candidate with a minimum Levenshtein distance as a candidate for a keyword approximate position among provisional candidates registered in the candidate registration unit 54. The keyword position acquired by the candidate evaluation unit 55 Are registered in the candidate registration unit 56.

図１７は、図１４に示した概略位置検出部５１で行われるキーワード概略位置検出の要領を示している。概略位置検出部５１の距離算出部５３では、タイトル包含領域を対象にした文字認識で取得した認識結果文字列とキーワード文字列とを比較し、このとき、認識結果文字列に対するキーワード文字列の比較位置を行頭から行末に向け１文字ずつずらしながら、その比較位置における認識結果文字列内の比較対象文字列とキーワード文字列との間のレーベンシュタイン距離を求める。 FIG. 17 shows the outline of keyword approximate position detection performed by the approximate position detection unit 51 shown in FIG. The distance calculation unit 53 of the approximate position detection unit 51 compares the recognition result character string acquired by the character recognition for the title inclusion region with the keyword character string, and at this time, compares the keyword character string with the recognition result character string. While shifting the position one character at a time from the beginning of the line to the end of the line, the Levenshtein distance between the comparison target character string and the keyword character string in the recognition result character string at the comparison position is obtained.

この例では、キーワード文字列「請求書番号：」の長さ（文字数）が、スペースを１文字と数えることにすると７文字となり、まず認識結果文字列の先頭の文字から７文字分の文字列が比較対象文字列としてキーワード文字列と比較される。ついで認識結果文字列の２番目の文字から７文字分の文字列が比較対象文字列としてキーワード文字列と比較され、以後、この比較対象文字列とキーワード文字列との比較が、比較対象文字列が認識結果文字列の行末に到達するまで繰り返される。 In this example, the length (number of characters) of the keyword character string “invoice number:” is 7 characters when the space is counted as one character. First, a character string of 7 characters from the first character of the recognition result character string. Is compared with the keyword character string as a comparison target character string. Next, a character string of 7 characters from the second character of the recognition result character string is compared with the keyword character string as a comparison target character string, and thereafter, the comparison between the comparison target character string and the keyword character string is performed as a comparison target character string. Is repeated until the end of the line of the recognition result character string is reached.

さらに、比較対象文字列内の１文字が異なる文字に置換されている、すなわちキーワード文字列と一致する文字が第１候補にない場合には、下位の候補と比較され、このとき、一致する候補文字の順位に応じた値がレーベンシュタイン距離に加算される。下位の候補文字と一致する際にレーベンシュタイン距離に加算する値は、最上位の候補文字と一致する場合（０）より高く且つ文字置換の場合（１）より低く設定され、さらに候補文字の順位が下がるのに伴って大きくなるように設定される。 Furthermore, if one character in the comparison target character string is replaced with a different character, that is, if there is no character that matches the keyword character string in the first candidate, it is compared with a lower candidate. A value corresponding to the character order is added to the Levenshtein distance. The value added to the Levenshtein distance when matching with the lower candidate character is set to be higher than (0) when matching with the highest candidate character and lower than (1) in the case of character replacement. Is set to increase as the value decreases.

ここでは、第３位までの候補文字と比較され、第２位あるいは第３位の候補文字と一致する場合には、レーベンシュタイン距離に加算する値をそれぞれ０．１及び０．２とする。一致する文字が第３候補までにない場合には、前記のようにレーベンシュタイン距離に１を加算する。 Here, when compared with the candidate characters up to the third place and matches the second or third place candidate characters, the values added to the Levenshtein distance are 0.1 and 0.2, respectively. If there is no matching character by the third candidate, 1 is added to the Levenshtein distance as described above.

図１７の例では、認識結果文字列の先頭の文字から７文字分の文字列を比較対象文字列としてキーワード文字列と比較した場合、「請」が第１候補で一致するのに対して、「求」が第２候補で一致し、また「書」が第３候補で一致するため、レーベンシュタイン距離は０．１＋０．２＝０．３となる。 In the example of FIG. 17, when a character string for 7 characters from the first character of the recognition result character string is compared with the keyword character string as a comparison target character string, Since “Request” matches with the second candidate and “Call” matches with the third candidate, the Levenshtein distance is 0.1 + 0.2 = 0.3.

図１８は、図１４に示した距離算出部５３でのレーベンシュタイン距離算出結果の一例を示している。これは、図１７や図８（Ｂ）の例と同様に、誤認識により文字が置換された例である。候補評価部５５では、レーベンシュタイン距離が最小となるものが候補に選択され、レーベンシュタイン距離が最小となるものが複数ある場合には、その全てが候補に選択される。この例では、前記のように、始点が０となる比較位置でレーベンシュタイン距離が０．３で最小となるため、これが候補に選択される。 FIG. 18 shows an example of a Levenstein distance calculation result in the distance calculation unit 53 shown in FIG. This is an example in which characters are replaced by misrecognition as in the examples of FIGS. 17 and 8B. In the candidate evaluation unit 55, a candidate having the minimum Levenshtein distance is selected as a candidate, and if there are a plurality of candidates having a minimum Levenshtein distance, all of them are selected as candidates. In this example, since the Levenshtein distance is minimum at 0.3 at the comparison position where the starting point is 0 as described above, this is selected as a candidate.

図１９は、図１４に示した距離算出部５３でのレーベンシュタイン距離算出結果の別の例を示している。これは、図８（Ｄ）の例と同様に、ノイズにより認識結果文字列内に不正な文字「，」が挿入された例であり、この例では、始点が０となる比較位置でレーベンシュタイン距離が２となり、また始点が１となる比較位置でもレーベンシュタイン距離が２となり、この２つがレーベンシュタイン距離が共に最小となるため、これらが候補に選択される。 FIG. 19 shows another example of the Levenstein distance calculation result in the distance calculation unit 53 shown in FIG. This is an example in which an invalid character “,” is inserted into the recognition result character string due to noise, as in the example of FIG. 8D. In this example, the Levenshtein is at the comparison position where the starting point is 0. Since the Levenshtein distance is 2 even at the comparison position where the distance is 2 and the starting point is 1, both of these are selected as candidates because the Levenshtein distance is the minimum.

図２０は、図１４に示した概略位置検出部５１で行われるキーワード概略位置検出の手順を示すフロー図である。これは、図１６に示したキーワード概略位置検出（ステップ３０１）で実行されるものである。 FIG. 20 is a flowchart showing a procedure for keyword approximate position detection performed by the approximate position detection unit 51 shown in FIG. This is executed by the keyword approximate position detection (step 301) shown in FIG.

ここではまずキーワード文字列の長さ（文字数）が計数され（ステップ４０１）、ついでタイトル包含領域を対象にした文字認識で取得した認識結果文字列からキーワード文字列に対応する長さの比較対象文字列が取り出されて（ステップ４０２）、その比較対象文字列とキーワード文字列との間のレーベンシュタイン距離を算出する処理が距離算出部５３にて行われ（ステップ４０３）、その結果が暫定候補登録部５４に登録される。ここで、認識結果文字列に対するキーワード文字列の比較位置の初期値は、認識結果文字列の行頭である。 Here, the length (number of characters) of the keyword character string is first counted (step 401), and then a comparison target character having a length corresponding to the keyword character string from the recognition result character string acquired by the character recognition for the title inclusion area. The column is extracted (step 402), and the distance calculating unit 53 performs processing for calculating the Levenshtein distance between the comparison target character string and the keyword character string (step 403), and the result is provisional candidate registration. Registered in the unit 54. Here, the initial value of the comparison position of the keyword character string with respect to the recognition result character string is the beginning of the line of the recognition result character string.

次に全ての比較位置でのレーベンシュタイン距離の算出が終了したか否かが判定され（ステップ４０４）、終了していなければ、図１７に示したように、行末方向に１文字分移動させた比較位置でのレーベンシュタイン距離の算出に進み、全ての比較位置でのレーベンシュタイン距離の算出が終了すれば、暫定候補登録部５４に登録された各比較位置ごとのレーベンシュタイン距離の中から値が最小となるものをキーワード概略位置の候補に選択する処理が候補評価部５５にて行われ（ステップ４０５）、その候補が候補登録部５６に登録される。 Next, it is determined whether or not the calculation of the Levenshtein distance at all the comparison positions has been completed (step 404). If not, the character is moved by one character in the line end direction as shown in FIG. Proceeding to the calculation of the Levenshtein distance at the comparison position, and when the calculation of the Levenshtein distance at all the comparison positions is completed, the value is selected from the Levenshtein distance for each comparison position registered in the provisional candidate registration unit 54. The candidate evaluation unit 55 performs a process of selecting the smallest candidate for the keyword approximate position (step 405), and the candidate is registered in the candidate registration unit 56.

次に、図１４に示した詳細位置検出部５２について説明する。この詳細位置検出部５２は、候補登録部５６に登録された候補のキーワード概略位置及びレーベンシュタイン距離に基づいて、認識結果文字列中の比較対象範囲を設定する比較対象範囲設定部５７と、候補登録部５６に登録された候補のレーベンシュタイン距離に基づいて、キーワード文字列と比較される認識結果文字列中の比較対象文字列の長さの範囲を設定する比較対象文字列長設定部５８と、この比較対象文字列長設定部５８で設定された比較対象文字列長の範囲及び比較対象範囲設定部５７で設定された比較対象範囲に基づいてレーベンシュタイン距離を算出する距離算出部５９と、この距離算出部５９で取得した比較位置及び比較対象文字列長ごとのレーベンシュタイン距離が暫定候補として登録される暫定候補登録部６０と、この暫定候補登録部６０に登録されたものの中でレーベンシュタイン距離が最小となるものを候補に選択する候補評価部６１とを備えている。 Next, the detailed position detection unit 52 shown in FIG. 14 will be described. The detailed position detection unit 52 includes a comparison target range setting unit 57 that sets a comparison target range in the recognition result character string based on the candidate keyword approximate position and the Levenshtein distance registered in the candidate registration unit 56, and a candidate A comparison target character string length setting unit 58 for setting a length range of the comparison target character string in the recognition result character string to be compared with the keyword character string based on the candidate Levenshtein distance registered in the registration unit 56; A distance calculation unit 59 that calculates a Levenshtein distance based on the comparison target character string length range set by the comparison target character string length setting unit 58 and the comparison target range set by the comparison target range setting unit 57; A provisional candidate registration unit 60 in which the Levenshtein distance for each comparison position and comparison target character string length acquired by the distance calculation unit 59 is registered as a provisional candidate; And a candidate evaluating section 61 Levenshtein distance is selected to the candidate what the smallest among those registered in the provisional candidate registration unit 60.

図２１は、図１４に示した詳細位置検出部５２で行われるキーワード詳細位置検出の要領を示している。文字認識での誤認識による文字の挿入や消去により、認識結果文字列中のキーワード文字列に相当する文字列の長さが正規のキーワード文字列の長さと異なる場合が生じる。この例では、認識結果文字列の２番目にノイズによる不正な文字「，」が挿入されているため、認識結果文字列中のキーワード文字列に相当する文字列の長さは、正規のキーワード文字列「請求書番号：」の７文字とは異なり、１文字追加されて８文字となっており、認識結果文字列中のキーワード文字列の正確な位置は文字順番の０から７までの範囲となる。 FIG. 21 shows a procedure for keyword detailed position detection performed by the detailed position detection unit 52 shown in FIG. Due to insertion or deletion of characters due to erroneous recognition in character recognition, the length of the character string corresponding to the keyword character string in the recognition result character string may be different from the length of the regular keyword character string. In this example, since the illegal character “,” due to noise is inserted in the second of the recognition result character string, the length of the character string corresponding to the keyword character string in the recognition result character string is the normal keyword character. Unlike the 7 characters in the column “Invoice No .:”, 1 character is added to 8 characters, and the exact position of the keyword character string in the recognition result character string is in the range from 0 to 7 in the character order. Become.

このように認識結果文字列中のキーワード文字列に相当する文字列の長さが正規の長さと異なる場合、前記の概略位置検出での方法、すなわちキーワード文字列と比較される認識結果文字列内の比較対象文字列をキーワード文字列と同数に設定してキーワード位置を検出する方法では、認識結果文字列中のキーワード文字列の正確な位置を取得することができない。 As described above, when the length of the character string corresponding to the keyword character string in the recognition result character string is different from the normal length, the above-described method for the approximate position detection, that is, the recognition result character string to be compared with the keyword character string. In the method in which the comparison target character string is set to the same number as the keyword character string and the keyword position is detected, the exact position of the keyword character string in the recognition result character string cannot be acquired.

そこで、詳細位置検出部５２では、キーワード文字列と比較される認識結果文字列中の比較対象文字列の長さを所要の範囲で拡大縮小して、その比較対象文字列の長さごとにレーベンシュタイン距離を求め、このレーベンシュタイン距離が最小となるものをキーワード文字列の正確な位置とする。 Therefore, the detailed position detection unit 52 enlarges / reduces the length of the comparison target character string in the recognition result character string to be compared with the keyword character string within a required range, and sets the Leven for each length of the comparison target character string. The Stein distance is obtained, and the one with the minimum Levenshtein distance is determined as the exact position of the keyword string.

ここではまず、比較対象範囲設定部５７において、概略位置検出部５１で取得した候補のキーワード概略位置を所定の拡大幅だけ前後に拡大して、認識結果文字列中の比較対象範囲、すなわち比較対象文字列を取り出す範囲が設定される。この拡大幅は、キーワード概略位置の候補のレーベンシュタイン距離Ｄとする。この例では、キーワード概略位置が文字順番０から６までであり、この前後にレーベンシュタイン距離Ｄ＝２を加えることにより、認識結果文字列中の比較対象範囲は文字順番−２から８までとなる。 Here, first, the comparison target range setting unit 57 enlarges the candidate keyword approximate position acquired by the approximate position detection unit 51 forward and backward by a predetermined enlargement width, and compares the comparison target range in the recognition result character string, that is, the comparison target. The range to extract the character string is set. This expansion width is the Levenshtein distance D of the keyword approximate position candidate. In this example, the approximate keyword position is the character order 0 to 6, and the Levenstein distance D = 2 is added before and after this, so that the comparison target range in the recognition result character string is the character order -2 to 8. .

なお、概略位置検出部５１で取得したレーベンシュタイン距離Ｄが整数でない場合には、切り上げなどの演算により整数にする。また、文字認識部３３で取得した誤認識率Ｅ(０≦Ｅ≦1.0)を用い、次式に従って求められるＤ２をＤの代わりに使用することも可能である。
Ｄ２＝Ｄ＋Ｄ×Ｅ。 When the Levenshtein distance D acquired by the approximate position detection unit 51 is not an integer, it is made an integer by calculation such as rounding up. It is also possible to use D2 obtained according to the following equation instead of D using the error recognition rate E (0 ≦ E ≦ 1.0) acquired by the character recognition unit 33.
D2 = D + D × E.

比較対象文字列長設定部５８では、概略位置検出部５１で取得したキーワード概略位置の候補のレーベンシュタイン距離Ｄに基づいて、キーワード文字列と比較される認識結果文字列中の比較対象文字列の長さＭＬの範囲、すなわち最大値ＭＬmax及び最小値ＭＬminが設定される。比較対象文字列長の最小値ＭＬminは、キーワード文字列長ＫＬからレーベンシュタイン距離Ｄを減算して求められ、比較対象文字列長の最大値ＭＬmaxは、キーワード文字列長ＫＬにレーベンシュタイン距離Ｄを加算して求められる。この例では、キーワード文字列長ＫＬ＝７、キーワード概略位置の候補のレーベンシュタイン距離Ｄ＝２であるため、比較対象文字列長の最小値ＭＬmin＝ＫＬ−Ｄ＝７−２＝５となり、比較対象文字列長の最大値ＭＬmax＝ＫＬ＋Ｄ＝７＋２＝９となる。 In the comparison target character string length setting unit 58, based on the Levenshtein distance D of the keyword approximate position candidate acquired by the approximate position detection unit 51, the comparison target character string in the recognition result character string to be compared with the keyword character string. The range of the length ML, that is, the maximum value MLmax and the minimum value MLmin are set. The minimum value MLmin of the comparison target character string length is obtained by subtracting the Levenshtein distance D from the keyword character string length KL, and the maximum value MLmax of the comparison target character string length is obtained by adding the Levenshtein distance D to the keyword character string length KL. It is obtained by adding. In this example, since the keyword character string length KL = 7 and the Levenstein distance D = 2 of the keyword approximate position candidate, the minimum value MLmin = KL−D = 7−2 = 5 of the comparison target character string length is obtained. The maximum value of the target character string length is MLmax = KL + D = 7 + 2 = 9.

距離算出部５９では、比較対象範囲設定部５７で設定された比較対象範囲内で比較位置を１文字ずつずらしながら、比較対象文字列とキーワード文字列との間のレーベンシュタイン距離ＬＤが算出され、さらに各比較位置において、比較対象文字列長設定部５８で設定された比較対象文字列長の範囲内で比較対象文字列の長さを１文字ずつ変更しながら、比較対象文字列とキーワード文字列との間のレーベンシュタイン距離ＬＤが算出される。 The distance calculation unit 59 calculates the Levenshtein distance LD between the comparison target character string and the keyword character string while shifting the comparison position one character at a time within the comparison target range set by the comparison target range setting unit 57. Further, at each comparison position, the comparison target character string and the keyword character string are changed while changing the length of the comparison target character string one character at a time within the range of the comparison target character string length set by the comparison target character string length setting unit 58. Levenshtein distance LD between and is calculated.

ここで、比較対象範囲の始点が比較位置ＭＳの初期値、すなわち比較開始位置に設定され、この比較開始位置から比較対象範囲の終点に向けて１文字ずつ比較位置ＭＳがずらされる。この例では比較位置ＭＳの初期値は、キーワード概略位置の始点ＳＴからレーベンシュタイン距離Ｄを減算した値ＳＴ−Ｄ＝−２となる。また比較対象文字列長ＭＬの最小値ＭＬminが比較対象文字列長ＭＬの初期値に設定され、比較対象文字列長ＭＬが最大値ＭＬmaxに達するまで１文字ずつ増加される。この例では比較対象文字列長ＭＬの初期値は最小値ＭＬmin＝ＫＬ−Ｄ＝５である。 Here, the starting point of the comparison target range is set to the initial value of the comparison position MS, that is, the comparison start position, and the comparison position MS is shifted character by character from the comparison start position toward the end point of the comparison target range. In this example, the initial value of the comparison position MS is a value ST−D = −2 obtained by subtracting the Levenshtein distance D from the starting point ST of the keyword approximate position. Further, the minimum value MLmin of the comparison target character string length ML is set as the initial value of the comparison target character string length ML, and is increased by one character until the comparison target character string length ML reaches the maximum value MLmax. In this example, the initial value of the comparison target character string length ML is the minimum value MLmin = KL−D = 5.

図２２は、図１４に示した暫定候補登録部６０に登録される暫定候補リストを示している。この暫定候補リストには、距離算出部５９で取得した比較位置（始点）ＭＳ及び比較対象文字列長ＭＬごとのレーベンシュタイン距離ＬＤが、キーワード詳細位置の暫定候補として候補管理番号を付して登録される。 FIG. 22 shows a provisional candidate list registered in the provisional candidate registration unit 60 shown in FIG. In this temporary candidate list, the comparison position (starting point) MS acquired by the distance calculation unit 59 and the Levenshtein distance LD for each comparison target character string length ML are registered with candidate management numbers as temporary candidates for the keyword detailed position. Is done.

この暫定候補リストに基づいて候補評価部６１にてレーベンシュタイン距離ＬＤが最小となる候補が最終候補として選択される。この例では、候補管理番号１３の候補が最終候補となり、この最終候補によると、キーワード文字列の詳細位置は、比較位置（始点）ＭＳの値である０となり、認識結果文字列内でのキーワード文字列の長さは、比較対象文字列長ＭＬの値である８となり、認識結果文字列上でのキーワード文字列の位置は、図２１に示した文字順番では０から７の範囲となる。 Based on this temporary candidate list, the candidate evaluation unit 61 selects a candidate having the minimum Levenshtein distance LD as a final candidate. In this example, the candidate with the candidate management number 13 is the final candidate. According to this final candidate, the detailed position of the keyword character string is 0, which is the value of the comparison position (starting point) MS, and the keyword in the recognition result character string. The length of the character string is 8, which is the value of the comparison target character string length ML, and the position of the keyword character string on the recognition result character string is in the range of 0 to 7 in the character order shown in FIG.

さらに候補評価部６１では、最終候補がキーワード文字列の位置として妥当か否かの判定が行われる。レーベンシュタイン距離ＬＤが異常に大きい場合は、キーワード文字列の位置として不適切であり、最終候補のレーベンシュタイン距離ＬＤを所定の閾値ＴＨと比較し、レーベンシュタイン距離ＬＤが閾値ＴＨを超える場合には、最終候補がキーワード文字列の位置として不適切と判断して、キーワード検索が失敗したものとする。 Further, the candidate evaluation unit 61 determines whether or not the final candidate is valid as the position of the keyword character string. When the Levenstein distance LD is abnormally large, the position of the keyword character string is inappropriate. When the Levenstein distance LD of the final candidate is compared with a predetermined threshold TH and the Levenstein distance LD exceeds the threshold TH, It is assumed that the keyword search has failed because the final candidate is determined to be inappropriate as the position of the keyword character string.

なお、概略位置検出部５１でキーワード概略位置の候補を複数取得した場合、その候補ごとに取得されるキーワード詳細位置の暫定候補同士で、比較条件、すなわち比較位置ＭＳ及び比較対象文字列長ＭＬが等しくなる場合があり、このように比較条件が等しい重複候補同士では、当然ながらレーベンシュタイン距離ＬＤも等しくなるため、この重複候補のレーベンシュタイン距離算出は省略され、暫定候補リストにも登録されない。 When a plurality of keyword approximate position candidates are acquired by the approximate position detection unit 51, the comparison condition, that is, the comparison position MS and the comparison target character string length ML are set between the temporary keyword detail position candidates acquired for each candidate. In some cases, the duplication candidates having the same comparison condition also have the same Levenshtein distance LD. Therefore, the Levenshtein distance calculation of the duplication candidate is omitted and is not registered in the temporary candidate list.

図２３は、図１４に示した詳細位置検出部５２で行われるキーワード詳細位置検出の手順を示すフロー図である。これは、図１６のキーワード詳細位置検出（ステップ３０２）で実行されるものである。 FIG. 23 is a flowchart showing a keyword detail position detection procedure performed by the detail position detection unit 52 shown in FIG. This is executed by the keyword detailed position detection (step 302) in FIG.

ここではまず概略位置検出部５１で取得して候補登録部５６に登録されたキーワード概略位置の候補が１つ取り出され（ステップ５０１）、さらにその候補のレーベンシュタイン距離Ｄが取り出される（ステップ５０２）。ついでキーワード文字列長ＫＬが算出される（ステップ５０３）。また認識結果文字列中の比較対象範囲を設定する処理が比較対象範囲設定部５７にて行われ（ステップ５０４）、比較対象範囲の始点が比較位置ＭＳの初期値に設定される。また認識結果文字列内の比較対象文字列の長さの範囲、すなわち最大値ＭＬmax及び最小値ＭＬminを設定する処理が比較対象文字列長設定部５８にて行われ（ステップ５０５）、比較対象文字列長ＭＬの最小値ＭＬminが初期値に設定される。 Here, one candidate of approximate keyword position acquired by the approximate position detector 51 and registered in the candidate registration unit 56 is first extracted (step 501), and the Levenstein distance D of the candidate is extracted (step 502). . Next, the keyword character string length KL is calculated (step 503). Further, a process for setting the comparison target range in the recognition result character string is performed by the comparison target range setting unit 57 (step 504), and the starting point of the comparison target range is set as the initial value of the comparison position MS. The comparison target character string length setting unit 58 performs a process of setting the length range of the comparison target character string in the recognition result character string, that is, the maximum value MLmax and the minimum value MLmin (step 505). The minimum value MLmin of the column length ML is set to the initial value.

次に、設定された比較位置ＭＳと比較対象文字列長ＭＬの条件にしたがって、認識結果文字列内の比較対象文字列とキーワード文字列との間のレーベンシュタイン距離を算出する処理が距離算出部５９にて行われ（ステップ５０６）、ここで取得したレーベンシュタイン距離が暫定候補登録部６０の暫定候補リスト（図２２参照）に登録される（ステップ５０７）。 Next, a process for calculating the Levenshtein distance between the comparison target character string in the recognition result character string and the keyword character string according to the set comparison position MS and the comparison target character string length ML is a distance calculation unit. 59 (step 506), and the Levenstein distance acquired here is registered in the provisional candidate list (see FIG. 22) of the provisional candidate registration unit 60 (step 507).

そして比較対象文字列長ＭＬが最大値ＭＬmaxに達したか否かの判定が行われ（ステップ５０８）、比較対象文字列長ＭＬが最大値ＭＬmaxに達していない場合には、比較対象文字列長ＭＬが１増分され（ステップ５０９）、１文字増やした比較対象文字列長ＭＬでレーベンシュタイン距離の算出及び登録の処理が行われ、この処理が比較対象文字列長ＭＬが最大値ＭＬmaxに達するまで繰り返される。 Then, it is determined whether or not the comparison target character string length ML has reached the maximum value MLmax (step 508). If the comparison target character string length ML has not reached the maximum value MLmax, the comparison target character string length is determined. ML is incremented by 1 (step 509), the Levenshtein distance is calculated and registered with the comparison target character string length ML increased by one character, and this processing is continued until the comparison target character string length ML reaches the maximum value MLmax. Repeated.

そして比較対象文字列長ＭＬが最大値ＭＬmaxに達した場合には、次に比較位置ＭＳが、キーワード概略位置の始点ＳＴにレーベンシュタイン距離Ｄを加算した値に達したか否か、すなわち比較位置ＭＳが比較対象範囲内の最終位置に到達したか否かの判定が行われ（ステップ５１０）、比較位置ＭＳが最終位置に到達していない場合には、比較位置ＭＳが１増分され（ステップ５１１）、１文字ずれた比較位置ＭＳでレーベンシュタイン距離の算出及び登録の処理が行われ、この処理が比較位置ＭＳが最終位置に到達するまで繰り返される。 When the comparison target character string length ML reaches the maximum value MLmax, whether or not the comparison position MS has reached the value obtained by adding the Levenshtein distance D to the starting point ST of the keyword approximate position, that is, the comparison position It is determined whether or not the MS has reached the final position within the comparison target range (step 510). If the comparative position MS has not reached the final position, the comparative position MS is incremented by 1 (step 511). ) The Levenshtein distance is calculated and registered at the comparison position MS shifted by one character, and this process is repeated until the comparison position MS reaches the final position.

そして比較位置ＭＳが最終位置に到達した場合には、候補登録部５６に登録されたキーワード概略位置の候補が全て終了したか否かの判定が行われ（ステップ５１２）、全て終了していなければ次の候補の処理に進む。そしてキーワード概略位置の候補が全て終了すれば、暫定候補登録部６０の暫定候補リストを参照してレーベンシュタイン距離Ｄが最小となるものを最終候補に選択する処理が候補評価部６１にて行われ、さらにレーベンシュタイン距離ＬＤと所定の閾値ＴＨとの比較により最終候補がキーワード文字列の位置として妥当か否かの判定が候補評価部６１にて行われ（ステップ５１３）、ここで妥当と判定された場合に、最終候補のキーワード位置が出力される。 When the comparison position MS has reached the final position, it is determined whether or not all of the keyword approximate position candidates registered in the candidate registration unit 56 have been completed (step 512). Proceed to the next candidate process. When all of the keyword approximate position candidates are completed, the candidate evaluation unit 61 performs a process of selecting, as the final candidate, the one with the minimum Levenshtein distance D by referring to the provisional candidate list of the provisional candidate registration unit 60. Further, the candidate evaluation unit 61 determines whether or not the final candidate is valid as the position of the keyword character string by comparing the Levenshtein distance LD with a predetermined threshold TH (step 513). In the case, the final candidate keyword position is output.

図２４は、図５に示したキーワード検索部３４で行われるキーワード検索の手順の別の例を示すフロー図である。これは、図１３に示した語順判定によるキーワード検索と図１６に示したレーベンシュタイン距離によるキーワード検索との２つの方式を併用するものである。この場合、キーワード検索部３４は、図９に示した各部と図１４に示した各部とを併有したものになる。 FIG. 24 is a flowchart showing another example of a keyword search procedure performed by the keyword search unit 34 shown in FIG. This uses both the keyword search based on the word order determination shown in FIG. 13 and the keyword search based on the Levenshtein distance shown in FIG. In this case, the keyword search unit 34 has both the units shown in FIG. 9 and the units shown in FIG.

まず語順判定によるキーワード検索がキーワード検索部３４で行われ（ステップ６０１）、ついでキーワード位置が検出されたか否かの判定が行われ（ステップ６０２）、キーワード位置が検出された場合には、そのキーワード位置の情報が出力される（ステップ６０３）。 First, keyword search based on word order determination is performed by the keyword search unit 34 (step 601), and then it is determined whether or not the keyword position is detected (step 602). If the keyword position is detected, the keyword search is performed. The position information is output (step 603).

他方、語順判定によるキーワード検索によりキーワード位置が検出されなかった場合には、レーベンシュタイン距離によるキーワード検索がキーワード検索部３４で行われ（ステップ６０４）、ついでキーワード位置が検出されたか否かの判定が行われ（ステップ６０５）、キーワード位置が検出された場合には、そのキーワード位置の情報が出力され（ステップ６０６）、キーワード位置が検出されなかった場合には、キーワード検索が失敗したものとして（ステップ６０７）、終了する。 On the other hand, when the keyword position is not detected by the keyword search by the word order determination, the keyword search by the Levenshtein distance is performed by the keyword search unit 34 (step 604), and then it is determined whether or not the keyword position is detected. If the keyword position is detected, information on the keyword position is output (step 606). If the keyword position is not detected, it is determined that the keyword search has failed (step 606). 607), the process ends.

語順判定によるキーワード検索は、処理手順が比較的簡易で、迅速にキーワード位置を検出することができるため、処理手順が比較的煩雑で処理に時間を要するレーベンシュタイン距離によるキーワード検索に先立って実施される。また語順判定によるキーワード検索では、図８（Ｄ）の例のように、認識結果文字列内の部分文字列に相当する部分に不正な文字が挿入された誤認識の場合には、キーワード位置を検出することができないが、レーベンシュタイン距離によるキーワード検索では、このような誤認識の影響を受け難く、キーワード位置を精度良く検出することができ、このレーベンシュタイン距離によるキーワード検索を、語順判定によるキーワード検索でキーワード位置の検出ができない場合に実施することで、キーワード位置を確実に検出することができる。 Keyword search based on word order determination is performed prior to keyword search based on Levenshtein distance because the processing procedure is relatively simple and the keyword position can be detected quickly, and the processing procedure is relatively complicated and takes time to process. The In the keyword search based on the word order determination, as shown in the example of FIG. 8D, in the case of erroneous recognition in which an illegal character is inserted in a portion corresponding to the partial character string in the recognition result character string, the keyword position is changed. Although it cannot be detected, the keyword search based on the Levenshtein distance is not easily affected by such misrecognition, and the keyword position can be accurately detected. By carrying out when the keyword position cannot be detected by the search, the keyword position can be reliably detected.

本発明にかかるタイトル抽出装置、画像読取装置、タイトル抽出方法、及びタイトル抽出プログラムは、文書画像上で外観的特徴を有しない文字列でもタイトルとして精度良く抽出することができる効果を有し、文書画像からタイトルを抽出するタイトル抽出装置、画像読取装置、タイトル抽出方法、及びタイトル抽出プログラムなどとして有用である。 The title extraction device, the image reading device, the title extraction method, and the title extraction program according to the present invention have an effect of accurately extracting a character string having no appearance feature on a document image as a title. The present invention is useful as a title extraction device, an image reading device, a title extraction method, and a title extraction program that extract a title from an image.

本発明が適用される画像処理装置の概略構成を示すブロック図1 is a block diagram showing a schematic configuration of an image processing apparatus to which the present invention is applied. 本発明が適用される画像読取装置の概略構成を示すブロック図1 is a block diagram showing a schematic configuration of an image reading apparatus to which the present invention is applied. 図１・図２に示したタイトル抽出部での処理の概要を示す図The figure which shows the outline | summary of the process in the title extraction part shown in FIG. 1 and FIG. 図３に示したキーワード文字列に対するタイトル文字列の相対的な位置関係の指定方法の例を示す図The figure which shows the example of the designation | designated method of the relative positional relationship of the title character string with respect to the keyword character string shown in FIG. 図１・図２に示したタイトル抽出部を示すブロック図Block diagram showing the title extraction unit shown in FIG. 1 and FIG. 図５に示した文字認識部で行われる文字認識の概要を示す図The figure which shows the outline | summary of the character recognition performed in the character recognition part shown in FIG. 図１・図２に示した画像処理装置及び画像読取装置で行われる処理の手順を示すフロー図FIG. 1 is a flowchart showing a procedure of processing performed in the image processing apparatus and the image reading apparatus shown in FIG. 図５に示した文字認識部での文字認識で生じる誤認識の例を示す図The figure which shows the example of the misrecognition which arises by the character recognition in the character recognition part shown in FIG. 図５に示したキーワード検索部を示すブロック図The block diagram which shows the keyword search part shown in FIG. 図９に示した文字列分割部で行われる文字列分割の一例を示す図The figure which shows an example of the character string division | segmentation performed by the character string division | segmentation part shown in FIG. 図９に示した部分文字列位置検出部で行われる部分文字列位置検出の要領を示す図The figure which shows the point of the partial character string position detection performed by the partial character string position detection part shown in FIG. 図９に示した候補登録部に登録される情報を示す図The figure which shows the information registered into the candidate registration part shown in FIG. 図９に示したキーワード検索部で行われるキーワード検索の手順を示すフロー図The flowchart which shows the procedure of the keyword search performed in the keyword search part shown in FIG. 図５に示したキーワード検索部の別の例を示すブロック図The block diagram which shows another example of the keyword search part shown in FIG. 図１４に示したレーベンシュタイン距離算出の要領を示す図The figure which shows the point of the Levenshtein distance calculation shown in FIG. 図１４に示したキーワード検索部で行われるキーワード検索の手順を示すフロー図The flowchart which shows the procedure of the keyword search performed in the keyword search part shown in FIG. 図１４に示した概略位置検出部で行われるキーワード概略位置検出の要領を示す図The figure which shows the point of the keyword approximate position detection performed by the approximate position detection part shown in FIG. 図１４に示した距離算出部でのレーベンシュタイン距離算出結果の一例を示す図The figure which shows an example of the Levenshtein distance calculation result in the distance calculation part shown in FIG. 図１４に示した距離算出部でのレーベンシュタイン距離算出結果の別の例を示す図The figure which shows another example of the Levenstein distance calculation result in the distance calculation part shown in FIG. 図１４に示した概略位置検出部で行われるキーワード概略位置検出の手順を示すフロー図FIG. 14 is a flowchart showing a procedure for keyword approximate position detection performed by the approximate position detection unit shown in FIG. 図１４に示した詳細位置検出部で行われるキーワード詳細位置検出の要領を示す図The figure which shows the point of the keyword detailed position detection performed by the detailed position detection part shown in FIG. 図１４に示した暫定候補登録部に登録される暫定候補リストを示す図The figure which shows the temporary candidate list registered into the temporary candidate registration part shown in FIG. 図１４に示した詳細位置検出部で行われるキーワード詳細位置検出の手順を示すフロー図The flowchart which shows the procedure of the keyword detailed position detection performed by the detailed position detection part shown in FIG. 図５に示したキーワード検索部で行われるキーワード検索の手順の別の例を示すフロー図The flowchart which shows another example of the procedure of the keyword search performed in the keyword search part shown in FIG. 文書画像の一例を示す図Figure showing an example of a document image

Explanation of symbols

３タイトル抽出部（タイトル抽出装置）
２１タイトル包含領域
２２・２３矩形領域
３１抽出条件取得部（抽出条件取得手段）
３２抽出条件記憶部
３３文字認識部（文字認識手段）
３４キーワード検索部（キーワード検索手段）
３５タイトル位置取得部（タイトル位置取得手段）
３６タイトル出力部（タイトル出力手段）
４１文字列分割部（文字列分割手段）
４２部分文字列位置検出部（部分文字列位置検出手段）
４３評価値算出部
４４候補登録部
４５候補評価部（評価手段）
５１概略位置検出部（第１の位置検出手段）
５２詳細位置検出部（第２の位置検出手段）
５３距離算出部
５４暫定候補登録部
５５候補評価部
５６候補登録部
５７比較対象範囲設定部
５８比較対象文字列長設定部
５９距離算出部
６０暫定候補登録部
６１候補評価部 3 Title Extraction Unit (Title Extraction Device)
21 Title inclusion area 22/23 Rectangular area 31 Extraction condition acquisition unit (extraction condition acquisition means)
32 Extraction condition storage unit 33 Character recognition unit (character recognition means)
34 Keyword search part (keyword search means)
35 Title position acquisition unit (title position acquisition means)
36 Title output section (title output means)
41 Character string dividing unit (character string dividing means)
42 Partial character string position detection unit (partial character string position detection means)
43 Evaluation Value Calculation Unit 44 Candidate Registration Unit 45 Candidate Evaluation Unit (Evaluation Means)
51 Approximate position detector (first position detector)
52 Detailed Position Detection Unit (Second Position Detection Unit)
53 distance calculation unit 54 provisional candidate registration unit 55 candidate evaluation unit 56 candidate registration unit 57 comparison target range setting unit 58 comparison target character string length setting unit 59 distance calculation unit 60 temporary candidate registration unit 61 candidate evaluation unit

Claims

An extraction condition acquisition means for acquiring a keyword character string described in the vicinity of the title character string on the document image, and relative position information of the title character string with respect to the keyword character string;
Character recognition means for performing character recognition on an area including at least the title character string and the keyword character string in a document image;
From the recognition result of the character recognition means, keyword search means for searching the keyword character string acquired by the extraction condition acquisition means and acquiring its position;
Title position acquisition means for acquiring the position of the title character string based on the position of the keyword character string acquired by the keyword search means and the relative position of the title character string with respect to the keyword character string acquired by the extraction condition acquisition means When,
A title extraction apparatus comprising: a title output unit that outputs data of a title character string based on the position of the title character string acquired by the title position acquisition unit.

The extraction condition acquisition means acquires position information of a title inclusion area including the title character string and the keyword character string by a user input operation,
The title extracting apparatus according to claim 1, wherein the character recognizing unit performs character recognition on the title inclusion area acquired by the extraction condition acquiring unit.

The keyword search means
Character string dividing means for dividing the keyword character string into a plurality of partial character strings;
Each partial character string obtained by this character string dividing means is searched in the recognition result character string of the character recognition means, and a partial character string position detecting means for detecting the position of each partial character string;
The keyword character string based on the position of the partial character string obtained by the partial character string position detecting means by comparing the word order based on the position of each partial character string obtained by the partial character string position detecting means with the normal word order The title extracting apparatus according to claim 1, further comprising: an evaluation unit that evaluates whether or not the position of the file is appropriate.

The keyword search means
First and second position detecting means for detecting the position of the keyword character string on the recognition result character string based on the difference between the comparison target character string and the keyword character string in the recognition result character string;
The first position detecting means detects the approximate position of the keyword character string on the recognition result character string based on the degree of difference obtained by setting the length of the comparison target character string to be the same as the keyword character string. ,
The second position detection means limits the comparison target within a comparison target range based on the approximate position of the keyword character string acquired by the first position detection means, and requires the length of the comparison target character string. The title extraction device according to claim 1, wherein the detailed position of the keyword character string on the recognition result character string is detected based on the degree of difference obtained by increasing or decreasing within the range.

The title extraction device according to claim 1, wherein the title output unit outputs text data of a title character string in the recognition result character string acquired by the character recognition unit.

2. The title extracting apparatus according to claim 1, wherein the title output means outputs image data created by cutting out a title character string area in a document image.

An image reading apparatus comprising the title extracting apparatus according to claim 1.

An extraction condition acquisition step of acquiring a keyword character string described in the vicinity of the title character string on the document image, and relative position information of the title character string with respect to the keyword character string;
A character recognition step for performing character recognition on an area including at least the title character string and the keyword character string in a document image;
From the recognition result of this character recognition step, the keyword search step of searching the keyword character string acquired in the extraction condition acquisition step and acquiring its position;
Title position acquisition step for acquiring the position of the title character string based on the position of the keyword character string acquired in the keyword search step and the relative position of the title character string with respect to the keyword character string acquired in the extraction condition acquisition step When,
A title extraction method comprising: a title output step for outputting data of a title character string based on the position of the title character string acquired in the title position acquisition step.

A title extraction program for causing a computer to execute each step in the title extraction method according to claim 8.