JP7206605B2

JP7206605B2 - Information processing equipment

Info

Publication number: JP7206605B2
Application number: JP2018053024A
Authority: JP
Inventors: 一憲宋; 拓也桜井; 久美藤原; 俊一木村; 裕越
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2018-03-20
Filing date: 2018-03-20
Publication date: 2023-01-18
Anticipated expiration: 2038-03-20
Also published as: JP2019164687A

Description

本発明は、情報処理装置に関する。 The present invention relates to an information processing device.

特許文献１に開示された方法は、入力された帳票上の画像に対し文字認識を行ない、その文字認識結果としての類似度を得て、この得られた類似度とあらかじめ登録された当該文字認識に要求する確信度とを比較し、この比較の結果に基づき文字認識結果に対し人手によるベリファイ処理を必要としない出力を行なうか、あるいは、上記比較の結果に基づき文字認識結果に対し文字認識候補の選択肢を提示して人手によるベリファイ処理を促す出力を行なうか、あるいは、上記比較の結果に基づき文字認識結果に対し人手による新規入力および確定を提示して手入力処理を促す出力を行なう。 The method disclosed in Patent Document 1 performs character recognition on an image on an input form, obtains a degree of similarity as a result of the character recognition, and compares the obtained degree of similarity with the previously registered character recognition. , and based on the result of this comparison, the character recognition result is output without requiring manual verification processing, or the character recognition result is output as a character recognition candidate based on the result of the above comparison. Alternatively, based on the result of the comparison, manual new input and confirmation of the character recognition result are presented to prompt the manual input process.

特許文献２に開示された方法は、旧版の文字認識ソフトウェアから新版の文字認識ソフトウェアに変更するに際し、旧版ソフトウェアから新版ソフトウェアへの移行時における期間、実システムにおいて、新版及び旧版の双方のソフトウェアにより文字の認識を行う。その結果、新版及び旧版双方のソフトウェアの認識精度に関する情報を統計的に収集し、両者の認識精度を比較する。そして新版の精度が旧版の精度よりも高い場合に、新版ソフトウェアの導入を確定するようにするものである。一方、旧版ソフトウェアの認識精度の方が良かった場合には、新版ソフトウェアへの全面的な変更は行わず、旧版及び新版双方のソフトウェアの良い所を用いて並行的に運用することもできる。 In the method disclosed in Patent Document 2, when changing from the old version of the character recognition software to the new version of the character recognition software, during the transition from the old version of the software to the new version of the software, both the new version and the old version of the software are used in the actual system. Recognize characters. As a result, information on the recognition accuracy of both the new version and the old version of the software is statistically collected, and the recognition accuracy of both is compared. Then, when the accuracy of the new version is higher than the accuracy of the old version, the introduction of the new version software is decided. On the other hand, if the recognition accuracy of the old version software is better, it is possible to operate in parallel using the good points of both the old and new versions without completely changing to the new version software.

特許文献３に開示された方法は、入力原稿より文字情報をＯＣＲで読み取り、認識処理部で認識処理する。オペレータによって入力原稿上の文字情報をキーボードよりキー入力させ、キー入力された文字データと文字認識された認識データとをＣＰＵで比較し、誤りの可能性があるキー入力データの部分をＣＲＴ１５にて異常表示させることで、ベリファイ入力を行う構成とする。たとえばキー入力された文字データが入力原稿と一致し、認識データに誤りがあると判断される文字データ、および認識データだけでなくキー入力された文字データにも誤りがあると判断される文字データを反転（しろ抜き）により異常表示し、入力ミスの可能性が高い入力データを自動的に検出できる。 In the method disclosed in Patent Document 3, character information is read from an input document by OCR, and recognition processing is performed by a recognition processing unit. The operator inputs the character information on the input document from the keyboard, and the CPU compares the key-inputted character data with the character-recognized recognition data. The configuration is such that verify input is performed by displaying an abnormality. For example, character data that is judged to have an error in the recognition data because the character data entered with the key matches the input document, and character data that is judged to have an error in not only the recognition data but also the character data entered with the key. is displayed as an error by inverting (whitening out), and input data with a high possibility of input error can be automatically detected.

特許文献４に開示された装置は、データの記入されたフォーム（帳票）を電子画像フォームとして読み取る画像読取手段と、読み取った電子画像フォームを性質の異なる、すなわち、誤認識を共通にしない或いは共通にすることの少ない２種類（以上）のＯＣＲエンジンでＯＣＲ認識するＯＣＲ認識手段と、認識結果が一致した文字は自動的にデータベースへ保存し、一致しない文字及び一致してもいずれか一方のＯＣＲエンジンの認識の信頼性の低い文字は確認修正後にデータベースへ保存するデータベース保存手段と、を備える。 The apparatus disclosed in Patent Document 4 has an image reading means for reading a form (form) in which data is entered as an electronic image form, and a read electronic image form having different properties, i.e., misrecognition is not common or OCR recognition means that performs OCR recognition with two (or more) OCR engines that are rarely used, and characters whose recognition results match are automatically stored in the database, and even if they match, characters that do not match OCR and database storage means for storing characters with low engine recognition reliability in a database after confirmation and correction.

特許文献５に開示された情報処理装置の分類手段は、文字認識対象を３種類のいずれかに分類し、抽出手段は、前記分類手段によって第１の種類に分類された場合に、前記文字認識対象の文字認識結果を抽出し、第１の制御手段は、前記分類手段によって第２の種類に分類された場合に、前記文字認識対象の文字認識結果を抽出し、該文字認識対象を人手で入力させるように制御し、第２の制御手段は、前記分類手段によって第３の種類に分類された場合に、前記文字認識対象を複数人の人手で入力させるように制御する。 The classification means of the information processing apparatus disclosed in Patent Document 5 classifies a character recognition object into one of three types, and the extraction means performs the character recognition when the character recognition target is classified into the first type by the classification means. extracting a target character recognition result, and extracting the character recognition result of the character recognition target when the classification means classifies the character recognition target into the second type, and extracting the character recognition target manually. The second control means controls to manually input the character recognition target by a plurality of persons when the character recognition target is classified into the third type by the classification means.

特許文献６～１１には、文字認識の認識確度についての様々な算出方式が示されている。 Patent Documents 6 to 11 disclose various calculation methods for the recognition accuracy of character recognition.

特開２００３－３４６０８０号公報Japanese Patent Application Laid-Open No. 2003-346080 特開２００４－１７１３２６号公報JP 2004-171326 A 特開平０５－２７４４６７号公報JP-A-05-274467 特開２０１０－０７３２０１号公報JP 2010-073201 A 特開２０１６－２１２８１２号公報JP 2016-212812 A 特開平５－４０８５３号公報JP-A-5-40853 特開平５－２０５００号公報JP-A-5-20500 特開平５－２９０１６９公報JP-A-5-290169 特開平８－１０１８８０号公報JP-A-8-101880 特開平９－１３４４１０号公報JP-A-9-134410 特開平９－２５９２２６号公報JP-A-9-259226

判定手段により入力を判定する場合において、その判定手段の判定の正解率を求めるには、例えば、各入力についての判定手段による判定結果を、より判定精度が高い方法（例えば人間によるチェック）で正解か否か判定し、それら入力の全てに対する正解の判定結果の割合を求める方法がある。しかし、その判定精度の高い方法での判定は、判定手段の判定よりも高コストである。そうでなければ、判定手段の代わりにその判定精度の高い方法を最初から用いればよいからである。したがって、全入力について、その方法による判定を行うのは、コスト的な負担が大きい。 In the case of judging an input by a judging means, in order to obtain the accuracy rate of the judgment of the judging means, for example, the judging result of the judging means for each input is judged by a method with higher judging accuracy (for example, a human check). There is a method of judging whether or not the input is correct, and calculating the proportion of correct judgment results for all of those inputs. However, the determination by the method with high determination accuracy is more expensive than the determination by the determination means. Otherwise, a method with high determination accuracy should be used instead of the determination means from the beginning. Therefore, it is costly to make determinations using this method for all inputs.

本発明は、すべての入力について判定手段の判定結果の正解不正解を別の方法で判定することでその判定手段の正解率を求める方式よりも、より低いコストでその判定手段の正解率を求めることを目的とする。 The present invention obtains the correct answer rate of the judging means at a lower cost than the method of obtaining the correct answer rate of the judging means by judging the correct/wrong answer of the judging result of the judging means for all inputs by a different method. for the purpose.

請求項１に係る発明は、入力についての文字認識を実行し、前記文字認識の認識結果と認識確度とを出力する認識手段と、前記認識結果が正解か誤りかを確認し、認識結果が正解の場合はその認識結果を採用し、誤りの場合に前記入力についての正しい認識結果を求め、求めた認識結果を採用する確認手段と、前記認識確度が閾値以上である入力については前記確認手段を介在させずに前記認識手段の認識結果を出力し、閾値未満であれば前記確認手段が採用した認識結果を出力する制御を行う出力制御手段と、前記認識確度が前記閾値未満の範囲のうちの第１範囲内である入力のうち前記確認手段で正解と確認されたものの比率を、前記第１範囲における前記認識手段の正解率として算出する正解率算出手段と、前記第１範囲における前記正解率に基づき、前記閾値以上の範囲のうちの第２範囲における前記認識手段の正解率を推定する推定手段と、を含む情報処理装置である。 The invention according to claim 1 comprises recognition means for executing character recognition for an input, outputting the recognition result and recognition accuracy of the character recognition, and confirming whether the recognition result is correct or wrong, and confirming that the recognition result is correct. In the case of , the recognition result is adopted, in the case of an error, a correct recognition result for the input is obtained, and confirmation means for adopting the obtained recognition result; output control means for outputting the recognition result of the recognition means without intervening, and outputting the recognition result adopted by the confirmation means if the recognition accuracy is less than the threshold; accuracy rate calculation means for calculating a ratio of inputs within a first range that are confirmed as correct by the confirmation means as an accuracy rate of the recognition means in the first range; and the accuracy rate in the first range. estimating means for estimating the accuracy rate of the recognizing means in a second range out of the range equal to or greater than the threshold based on the above.

請求項２に係る発明は、前記第１範囲は、所定基準に従い決まる０より大きい値から前記閾値までの範囲である、請求項１に記載の情報処理装置である。 The invention according to claim 2 is the information processing apparatus according to claim 1, wherein the first range is a range from a value greater than 0 determined according to a predetermined criterion to the threshold value.

請求項３に係る発明は、前記推定手段は、前記正解率算出手段が算出した前記正解率が前記第１範囲における前記認識確度の第１の代表値に対応するものであるとし、前記第２範囲における前記認識確度の第２の代表値に対応する正解率を、前記第１の代表値に対応する正解率と、前記認識確度が取り得る最大値における所定の最大正解率と、の間の線形補間により推定する、請求項１又は２に記載の情報処理装置である。 In the invention according to claim 3, the estimation means is configured such that the accuracy rate calculated by the accuracy rate calculation means corresponds to a first representative value of the recognition accuracy in the first range, and the second The accuracy rate corresponding to the second representative value of the recognition accuracy in the range is set between the accuracy rate corresponding to the first representative value and a predetermined maximum accuracy rate at the maximum value that the recognition accuracy can take. 3. The information processing apparatus according to claim 1, wherein estimation is performed by linear interpolation.

請求項４に係る発明は、前記正解率算出手段は、前記認識確度が前記閾値未満である複数の範囲についてそれぞれ前記正解率を求め、前記推定手段は、前記複数の範囲の各々の前記正解率の前記認識確度に応じた変化の傾向に基づき、前記第２範囲における前記正解率を推定する、請求項１又は２に記載の情報処理装置である。 In the invention according to claim 4, the accuracy rate calculation means obtains the accuracy rate for each of a plurality of ranges in which the recognition accuracy is less than the threshold value, and the estimation means calculates the accuracy rate for each of the plurality of ranges. 3. The information processing apparatus according to claim 1, wherein said accuracy rate in said second range is estimated based on a tendency of change according to said recognition accuracy of .

請求項５に係る発明は、前記正解率算出手段は、前記認識確度が前記閾値未満である複数の範囲についてそれぞれ前記正解率を求め、前記推定手段は、前記複数の範囲の各々の前記正解率と前記認識確度との関係から、前記認識確度に対応する前記正解率を求める関数を推定し、推定した関数を用いて前記第２範囲における前記正解率を推定する、請求項１又は２に記載の情報処理装置である。 In the invention according to claim 5, the accuracy rate calculation means obtains the accuracy rate for each of a plurality of ranges in which the recognition accuracy is less than the threshold value, and the estimation means calculates the accuracy rate for each of the plurality of ranges. and the recognition accuracy, estimating a function for obtaining the accuracy rate corresponding to the recognition accuracy, and estimating the accuracy rate in the second range using the estimated function. is an information processing device.

請求項６に係る発明は、前記推定手段は、前記認識確度の発生頻度の分布から前記認識確度の確率密度関数を求め、前記確率密度関数を用いて前記第２範囲における前記正解率を推定する請求項１に記載の情報処理装置である。 In the invention according to claim 6, the estimation means obtains a probability density function of the recognition accuracy from the distribution of occurrence frequencies of the recognition accuracy, and estimates the accuracy rate in the second range using the probability density function. An information processing apparatus according to claim 1 .

請求項７に係る発明は、コンピュータを、入力についての文字認識を実行し、前記文字認識の認識結果と認識確度とを出力する認識手段、前記認識結果が正解か誤りかを確認し、認識結果が正解の場合はその認識結果を採用し、誤りの場合に前記入力についての正しい認識結果を求め、求めた認識結果を採用する確認手段、前記認識確度が閾値以上である入力については前記確認手段を介在させずに前記認識手段の認識結果を出力し、閾値未満であれば前記確認手段が採用した認識結果を出力する制御を行う出力制御手段、前記認識確度が前記閾値未満の範囲のうちの第１範囲内である入力のうち前記確認手段で正解と確認されたものの比率を、前記第１範囲における前記認識手段の正解率として算出する正解率算出手段、前記第１範囲における前記正解率に基づき、前記閾値以上の範囲のうちの第２範囲における前記認識手段の正解率を推定する推定手段、として機能させるためのプログラムである。 According to a seventh aspect of the present invention, a computer comprises recognition means for executing character recognition on an input, outputting the recognition result and recognition accuracy of the character recognition , confirming whether the recognition result is correct or wrong, and If the input is correct, the recognition result is adopted; if the input is incorrect, the correct recognition result for the input is obtained and the obtained recognition result is adopted ; Output control means for controlling to output the recognition result of the recognition means without intervening, and to output the recognition result adopted by the confirmation means if the recognition accuracy is less than the threshold; accuracy rate calculation means for calculating a ratio of inputs within a first range that are confirmed as correct by the confirmation means as an accuracy rate of the recognition means in the first range; Based on this, the program functions as estimation means for estimating the accuracy rate of the recognition means in a second range out of the range equal to or greater than the threshold value.

請求項１、３又は７に係る発明によれば、すべての入力について判定手段の判定結果の正解不正解を別の方法で判定することでその判定手段の正解率を求める方式よりも、より低いコストでその判定手段の正解率を求めることができる。 According to the invention according to claim 1, 3 or 7, it is lower than the method of obtaining the correct answer rate of the judgment means by judging the correct/wrong answer of the judgment result of the judgment means for all inputs by another method. The accuracy rate of the judging means can be obtained from the cost.

請求項２に係る発明によれば、０から閾値までの全範囲についての正解率を算出してこれに基づいて第２範囲の正解率を推定する場合よりも、より妥当性が高い第２範囲の正解率を推定することができる。 According to the invention according to claim 2, the second range having higher validity than the case of calculating the accuracy rate for the entire range from 0 to the threshold value and estimating the accuracy rate of the second range based on this accuracy rate can be estimated.

請求項４、５又は６に係る発明によれば、第２範囲の正解率を線形補間により推定する場合よりも、より妥当性が高い正解率を推定することができる。 According to the fourth, fifth or sixth aspect of the present invention, it is possible to estimate an accuracy rate with higher validity than in the case of estimating the accuracy rate in the second range by linear interpolation.

実施形態の情報処理装置の機能構成を例示する図である。It is a figure which illustrates the functional structure of the information processing apparatus of embodiment. 認識確度が閾値以上の領域における正解率の推定の方法の一例を説明するための図である。FIG. 10 is a diagram for explaining an example of a method of estimating the accuracy rate in an area where the recognition accuracy is equal to or higher than a threshold; 認識確度の確率密度関数の算出の仕方を説明するための図である。FIG. 4 is a diagram for explaining how to calculate a probability density function of recognition accuracy; 認識確度が閾値以上の領域における正解率の推定の方法の別の例を説明するための図である。FIG. 10 is a diagram for explaining another example of a method of estimating the accuracy rate in an area where the recognition accuracy is equal to or higher than the threshold; 認識確度が閾値以上の領域における正解率の推定の方法の更に別の例を説明するための図である。FIG. 11 is a diagram for explaining still another example of the method of estimating the accuracy rate in an area where the recognition accuracy is equal to or higher than the threshold; 確認処理部の内部構成を例示する図である。It is a figure which illustrates the internal structure of a confirmation process part.

図１に、本発明に係る情報処理装置の一実施形態例を示す。 FIG. 1 shows an embodiment of an information processing apparatus according to the present invention.

この情報処理装置は、入力される画像データ（「入力画像データ」）に含まれる文字列を、ＯＣＲ１０と確認処理部１８により判定する。 This information processing apparatus determines a character string included in input image data (“input image data”) using an OCR 10 and a confirmation processing unit 18 .

ＯＣＲ１０は、認識処理部１２と認識確度算出部１４を備える。認識処理部１２は、入力画像データに対して公知のＯＣＲ（光学文字認識）処理を行うことで、その入力画像データ内に含まれる文字列を認識する。認識処理部１２は、認識した文字列を示すテキストコードを出力する。認識確度算出部１４は、入力画像データから認識したテキストコードについての認識確度を算出する。認識確度は、認識結果のテキストコードがその入力画像データに含まれる文字列（手書きの場合もある）を正しく表している確からしさを示す度合いである。認識確度が高いほど、認識結果のテキストコードが正解である（すなわち入力画像データ中の文字列を正しく表している）可能性が高い。認識結果が正解である可能性を以下では、認識率又は正解率と呼ぶ。ＯＣＲ１０は、入力画像データについての異なる複数の認識結果を、認識確度が高い順に、認識確度と対応付けて出力してもよい。なお、ＯＣＲ１０が文字認識を行う単位（すなわち認識結果を出力する単位）は、特に限定されず、例えば、文字単位、行又は列（横書き又は縦書き）単位、帳票の欄単位、ページ単位、文書単位等のいずれであってもよい。 The OCR 10 includes a recognition processing section 12 and a recognition accuracy calculation section 14 . The recognition processing unit 12 recognizes character strings included in the input image data by performing known OCR (optical character recognition) processing on the input image data. The recognition processing unit 12 outputs a text code indicating the recognized character string. The recognition accuracy calculator 14 calculates the recognition accuracy of the text code recognized from the input image data. The recognition accuracy is a degree of likelihood that the text code of the recognition result correctly represents the character string (which may be handwritten) included in the input image data. The higher the recognition accuracy, the higher the possibility that the text code of the recognition result is correct (that is, correctly represents the character string in the input image data). The possibility that the recognition result is correct is hereinafter referred to as the recognition rate or the accuracy rate. The OCR 10 may output a plurality of different recognition results of the input image data in association with the recognition accuracy in descending order of recognition accuracy. The unit in which the OCR 10 performs character recognition (that is, the unit in which the recognition result is output) is not particularly limited. Any unit or the like may be used.

なお，ＯＣＲ１０が用いる文字認識の手法や認識確度の算出方法は特に限定されず、特許文献６～１１に例示したものを初めとする従来手法や今後開発される手法のうちいずれを用いてもよい。 The method of character recognition used by the OCR 10 and the method of calculating the recognition accuracy are not particularly limited, and any of conventional methods such as those exemplified in Patent Documents 6 to 11 and methods to be developed in the future may be used. .

選択部１６は、認識処理部１２の文字認識結果（テキストコード）について認識確度算出部１４が算出した認識確度に基づいて、文字認識結果の出力の制御を行う。すなわち、選択部１６は、認識確度がある閾値以上であれば、その文字認識結果を情報処理装置自体の最終的な文字認識結果として出力する。認識確度が閾値以上であれば、認識処理部１２の認識が正確であると信頼するのである。 The selection unit 16 controls the output of the character recognition result based on the recognition accuracy calculated by the recognition accuracy calculation unit 14 for the character recognition result (text code) of the recognition processing unit 12 . That is, if the recognition accuracy is equal to or higher than a certain threshold, the selection unit 16 outputs the character recognition result as the final character recognition result of the information processing apparatus itself. If the recognition accuracy is equal to or higher than the threshold, the recognition by the recognition processing unit 12 is trusted to be accurate.

一方、認識確度がその閾値未満であれば、選択部１６は、その文字認識結果とこれに対応する入力画像データとを確認処理部１８に渡し、その文字認識結果が正しいかどうか確認する処理を実行する。 On the other hand, if the recognition accuracy is less than the threshold, the selection unit 16 passes the character recognition result and the corresponding input image data to the confirmation processing unit 18, and confirms whether the character recognition result is correct. Run.

一つの例では、確認処理部１８は、人間である確認担当者に、その入力画像データと文字認識結果を提示し、その文字認識結果が入力画像データ内の文字列として正しいかどうか確認させる。確認担当者は、この情報処理装置に対してインターネット等のネットワークを介して接続された端末を操作しているものであってもよく、この場合確認処理部１８は、入力画像データと文字認識結果を表示した画面情報（例えばウェブページ）をその端末に送り、その画面情報に対する確認担当者の入力を受け付ける。確認担当者は、正しいと判断した場合はその旨を示す入力を確認処理部１８に対して行い、これに応じ確認処理部１８は、選択部１６から受け取った文字認識結果を情報処理装置自体の最終的な文字認識結果として出力する。またこのとき確認処理部１８は、認識処理部１２の文字認識結果が正解である旨を示す確認結果情報を蓄積部２０に蓄積する。 In one example, the confirmation processing unit 18 presents the input image data and the character recognition result to a person in charge of confirmation, and has the character recognition result confirm whether or not the character string in the input image data is correct. The person in charge of confirmation may operate a terminal connected to the information processing apparatus via a network such as the Internet. is sent to the terminal (for example, a web page), and the confirmation person's input for the screen information is accepted. If the person in charge of confirmation determines that the information is correct, he or she makes an input to the confirmation processing unit 18 indicating that fact. Output as the final character recognition result. Also, at this time, the confirmation processing unit 18 accumulates in the accumulation unit 20 confirmation result information indicating that the character recognition result of the recognition processing unit 12 is correct.

また確認担当者は、選択部１６から受け取った文字認識結果が入力画像データ内の文字列として正しくないと判断した場合は、その文字認識結果を修正するための入力を確認処理部１８に行う。これに応じて確認処理部１８は、修正後の文字認識結果を、情報処理装置自体の最終的な文字認識結果として出力する。またこのとき確認処理部１８は、認識処理部１２の文字認識結果が不正解である旨を示す確認結果情報を蓄積部２０に蓄積する。 When the person in charge of confirmation determines that the character recognition result received from the selection unit 16 is not correct as a character string in the input image data, the person in charge of confirmation inputs to the confirmation processing unit 18 to correct the character recognition result. In response to this, the confirmation processing unit 18 outputs the corrected character recognition result as the final character recognition result of the information processing apparatus itself. Also, at this time, the confirmation processing unit 18 accumulates in the accumulation unit 20 confirmation result information indicating that the character recognition result of the recognition processing unit 12 is incorrect.

以上では、ＯＣＲ１０の文字認識結果の確認を人間が行う場合を例示したが、この他に、例えばＯＣＲ１０よりも正確ではあるが文字認識のためのコストが高い別のＯＣＲ（例えば、情報処理装置の利用者とは別の運営主体が運営するインターネット上の有料の高精度ＯＣＲサービス）を用いて確認を行ってもよい。この場合、確認処理部１８は、入力画像データをその別のＯＣＲに認識させて認識結果を受け取り、受け取った認識結果を当該情報処理装置自体の最終的な文字認識結果として出力する。また、確認処理部１８は、選択部１６から受け取った認識処理部１２の文字認識結果と、別のＯＣＲから受け取ったその認識結果とを比較し、両者が一致する場合には、認識処理部１２の文字認識結果が正解である旨を示す確認結果情報を、両者が不一致の場合は不正解である旨を示す確認結果情報を、蓄積部２０に蓄積する。 In the above, the case where a human confirms the character recognition result of the OCR 10 is exemplified. The confirmation may be performed using a pay high-precision OCR service on the Internet operated by a management entity other than the user. In this case, the confirmation processing unit 18 causes the other OCR to recognize the input image data, receives the recognition result, and outputs the received recognition result as the final character recognition result of the information processing apparatus itself. Further, the confirmation processing unit 18 compares the character recognition result of the recognition processing unit 12 received from the selection unit 16 with the recognition result received from another OCR, and if both match, the recognition processing unit 12 Confirmation result information indicating that the character recognition result is correct is accumulated in the accumulation unit 20, and confirmation result information indicating that the character recognition result is incorrect if both do not match.

このように、確認処理部１８は、認識処理部１２の文字認識結果の正解・不正解を示す確認結果情報を蓄積部２０に蓄積する。ここで、認識処理部１２の文字認識結果について確認処理部１８による正解・不正解の判定が行われるのは、その文字認識結果に対応する認識確度が閾値未満である場合である。したがって、蓄積部２０に蓄積される確認結果情報は、認識確度がその閾値未満である文字認識結果についての正解・不正解の判定結果である。 In this manner, the confirmation processing unit 18 accumulates in the accumulation unit 20 confirmation result information indicating whether the character recognition result of the recognition processing unit 12 is correct or incorrect. Here, the confirmation processing unit 18 determines whether the character recognition result of the recognition processing unit 12 is correct or incorrect when the recognition accuracy corresponding to the character recognition result is less than the threshold. Therefore, the confirmation result information accumulated in the accumulation unit 20 is the correct/incorrect judgment result for the character recognition result whose recognition accuracy is less than the threshold.

低確度域正解率算出部２２は、蓄積部２０に蓄積されている確認結果情報群、すなわち認識確度が閾値未満である文字認識結果についての正解・不正解の情報に基づいて、低確度域すなわち閾値未満の認識確度範囲についての、認識処理部１２の正解率を算出する。例えば、この正解率は、正解率算出の対象とする確認結果情報の総数で、そのうちの正解を示す確認結果情報の数を割ることで算出すればよい。 The low-accuracy region accuracy rate calculation unit 22 calculates the low-accuracy region, i.e. The accuracy rate of the recognition processing unit 12 is calculated for the recognition accuracy range less than the threshold. For example, the accuracy rate may be calculated by dividing the number of confirmation result information indicating the correct answer by the total number of confirmation result information for which the accuracy rate is calculated.

高確度域正解率推定部２４は、低確度域正解率算出部２２が算出した低確度域の正解率に基づき、高確度域すなわち閾値以上の認識確度範囲についての認識処理部１２の正解率を推定する。以下、高確度域正解率推定部２４が行う推定の例を説明する。 The high-accuracy area accuracy rate estimating unit 24 calculates the accuracy rate of the recognition processing unit 12 for the high-accuracy area, that is, the recognition accuracy range equal to or higher than the threshold, based on the accuracy rate in the low-accuracy area calculated by the low-accuracy area accuracy rate calculating unit 22. presume. An example of estimation performed by the high-precision region accuracy rate estimation unit 24 will be described below.

第１の例を、図２を参照して説明する。 A first example will be described with reference to FIG.

認識確度を０から１までの実数値とし、低確度域の代表値をＵ、高確度域の代表値をＶとする。各領域の代表値として、その領域の中央値を用いる場合、選択部１６が用いる閾値をＴとすると、Ｕ＝Ｔ／２、Ｖ＝（Ｔ＋１）／２である。図２の例では、認識確度が１のときの正解率（認識率）が１であり、低確度域正解率算出部２２が算出した低確度域の正解率αが低確度域の代表値Ｕでの正解率であるとして、高確度域の代表値Ｖでの正解率δを線形補間により推定する。すなわち、高確度域正解率推定部２４は、次の式（１）を用いて正解率δを求める。

Let the recognition accuracy be a real number from 0 to 1, let U be the representative value of the low accuracy range, and V be the representative value of the high accuracy range. When the median value of each region is used as the representative value of each region, and the threshold used by the selection unit 16 is T, U=T/2 and V=(T+1)/2. In the example of FIG. 2, the accuracy rate (recognition rate) when the recognition accuracy is 1 is 1, and the accuracy rate α in the low-accuracy area calculated by the low-accuracy area accuracy calculation unit 22 is the representative value U , the accuracy rate δ at the representative value V in the high-accuracy region is estimated by linear interpolation. That is, the high-probability region accuracy rate estimating unit 24 obtains the accuracy rate δ using the following equation (1).

以上では、低確度域及び高確度域の代表値Ｕ及びＶとして、それら各領域自体の中央値を用いたが、これは一例に過ぎない。この代わりに、それら各領域における認識確度の度数分布（あるいは、これから求めた確率密度関数）の代表値をＵ及びＶとして用いてもよい。すなわち、認識確度算出部１４が各入力画像データについて求めた認識確度を蓄積しておき、この蓄積した情報を用いて、認識確度の区間ごとに当該区間に属する認識確度の度数（発生頻度）を求め、これにより生成できる度数の分布（ヒストグラム）から高確度域及び低確度域の代表値を求めればよい。なお、蓄積部２０には低確度域の情報しか蓄積されないので、全範囲の認識確度の分布を得るためには、これとは別に認識確度算出部１４の出力を蓄積しておく。度数分布の代表値としては、例えば、平均値、中央値、最頻値を用いてもよい。 In the above, as the representative values U and V of the low-accuracy region and the high-accuracy region, the median values of the respective regions themselves are used, but this is only an example. Alternatively, U and V may be representative values of the frequency distribution of recognition accuracy (or the probability density function obtained from this) in each region. That is, the recognition accuracy calculation unit 14 accumulates the recognition accuracy obtained for each piece of input image data, and uses this accumulated information to calculate the frequency (occurrence frequency) of the recognition accuracy belonging to each recognition accuracy interval. Then, the representative values of the high-accuracy region and the low-accuracy region can be obtained from the frequency distribution (histogram) that can be generated by this. Since only the information in the low-accuracy region is accumulated in the accumulation unit 20, the output of the recognition accuracy calculation unit 14 is separately accumulated in order to obtain the distribution of the recognition accuracy over the entire range. As the representative value of the frequency distribution, for example, an average value, a median value, or a mode value may be used.

また認識確度の確率密度関数ｐ（ｘ）を用い、次式（２）を用いて、平均値としての代表値Ｕ及びＶを求めてもよい。

Also, using the probability density function p(x) of the recognition accuracy, the representative values U and V as average values may be obtained using the following equation (2).

ここで、確率密度関数ｐ（ｘ）は次のように求めればよい。 Here, the probability density function p(x) can be obtained as follows.

すなわち、図３に示すように、まず認識確度ｘを複数の区間に分割する。区間の数をＺ個とし、区間の幅をＷとする。各区間のインデクスをｋとする。ｋは１以上Ｚ以下の整数である。区間ｋの中央の値（すなわち区間の下限と上限を足して２で割った値）を区間代表値ｘ_kとする。認識確度算出部１４が各入力画像データについて求めた認識確度を蓄積しておき、この蓄積した情報から、各区間ｋに入る認識確度の発生頻度（度数）Ｙ_kを求める。入力画像データの個数（すなわち認識確度の個数）をＮ個とすると、区間代表値における確率密度値ｐ（ｘ）は、次式で求められる。
ｐ（ｘ_k）＝Ｙ_k／ＮＷ That is, as shown in FIG. 3, the recognition accuracy x is first divided into a plurality of intervals. Let Z be the number of sections and W be the width of the section. Let the index of each interval be k. k is an integer of 1 or more and Z or less. The central value of interval k (that is, the value obtained by adding the lower limit and upper limit of the interval and dividing by 2) is defined as the interval representative value x _k . The recognition accuracy calculation unit 14 accumulates the recognition accuracy obtained for each piece of input image data, and from this accumulated information, the occurrence frequency (frequency) Yk of the recognition accuracy within each section _k is obtained. Assuming that the number of pieces of input image data (that is, the number of recognition accuracies) is N, the probability density value p(x) in the section representative value is obtained by the following equation.
p( _xk)=Yk _/ NW

これは離散的な確率密度関数である。これを公知の補間法で補間して連続関数としたものを確率密度関数ｐ（ｘ）として用いてもよい。 This is a discrete probability density function. A continuous function obtained by interpolating this by a known interpolation method may be used as the probability density function p(x).

図２を用いて説明した高確度域正解率推定部２４の推定方法を改良したものを、図４を参照して次に説明する。 An improved estimation method of the high-precision range accuracy rate estimator 24 described with reference to FIG. 2 will now be described with reference to FIG.

図２の例では、低確度域全体における正解率を用いて高確度域における正解率を算出した。しかし、認識確度が非常に低い領域での正解率は、高確度域での正解率に対する関連性が低い。そこで、この改良方法では、低確度域全体ではなく、そのうちの閾値Ｔに近い領域のみについての正解率に基づいて高確度域の正解率を推定する。 In the example of FIG. 2, the accuracy rate in the high-accuracy area is calculated using the accuracy rate in the entire low-accuracy area. However, the accuracy rate in the region where the recognition accuracy is very low has little relevance to the accuracy rate in the high accuracy region. Therefore, in this improved method, the accuracy rate of the high-accuracy area is estimated based on the accuracy rate of only the area close to the threshold value T, not the entire low-accuracy area.

すなわち、１＜Ｓ＜Ｔを満たす領域下限値Ｓをあらかじめ定めておき、低確度域正解率算出部２２は、蓄積部２０に蓄積された確認結果情報のうち認識確度ｘがＳ≦ｘ≦Ｔを満たすもののみから正解率αを計算する。Ｓの値の定め方は特に限定されない。例えば、閾値Ｔに対して１未満の固定の割合となる値をＳとして定めておいてもよい。また、蓄積部２０内のデータ（確認結果情報）を、認識確度ｘの値が閾値Ｔから小さくなる方向に順に選んでいき、選んだデータの個数が、閾値Ｔ以下のデータの総数のうちの所定割合になったときの認識確度ｘを下限値Ｓとしてもよい。 That is, a region lower limit value S that satisfies 1<S<T is determined in advance, and the low-accuracy region accuracy rate calculation unit 22 determines that the recognition accuracy x of the confirmation result information accumulated in the accumulation unit 20 is S≦x≦T. Accuracy rate α is calculated only from those that satisfy How to determine the value of S is not particularly limited. For example, a fixed ratio of less than 1 to the threshold value T may be defined as S. In addition, the data (confirmation result information) in the storage unit 20 are sequentially selected in the direction in which the value of the recognition accuracy x decreases from the threshold T, and the number of selected data is the total number of data below the threshold T. The lower limit value S may be the recognition accuracy x when it reaches a predetermined ratio.

高確度域正解率推定部２４は、認識確度がＳからＴまでの領域において認識確度の代表値Ｕを上記実施形態と同様の方法で求める。そして、その領域の正解率αがその代表値Ｕでの値であるとして、上記式（１）を用いて高確度域の正解率δを計算する。 The high-accuracy region accuracy rate estimating unit 24 obtains the representative value U of the recognition accuracy in the region of the recognition accuracy from S to T by the same method as in the above embodiment. Then, assuming that the accuracy rate α of that area is the value at the representative value U, the accuracy rate δ of the high-accuracy area is calculated using the above equation (1).

この改良方法では、低確度域のうちの高確度域に近い領域の正解率から高確度域の正解率を推定するので、低確度域全域の正解率から推定するよりも、高確度域の正解率がより正確に推定できる。 In this improved method, the accuracy rate of the high-accuracy area is estimated from the accuracy rate of the area close to the high-accuracy area among the low-accuracy areas. rate can be estimated more accurately.

図５を参照して、更なる変形例を説明する。 A further modification will be described with reference to FIG.

この変形例では、低確度域正解率算出部２２は、図５に示すように、低確度域をＮ個（Ｎは２以上の整数）の小領域に分割し、小領域ごとに、蓄積部２０に蓄積されたその小領域に属する認識確度に対応する確認結果情報から正解率を計算する。図５の例では、低確度域を４つの小領域に分割しているが、これは一例に過ぎない。そして、低確度域正解率算出部２２は、小領域の正解率αをその小領域の代表値ｘ（例えば小領域の上限と下限の中央の確度）における正解率（図５中ではＸ印で示す）とする。 In this modification, as shown in FIG. 5, the low-accuracy region correct rate calculation unit 22 divides the low-accuracy region into N (N is an integer equal to or greater than 2) small regions, and for each small region, the storage unit 20 is calculated from the confirmation result information corresponding to the recognition accuracy belonging to the small area. Although the low-accuracy region is divided into four small regions in the example of FIG. 5, this is only an example. Then, the low-accuracy area accuracy rate calculation unit 22 calculates the accuracy rate α of the small area at the representative value x of the small area (for example, the accuracy of the middle between the upper limit and the lower limit of the small area) shown).

高確度域正解率推定部２４は、正解率αが認識確度ｘの関数α（ｘ）となるとの仮定の下、多項式近似や曲線フィッティング等の公知の手法により関数α（ｘ）を推定する。そして、この関数α（ｘ）を用いて、次式（３）により高確度域の正解率δを推定する。

The high-accuracy region accuracy rate estimating unit 24 estimates the function α(x) by a known technique such as polynomial approximation or curve fitting under the assumption that the accuracy rate α is a function α(x) of the recognition accuracy x. Then, using this function α(x), the accuracy rate δ in the high-accuracy region is estimated by the following equation (3).

また、高確度域正解率推定部２４は、式（３）の代わりに次の式（４）を用いて高確度域の正解率δを推定してもよい。

Further, the high-accuracy area accuracy rate estimating unit 24 may estimate the accuracy rate δ in the high-accuracy area using the following equation (4) instead of equation (3).

式（４）においてｐ（ｘ）は、上述の確率密度関数ｐ（ｘ）である。逆に言えば、式（３）は、確率密度関数ｐ（ｘ）が一様分布であると仮定した場合の式である。 In Equation (4), p(x) is the probability density function p(x) described above. Conversely, Equation (3) is an equation when it is assumed that the probability density function p(x) has a uniform distribution.

また、式（３）または（４）は、高確度域、すなわち認識確度ｘが閾値Ｔから１までの範囲全体についての正解率を求めるものである。これを一般化し、高確度域正解率推定部２４は、高確度域内のＴ₁≦ｘ≦Ｔ₂（ただしＴ≦Ｔ₁＜Ｔ₂）の範囲についての正解率を次の式（５）により推定してもよい。

Expression (3) or (4) is for obtaining the accuracy rate for the entire range from the threshold value T to 1 in the high-accuracy region, that is, the recognition accuracy x. Generalizing this, the high-accuracy region accuracy rate estimating unit 24 calculates the accuracy rate for the range of T ₁ ≤ x ≤ T ₂ (where T ≤ T ₁ < T ₂ ) in the high-accuracy region by the following equation (5) can be estimated.

図６を参照して、更なる変形例を説明する。 A further modification will be described with reference to FIG.

図６には、この変形例の情報処理装置のうち、確認処理部１８の内部構成の例と、蓄積部２０、低確度域正解率算出部２２及び高確度域正解率推定部２４を示している。この変形例の情報処理装置は、図１に示すのと同様のＯＣＲ１０及び選択部１６を更に備えている。 FIG. 6 shows an example of the internal configuration of the confirmation processing unit 18, the storage unit 20, the low-accuracy area accuracy calculation unit 22, and the high-accuracy area accuracy estimation unit 24 in the information processing apparatus of this modification. there is The information processing apparatus of this modified example further comprises an OCR 10 and a selector 16 similar to those shown in FIG.

入力画像データに対して認識確度算出部１４が算出した認識確度が閾値未満である場合、選択部１６は、確認処理部１８に対して処理の実行を指示する。このとき選択部１６は、その入力画像データと、その入力画像データに対する認識処理部１２の文字認識結果とを確認処理部１８に入力する。文字認識結果は突合部１８４に渡され、入力画像データは人手入力部１８２に渡される。 If the recognition accuracy calculated by the recognition accuracy calculation unit 14 for the input image data is less than the threshold, the selection unit 16 instructs the confirmation processing unit 18 to execute processing. At this time, the selection unit 16 inputs the input image data and the character recognition result of the recognition processing unit 12 for the input image data to the confirmation processing unit 18 . The character recognition result is passed to matching section 184 and the input image data is passed to manual input section 182 .

人手入力部１８２は、渡された入力画像データが示す画像を人である入力者に提示し、その入力者がその画像から読み取った文字列の入力を受け付ける。人手入力部１８２は、人間を文字認識エンジンとした文字認識部と捉えることができる。文字認識を行う入力者は、この情報処理装置に対してインターネット等のネットワークを介してリモートの位置にいてもよく、この場合人手入力部１８２は、入力者が操作する端末に対してネットワーク経由で入力画像データが示す画像を例えばウェブページの形で提供し、それに対してユーザが入力した認識結果の文字列をネットワーク経由で受け取る。人手入力部１８２が入力者から受け取った文字列は、突合部１８４に入力される。 The manual input unit 182 presents an image indicated by the passed input image data to a human input person, and receives input of a character string read from the image by the input person. The manual input unit 182 can be regarded as a character recognition unit using a human as a character recognition engine. An input person who performs character recognition may be in a remote position with respect to this information processing apparatus via a network such as the Internet. An image indicated by the input image data is provided in the form of, for example, a web page, and the character string of the recognition result input by the user is received via the network. The character string received by the manual input unit 182 from the input person is input to the matching unit 184 .

突合部（Ｘ）１８４は、ＯＣＲ１０の認識処理部１２の文字認識結果と、人手入力部１８２が入力者から受け取った文字列とを突き合わせて（すなわち照合して）、両者が合致（すなわち一致）するか否（非合致）かを判定する。両者が合致する場合、突合部１８４は、その合致した判定結果を当該情報処理装置の最終的な文字認識結果として出力する。両者が非合致の場合、突合部１８４は、人手入力部１８６に処理を実行させる。また突合部１８４は、その突き合わせの結果である突合結果（すなわち「合致」か「非合致」かを示す値）Ｘを、蓄積部２０へと蓄積する。突合結果Ｘの値は、合致が非合致かを示す二値の値である。以下では、一例として、計算の便宜のために、突合結果Ｘの値は合致の場合は「１」、非合致の場合は「０」とする（後述する突合部１８８Ａ及び１８８Ｂの場合も同様）。蓄積部２０に蓄積する突合結果Ｘには、入力画像データの識別情報ｉ（例えば各入力データに順に付与される通し番号）が対応付けられており、どの入力画像データに対応する突合結果であるかが識別可能となっている。 The matching unit (X) 184 matches (matches) the character recognition result of the recognition processing unit 12 of the OCR 10 with the character string received from the input person by the manual input unit 182 to match (match) the two. Determines whether or not (non-match). If the two match, the matching unit 184 outputs the matching determination result as the final character recognition result of the information processing apparatus. If the two do not match, the matching unit 184 causes the manual input unit 186 to execute processing. The matching unit 184 also accumulates a matching result (that is, a value indicating “match” or “non-match”) X in the storage unit 20 . The value of match result X is a binary value that indicates whether the match is a non-match. In the following, as an example, for convenience of calculation, the value of the matching result X is set to "1" in case of matching and "0" in case of non-matching (the same applies to the matching sections 188A and 188B described later). . The identification information i of the input image data (for example, a serial number assigned to each piece of input data in order) is associated with the matching result X accumulated in the storage unit 20, and which input image data the matching result corresponds to. is identifiable.

人手入力部１８６は、突合部１８４からの上記非合致の場合のトリガを受けると、入力画像データの示す画像を人手入力部１８２の入力者は別の第２の入力者に提示し、第２の入力者がその画像から読み取った文字列の入力を受け付ける。そして、第２の入力者から人手入力部１８６が受け付けた文字列が、入力画像データに対する当該情報処理装置の最終的な文字認識結果として出力される。 When the manual input unit 186 receives the trigger for the case of non-match from the matching unit 184, the input person of the manual input unit 182 presents an image indicated by the input image data to a different second input person. accepts the input of the character string read from the image by the input person. Then, the character string received by the manual input unit 186 from the second input person is output as the final character recognition result of the information processing apparatus for the input image data.

人手入力部１８６は、ＯＣＲ１０及び人手入力部１８２と並行に、同じ入力画像データについて常に第２の入力者からの文字列の入力を受け付ける処理を行ってもよいが、この処理を突合部１８４の判定結果が非合致の場合にのみ行うようにしてもよい。これにより、人手入力部１８６の処理のためのコスト（例えば第２の入力者のためのコスト）が低減される。 In parallel with the OCR 10 and the manual input unit 182, the manual input unit 186 may perform a process of always accepting a character string input from the second input person for the same input image data. It may be performed only when the determination result is non-coincidence. This reduces the cost for the processing of the manual input unit 186 (for example, the cost for the second input person).

ＯＣＲ１０、人手入力部１８２、突合部１８４、及び人手入力部１８６が、低確度域、すなわち認識確度が閾値未満の領域についての、入力画像データに対する文字認識を担う認識機構である。 The OCR 10, the manual input unit 182, the matching unit 184, and the manual input unit 186 are recognition mechanisms that perform character recognition on input image data in a low-accuracy area, that is, an area where recognition accuracy is less than a threshold.

一方、以下に説明する突合部１８８Ａ及び１８８Ｂ、蓄積部２０及び低確度域正解率算出部２２は、上記認識機構が行った判定の結果を多数蓄積し、蓄積した情報に基づいて、低確度域におけるＯＣＲ１０及び人手入力部１８２の正解率をそれぞれ計算する。更に低確度域についての上記認識機構の正解率を計算してもよい。 On the other hand, the matching units 188A and 188B, the accumulation unit 20, and the low-accuracy region accuracy calculation unit 22, which will be described below, accumulate a large number of determination results performed by the recognition mechanism, and based on the accumulated information, calculate the low-accuracy region , the accuracy rate of the OCR 10 and the manual input unit 182 are calculated respectively. Furthermore, the accuracy rate of the recognition mechanism for the low-accuracy region may be calculated.

すなわち、まず突合部１８８Ａは、ＯＣＲ１０の文字認識結果と人手入力部１８６が受け付けた文字列とを突き合わせ、その突き合わせの結果（突合結果Ａ）を、入力画像データの識別情報ｉと対応付けて蓄積部２０に蓄積する。突合部１８８Ｂは、人手入力部１８２の判定結果と人手入力部１８６の判定結果を突き合わせ、その突き合わせの結果（突合結果Ｂ）を、入力画像データの識別情報ｉと対応付けて蓄積部２０に蓄積する。 That is, first, the matching unit 188A matches the character recognition result of the OCR 10 with the character string received by the manual input unit 186, and stores the matching result (matching result A) in association with the identification information i of the input image data. Store in unit 20 . The matching unit 188B matches the determination result of the manual input unit 182 with the determination result of the manual input unit 186, and stores the matching result (matching result B) in the storage unit 20 in association with the identification information i of the input image data. do.

蓄積部２０には、入力データｉごとに、突合部１８４、１８８Ａ、１８８Ｂによる３つの突合結果Ｘ_i、Ａ_i、Ｂ_iが蓄積される。 The accumulation unit 20 accumulates three matching results X _i , A _i , and B _i by the matching units 184 , 188 A, and 188 B for each input data i.

低確度域正解率算出部２２は、蓄積部２０に蓄積されている突合結果Ｘ_i、Ａ_i、Ｂ_iを用いて、ＯＣＲ１０、人手入力部１８２、及び上記認識機構の低確度域での正解率を算出する。 The low-accuracy region correct rate calculation unit 22 uses the matching results _Xi , Ai, and _Bi accumulated in the accumulation unit 20 to calculate the correct answers in the low- _accuracy region of the OCR 10, the manual input unit 182, and the recognition mechanism. Calculate the rate.

低確度域正解率算出部２２による正解率の算出方法を説明する。まず、ＯＣＲ１２ａの正解率αと、人手入力部１８２の正解率βの算出方法を説明する。 A method of calculating the accuracy rate by the low-probability region accuracy rate calculation unit 22 will be described. First, a method of calculating the accuracy rate α of the OCR 12a and the accuracy rate β of the manual input unit 182 will be described.

この算出方法は、以下の２つの前提（ａ）、（ｂ）及び（ｃ）に基づいて正解率α及びβを計算する。
（ａ）突合部１８４の突合結果Ｘが「合致」の場合、ＯＣＲ１０及び人手入力部１８２の認識結果は共に正解である。
（ｂ）突合部１８８Ａの突合結果Ａが「合致」の場合、ＯＣＲ１０の認識結果が正解である。
（ｃ）突合部１８８Ｂの突合結果Ｂが「合致」の場合、人手入力部１８２が受け付けた入力者の入力正解である。 This calculation method calculates accuracy rates α and β based on the following two premises (a), (b) and (c).
(a) If the matching result X of the matching section 184 is "match", both the recognition results of the OCR 10 and the manual input section 182 are correct.
(b) If the matching result A of the matching section 188A is "match", the recognition result of the OCR 10 is correct.
(c) If the matching result B of the matching unit 188B is “match”, it is the correct input by the inputter accepted by the manual input unit 182 .

すなわち、ここでは、ＯＣＲの認識結果は、人手入力部１８２又は人手入力部１８６に入力された文字列と合致する場合に正解であり、人手入力部１８２に入力された文字列は、ＯＣＲ１０の認識結果又は人手入力部１８６に入力された文字列と合致する場合に正解であるとみなして、正解率α及びβを求める。これらの前提に基づいて、低確度域正解率算出部２２は、次式（６）に従って正解率α及びβを算出する。

That is, here, the OCR recognition result is correct when it matches the character string input to the manual input unit 182 or the manual input unit 186, and the character string input to the manual input unit 182 is recognized by the OCR 10. If the character string matches the result or the character string input to the manual input unit 186, it is regarded as correct, and the accuracy rates α and β are calculated. Based on these premises, the low-probability region accuracy rate calculator 22 calculates the accuracy rates α and β according to the following equation (6).

ここで、ｉは入力画像データの識別情報である通し番号であり、Ｎは入力データの総数である。また「Ｐ｜Ｑ」は、Ｐ又はＱが１であれば値が１となり、Ｐ及びＱの両方が０であれば値が０となる演算である。 Here, i is a serial number that is identification information of input image data, and N is the total number of input data. Also, "P|Q" is an operation that gives a value of 1 if P or Q is 1, and gives a value of 0 if both P and Q are 0.

なお、突合部１８４の突合結果が「合致」の場合、人手入力部１８６に判定を行わせないようにしてもよい。この場合、人手入力部１８６の判定結果が得られないので、これを用いる突合部１８８Ａ及び１８８Ｂの突合結果は共に「０」となるようにしてもよい。このようにした場合、低確度域正解率算出部２２は、上述の式（６）の代わりに、次の式（７）により正解率を計算してもよい。

It should be noted that if the matching result of the matching unit 184 is "match", the manual input unit 186 may not be allowed to make the determination. In this case, since the judgment result of the manual input unit 186 cannot be obtained, the matching results of the

matching units

188A and 188B using this may both be "0". In this case, the low-probability region accuracy rate calculation unit 22 may calculate the accuracy rate using the following equation (7) instead of the above-described equation (6).

次に、この情報処理装置の低確度域についての認識機構（すなわちＯＣＲ１０、人手入力部１８２、突合部１８４及び人手入力部１８６からなる部分）の正解率γを求める処理について説明する。ここでは、人手入力部１８２と人手入力部１８６と同じ特性を持つものとする。すなわち、人手入力部１８２と人手入力部１８６は、統計的に見て正解率が等しいとみなす。 Next, the processing for obtaining the accuracy rate γ of the recognition mechanism (that is, the portion comprising the OCR 10, the manual input section 182, the matching section 184, and the manual input section 186) for the low-accuracy region of this information processing apparatus will be described. Here, it is assumed that the manual input section 182 and the manual input section 186 have the same characteristics. That is, the manual input unit 182 and the manual input unit 186 are considered to have the same accuracy rate statistically.

ＯＣＲ１０及び人手入力部１８２の低確度域における正解率α及びβは、上述の方法で既に計算済みであるとする。この例では、上述の通り、人手入力部１８６は、入力データの数が十分多い場合、人手入力部１８２と同じ正解率αを持つとみなすことができる。したがって、低確度域正解率算出部２２は、正解率γを次式により計算することができる。
γ＝αβ＋（１－αβ）α Assume that the accuracy rates α and β in the low-accuracy region of the OCR 10 and the manual input unit 182 have already been calculated by the method described above. In this example, as described above, the manual input unit 186 can be considered to have the same accuracy rate α as the manual input unit 182 when the number of input data is sufficiently large. Therefore, the low-probability region accuracy rate calculator 22 can calculate the accuracy rate γ using the following equation.
γ=αβ+(1−αβ)α

より詳しく説明すると、判定機構全体としての正解となるケースは、（ａ）ＯＣＲ１０の認識結果が正解、かつ、人手入力部１８２が受け付けた入力が正解となるケースと、（ｂ）これ以外かつ人手入力部１８６が正解となるケースの２つである。（ａ）のケースが生じる確率はαβ、（ｂ）のケースが生じる確率は、上記（ａ）以外の確率（１－αβ）と人手入力部１８６が正解となる確率αの積（１－αβ）αなので、（ａ）と（ｂ）の確率の和が最終的な正解率γとなる。 More specifically, the cases where the judgment mechanism as a whole is correct are (a) cases where the recognition result of the OCR 10 is correct and the input received by the manual input unit 182 is correct, and (b) cases other than this and manual These are two cases where the input unit 186 is correct. The probability of case (a) occurring is αβ, and the probability of case (b) occurring is the product of the probability (1-αβ) other than (a) and the probability α that the manual input unit 186 is correct (1-αβ ) α, the sum of the probabilities of (a) and (b) is the final accuracy rate γ.

高確度域正解率推定部２４は、低確度域正解率算出部が算出したＯＣＲ１０の低確度域における正解率αを用いて、上述の実施形態又は各変形例に示した方法で、ＯＣＲ１０の高確度域（すなわち認識確度が閾値以上）での正解率を推定する。また、このシステム全体の正解率を推定する場合には、低確度域の正解率として上述のγを用い、このγから述の実施形態又は各変形例に示した方法で、高確度域におけるシステム全体の正解率を推定してもよい。 The high-accuracy area accuracy rate estimating unit 24 uses the accuracy rate α in the low-accuracy area of the OCR 10 calculated by the low-accuracy area accuracy calculation unit, and uses the method shown in the above-described embodiment or each modification to increase the OCR 10. The accuracy rate is estimated in the accuracy region (that is, the recognition accuracy is greater than or equal to the threshold). Further, when estimating the accuracy rate of the entire system, the above-mentioned γ is used as the accuracy rate in the low-accuracy region, and from this γ, the system in the high-accuracy region is An overall accuracy rate may be estimated.

図６に例示した確認処理部１８は、ＯＣＲ１０の文字認識結果を一人の人が確認する（すなわちその一人の人の認識結果を必ず正解とする）方式よりも、低確度域における文字認識結果（すなわち確認処理部１８の出力）の正確さを高くすることができ、ひいては低確度域におけるＯＣＲ１０の正解率の正確さを高めることができる。 The confirmation processing unit 18 illustrated in FIG. 6 has a character recognition result ( That is, the accuracy of the output of the confirmation processing unit 18 can be increased, and the accuracy of the accuracy rate of the OCR 10 in the low-accuracy range can be increased.

図６の例では、ＯＣＲ１０の文字認識結果を人により確認したが、人以外の手段で確認してもよい。人以外の確認手段としては、例えばＯＣＲ１０よりも文字認識の正解率が高いと期待される文字認識システムを用いてもよい。この文字認識システムの利用コストが高く、ＯＣＲ１０で十分な正解率が見込める場合にはその文字認識システムを利用しないことでコスト削減を図るといった目的で、この仕組みは利用できる。 In the example of FIG. 6, the character recognition result of the OCR 10 is checked by a person, but it may be checked by means other than a person. As confirmation means other than human, for example, a character recognition system that is expected to have a higher accuracy rate of character recognition than OCR 10 may be used. If the use cost of this character recognition system is high and a sufficient accuracy rate can be expected with OCR 10, this mechanism can be used for the purpose of reducing costs by not using the character recognition system.

以上に説明した実施形態及び変形例は、いずれも、入力画像データ中の文字列を認識するものであったが、上記実施形態及び変形例の手法は、文字認識に限らず、入力されたデータの内容を判定してその判定結果を出力する情報処理装置全般に適用可能である。すなわち、入力されたデータの内容を判定する判定手段（その一例がＯＣＲ１０）の判定の確度、すなわちその判定の結果が正解である可能性の高さの度合い、が閾値以上であればその判定手段の判定結果をそのまま出力し、閾値未満であればその判定結果を別の手段で確認し、誤りの場合は修正するシステムにおいて、確度が閾値以上の範囲における判定手段の正解率を求めるのに、上記実施形態及び変形例の方式が適用可能である。 Both the embodiments and modifications described above recognize character strings in input image data. can be applied to general information processing apparatuses that determine the content of and output the determination result. That is, if the accuracy of determination of determination means (an example of which is an OCR 10) that determines the content of input data, that is, the degree of likelihood that the determination result is correct, is greater than or equal to a threshold value, the determination means In a system that outputs the judgment result as it is, if it is less than the threshold, the judgment result is confirmed by another means, and if it is an error, it is corrected. The methods of the above embodiments and modifications are applicable.

以上に例示した情報処理装置は、一つの例ではハードウェアの論理回路として構成可能である。また、別の例として、この情報処理装置は、例えば、内蔵されるコンピュータにそれらシステムまたは装置内の各機能モジュールの機能を表すプログラムを実行させることにより実現してもよい。ここで、コンピュータは、例えば、ハードウェアとして、ＣＰＵ等のプロセッサ、ランダムアクセスメモリ（ＲＡＭ）およびリードオンリメモリ（ＲＯＭ）等のメモリ（一次記憶）、ＨＤＤ（ハードディスクドライブ）を制御するＨＤＤコントローラ、各種Ｉ／Ｏ（入出力）インタフェース、ローカルエリアネットワークなどのネットワークとの接続のための制御を行うネットワークインタフェース等が、たとえばバスを介して接続された回路構成を有する。また、そのバスに対し、例えばＩ／Ｏインタフェース経由で、ＣＤやＤＶＤなどの可搬型ディスク記録媒体に対する読み取り及び／又は書き込みのためのディスクドライブ、フラッシュメモリなどの各種規格の可搬型の不揮発性記録媒体に対する読み取り及び／又は書き込みのためのメモリリーダライタ、などが接続されてもよい。上に例示した各機能モジュールの処理内容が記述されたプログラムがＣＤやＤＶＤ等の記録媒体を経由して、又はネットワーク等の通信手段経由で、ハードディスクドライブ等の固定記憶装置に保存され、コンピュータにインストールされる。固定記憶装置に記憶されたプログラムがＲＡＭに読み出されＣＰＵ等のプロセッサにより実行されることにより、上に例示した機能モジュール群が実現される。また、情報処理装置は、ソフトウェアとハードウェアの組合せで構成されてもよい。 The information processing apparatus exemplified above can be configured as a logic circuit of hardware in one example. As another example, the information processing apparatus may be implemented by causing a built-in computer to execute a program representing the function of each functional module in the system or apparatus. Here, the computer includes, for example, hardware such as a processor such as a CPU, memory (primary storage) such as random access memory (RAM) and read only memory (ROM), HDD controller for controlling HDD (hard disk drive), various It has a circuit configuration in which an I/O (input/output) interface, a network interface for controlling connection with a network such as a local area network, and the like are connected via a bus, for example. Also, for the bus, for example, via an I/O interface, a disk drive for reading from and/or writing to a portable disk recording medium such as a CD or a DVD, a portable nonvolatile recording medium of various standards such as a flash memory, etc. A memory reader/writer for reading from and/or writing to the medium, etc., may also be connected. A program in which the processing contents of each functional module exemplified above is described is stored in a fixed storage device such as a hard disk drive via a recording medium such as a CD or DVD or via a communication means such as a network. Installed. A program stored in a fixed storage device is read out to a RAM and executed by a processor such as a CPU to implement the functional module group illustrated above. Also, the information processing apparatus may be configured by a combination of software and hardware.

１０ＯＣＲ、１２認識処理部、１４認識確度算出部、１６選択部、１８確認処理部、２０蓄積部、２２低確度域正解率算出部、２４高確度域正解率推定部、１８２，１８６人手入力部、１８４，１８８Ａ，１８８Ｂ突合部。
10 OCR, 12 recognition processing unit, 14 recognition accuracy calculation unit, 16 selection unit, 18 confirmation processing unit, 20 accumulation unit, 22 low-accuracy area accuracy calculation unit, 24 high-accuracy area accuracy estimation unit, 182, 186 manual input Part, 184, 188A, 188B Abutment.

Claims

recognition means for performing character recognition on an input and outputting the recognition results and recognition accuracy of the character recognition;
confirming means for confirming whether the recognition result is correct or incorrect, adopting the recognition result if the recognition result is correct, obtaining the correct recognition result for the input if the recognition result is incorrect, and adopting the obtained recognition result; ,
Output control for outputting the recognition result of the recognition means without intervening the confirmation means for the input whose recognition accuracy is equal to or higher than the threshold, and outputting the recognition result adopted by the confirmation means if the recognition accuracy is less than the threshold. means and
A correct answer rate for calculating a ratio of inputs confirmed as correct answers by the confirmation means among inputs whose recognition accuracy is within a first range of the range less than the threshold, as a correct answer rate of the recognition means in the first range. calculating means;
an estimation means for estimating the accuracy rate of the recognition means in a second range out of the range equal to or greater than the threshold based on the accuracy rate in the first range;
Information processing equipment including.

2. The information processing apparatus according to claim 1, wherein said first range is a range from a value larger than 0 determined according to a predetermined criterion to said threshold value.

The estimating means assumes that the accuracy rate calculated by the accuracy rate calculating means corresponds to a first representative value of the recognition accuracy in the first range, and a second representative value of the recognition accuracy in the second range. The accuracy rate corresponding to the representative value of is estimated by linear interpolation between the accuracy rate corresponding to the first representative value and a predetermined maximum accuracy rate at the maximum value that the recognition accuracy can take. 3. The information processing device according to 1 or 2.

The accuracy rate calculation means obtains the accuracy rate for each of a plurality of ranges in which the recognition accuracy is less than the threshold;
3. The information processing according to claim 1, wherein said estimation means estimates said accuracy rate in said second range based on a tendency of change of said accuracy rate in each of said plurality of ranges according to said recognition accuracy. Device.

The accuracy rate calculation means obtains the accuracy rate for each of a plurality of ranges in which the recognition accuracy is less than the threshold;
The estimation means estimates a function for obtaining the accuracy rate corresponding to the recognition accuracy from the relationship between the accuracy rate and the recognition accuracy for each of the plurality of ranges, and uses the estimated function to determine the second range. 3. The information processing apparatus according to claim 1, wherein the accuracy rate in is estimated.

2. The information processing according to claim 1, wherein said estimation means obtains a probability density function of said recognition accuracy from a distribution of occurrence frequencies of said recognition accuracy, and estimates said accuracy rate in said second range using said probability density function. Device.

the computer,
recognition means for performing character recognition on an input and outputting the recognition result and recognition accuracy of the character recognition;
Confirmation means for confirming whether the recognition result is correct or incorrect, adopting the recognition result if the recognition result is correct, obtaining the correct recognition result for the input if the recognition result is incorrect, and adopting the obtained recognition result;
Output control for outputting the recognition result of the recognition means without intervening the confirmation means for the input whose recognition accuracy is equal to or higher than the threshold, and outputting the recognition result adopted by the confirmation means if the recognition accuracy is less than the threshold. means,
A correct answer rate for calculating a ratio of inputs confirmed as correct answers by the confirmation means among inputs whose recognition accuracy is within a first range of the range less than the threshold, as a correct answer rate of the recognition means in the first range. calculating means;
Estimation means for estimating the accuracy rate of the recognition means in a second range out of the range equal to or greater than the threshold based on the accuracy rate in the first range;
A program to function as