JP2019159374A

JP2019159374A - Information processing apparatus and program

Info

Publication number: JP2019159374A
Application number: JP2018040657A
Authority: JP
Inventors: 木村　俊一; Shunichi Kimura; 俊一木村
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2018-03-07
Filing date: 2018-03-07
Publication date: 2019-09-19

Abstract

To determine a threshold value reflecting information of an accumulated past determination result as a threshold dividing sections of a determination accuracy of the determination result for selecting post-stage processing to a system for processing the determination result of determination means for input to selected one of the plurality of post-stage processing.SOLUTION: N-number of post-stage processing are associated with sections having different recognition accuracy (corresponding to a determination accuracy) of a character recognition unit (example of determination means) in a pre-stage. A number of pairs of the recognition accuracy obtained by a past character recognition and correctness information (information indicating whether the recognition is correct) are received as learning data. In order from the section with the highest recognition accuracy, the recognition accuracy rate obtained from the correctness information of the learning data belonging to the section determines the threshold defining the section so that the target correction rate of the determination means corresponding to the section is satisfied.SELECTED DRAWING: Figure 5

Description

本発明は、情報処理装置及びプログラムに関する。 The present invention relates to an information processing apparatus and a program.

特許文献１に開示された方法は、入力された帳票上の画像に対し文字認識を行ない、その文字認識結果としての類似度を得て、この得られた類似度とあらかじめ登録された当該文字認識に要求する確信度とを比較し、この比較の結果に基づき文字認識結果に対し人手によるベリファイ処理を必要としない出力を行なうか、あるいは、上記比較の結果に基づき文字認識結果に対し文字認識候補の選択肢を提示して人手によるベリファイ処理を促す出力を行なうか、あるいは、上記比較の結果に基づき文字認識結果に対し人手による新規入力および確定を提示して手入力処理を促す出力を行なう。 The method disclosed in Patent Document 1 performs character recognition on an input image on a form, obtains a similarity as a result of the character recognition, and obtains the similarity and the character recognition registered in advance. Compared with the certainty required for the comparison, the character recognition result is output based on the comparison result without requiring manual verification processing, or the character recognition candidate for the character recognition result based on the comparison result. An option for prompting a manual verification process is presented, or an output for prompting a manual input process by presenting a new manual input and confirmation for the character recognition result based on the comparison result.

特許文献２に開示された文字認識装置は、手書き入力された文字の座標点列を認識して認識候補文字群を出力する文字認識手段と、文字認識手段より出力される判定対象認識候補文字群の信頼度を算出するための特徴量として、手書き入力された文字の座標点列の平均筆記速度を算出する特徴抽出手段と、特徴抽出手段からの特徴量と、サンプルデータの統計的傾向とに基づいて、判定対象認識候補文字群の信頼度を算出する信頼度算出手段と、信頼度算出手段からの信頼度に基づいて判定対象認識候補文字群の後処理を制御する後処理制御手段とを有する。 A character recognition device disclosed in Patent Document 2 recognizes a coordinate point sequence of a character input by handwriting and outputs a recognition candidate character group, and a determination target recognition candidate character group output from the character recognition unit As feature quantities for calculating the reliability, a feature extraction means for calculating an average writing speed of a coordinate point sequence of handwritten characters, a feature quantity from the feature extraction means, and a statistical tendency of sample data And a reliability calculation means for calculating the reliability of the determination target recognition candidate character group, and a post-processing control means for controlling the post-processing of the determination target recognition candidate character group based on the reliability from the reliability calculation means. Have.

特許文献３に開示された方法は、入力された文書画像から論理要素を抽出し、抽出された論理要素が文字列領域であるかを識別し、識別された文字列領域を文字認識し、認識結果の確信度がしきい値以上であるときテキストとして表示し、しきい値未満であるとき部分画像として表示する。 The method disclosed in Patent Document 3 extracts a logical element from an input document image, identifies whether the extracted logical element is a character string area, recognizes the identified character string area, and recognizes the character string area. When the certainty of the result is greater than or equal to the threshold, it is displayed as text, and when it is less than the threshold, it is displayed as a partial image.

特許文献４に開示された情報処理装置の分類手段は、文字認識対象を３種類のいずれかに分類し、抽出手段は、前記分類手段によって第１の種類に分類された場合に、前記文字認識対象の文字認識結果を抽出し、第１の制御手段は、前記分類手段によって第２の種類に分類された場合に、前記文字認識対象の文字認識結果を抽出し、該文字認識対象を人手で入力させるように制御し、第２の制御手段は、前記分類手段によって第３の種類に分類された場合に、前記文字認識対象を複数人の人手で入力させるように制御する。 The classification unit of the information processing apparatus disclosed in Patent Document 4 classifies the character recognition target into one of three types, and the extraction unit recognizes the character recognition when the classification unit classifies the first type. The character recognition result of the target is extracted, and the first control means extracts the character recognition result of the character recognition target when it is classified into the second type by the classification means, and manually selects the character recognition target. The second control unit controls the character recognition target to be input manually by a plurality of persons when the classification unit classifies the third type.

特許文献５〜１０には、文字認識の認識確度についての様々な算出方式が示されている。 Patent Documents 5 to 10 show various calculation methods for the recognition accuracy of character recognition.

特開２００３−３４６０８０号公報JP 2003-346080 A 特開２００３−２９６６６１号公報JP 2003-296661 A 特開２０００−２５９８４７号公報JP 2000-259847 A 特開２０１６−２１２８１２号公報JP 2016-211281 A 特開平５−４０８５３号公報JP-A-5-40853 特開平５−２０５００号公報JP-A-5-20500 特開平５−２９０１６９公報JP-A-5-290169 特開平８−１０１８８０号公報JP-A-8-101880 特開平９−１３４４１０号公報JP-A-9-134410 特開平９−２５９２２６号公報JP-A-9-259226

入力に対して判定を行い、複数の後段処理のうち、判定結果の判定確度に属する区間に対応する後段処理にその判定結果を処理させるシステムでは、それら区間を区切る閾値を設定する必要がある。それら閾値は、蓄積されている過去の判定結果の情報を反映したものであることが望まれる。しかし、そのような閾値を定める装置や方法は従来提案されていない。 In a system that performs a determination on an input and causes a subsequent process corresponding to a section belonging to the determination accuracy of the determination result to process the determination result among a plurality of subsequent processes, it is necessary to set a threshold value that divides the sections. It is desirable that these threshold values reflect information on accumulated past determination results. However, no device or method for determining such a threshold value has been proposed.

請求項１に係る発明は、入力に対して判定を行う判定手段と、前記入力に対する前記判定手段の判定確度を算出する算出手段と、前記判定手段の判定結果に対して後段処理を行うことで前記入力に対する出力を生成可能であり、前記出力の生成における前記判定手段の判定結果に対する依存度合いが互いに異なる複数の後段処理手段であって、前記判定確度が取り得る範囲を１以上の閾値で区切った区間ごとに対してそれぞれ対応づけられた複数の後段処理手段と、前記算出手段が算出した判定確度が属する区間に対応する前記後段処理手段に、前記入力に対する出力の生成を行わせるように制御する制御手段と、を含む判定システムのために、前記判定確度についての前記区間を区切る閾値を決定する情報処理装置であって、前記判定手段に対する過去の入力の各々について、当該入力に対する前記判定確度と、当該入力に対する前記判定手段の判定結果が正解か不正解かを示す正誤情報との組の集合を取得する取得手段と、前記取得手段が取得した前記集合を用いて、判定確度が高い区間から順に、当該区間に属する前記組の集合から求められる前記判定手段の正解率が、その区間に対応する前記判定手段の目標正解率を満たすよう当該区間を規定する前記閾値を決定する決定手段と、を含む情報処理装置である。 The invention according to claim 1 includes: a determination unit that performs determination on an input; a calculation unit that calculates a determination accuracy of the determination unit with respect to the input; and a subsequent process on the determination result of the determination unit. A plurality of post-processing units capable of generating an output in response to the input and having different degrees of dependence on the determination result of the determination unit in generating the output, wherein the possible range of the determination accuracy is divided by one or more threshold values; A plurality of post-processing units associated with each section, and the post-processing unit corresponding to the section to which the determination accuracy calculated by the calculation unit belongs is controlled to generate an output for the input. An information processing apparatus for determining a threshold for dividing the section with respect to the determination accuracy, for a determination system including: Acquisition means for acquiring a set of the determination accuracy for the input and correct / incorrect information indicating whether the determination result of the determination means for the input is correct or incorrect for each past input; and the acquisition means The accuracy rate of the determination means obtained from the set of sets belonging to the section in order from the section with the highest determination accuracy using the set acquired by the above condition satisfies the target accuracy rate of the determination means corresponding to the section Determining means for determining the threshold value defining the section.

請求項２に係る発明は、前記各区間に対応する前記判定手段の目標認識率は、前記判定確度が高い区間ほど高い値であり、前記決定手段は、前記目標認識率が高い区間から順に、各区間を規定する前記閾値を決定する、請求項１に記載の情報処理装置である。 In the invention according to claim 2, the target recognition rate of the determination unit corresponding to each section is a higher value as the determination accuracy is higher, and the determination unit sequentially starts from the section with the higher target recognition rate. The information processing apparatus according to claim 1, wherein the threshold value defining each section is determined.

請求項３に係る発明は、前記判定確度が高い区間に対応する前記後段処理手段ほど、前記判定結果を用いて前記出力を生成するための方法としてコストがより低い方法を用いており、前記決定手段は、前記コストが低い区間から順に、各区間を規定する前記閾値を決定する、請求項１又は２に記載の情報処理装置である。 The invention according to claim 3 uses a method with a lower cost as a method for generating the output using the determination result as the post-processing unit corresponding to the section having the higher determination accuracy. The information processing apparatus according to claim 1, wherein the means determines the threshold value defining each section in order from the section with the lowest cost.

請求項４に係る発明は、前記判定確度が最も高い区間に対応する前記後段処理手段は、前記判定手段の判定結果をそのまま前記出力とするものであり、前記判定確度が最も高い区間に対応する前記後段処理手段についての前記目標正解率として、前記判定システムに対して設定された目標正解率を用いる、請求項１〜３のいずれか１項に記載の情報処理装置である。 According to a fourth aspect of the present invention, the subsequent processing means corresponding to the section with the highest determination accuracy uses the determination result of the determination means as it is as the output, and corresponds to the section with the highest determination accuracy. The information processing apparatus according to any one of claims 1 to 3, wherein a target correct answer rate set for the determination system is used as the target correct answer rate for the post-processing unit.

請求項５に係る発明は、前記複数の後段処理手段には、前記判定手段の判定結果を用いずに前記入力に対する出力を生成する第２種の後段処理手段が含まれ、前記第２種の後段処理手段は、前記区間のうち判定確度が最も低い区間に対応付けられる、請求項１〜４のいずれか１項に記載の情報処理装置である。 In the invention according to claim 5, the plurality of post-processing units include a second type post-processing unit that generates an output for the input without using a determination result of the determination unit, and the second type 5. The information processing apparatus according to claim 1, wherein the post-processing unit is associated with a section having the lowest determination accuracy among the sections.

請求項６に係る発明は、前記第２種の後段処理手段に対応する前記目標正解率は０である、請求項５に記載の情報処理装置である。 The invention according to claim 6 is the information processing apparatus according to claim 5, wherein the target correct answer rate corresponding to the second-stage post-processing means is zero.

請求項７に係る発明は、前記複数の後段処理手段は、前記判定手段の判定結果をそのまま前記出力とする第１の後段処理手段と、前記判定手段の判定結果を用いずに前記入力に対する人による判定結果に基づき前記出力を生成する第２の後段処理手段と、前記判定手段の判定結果と前記入力に対する人による判定結果との突き合わせにより前記出力を生成する第３の後段処理手段と、からなり、前記第１の後段処理手段についての前記目標正解率として、前記判定システムに対して設定された目標正解率を用い、前記第２の後段処理手段についての前記目標正解率として０を用い、前記第３の後段処理手段についての前記目標正解率を、前記人の正解率と前記判定手段の正解率とから求める、請求項１〜３のいずれか１項に記載の情報処理装置である。 According to a seventh aspect of the present invention, the plurality of post-processing units include a first post-processing unit that uses the determination result of the determination unit as it is as the output, and a person for the input without using the determination result of the determination unit. Second post-processing means for generating the output based on the determination result by the third post-processing means for generating the output by matching the determination result of the determination means and the determination result by the person with respect to the input, And using the target accuracy rate set for the determination system as the target accuracy rate for the first post-processing means, and using 0 as the target accuracy rate for the second post-processing means, The information processing device according to any one of claims 1 to 3, wherein the target accuracy rate for the third post-processing unit is obtained from the accuracy rate of the person and the accuracy rate of the determination unit. It is.

請求項８に係る発明は、前記取得手段は、判定確度と正誤情報との組の代わりに、判定確度と、判定確度の最大値から当該判定確度までの範囲内の各判定確度に対応する前記正誤情報の累積結果の情報と、の組を取得し、前記決定手段は、前記各判定確度に対応する前記累積結果の情報を用いて、前記閾値を決定しようとする区間の正解率を求める、請求項１〜７のいずれか１項に記載の情報処理装置である。 The invention according to claim 8 is characterized in that the acquisition means corresponds to each determination accuracy within a range from the maximum value of the determination accuracy to the determination accuracy, instead of a set of determination accuracy and correctness information. A set of correct and incorrect information cumulative result information is acquired, and the determination means uses the cumulative result information corresponding to each of the determination accuracy to obtain a correct answer rate of a section in which the threshold is to be determined. It is an information processing apparatus of any one of Claims 1-7.

請求項９に係る発明は、入力に対して判定を行う判定手段と、前記入力に対する前記判定手段の判定確度を算出する算出手段と、前記判定手段の判定結果に対して後段処理を行うことで前記入力に対する出力を生成可能であり、前記出力の生成における前記判定手段の判定結果に対する依存度合いが互いに異なる複数の後段処理手段であって、前記判定確度が取り得る範囲を１以上の閾値で区切った区間ごとに対してそれぞれ対応づけられた複数の後段処理手段と、前記算出手段が算出した判定確度が属する区間に対応する前記後段処理手段に、前記入力に対する出力の生成を行わせるように制御する制御手段と、を含む判定システムのために、前記判定確度についての前記区間を区切る閾値を決定する情報処理装置としてコンピュータを機能させるためのプログラムであって、前記コンピュータを、前記判定手段に対する過去の入力の各々について、当該入力に対する前記判定確度と、当該入力に対する前記判定手段の判定結果が正解か不正解かを示す正誤情報との組の集合を取得する取得手段、前記取得手段が取得した前記集合を用いて、判定確度が高い区間から順に、当該区間に属する前記組の集合から求められる前記判定手段の正解率が、その区間に対応する前記判定手段の目標正解率を満たすよう当該区間を規定する前記閾値を決定する決定手段、として機能させるためのプログラムである。 The invention according to claim 9 includes: a determination unit that performs determination on an input; a calculation unit that calculates a determination accuracy of the determination unit with respect to the input; and a subsequent process on the determination result of the determination unit. A plurality of post-processing units capable of generating an output in response to the input and having different degrees of dependence on the determination result of the determination unit in generating the output, wherein the possible range of the determination accuracy is divided by one or more threshold values; A plurality of post-processing units associated with each section, and the post-processing unit corresponding to the section to which the determination accuracy calculated by the calculation unit belongs is controlled to generate an output for the input. A computer as an information processing apparatus for determining a threshold for dividing the section with respect to the determination accuracy. Correct information indicating whether the determination accuracy for the input and the determination result of the determination unit for the input are correct or incorrect for each of the past inputs to the determination unit Using the set acquired by the acquisition means, the accuracy rate of the determination means obtained from the set of sets belonging to the section in order from the section with the highest determination accuracy, It is a program for functioning as a determination unit that determines the threshold value that defines the section so as to satisfy the target accuracy rate of the determination unit corresponding to the section.

請求項１０に係る発明は、入力に対して判定を行う判定手段と、前記入力に対する前記判定手段の判定確度を算出する算出手段と、前記判定手段の判定結果に対して後段処理を行うことで前記入力に対する出力を生成可能であり、前記出力の生成における前記判定手段の判定結果に対する依存度合いが互いに異なる複数の後段処理手段であって、前記判定確度が取り得る範囲を１以上の閾値で区切った区間ごとに対してそれぞれ対応づけられた複数の後段処理手段と、前記算出手段が算出した判定確度が属する区間に対応する前記後段処理手段に、前記入力に対する出力の生成を行わせるように制御する制御手段と、前記判定手段に対する過去の入力の各々について、当該入力に対する前記判定確度と、当該入力に対する前記判定手段の判定結果が正解か不正解かを示す正誤情報との組の集合を取得する取得手段と、前記取得手段が取得した前記集合を用いて、判定確度が高い区間から順に、当該区間に属する前記組の集合から求められる前記判定手段の正解率が、その区間に対応する前記判定手段の目標正解率を満たすよう当該区間を規定する前記閾値を決定する決定手段と、を含む情報処理装置である。 According to a tenth aspect of the present invention, there is provided: a determination unit that performs determination on an input; a calculation unit that calculates a determination accuracy of the determination unit with respect to the input; and a subsequent process on the determination result of the determination unit. A plurality of post-processing units capable of generating an output in response to the input and having different degrees of dependence on the determination result of the determination unit in generating the output, wherein the possible range of the determination accuracy is divided by one or more threshold values; A plurality of post-processing units associated with each section, and the post-processing unit corresponding to the section to which the determination accuracy calculated by the calculation unit belongs is controlled to generate an output for the input. Control means, and for each past input to the determination means, the determination accuracy for the input and the determination means determination for the input An acquisition unit that acquires a set of correct / incorrect information indicating whether the result is correct or incorrect, and the set acquired by the acquisition unit, in order from the interval with the highest determination accuracy, Determining means for determining the threshold value defining the section so that a correct answer rate of the determining means obtained from the set satisfies a target correct answer rate of the determining means corresponding to the section;

請求項１１に係る発明は、コンピュータを、入力に対して判定を行う判定手段、前記入力に対する前記判定手段の判定確度を算出する算出手段、前記判定手段の判定結果に対して後段処理を行うことで前記入力に対する出力を生成可能であり、前記出力の生成における前記判定手段の判定結果に対する依存度合いが互いに異なる複数の後段処理手段であって、前記判定確度が取り得る範囲を１以上の閾値で区切った区間ごとに対してそれぞれ対応づけられた複数の後段処理手段、前記算出手段が算出した判定確度が属する区間に対応する前記後段処理手段に、前記入力に対する出力の生成を行わせるように制御する制御手段、前記判定手段に対する過去の入力の各々について、当該入力に対する前記判定確度と、当該入力に対する前記判定手段の判定結果が正解か不正解かを示す正誤情報との組の集合を取得する取得手段、前記取得手段が取得した前記集合を用いて、判定確度が高い区間から順に、当該区間に属する前記組の集合から求められる前記判定手段の正解率が、その区間に対応する前記判定手段の目標正解率を満たすよう当該区間を規定する前記閾値を決定する決定手段、として機能させるためのプログラムである。 According to an eleventh aspect of the present invention, the computer performs determination processing for the input, determination means for calculating the determination accuracy of the determination means for the input, and post-processing for the determination result of the determination means. Can generate an output for the input, and a plurality of subsequent processing units having different degrees of dependency on the determination result of the determination unit in the generation of the output, and the range that the determination accuracy can take is a threshold value of 1 or more Control so that a plurality of subsequent processing means associated with each divided section, and the subsequent processing means corresponding to the section to which the determination accuracy calculated by the calculation means belongs, generate output for the input. For each of past inputs to the control means and the determination means, the determination accuracy for the input and the determination means for the input Acquisition means for acquiring a set of correct and incorrect information indicating whether the determination result is correct or incorrect, and using the set acquired by the acquisition means, the sets belonging to the section in descending order of determination accuracy This is a program for functioning as a determination unit that determines the threshold value that defines the section so that the accuracy rate of the determination unit obtained from the set satisfies the target accuracy rate of the determination unit corresponding to the section.

請求項１、２、４、５、６、９、１０又は１１に係る発明によれば、複数の後段処理に対応する、認識確度についての複数の区間を区切る閾値として、蓄積されている過去の判定結果の情報を反映した閾値を決定することができる。 According to the first, second, fourth, fifth, sixth, ninth, tenth, or eleventh aspects of the present invention, the accumulated past values are used as threshold values that delimit a plurality of sections for recognition accuracy corresponding to a plurality of subsequent processes. A threshold value reflecting the information of the determination result can be determined.

請求項３に係る発明によれば、この発明を用いない場合と比べて、判定システム全体としての判定のためのコストを低減することができる。 According to the invention which concerns on Claim 3, the cost for the determination as the whole determination system can be reduced compared with the case where this invention is not used.

請求項７に係る発明によれば、各区間の目標正解率を自動的に決定することができる。 According to the invention which concerns on Claim 7, the target correct answer rate of each area can be determined automatically.

請求項８に係る発明によれば、閾値を決定する際に正誤情報の累積結果の情報を算出しなくてよい。 According to the eighth aspect of the present invention, it is not necessary to calculate information on the accumulation result of correct / incorrect information when determining the threshold value.

実施形態の閾値設定処理装置が適用される判定システムの例を示す図である。It is a figure which shows the example of the determination system with which the threshold value setting processing apparatus of embodiment is applied. 閾値設定処理装置に入力される学習用データを説明するための図である。It is a figure for demonstrating the data for learning input into a threshold value setting processing apparatus. 閾値設定処理装置の機能構成を例示する図である。It is a figure which illustrates the function structure of a threshold value setting processing apparatus. 閾値算出部の処理手順を例示する図である。It is a figure which illustrates the process sequence of a threshold value calculation part. 閾値算出部が行う処理を説明するための図である。It is a figure for demonstrating the process which a threshold value calculation part performs. 閾値算出部の処理の進み方を説明するための図である。It is a figure for demonstrating how to advance the process of a threshold value calculation part. 閾値算出部における閾値決定処理の詳細な手順を例示する図である。It is a figure which illustrates the detailed procedure of the threshold value determination process in a threshold value calculation part. 判定システムの具体例の主要部を例示する図である。It is a figure which illustrates the principal part of the specific example of a determination system.

図１に、本発明に係る情報処理装置の一実施形態である閾値設定処理装置２０と、これを用いる判定システムの例を示す。 FIG. 1 shows an example of a threshold setting processing apparatus 20 that is an embodiment of an information processing apparatus according to the present invention and a determination system that uses the threshold setting processing apparatus 20.

この判定システムは、入力される画像データ（「入力画像データ」）に含まれる文字列を、ＯＣＲ１０とＮ個の後段処理部１８−１、１８−２、・・・、１８−Ｎ（Ｎは２以上の整数。以下、相互に区別の必要がない場合は後段処理部１８と総称）により判定する。 This determination system converts a character string included in input image data (“input image data”) into OCR 10 and N post-processing units 18-1, 18-2,. An integer greater than or equal to 2. Hereinafter, when there is no need to distinguish between each other, it is determined by the post-processing unit 18).

ＯＣＲ１０は、入力画像データに対して公知のＯＣＲ（光学文字認識）処理を行うことで、その入力画像データ内に含まれる文字列を認識する。ＯＣＲ１０は、入力画像データから認識した文字列を示すテキストコードと認識確度との組を出力する。認識確度は、認識結果のテキストコードがその入力画像データに含まれる文字列（手書きの場合もある）を正しく表している確からしさを示す度合いである。認識確度が高いほど、認識結果のテキストコードが正解である（すなわち入力画像データ中の文字列を正しく表している）可能性が高い。認識結果が正解である可能性を以下では、認識率又は正解率と呼ぶ。ＯＣＲ１０は、入力画像データについての異なる複数の認識結果を、認識確度が高い順に、認識確度と対応付けて出力してもよい。なお、ＯＣＲ１０が文字認識を行う単位（すなわち認識結果を出力する単位）は、特に限定されず、例えば、文字単位、行又は列（横書き又は縦書き）単位、ページ単位、文書単位等のいずれであってもよい。 The OCR 10 recognizes a character string included in the input image data by performing a known OCR (optical character recognition) process on the input image data. The OCR 10 outputs a combination of a text code indicating a character string recognized from input image data and a recognition accuracy. The recognition accuracy is a degree indicating the probability that the recognition result text code correctly represents a character string (may be handwritten) included in the input image data. The higher the recognition accuracy, the higher the possibility that the text code of the recognition result is correct (that is, correctly represents the character string in the input image data). The possibility that the recognition result is correct is hereinafter referred to as a recognition rate or a correct answer rate. The OCR 10 may output a plurality of different recognition results for the input image data in association with recognition accuracy in descending order of recognition accuracy. The unit in which the OCR 10 performs character recognition (that is, a unit for outputting a recognition result) is not particularly limited, and may be any of character unit, row or column (horizontal writing or vertical writing) unit, page unit, document unit, and the like. There may be.

なお、ＯＣＲ１０が用いる文字認識の手法や認識確度の算出方法は特に限定されず、特許文献５〜１０に例示したものを始めとする従来手法や今後開発される手法のうちいずれを用いてもよい。 The character recognition method used by the OCR 10 and the calculation method of the recognition accuracy are not particularly limited, and any of conventional methods such as those exemplified in Patent Documents 5 to 10 and methods developed in the future may be used. .

Ｎ個の後段処理部１８の各々は、原則的には、ＯＣＲ１０による認識結果のテキストコードを受け取り、そのテキストコードと、０個以上の他の手段による入力画像データ中の文字列の認識結果とを用いて、当該後段処理部１８の最終的な文字認識結果を決定する。例えば、ＯＣＲ１０の認識結果と、「他の手段」の認識結果の中から、所定の（すなわち予め定められた）規則に従って、１つの認識結果を選択して最終的な文字認識結果として出力する。「他の手段」として何を用いる（他の手段を用いない場合もある）かや、出力する認識結果を選択するための規則は、後段処理部１８ごとに定められている。後段処理部１８が文字認識に用いる「他の手段」としては、人、本システムの外部にある文字認識サービス等がある。この外部の文字認識サービスとしては、例えば平均的な認識率（正解率）がＯＣＲ１０より高いと期待されるが、利用のためにコストがかかる（ＯＣＲ１０の利用コストが０とみなせる場合）か、又は利用コストがＯＣＲ１０よりも高いものを用いる。なお、Ｎ個の後段処理部１８の中には、ＯＣＲ１０の認識結果をまったく用いないものが含まれていてもよい。 Each of the N post-processing units 18 receives the text code of the recognition result by the OCR 10 in principle, and the text code and the recognition result of the character string in the input image data by zero or more other means. Is used to determine the final character recognition result of the post-processing unit 18. For example, one recognition result is selected and output as a final character recognition result from a recognition result of the OCR 10 and a recognition result of “other means” according to a predetermined (ie, predetermined) rule. Rules for selecting what “other means” to use (some other means may not be used) and selecting a recognition result to be output are determined for each post-processing unit 18. “Other means” used by the post-processing unit 18 for character recognition includes a person, a character recognition service outside the system, and the like. As this external character recognition service, for example, an average recognition rate (accuracy rate) is expected to be higher than that of OCR 10, but it is expensive to use (when the usage cost of OCR 10 can be regarded as 0), or Use cost higher than OCR10. Note that the N post-processing units 18 may include those that do not use the recognition result of the OCR 10 at all.

Ｎ個の後段処理部１８には、１、２、３、・・・、Ｎの順に順序が規定されており、この順序の番号の数字が大きいほどに、ＯＣＲ１０への依存度が高い。より厳密には、順序番号が大きくなるに従ってＯＣＲ１０への依存度が単調増加する。また、順序番号が大きい後段処理部１８ほど、当該後段処理部１８の処理に必要なコスト（最終的に金額に換算される費用のこと）が低い。より厳密には、順序番号が大きくなるにつれて、処理のコストが単調減少する。 The order is defined in the order of 1, 2, 3,..., N in the N post-processing units 18, and the greater the number of the order number, the higher the dependency on the OCR 10. More precisely, the dependency on the OCR 10 monotonously increases as the sequence number increases. In addition, the later processing unit 18 having a larger sequence number has a lower cost (final cost converted into an amount of money) required for the processing of the subsequent processing unit 18. More precisely, the processing cost decreases monotonically as the sequence number increases.

例えば、順序が最後の後段処理部１８−Ｎ（以下煩雑さを避けるため後段処理部１８−Ｋのことを「後段処理Ｋ」とも呼ぶ。Ｋは１からＮまでの整数）として、後述する図８の「後段処理３」のように、ＯＣＲ１０の文字認識結果のテキストコードをそのまま最終的な文字認識結果として出力するものを用いてもよい。この例の後段処理Ｎは、最終的な文字認識結果の決定のためにＯＣＲ１０の認識結果のみを用い、「他の手段」は用いないため、ＯＣＲ１０への依存度はいわば１００％である。また、この例の後段処理Ｎは、ＯＣＲ１０以外に文字認識を担当する手段を用いないので、ＯＣＲ１０以外の文字認識のための追加のコストは０である。 For example, a later-described post-processing unit 18-N (hereinafter, the post-processing unit 18-K is also referred to as “post-processing K” in order to avoid complexity, where K is an integer from 1 to N) is described later. As in “Post-processing 3” in FIG. 8, a text code that is a character recognition result of the OCR 10 may be output as it is as a final character recognition result. Since the post-process N in this example uses only the recognition result of the OCR 10 to determine the final character recognition result and does not use “other means”, the degree of dependence on the OCR 10 is 100%. Further, since the post-process N in this example does not use a means for character recognition other than the OCR 10, the additional cost for character recognition other than the OCR 10 is zero.

また、順序が最初の後段処理部１８−１（「後段処理１」）は、ＯＣＲ１０の認識結果を用いず、１以上の「他の手段」の文字認識結果のみから最終的な認識結果を決定するものであってもよい。この例の後段処理ＮのＯＣＲ１０への依存度はいわば０％である。後段処理Ｎとしては、後述する図８の「後段処理１」のように、２人が入力画像データを見て認識し、入力した文字列同士が一致した場合にはその文字列を最終的な認識結果とし、一致しなかった場合には別の人による文字認識の結果を最終的な認識結果とするものを用いてもよい。この場合、最小でも２人、最大では３人の人を要するため、処理のための所要コストは高い。 Further, the first-stage post-processing unit 18-1 ("post-stage process 1") having the first order determines the final recognition result from only the character recognition result of one or more "other means" without using the recognition result of the OCR 10. You may do. In other words, the dependency of the post-processing N in this example on the OCR 10 is 0%. As the post-stage process N, as shown in “post-stage process 1” in FIG. 8 described later, when two persons see and recognize the input image data and the input character strings match, the character string is finalized. A recognition result may be used in which the result of character recognition by another person is used as a final recognition result when they do not match. In this case, since a minimum of two people and a maximum of three people are required, the required cost for processing is high.

また、例えば、後述する図８に例示した「後段処理２」のように、ＯＣＲ１０の認識結果と、第１の人が同じ入力画像データを見て入力した認識結果の文字列とが一致する場合にはその認識結果を最終的な認識結果として採用し、一致しない場合には第２の人がその同じ入力画像データを見て入力した認識結果を最終的な認識結果として採用する後段処理部１８（後段処理Ａと呼ぶ）を用いてもよい。この後段処理ＡのＯＣＲ１０への依存度及びコストは、前に例示したＯＣＲ１０の認識結果をそのまま最終的な認識結果に採用する後段処理Ｎと、前述のＯＣＲ１０をまったく用いない後段処理部１との間になる。 Also, for example, when the recognition result of the OCR 10 matches the character string of the recognition result input by the first person looking at the same input image data, as in “post-processing 2” illustrated in FIG. Is used as a final recognition result, and if they do not match, the second processing unit 18 adopts a recognition result input by the second person looking at the same input image data as the final recognition result. (Referred to as post-processing A) may be used. The degree of dependency and cost of the post-processing A on the OCR 10 are as follows: the post-processing N that uses the recognition result of the OCR 10 exemplified above as it is as the final recognition result, and the post-processing unit 1 that does not use the OCR 10 described above. Between.

また、後段処理部１８の１つとして、例えば、前述の後段処理Ａにおいて第１の人の代わりに外部のより高度な（そしてより高コストな）文字認識システムを用いるもの（後段処理Ｂと呼ぶ）を用いてもよい。この後段処理部１８は、ＯＣＲ１０の認識結果と外部の文字認識システムの認識結果とが一致する場合にはその認識結果を最終的な認識結果として採用し、一致しない場合には人が同じ入力画像データを見て入力した認識結果を最終的な認識結果として採用する。後段処理Ｂは、ＯＣＲ１０の認識結果を後段処理Ａと同じ方法で用いるので、後段処理ＢのＯＣＲ１０への依存度は、後段処理Ａのそれと同等であるとみなしてよい。一般に、人にかかるコストは、コンピュータによる文字認識システムのコストよりも高いので、後段処理Ｂは後段処理Ａよりもコストが低い。このため、この後段処理Ｂの順番は後段処理Ａよりも後の順番（数字が大きい）となる。 Further, as one of the post-processing units 18, for example, the above-described post-processing A uses an external higher-level (and higher-cost) character recognition system instead of the first person (referred to as post-processing B). ) May be used. The post-processing unit 18 adopts the recognition result as a final recognition result when the recognition result of the OCR 10 and the recognition result of the external character recognition system match, and when they do not match, The recognition result input by looking at the data is adopted as the final recognition result. Since the post-processing B uses the recognition result of the OCR 10 in the same manner as the post-processing A, the dependency of the post-processing B on the OCR 10 may be regarded as being equivalent to that of the post-processing A. Generally, since the cost for a person is higher than the cost of a character recognition system using a computer, the post-process B is lower than the post-process A. For this reason, the order of the post-stage process B is the order after the post-stage process A (the number is larger).

また、例えば、入力画像データとＯＣＲ１０の認識結果とを人に提示し、その人がＯＣＲ１０の認識結果が正しいと判断した場合にはその旨を示す簡単な入力（例えば正解ボタンの押下）を受け付け、誤っていると判断した場合にはその人の認識結果の文字列の入力を最終的な認識結果として受け付ける後段処理部１８（後段処理Ｃと呼ぶ）を用いてもよい。この後段処理Ｃは、「他の手段」をまったく用いない上述の後段処理Ｎの例よりは高コストであるが、上述した後段処理ＡやＢより低コストである（「他の手段」が１人の人だけなので）。また、後段処理ＡやＢよりもＯＣＲ１０以外に最終的な認識結果の決定に関与する手段が少ないという意味で、ＯＣＲ１０への依存度は後段処理ＡやＢよりも高いといえる。したがって、後段処理Ｃの順番は、「他の手段」をまったく用いない上述の後段処理Ｎの例と、後段処理Ａとの間である。 In addition, for example, when the input image data and the recognition result of the OCR 10 are presented to a person and the person determines that the recognition result of the OCR 10 is correct, a simple input indicating that fact (for example, pressing the correct button) is accepted. If it is determined that there is an error, a post-processing unit 18 (referred to as post-processing C) that accepts the input of the character string of the person's recognition result as the final recognition result may be used. This post-stage process C is more expensive than the example of the post-stage process N described above that does not use “other means” at all, but is lower in cost than the post-stage processes A and B described above (the “other means” is 1). Only humans). Further, it can be said that the degree of dependence on the OCR 10 is higher than that of the post-stage processes A and B in the sense that there are fewer means related to the determination of the final recognition result than the post-stage processes A and B. Accordingly, the order of the post-stage process C is between the above-described post-process N and the post-stage process A that do not use “other means” at all.

また、この後段処理Ｃのバリエーションとして、ＯＣＲ１０が同じ入力画像データから認識した複数の認識結果候補を認識確度の高い順にいくつか人に提示し、それら候補の中に正解があればその人は単にその正解を選ぶという簡単な操作を行い、正解がなければその人が認識した文字列を入力するという後段処理Ｄを用いてもよい。後段処理Ｄは、人の入力の手間が少なくなる分、時間あたりに処理できる数が増えるため、時間あたりのコストは後段処理Ｃよりも低くなると期待される。このため、後段処理Ｄは後段処理Ｃより順番が後になる。 Further, as a variation of the post-processing C, a plurality of recognition result candidates recognized by the OCR 10 from the same input image data are presented to several people in descending order of recognition accuracy. A simple operation of selecting the correct answer may be performed, and if there is no correct answer, a post-process D of inputting a character string recognized by the person may be used. Since the post-processing D is less time-consuming for human input, the number of processes that can be processed per hour increases, so the cost per hour is expected to be lower than the post-processing C. For this reason, the post-process D is later than the post-process C.

本実施形態では、ＯＣＲ１０が求める認識確度をＮ個の区間に分け、Ｎ個の後段処理部１８に対して、順位の順に１つずつ区間を対応付ける。すなわち、順位が高い後段処理部１８ほど、高い認識確度の区間が対応付けられる。そして、判定システムは、入力画像データについての最終的な文字認識結果を検定するために、これらＮ個の順位付けされた後段処理部１８のうち、ＯＣＲ１０が出力したその入力画像データについての認識確度が属する区間に対応付けられた後段処理部１８を選択し、動作させる。選択されていない後段処理部１８は動作させない。 In the present embodiment, the recognition accuracy required by the OCR 10 is divided into N sections, and the sections are associated one by one in the order of rank with the N post-processing units 18. That is, a higher recognition accuracy section is associated with a higher-order post-processing unit 18. Then, in order to verify the final character recognition result for the input image data, the determination system recognizes the recognition accuracy for the input image data output by the OCR 10 among these N ranked post-processing units 18. The post-processing unit 18 associated with the section to which the user belongs is selected and operated. The post-processing unit 18 not selected is not operated.

図１に示す閾値ＤＢ１４は、これらＮ個の区間を区切る（Ｎ−１）個の閾値を保持する。閾値比較処理部１２は、ＯＣＲ１０が、出力する認識結果のテキストコードについて求めた認識確度を、それら（Ｎ−１）個の閾値と比較し、その認識結果がＮ個の区間のいずれに属するかを判定する。この判定の結果は、その区間を示す１からＮまでのいずれかの番号であり、この番号はその区間に対応する後段処理部１８を特定する情報として機能する。分離処理部１６は、閾値比較処理部１２が出力する区間番号を受け取り、Ｎ個の後段処理部１８のうち、その区間番号に対応する後段処理部１８を選択的に有効化する。有効化された後段処理部１８は、入力される情報（ＯＣＲ１０の認識結果、入力画像データ等）を用いて、その入力画像データに対する最終的な文字認識結果を決定して出力する。他の後段処理部１８は動作しない。統合処理部１９は、Ｎ個の後段処理部１８のうち、閾値比較処理部１２から得た区間番号に対応する後段処理部１８の出力を、その入力画像データに対するこの判定システムの文字認識結果として出力する。統合処理部１９は、他の後段処理部１８の出力は（仮にあったとしても）破棄する。 The threshold value DB 14 shown in FIG. 1 holds (N−1) threshold values that delimit these N intervals. The threshold comparison processing unit 12 compares the recognition accuracy obtained by the OCR 10 for the text code of the recognition result to be output with those (N−1) threshold values, and which of the N intervals the recognition result belongs to. Determine. The result of this determination is any number from 1 to N indicating the section, and this number functions as information for specifying the post-processing unit 18 corresponding to the section. The separation processing unit 16 receives the section number output by the threshold comparison processing unit 12 and selectively validates the subsequent processing unit 18 corresponding to the section number among the N subsequent processing units 18. The validated post-processing unit 18 determines and outputs the final character recognition result for the input image data using the input information (the recognition result of the OCR 10, the input image data, etc.). Other post-processing units 18 do not operate. The integration processing unit 19 uses the output of the subsequent processing unit 18 corresponding to the section number obtained from the threshold comparison processing unit 12 among the N subsequent processing units 18 as the character recognition result of this determination system for the input image data. Output. The integrated processing unit 19 discards the output of the other subsequent processing unit 18 (if it exists).

閾値ＤＢ１４に保持される閾値群は、閾値設定処理装置２０により設定される。閾値設定処理装置２０は、多数の学習用データを処理することで（Ｎ−１）個の閾値を決定する。学習用データとしては、図２に例示するように、過去にＯＣＲ１０が行ったＭ回（非常に多い回数）の文字認識の各々についての、認識確度の値と、その文字認識の結果が正解、誤りのいずれであったかを示す正誤情報とのペアを用いる。認識確度の値は、ＯＣＲ１０が出力した値を記録しておけばよい。また、正誤情報は、その文字認識の結果が正解であるか誤りであるかを示す二値の値である。以下の説明では、正誤情報は、正解の場合に「１」、誤りの場合に「０」となる情報とする。あくまで一例であるが、ＯＣＲ１０の認識結果の正誤を人が確認し、正誤情報を入力すればよい。 The threshold value group held in the threshold value DB 14 is set by the threshold value setting processing device 20. The threshold value setting processor 20 determines (N−1) threshold values by processing a large number of learning data. As the data for learning, as illustrated in FIG. 2, the recognition accuracy value and the character recognition result for each of M (very many) character recognitions performed by the OCR 10 in the past are correct, A pair with correct / incorrect information indicating which of the errors is used is used. As the recognition accuracy value, the value output by the OCR 10 may be recorded. The correct / incorrect information is a binary value indicating whether the character recognition result is correct or incorrect. In the following description, the correct / incorrect information is information that is “1” when the answer is correct and “0” when there is an error. For example, it is only necessary that a person confirms whether the recognition result of the OCR 10 is correct and inputs correct / incorrect information.

閾値設定処理装置２０が行う閾値設定処理のポイントは以下である。
（１）認識確度と正誤情報のペアを相当数、学習用データとして入力する。
（２）後段処理１〜Ｎの各々に、その後段処理部１８が用いる認識結果のテキストコードに求められる目標認識率を設定する。後段処理のＮの数字が大きいほど、目標認識率は高く設定する。
（３）後段処理Ｎから順に後段処理１に向かって、各後段処理Ｋ（１≦Ｋ≦Ｎ）の目標認識率を達成できる閾値を算出していく。 The points of threshold setting processing performed by the threshold setting processing device 20 are as follows.
(1) A considerable number of pairs of recognition accuracy and correct / incorrect information are input as learning data.
(2) The target recognition rate required for the text code of the recognition result used by the subsequent processing unit 18 is set in each of the subsequent processing 1 to N. The target recognition rate is set higher as the number of N in the subsequent process is larger.
(3) From the subsequent process N to the subsequent process 1 in order, a threshold that can achieve the target recognition rate of each subsequent process K (1 ≦ K ≦ N) is calculated.

閾値設定処理装置２０は、学習用データ入力部２２、累積データ算出部２４、目標認識率設定部２６、閾値算出部２８を含む。 The threshold setting processing device 20 includes a learning data input unit 22, a cumulative data calculation unit 24, a target recognition rate setting unit 26, and a threshold calculation unit 28.

学習用データ入力部２２は、Ｍ個の学習用データ（認識確度と正誤情報のペア）を入力し、それらＭ個の学習用データを認識確度の順にソートする。 The learning data input unit 22 inputs M learning data (a pair of recognition accuracy and correct / incorrect information), and sorts the M learning data in the order of recognition accuracy.

累積データ算出部２４は、ソートした学習用データを用いて、１番目の学習用データから各順番の学習用データまでの間の累積正解数を算出する。詳細は後述する。 The accumulated data calculation unit 24 uses the sorted learning data to calculate the cumulative number of correct answers from the first learning data to the learning data in each order. Details will be described later.

目標認識率設定部２６は、後段処理１〜Ｎのそれぞれに対して目標認識率を設定する。後段処理Ｋの目標認識率は、その後段処理Ｋが満たすべき認識率である。一つの例では、この設定は、ユーザが行う。また、後述する図８の例では、目標認識率を自動設定できる。 The target recognition rate setting unit 26 sets a target recognition rate for each of the subsequent processes 1 to N. The target recognition rate of the subsequent process K is a recognition rate that the subsequent process K should satisfy. In one example, this setting is made by the user. In the example of FIG. 8 described later, the target recognition rate can be automatically set.

閾値算出部２８は、累積データ算出部２４が算出した累積正解数と、目標認識率設定部が設定した各後段処理の目標認識率に基づいて、各後段処理に対応する認識確度の区間を区切る（Ｎ−１）個の閾値を算出する。 Based on the cumulative number of correct answers calculated by the cumulative data calculation unit 24 and the target recognition rate of each subsequent process set by the target recognition rate setting unit, the threshold calculation unit 28 divides a recognition accuracy interval corresponding to each subsequent process. (N-1) threshold values are calculated.

閾値算出部２８が行う閾値算出の方法を、図４〜図７を参照して説明する。 A threshold value calculation method performed by the threshold value calculation unit 28 will be described with reference to FIGS.

認識確度Ｘの取り得る範囲をＮ個の区間に区分するには、（Ｎ−１）個の閾値を決定する必要がある。ここで設定するＮ個の閾値を、Ｔ₁，Ｔ₂，・・・・，Ｔ_N-1とする。認識確度Ｘの取り得る範囲は０から１までの実数としても、一般性は失われないので、以下の例ではそのように規定する。また、区間（後段処理）の番号をＫとする。 In order to divide the possible range of the recognition accuracy X into N sections, it is necessary to determine (N−1) threshold values. The N threshold values set here are T ₁ , T ₂ ,..., T _N−1 . Even if the recognition accuracy X can be a real number from 0 to 1, the generality is not lost, so in the following example, it is defined as such. In addition, the number of the section (the subsequent process) is K.

閾値算出部２８は、図４に例示する閾値算出手順を実行する。この手順では、まず閾値の両端をＴ₀＝０、Ｔ_N＝１に設定し、閾値インデクスＪ_Kの初期値Ｊ₀を０に設定する（Ｓ１０）。この例では、Ｔ_K＝Ｘ_jとなるときのインデクスｊを、ｊ＝Ｊ_N-Kと表す。つまり、以下に説明する手順は、閾値Ｔ_Kを求めるために、閾値インデクスＪ_N-Kを求めるアルゴリズムと捉えることができる。なお、Ｘ_jは、ｊ番目の学習用データの認識確度である。ただし１≦ｊ≦Ｍ。また後述するようにｉ＞ｊならばＸ_i≦Ｘ_jとなるようにソートされているとする。 The threshold calculation unit 28 executes the threshold calculation procedure illustrated in FIG. In this procedure, first, both ends of the threshold are set to T ₀ = 0 and T _N = 1, and the initial value J ₀ of the threshold index J _K is set to 0 (S10). In this example, the index j when T _K = X _j is expressed as j = J _NK . That is, the procedure described below, to determine the threshold value T _K, it can be regarded as an algorithm to calculate a threshold index J _NK. X _j is the recognition accuracy of the j-th learning data. However, 1 ≦ j ≦ M. As will be described later, it is assumed that if i> j, sorting is performed so that X _i ≦ X _j .

次に、各区間Ｋについての目標認識率Ｙ_Kを設定する（Ｓ１２）。この目標認識率Ｙ_Kは、当該区間Ｋに対応する後段処理ＫのためにＯＣＲ１０が達成すべき目標となる認識率（すなわち文字認識の正解率）である。ここでは、区間Ｋの番号が大きくなるほど、目標認識率Ｙ_Kは高くなるように設定する。その理由は以下の通りである。 Next, a target recognition rate Y _K for each section K is set (S12). The target recognition rate Y _K is a recognition rate (that is, a correct recognition rate of character recognition) that is a target that the OCR 10 should achieve for the subsequent processing K corresponding to the section K. Here, the target recognition rate Y _K is set to increase as the number of the section K increases. The reason is as follows.

すなわち、図１に例示した判定システムは、入力画像データに対して後段処理１〜Ｎのいずれかを選択し、その選択された後段処理Ｋがその入力画像データに含まれる文字列についての、当該判定システムとしての最終的な認識結果を出力する。判定システム全体として要求される一定の認識率を満たす（すなわちそれ以上の認識率を平均として出す）必要があるので、選択された後段処理Ｋはその一定の認識率を満たす必要がある。後段処理Ｋは、ＯＣＲ１０の認識結果を他の手段による認識結果と組み合わせることで、後段処理Ｋとしての認識結果を求める。ここで、前述のように、番号Ｋが大きくなるほど、ＯＣＲ１０の認識結果に対する後段処理Ｋの依存度は高くなる。したがって、後段処理Ｋとしての認識率が判定システムに要求される認識率を満たすようにするには、Ｋが大きくなるほど、ＯＣＲ１０の認識率が高くする必要がある。このため、ＯＣＲ１０の目標認識率Ｙ_Kは、Ｋが大きくなるほど高くなるように設定する。 That is, the determination system illustrated in FIG. 1 selects any one of the post-processing 1 to N for the input image data, and the selected post-processing K regarding the character string included in the input image data The final recognition result as a judgment system is output. Since it is necessary to satisfy a certain recognition rate required for the entire determination system (that is, to obtain a higher recognition rate as an average), the selected post-processing K needs to satisfy the certain recognition rate. The post-stage process K obtains the recognition result as the post-stage process K by combining the recognition result of the OCR 10 with the recognition result by other means. Here, as described above, as the number K increases, the dependency of the post-processing K on the recognition result of the OCR 10 increases. Therefore, in order for the recognition rate as the post-processing K to satisfy the recognition rate required for the determination system, it is necessary to increase the recognition rate of the OCR 10 as K increases. For this reason, the target recognition rate Y _K of the OCR 10 is set so as to increase as K increases.

Ｓ１２での目標認識率Ｙ_Kの設定は、例えばユーザが行う。また、目標認識率Ｙ_Kを自動的に算出する例を後で説明する。 The user sets the target recognition rate Y _{K in} S12, for example. An example of automatically calculating the target recognition rate Y _K will be described later.

次に、閾値算出部２８は、学習用データを認識確度の降順にソートする（Ｓ１４）。前述のように個々の学習用データは、認識確度Ｘと正誤情報Ｆのペアである。そして、認識確度Ｘの降順にソートした学習用データ群では、ｉ＞ｊならばＸ_i≦Ｘ_jという関係が成り立つ。すなわち、インデクスｊの値が大きくなるにつれて、学習用データｊ内の認識確度Ｘ_jは単調減少する。予め学習用データを認識確度でソートしておくことによって、後述する累積正解数Ｓ（ｉ）の算出が高速になる。また、予め累積正解数Ｓ（ｉ）を算出しておくことによって、閾値算出処理が高速になる（毎回所望の後段処理に入る学習用データ数を加算する必要がなくなる）。 Next, the threshold value calculation unit 28 sorts the learning data in descending order of recognition accuracy (S14). As described above, each learning data is a pair of recognition accuracy X and correct / incorrect information F. In the learning data group sorted in descending order of the recognition accuracy X, the relationship X _i ≦ X _j is established if i> j. That is, as the value of the index j increases, the recognition accuracy X _j in the learning data j decreases monotonously. By sorting the learning data in advance according to the recognition accuracy, the calculation of the cumulative correct number S (i), which will be described later, becomes faster. In addition, by calculating the cumulative correct answer number S (i) in advance, the threshold value calculation process becomes faster (it is not necessary to add the number of learning data items that enter the desired subsequent process every time).

このソートの後の、各学習用データｊ内の認識確度Ｘ_j、閾値インデクスＪ_K及び閾値Ｔ_Kと、各区間（後段処理）Ｋとの関係を、図５に模式的に示す。図５に示すように、後段処理Ｋが適用される区間Ｋは、認識確度ＸがＴ_K-1以上Ｔ_K未満の区間である。この区間Ｋに設定された目標認識率の値がＹ_Kである。各学習用データｊの認識確度Ｘ_jは、インデクスｊが大きくなるにつれて値が小さくなる。学習用データの個数をＭとすると、Ｘ_Mが学習用データの集合の中での認識確度の最小値である。また、定義より、ｊ= Ｊ_N-Kであるときの認識確度Ｘ_jが、閾値Ｔ_Kとなる。 FIG. 5 schematically shows the relationship between the recognition accuracy X _j , the threshold index J _K and the threshold T _K in each learning data j, and each section (post-stage processing) K after this sorting. As shown in FIG. 5, the section K to which the post-stage processing K is applied is a section in which the recognition accuracy X is T _K−1 or more and less than T _K. The value of the target recognition rate set in this section K is Y _K. Recognition accuracy X _j of each learning data j is the value as the index j is increased is reduced. When the number of learning data is M, X _M is the minimum value of the recognition accuracy in the learning data set. From the definition, the recognition accuracy X _j when j = J _NK is the threshold value T _K.

図４の手順では、図５に示すように、Ｋ＝Ｎから順にＫが小さくなる方向に閾値Ｔ_Kを決定していくが、これは言い換えれば、閾値インデクスＪ_mを、ｍが０から大きくなる向きに決定していくことでもある。 In the procedure of FIG. 4, as shown in FIG. 5, the threshold value T _K is determined in the direction of decreasing K from K = N. In other words, the threshold index J _m is increased from 0 to m. It is also determined to become the direction.

図４の手順の説明に戻る。Ｓ１４の後、閾値算出部２８は、インデクスｉごとに、図示の式（１）を用いて、累積正解数Ｓ（ｉ）を計算する（Ｓ１６）。すなわち、累積正解数Ｓ（ｉ）は、各学習用データｊ内の正誤情報Ｆj（正解の場合１、不正解の場合０）を、ｊが１からｉまで総和したものである。 Returning to the description of the procedure in FIG. After S14, the threshold value calculation unit 28 calculates the cumulative number of correct answers S (i) for each index i using the equation (1) shown (S16). That is, the cumulative correct answer number S (i) is the sum of correct / incorrect information Fj (1 for correct answer, 0 for incorrect answer) from 1 to i in each learning data j.

次に閾値算出部２８は、閾値Ｔ_KのインデクスＫをＮ（＝後段処理の総数）に初期化する（Ｓ１８）。 Next the threshold calculation unit 28 initializes the index K of the threshold T _K to N (= total number of post-processing) (S18).

次に、閾値算出部２８は、閾値Ｔ_K-1を決定するための処理を実行する（Ｓ２０）。最初のループで処理する区間Ｎは、上限の閾値Ｔ_Nは１に決められており、Ｓ２０ではその区間Ｎの下限の閾値Ｔ_N-1を決定する。Ｓ２０の処理の詳細な例は、後で図７を参照して説明する。 Next, the threshold value calculation unit 28 executes a process for determining the threshold value T _K-1 (S20). In the section N processed in the first loop, the upper limit threshold T _N is set to 1, and the lower limit threshold T _N-1 of the section N is determined in S20. A detailed example of the process of S20 will be described later with reference to FIG.

閾値Ｔ_K-1の決定が終わると、閾値算出部２８は、インデクスＫを１減らし（Ｓ２２）、その結果Ｋが１に到達したかどうかを判定する（Ｓ２４）。Ｋが１でない場合（すなわちＫが２以上である場合）は、まだすべての閾値の決定が完了していないので、Ｓ２０に戻って閾値Ｔ_K-1の決定を行う。Ｋが１に到達した場合、求めるべきすべての閾値Ｔ₁〜Ｔ_N-1の決定が完了したことを意味するので、図４の処理は終了する。 When the determination of the threshold value T _K-1 is finished, the threshold value calculation unit 28 decreases the index K by 1 (S22), and determines whether or not the result K has reached 1 (S24). When K is not 1 (that is, when K is 2 or more), determination of all threshold values has not been completed yet, so the process returns to S20 to determine threshold value T _K-1 . When K reaches 1, it means that the determination of all threshold values T _{1 to} T _N-1 to be obtained has been completed, and the processing of FIG. 4 ends.

次に、図７を参照して、閾値Ｔ_K-1を決定するための処理（Ｓ２０）の詳細な手順を例示する。 Next, with reference to FIG. 7, a detailed procedure of the process (S20) for determining the threshold value T _K-1 is illustrated.

この手順を開始する時点では、閾値Ｔ_N，Ｔ_N-1，Ｔ_N-2，・・・，Ｔ_Kは決定済みである。 At the start of this procedure, the thresholds T _N , T _N−1 , T _N−2 ,..., T _K have been determined.

この手順では、まず閾値算出部２８は、累積正解数Ｓ（ｊ）のインデクスｊをＭ（すなわち学習用データの総数）に初期化する（Ｓ２０２）。 In this procedure, the threshold value calculation unit 28 first initializes the index j of the cumulative number of correct answers S (j) to M (that is, the total number of learning data) (S202).

次に閾値算出部２８は、図示の式（２）が成立するか否かを判定する（Ｓ２０４）。図５を参照して説明すると、Ｓ（Ｊ_N-K）は、認識確度の最大値Ｘ₁を含む学習用データ内の正誤情報Ｆ₁から、既に決定済みの閾値Ｔ_Kの閾値インデクスＪ_N-K（＝j(K)とする）に対応する認識確度Ｘ_j(K)を含む学習用データ内の正誤情報Ｆ_j(K)までの総和である。一方、Ｓ（ｊ）は、正誤情報Ｆ₁から、閾値インデクスＪ_N-Kよりも大きいインデクスｊに対応する正誤情報Ｆ_jまでの総和である。これらの差（Ｓ（ｊ）−Ｓ（Ｊ_N-K））は、Ｊ_N-Kからｊまでの区間の正解の総数であり、これを（ｊ−Ｊ_N-K）で除すると、その区間でのＯＣＲ１０の正解率が得られる。 Next, the threshold value calculation unit 28 determines whether or not the illustrated equation (2) is satisfied (S204). Referring to FIG. 5, S (J _NK ) is a threshold index J _NK (=) of a threshold value T _K that has already been determined from correct / incorrect information F ₁ in the learning data including the maximum value X ₁ of recognition accuracy. j (K)) is the sum of up to correct / incorrect information F _{j (K)} in the learning data including the recognition accuracy X _{j (K)} corresponding to ₍ _{j (K))} . On the other hand, S (j) is the sum from the correct / incorrect information F ₁ to the correct / incorrect information F _j corresponding to the index j larger than the threshold index J _NK . These differences (S (j) −S (J _NK )) are the total number of correct answers in the section from J _NK to j, and when this is divided by (j−J _NK ), the correct answer of OCR 10 in that section Rate is obtained.

この正解率が現在の閾値決定処理の対象である区間Ｋの目標認識率Ｙ_K以上であれば、Ｊ_N-Kからｊまでの区間は、その区間Ｋについての目標認識率の条件を満たしている（Ｓ２０４の判定結果がＹｅｓ）。この場合閾値算出部２８は、このときのｊに対応する認識確度Ｘ_jを、その区間Ｋの認識確度の下限を規定する閾値Ｔ_K-1として採用する（Ｓ２０６）。また、このとき、そのｊをその閾値Ｔ_K-1に対応する閾値インデクスＪ_N-K+1として記憶する。 If this accuracy rate is equal to or higher than the target recognition rate Y _K of the section K that is the target of the current threshold determination process, the section from J _NK to j satisfies the target recognition rate condition for that section K ( The determination result of S204 is Yes). In this case, the threshold value calculation unit 28 employs the recognition accuracy X _j corresponding to j at this time as the threshold value T _K-1 that defines the lower limit of the recognition accuracy of the section K (S206). At this time, the j is stored as a threshold index J _{N-K + 1} corresponding to the threshold T _K−1 .

Ｓ２０４の判定結果がＮｏの場合、閾値算出部２８は、インデクスｊを１減らし（Ｓ２０８）、減らした結果の新たなインデクスｊが区間Ｋの上限に対応する閾値インデクスＪ_N-Kに達したかどうかを判定する（Ｓ２１０）。この判定の結果がＮｏの場合、閾値算出部２８は、Ｓ２０４に戻り、その新たなインデクスｊについて式（２）を評価する。 If S204 the determination result is No, the threshold value calculation unit 28, the index j 1 Herashi (S208), whether a new index j as a result of reduced has reached the threshold index J _NK corresponding to the upper limit of the interval K Determine (S210). When the result of this determination is No, the threshold value calculation unit 28 returns to S204 and evaluates the expression (2) for the new index j.

式（２）の評価は、最大値Ｍから順にインデクスｊを１ずつ減らしながら（Ｓ２０８）繰り返し行われ、ｊがＪ_N-Kに達すると（Ｓ２１０の判定結果がＹｅｓ）、区間Ｋに入り得る認識確度Ｘが存在しないということになる。この場合、閾値算出部２８は、その区間Ｋに対応する後段処理Ｋを無効化する（Ｓ２１２）。すなわち、この閾値設定処理で設定された閾値群を用いて判定システムが実行する実際の入力画像データについての判定処理では、その後段処理Ｋは用いられない。 The evaluation of Expression (2) is repeatedly performed while decreasing the index j by 1 in order from the maximum value M (S208). When j reaches J _NK (the determination result in S210 is Yes), the recognition accuracy that can enter the section K. This means that X does not exist. In this case, the threshold calculation unit 28 invalidates the subsequent process K corresponding to the section K (S212). That is, in the determination process for actual input image data executed by the determination system using the threshold value group set in the threshold value setting process, the subsequent process K is not used.

このように、図７の手順では、最大値Ｍから順にインデクスｊが小さくなる方向に評価を進めるので、この手順により決定される区間Ｋは、目標認識率Ｙ_Kを満たす最大幅の区間となる。図４の手順は、対応する認識確度Ｘ（別の観点では対応する目標認識率Ｙ_K）が高い区間Ｋから順にその区間Ｋの区切り（下限の閾値Ｔ_K-1）を決定していくので、認識確度Ｘが高い区間Ｋから順に目標認識率Ｙ_Kを満たす最大幅の区間が確保されていくことになる。例えば、図６の例では、まず、対応する認識確度が最も高い後段処理Ｎの区間Ｎが目標認識率Ｙ_Nを満たす範囲で最大幅となるように区間Ｎの下限の閾値Ｔ_N-1が決定され、次に区間（Ｎ−１）が目標認識率Ｙ_N-1を満たす範囲で最大幅となるように閾値Ｔ_N-2が決定される。このような決定処理が、対応する認識確度が最低の区間１の上限の閾値（すなわちの区間２の下限の閾値）Ｔ₁を決定するところまで繰り返される。 In this way, in the procedure of FIG. 7, the evaluation proceeds in the direction in which the index j decreases in order from the maximum value M, so the section K determined by this procedure is the section of the maximum width that satisfies the target recognition rate Y _K. . The procedure of FIG. 4 determines the section K (lower threshold T _K-1 ) in order from the section K in which the corresponding recognition accuracy X (corresponding target recognition rate Y _{K in} another viewpoint) is high. The sections with the maximum width satisfying the target recognition rate Y _K are secured in order from the section K with the highest recognition accuracy X. For example, in the example of FIG. 6, first, the lower limit threshold value T _N−1 of the section N is set so that the section N of the subsequent processing N with the highest recognition accuracy has a maximum width in a range satisfying the target recognition rate Y _N. Next, the threshold value T _N−2 is determined so that the section (N−1) has the maximum width in a range satisfying the target recognition rate Y _N−1 . Such determination process, the corresponding recognition accuracy is repeated until it is determined (lower threshold i.e. the section 2) T ₁ upper threshold of lowest section 1.

認識確度（又は目標認識率Ｙ_K）が高い区間Ｋに対応する後段処理Ｋほど、ＯＣＲ１０への依存度が高い、すなわちＯＣＲ１０よりコストが掛かる１以上の「他の手段」への依存度が低いので、図４の手順では、コストが低い後段処理Ｋから順に、目標認識率Ｙ_Kを満たす最大幅の区間が確保されていくので、入力画像データの処理時にコストが低い後段処理Ｋが選ばれやすくなり、判定システム全体としての処理コストが低減される（与えられた学習用データ群のもとでは理論上コストが最小化される）。 The later processing K corresponding to the section K having a higher recognition accuracy (or target recognition rate Y _K ) has a higher dependency on the OCR 10, that is, a lower dependency on one or more “other means” that costs more than the OCR 10. Therefore, in the procedure of FIG. 4, since the maximum width section that satisfies the target recognition rate Y _K is secured in order from the low-cost post-processing K, the low-cost post-processing K is selected when processing the input image data. As a result, the processing cost of the entire determination system is reduced (theoretically, the cost is minimized under a given learning data group).

以上、本実施形態の閾値設定処理装置２０の構成及び動作について説明した。次に、図８を参照して、具体的な後段処理部１８を３つ（後段処理１〜３）備える判定システムの具体例に即して、各後段処理部１８に対応する区間の目標認識率を自動的に決定する例を示す。 Heretofore, the configuration and operation of the threshold setting processing device 20 of the present embodiment have been described. Next, referring to FIG. 8, in accordance with a specific example of a determination system including three specific post-processing units 18 (post-processing 1 to 3), target recognition of a section corresponding to each post-processing unit 18 An example of automatically determining the rate is shown.

図８には、この具体例の判定システムのうちの後段処理部１８−１、１８−２、１８−３と、それに対して認識結果を供給するＯＣＲ１０と、分離処理部１６とを示す。図８には、更にその判定システムに対して人の操作者による入力画像データを提供する人手入力装置３０−１、３０−２、３０−３（相互に区別する必要がない場合は人手入力装置３０と総称）を示す。 FIG. 8 shows the post-processing units 18-1, 18-2, and 18-3, the OCR 10 that supplies recognition results to the processing units 18-1, 18-2, and the separation processing unit 16 in the determination system of this specific example. FIG. 8 further shows manual input devices 30-1, 30-2, and 30-3 that provide input image data by a human operator to the determination system (manual input devices when there is no need to distinguish them from each other). 30).

人手入力装置３０は、文字認識の対象となる入力画像データを画面に表示し、人である操作者からその入力画像データに含まれる文字列の認識結果の入力を受け付け、受け付けた認識結果の文字列を、後段処理部１８−１、１８−２、１８−３に送信する。人手入力装置３０は、例えば、判定システムに対してインターネットを介して接続された、各操作者のパーソナルコンピュータ上のアプリケーションソフトウエアである。 The manual input device 30 displays input image data to be character-recognized on a screen, accepts input of a recognition result of a character string included in the input image data from an operator who is a person, and receives characters of the received recognition result The column is transmitted to the subsequent processing units 18-1, 18-2, and 18-3. The manual input device 30 is, for example, application software on each operator's personal computer connected to the determination system via the Internet.

後段処理部１８−３（後段処理３）は、３つの後段処理部１８のうち認識確度（別の観点では目標認識率）が最も高い区間に対応するものであり、この例ではＯＣＲ１０の認識結果を受け取り、その認識結果をそのまま自分の認識結果として出力する。 The post-processing unit 18-3 (post-processing 3) corresponds to a section having the highest recognition accuracy (target recognition rate in another viewpoint) among the three post-processing units 18, and in this example, the recognition result of the OCR 10 And outputs the recognition result as it is as its own recognition result.

後段処理部１８−２（後段処理２）は、３つの後段処理部１８のうち認識確度が真ん中の区間に対応するものである。後段処理部１８−２には、ＯＣＲ１０の認識結果に加え、人手入力装置３０−１及び３０−２から各操作者の認識結果の文字列が入力される。後段処理部１８−２は、ＯＣＲ１０の求めた認識確度に応じて分離処理部１６から選択されると、人手入力装置３０−１に対して入力画像データを供給し、人手入力装置３０−１の操作者がその入力画像データの認識結果として入力した文字列（テキストコード）を取得する。そして、ＯＣＲ１０から得た認識結果と人手入力装置３０−１から得た認識結果とを突き合わせ、それら両者が一致する場合には、その一致する認識結果を当該後段処理部１８−２の認識結果として出力する。一方、それら両者が不一致の場合は、後段処理部１８−２は、別の人手入力装置３０−２に対してその入力画像データを供給し、人手入力装置３０−２の操作者がその入力画像データの認識結果として入力した文字列を取得し、その文字列を後段処理部１８−２の認識結果として出力する。この場合、人手入力装置３０−２の操作者は、人手入力装置３０−１の操作者よりも入力画像データ内の文字列の認識の正確さが高いと想定される人（例えば過去の成績がよい人）としてもよい。 The post-stage processing unit 18-2 (post-stage process 2) corresponds to the section in which the recognition accuracy is the middle of the three post-stage processing units 18. In addition to the recognition result of the OCR 10, a character string of the recognition result of each operator is input from the manual input devices 30-1 and 30-2 to the post-processing unit 18-2. When the post-processing unit 18-2 is selected from the separation processing unit 16 in accordance with the recognition accuracy obtained by the OCR 10, the post-processing unit 18-2 supplies input image data to the manual input device 30-1, and A character string (text code) input by the operator as a recognition result of the input image data is acquired. Then, the recognition result obtained from the OCR 10 and the recognition result obtained from the manual input device 30-1 are matched, and when both match, the matching recognition result is used as the recognition result of the subsequent processing unit 18-2. Output. On the other hand, if they do not match, the post-processing unit 18-2 supplies the input image data to another manual input device 30-2, and the operator of the manual input device 30-2 receives the input image. The character string input as the data recognition result is acquired, and the character string is output as the recognition result of the post-processing unit 18-2. In this case, the operator of the manual input device 30-2 is assumed to have a higher accuracy in recognizing the character string in the input image data than the operator of the manual input device 30-1 (for example, past results are Good person).

後段処理部１８−３（後段処理３）は、３つの後段処理部１８のうち認識確度が最低の区間に対応するものである。後段処理部１８−３は、ＯＣＲ１０の認識結果を用いず、人手入力装置３０−１、３０−２、３０−３に対して各々の操作者が入力した認識結果を用いて、後段処理部１８−２と同様の処理を行う。すなわち、後段処理部１８−３は、ＯＣＲ１０の求めた認識確度に応じて分離処理部１６から選択されると、人手入力装置３０−１及び３０−３に対して入力画像データを供給し、人手入力装置３０−１及び３０−３の各々の操作者がその入力画像データの認識結果として入力した文字列を取得する。そして、それら２つの人手入力装置３０−１及び３０−３から得た認識結果同士を突き合わせ、それら両者が一致する場合には、その一致する認識結果を当該後段処理部１８−３の認識結果として出力する。一方、それら両者が不一致の場合、後段処理部１８−２は、別の人手入力装置３０−２に対してその入力画像データを供給し、人手入力装置３０−２の操作者がその入力画像データの認識結果として入力した文字列を取得し、その文字列を後段処理部１８−２の認識結果として出力する。この場合、人手入力装置３０−２の操作者は、人手入力装置３０−１及び３０−３の各々の操作者よりも入力画像データ内の文字列の認識の正確さが高いと想定される人としてもよい。 The post-stage processing unit 18-3 (post-stage process 3) corresponds to a section having the lowest recognition accuracy among the three post-stage processing units 18. The post-processing unit 18-3 does not use the recognition result of the OCR 10, but uses the recognition result input by each operator to the manual input devices 30-1, 30-2, and 30-3, and uses the recognition result of the post-processing unit 18-3. -2 is performed. That is, when the post-processing unit 18-3 is selected from the separation processing unit 16 according to the recognition accuracy obtained by the OCR 10, it supplies the input image data to the manual input devices 30-1 and 30-3, A character string input as a recognition result of the input image data by each operator of the input devices 30-1 and 30-3 is acquired. If the recognition results obtained from the two manual input devices 30-1 and 30-3 are matched, and the two match, the matching recognition results are used as the recognition results of the subsequent processing unit 18-3. Output. On the other hand, if they do not match, the post-processing unit 18-2 supplies the input image data to another manual input device 30-2, and the operator of the manual input device 30-2 receives the input image data. The character string input as the recognition result is acquired, and the character string is output as the recognition result of the post-processing unit 18-2. In this case, the operator of the manual input device 30-2 is assumed to be more accurate in recognizing the character string in the input image data than the operators of the manual input devices 30-1 and 30-3. It is good.

このような３つの後段処理部１８−１、１８−２、１８−３に対応付けられる認識確度の３つの区間１、２、３は、２つの閾値Ｔ₁及びＴ₂に（Ｔ₁＜Ｔ₂）より区切られる。分離処理部１６は、ＯＣＲ１０出力する認識確度ＸがＴ₁未満であれば後段処理部１８−１を選択し、Ｔ₁以上Ｔ₂未満であれば後段処理部１８−２を選択し、Ｔ₂以上であれば後段処理部１８−３を選択する。 The three sections 1, 2, and 3 of the recognition accuracy associated with the three subsequent processing units 18-1, 18-2, and 18-3 have two threshold values T ₁ and T ₂ (T ₁ <T ₂ ) separated by. Separation processing unit 16 selects the post-processing unit 18-1 if recognition accuracy X is less than T ₁ to OCR10 outputs, select the post-processing unit 18-2 is less than above T ₁ T _2, T ₂ If so, the post-processing unit 18-3 is selected.

閾値設定処理装置２０は、３つの後段処理部１８−１、１８−２、１８−３に対応する区間１（Ｔ₁＞Ｘ）、区間２（Ｔ₂＞Ｘ≧Ｔ₁）、区間３（Ｘ≧Ｔ₂）の各々に対応するＯＣＲ１０の目標認識率Ｙ₁、Ｙ₂、Ｙ₃を次のようにして計算する。 The threshold setting processing apparatus 20 includes a section 1 (T ₁ > X), a section 2 (T ₂ > X ≧ T ₁ ), a section 3 (corresponding to the three subsequent processing units 18-1, 18-2, and 18-3. The target recognition rates Y ₁ , Y ₂ , Y ₃ of the OCR 10 corresponding to each of X ≧ T ₂ ) are calculated as follows.

まず、判定システムの目標認識率（つまり判定システムの最終的な出力の正解率の目標値）をＲとする。後段処理部１８−１は、ＯＣＲ１０の認識結果をまったく用いないので、ＯＣＲ１０の認識率は０でもよい。そこで、閾値設定処理装置２０は、目標認識率Ｙ₁＝０とする。後段処理部１８−１自体は、人手入力装置３０−１、３０−２、３０−３の各々の操作者を適切に選ぶことで判定システムの目標認識率Ｒを満たす。 First, let R be the target recognition rate of the determination system (that is, the target value of the correct answer rate of the final output of the determination system). Since the post-processing unit 18-1 does not use the recognition result of the OCR 10 at all, the recognition rate of the OCR 10 may be zero. Therefore, the threshold setting processing device 20 sets the target recognition rate Y ₁ = 0. The post-processing unit 18-1 itself satisfies the target recognition rate R of the determination system by appropriately selecting each operator of the manual input devices 30-1, 30-2, and 30-3.

一方、後段処理部１８−３は、ＯＣＲ１０の認識結果をそのまま自身の出力として使うので、目標認識率Ｙ₃＝Ｒである。 On the other hand, since the post-processing unit 18-3 uses the recognition result of the OCR 10 as its output as it is, the target recognition rate Y ₃ = R.

残る後段処理部１８−２については、以下のようにしてＯＣＲ１０の目標認識率Ｙ₂を算出する。 For the remaining post-processing unit 18-2, the target recognition rate Y2 of the OCR ₁₀ is calculated as follows.

まず、人がデータエントリ（すなわち入力画像データに含まれる文字列を認識し、人手入力装置３０に入力する処理）するときのエラー率をλとする。言い換えれば、人のデータエントリの正解率（認識率）は、（１−λ）である。一方、ＯＣＲ１０の認識結果のエラー率をωとする。すなわち、ＯＣＲ１０の正解率（認識率）は（１−ω）である。後段処理部１８−２によるＯＣＲ１０と人の認識結果の突き合わせ処理のエラー率の概算値はλωとなる。後段処理部１８−２が判定システムの目標認識率Ｒを満たす必要があることを考え合わせると、
λω＝（１−Ｒ）
が成り立つ。したがって、後段処理部１８−２が選ばれる場合のＯＣＲ１０の目標認識率Ｙ₂は（１−ω）に等しいと考えてよいので、最終的にＹ₂は、既知である人のエラー率λと判定システムの目標認識率Ｒから次のように計算される。
Ｙ₂＝１−ω＝１−（１−Ｒ）／λ First, let λ be an error rate when a person performs data entry (that is, processing for recognizing a character string included in input image data and inputting it to the manual input device 30). In other words, the correct answer rate (recognition rate) of the human data entry is (1−λ). On the other hand, the error rate of the recognition result of the OCR 10 is ω. That is, the accuracy rate (recognition rate) of the OCR 10 is (1-ω). The approximate value of the error rate of the matching process between the OCR 10 and the human recognition result by the post-processing unit 18-2 is λω. Considering that the post-processing unit 18-2 needs to satisfy the target recognition rate R of the determination system,
λω = (1-R)
Holds. Therefore, since the target recognition rate Y ₂ of the OCR 10 when the post-processing unit 18-2 is selected may be considered to be equal to (1-ω), finally Y ₂ is the error rate λ of a known person. It is calculated from the target recognition rate R of the determination system as follows.
Y ₂ = 1−ω = 1− (1-R) / λ

閾値設定処理装置２０は、この式に従って閾値Ｙ₂を計算すればよい。 The threshold setting processor 20 may calculate the threshold Y ₂ according to this equation.

以上に説明した実施形態は、本発明の具現化のあくまで一例に過ぎない。 The embodiment described above is merely an example of realization of the present invention.

以上の例では、閾値の決定において累積正解数Ｓ（ｊ）を用いたが、この代わりに累積エラー数を用いてもよい。エラー数は、Ｆ_j＝０の学習用データの数である。また、累積正解数Ｓ（ｊ）（またはエラー数）の代わりに、累積正解数をサンプル数で割った累積正解率（認識率）を用いても、累積正解数を用いる場合と同様の処理が可能である。また、各区間Ｋがその上限または下限の閾値Ｔ_K又はＴ_K-1を含むか否かは、上に例示したものに限らず、適宜定めればよい。 In the above example, the cumulative number of correct answers S (j) is used in determining the threshold value, but the cumulative number of errors may be used instead. The number of errors is the number of learning data with F _j = 0. Even if the cumulative correct answer rate (recognition rate) obtained by dividing the cumulative correct answer number by the number of samples is used instead of the cumulative correct answer number S (j) (or error number), the same processing as in the case of using the cumulative correct answer number is performed. Is possible. Whether each section K includes the upper limit or lower limit threshold value T _K or T _K-1 is not limited to the above example, and may be determined as appropriate.

また、上記実施形態では、閾値設定処理装置２０には、認識確度Ｘと正誤情報Ｆのペアである学習用データが入力されたが、これは一例に過ぎない。この代わりに、正誤情報Ｆを、対応する認識確度Ｘの小さい順又は大きい順に注目する認識確度Ｘ_jまで累積することで得られる累積結果の情報を予め計算しておき、その情報とその認識確度Ｘ_jとのペアを学習用データとして閾値設定処理装置２０に入力してもよい。ここで累積結果の情報としては、上述した累積正解数、累積エラー数、累積正解率、累積エラー率等のいずれを用いてもよい。 In the above-described embodiment, learning data that is a pair of recognition accuracy X and correct / incorrect information F is input to the threshold setting processing device 20, but this is merely an example. Instead, information on the accumulation result obtained by accumulating the correctness / incorrectness information F to the recognition accuracy X _j to which attention is paid in ascending order of the corresponding recognition accuracy X is calculated in advance, and the information and its recognition accuracy are calculated. A pair with X _j may be input to the threshold setting processing device 20 as learning data. Here, as the information of the accumulation result, any of the above-mentioned accumulated correct answer number, accumulated error number, accumulated correct answer rate, accumulated error rate, etc. may be used.

また、上記実施形態では、判定システムは、入力画像データ中の文字列を認識するものであったが、文字認識以外にも、入力されたデータの内容を判定してその判定結果を出力する判定システム全般に、上記実施形態の手法が適用可能である。すなわち、本発明の適用対象となる判定システムは、入力されたデータの内容を判定する一次判定手段（ＯＣＲ１０に相当）と、その一次判定手段の判定結果と０以上の他の判定手段（例えば人間や一次判定手段より高精度だが高コストの判定手段）の判定結果を組み合わせてそのデータの内容の判定結果を求める複数の後段処理部とを含むものでよい。この判定システム、一次判定手段の判定結果についての判定確度（文字認識の場合の認識確度に相当）を求める手段を有し、この判定確度に応じて、複数の後段処理部のいずれを用いるかを決定する。すなわち、各後段処理部には、それぞれ異なる判定確度の区間が対応付けられ、一次判定手段の判定の判定確度が属する区間に対応する後段処理部が選択的に動作する。 In the above embodiment, the determination system recognizes a character string in input image data. However, in addition to character recognition, the determination system determines the content of input data and outputs the determination result. The method of the above embodiment can be applied to the entire system. That is, the determination system to which the present invention is applied includes a primary determination means (corresponding to OCR10) for determining the content of input data, a determination result of the primary determination means, and zero or more other determination means (for example, humans). And a plurality of subsequent processing units that obtain the determination result of the content of the data by combining the determination results of the determination means of higher accuracy but higher cost than the primary determination means. This determination system has means for obtaining determination accuracy (corresponding to recognition accuracy in the case of character recognition) for the determination result of the primary determination means, and which of the plurality of subsequent processing units is used according to this determination accuracy. decide. That is, each post-processing unit is associated with a section with different determination accuracy, and the post-processing unit corresponding to the section to which the determination accuracy of the determination by the primary determination unit belongs selectively operates.

以上に例示した判定システム及び閾値設定処理装置２０は、一つの例ではハードウエアの論理回路として構成可能である。また、別の例として、判定システム及び閾値設定処理装置２０は、例えば、内蔵されるコンピュータにそれらシステムまたは装置内の各機能モジュールの機能を表すプログラムを実行させることにより実現してもよい。ここで、コンピュータは、例えば、ハードウエアとして、ＣＰＵ等のプロセッサ、ランダムアクセスメモリ（ＲＡＭ）およびリードオンリメモリ（ＲＯＭ）等のメモリ（一次記憶）、ＨＤＤ（ハードディスクドライブ）を制御するＨＤＤコントローラ、各種Ｉ／Ｏ（入出力）インタフェース、ローカルエリアネットワークなどのネットワークとの接続のための制御を行うネットワークインタフェース等が、たとえばバスを介して接続された回路構成を有する。また、そのバスに対し、例えばＩ／Ｏインタフェース経由で、ＣＤやＤＶＤなどの可搬型ディスク記録媒体に対する読み取り及び／又は書き込みのためのディスクドライブ、フラッシュメモリなどの各種規格の可搬型の不揮発性記録媒体に対する読み取り及び／又は書き込みのためのメモリリーダライタ、などが接続されてもよい。上に例示した各機能モジュールの処理内容が記述されたプログラムがＣＤやＤＶＤ等の記録媒体を経由して、又はネットワーク等の通信手段経由で、ハードディスクドライブ等の固定記憶装置に保存され、コンピュータにインストールされる。固定記憶装置に記憶されたプログラムがＲＡＭに読み出されＣＰＵ等のプロセッサにより実行されることにより、上に例示した機能モジュール群が実現される。また、判定システム及び閾値設定処理装置２０は、ソフトウエアとハードウエアの組合せで構成されてもよい。 The determination system and the threshold setting processing device 20 exemplified above can be configured as a hardware logic circuit in one example. As another example, the determination system and the threshold setting processing device 20 may be realized, for example, by causing a built-in computer to execute a program representing the function of each functional module in the system or device. Here, the computer includes, for example, a processor such as a CPU, a memory (primary storage) such as a random access memory (RAM) and a read only memory (ROM), an HDD controller that controls an HDD (hard disk drive), and the like as hardware. A network interface that performs control for connection to a network such as an I / O (input / output) interface or a local area network has a circuit configuration connected via a bus, for example. Also, portable non-volatile recording of various standards such as a disk drive and a flash memory for reading and / or writing to a portable disk recording medium such as a CD or a DVD via the I / O interface, for example. A memory reader / writer for reading from and / or writing to a medium may be connected. A program in which the processing contents of each functional module exemplified above are described is stored in a fixed storage device such as a hard disk drive via a recording medium such as a CD or DVD, or via a communication means such as a network, and stored in a computer. Installed. The program stored in the fixed storage device is read into the RAM and executed by a processor such as a CPU, whereby the functional module group exemplified above is realized. Further, the determination system and the threshold setting processing device 20 may be configured by a combination of software and hardware.

１０ＯＣＲ、１２閾値比較処理部、１４閾値ＤＢ、１６分離処理部、１８後段処理部、１９統合処理部、２０閾値設定処理装置、２２学習用データ入力部、２４累積データ算出部、２６目標認識率設定部、２８閾値算出部、３０人手入力装置。
10 OCR, 12 Threshold comparison processing unit, 14 Threshold DB, 16 Separation processing unit, 18 Subsequent processing unit, 19 Integration processing unit, 20 Threshold setting processing device, 22 Learning data input unit, 24 Cumulative data calculation unit, 26 Target recognition Rate setting unit, 28 threshold calculation unit, 30 manual input device.

Claims

A determination means for determining the input;
Calculation means for calculating the determination accuracy of the determination means for the input;
A plurality of post-processing units that can generate an output for the input by performing post-processing on the determination result of the determination unit, and have different degrees of dependence on the determination result of the determination unit in the generation of the output. A plurality of subsequent processing means respectively associated with each section obtained by dividing the range that the determination accuracy can take with one or more threshold values;
Control means for controlling the subsequent processing means corresponding to the section to which the determination accuracy calculated by the calculation means belongs to generate an output for the input;
An information processing apparatus that determines a threshold value that divides the section for the determination accuracy, for a determination system including:
For each past input to the determination means, an acquisition means for acquiring a set of the determination accuracy for the input and correct / incorrect information indicating whether the determination result of the determination means for the input is correct or incorrect;
Using the set acquired by the acquisition means, the correct answer rate of the determination means obtained from the set of sets belonging to the section in order from the section with the highest determination accuracy is the target correct answer of the determination means corresponding to the section. Determining means for determining the threshold value defining the section so as to satisfy the rate;
An information processing apparatus including:

The target recognition rate of the determination means corresponding to each section is a higher value as the section has a higher determination accuracy.
The information processing apparatus according to claim 1, wherein the determination unit determines the threshold value that defines each section in order from the section having the highest target recognition rate.

The later stage processing means corresponding to the section with higher determination accuracy uses a method with a lower cost as a method for generating the output using the determination result,
The information processing apparatus according to claim 1, wherein the determination unit determines the threshold value that defines each section in order from the section having the lowest cost.

The post-processing unit corresponding to the section with the highest determination accuracy is the output of the determination result of the determination unit as it is,
The target accuracy rate set for the determination system is used as the target accuracy rate for the post-processing unit corresponding to the section with the highest determination accuracy. Information processing device.

The plurality of post-stage processing means includes second type post-stage processing means for generating an output for the input without using the determination result of the determination means, and the second type of post-stage processing means includes the section of the section. The information processing apparatus according to claim 1, wherein the information processing apparatus is associated with a section having the lowest determination accuracy.

The information processing apparatus according to claim 5, wherein the target accuracy rate corresponding to the second-stage post-processing unit is zero.

The plurality of post-processing units output the output based on a first post-processing unit that uses the determination result of the determination unit as it is as the output, and a determination result by a person with respect to the input without using the determination result of the determination unit. A second post-processing unit for generating, and a third post-processing unit for generating the output by matching the determination result of the determination unit and the determination result by the person with respect to the input,
The target accuracy rate set for the determination system is used as the target accuracy rate for the first post-processing means, and 0 is used as the target accuracy rate for the second post-processing means. The information processing apparatus according to any one of claims 1 to 3, wherein the target correct answer rate for three subsequent processing means is obtained from the correct answer rate of the person and the correct answer rate of the determination means.

The acquisition means, instead of a set of determination accuracy and correct / incorrect information, includes determination accuracy, and information on the accumulation result of the correct / incorrect information corresponding to each determination accuracy within a range from the maximum value of the determination accuracy to the determination accuracy. Get a pair of,
8. The information processing according to claim 1, wherein the determining unit obtains a correct answer rate of a section in which the threshold is to be determined using information on the cumulative result corresponding to each determination accuracy. apparatus.

A determination means for determining the input;
Calculation means for calculating the determination accuracy of the determination means for the input;
A plurality of post-processing units that can generate an output for the input by performing post-processing on the determination result of the determination unit, and have different degrees of dependence on the determination result of the determination unit in the generation of the output. A plurality of subsequent processing means respectively associated with each section obtained by dividing the range that the determination accuracy can take with one or more threshold values;
Control means for controlling the subsequent processing means corresponding to the section to which the determination accuracy calculated by the calculation means belongs to generate an output for the input;
For a determination system including:
For each past input to the determination means, an acquisition means for acquiring a set of the determination accuracy for the input and correct / incorrect information indicating whether the determination result of the determination means for the input is correct or incorrect;
Using the set acquired by the acquisition means, the correct answer rate of the determination means obtained from the set of sets belonging to the section in order from the section with the highest determination accuracy is the target correct answer of the determination means corresponding to the section. Determining means for determining the threshold value defining the section so as to satisfy the rate;
Program to function as.

A determination means for determining the input;
Calculation means for calculating the determination accuracy of the determination means for the input;
A plurality of post-processing units that can generate an output for the input by performing post-processing on the determination result of the determination unit, and have different degrees of dependence on the determination result of the determination unit in the generation of the output. A plurality of subsequent processing means respectively associated with each section obtained by dividing the range that the determination accuracy can take with one or more threshold values;
Control means for controlling the subsequent processing means corresponding to the section to which the determination accuracy calculated by the calculation means belongs to generate an output for the input;
For each past input to the determination means, an acquisition means for acquiring a set of the determination accuracy for the input and correct / incorrect information indicating whether the determination result of the determination means for the input is correct or incorrect;
Using the set acquired by the acquisition means, the correct answer rate of the determination means obtained from the set of sets belonging to the section in order from the section with the highest determination accuracy is the target correct answer of the determination means corresponding to the section. Determining means for determining the threshold value defining the section so as to satisfy the rate;
An information processing apparatus including:

Computer
A determination means for determining the input;
Calculation means for calculating the determination accuracy of the determination means for the input;
A plurality of post-processing units that can generate an output for the input by performing post-processing on the determination result of the determination unit, and have different degrees of dependence on the determination result of the determination unit in the generation of the output. A plurality of subsequent processing means associated with each section obtained by dividing the range that the determination accuracy can take by one or more threshold values,
Control means for controlling the post-processing means corresponding to the section to which the determination accuracy calculated by the calculation means belongs to generate an output for the input;
For each past input to the determination means, an acquisition means for acquiring a set of the determination accuracy for the input and correct / incorrect information indicating whether the determination result of the determination means for the input is correct or incorrect;
Using the set acquired by the acquisition means, the correct answer rate of the determination means obtained from the set of sets belonging to the section in order from the section with the highest determination accuracy is the target correct answer of the determination means corresponding to the section. Determining means for determining the threshold value defining the section so as to satisfy the rate;
Program to function as.