JP5636766B2

JP5636766B2 - Image processing apparatus and image processing program

Info

Publication number: JP5636766B2
Application number: JP2010146014A
Authority: JP
Inventors: 木村　俊一; 俊一木村; 瑛一田中; 関野　雅則; 雅則関野; 久保田　聡; 聡久保田; 越　裕; 裕越
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2010-06-28
Filing date: 2010-06-28
Publication date: 2014-12-10
Anticipated expiration: 2030-06-28
Also published as: JP2012008909A

Description

本発明は、画像処理装置及び画像処理プログラムに関する。 The present invention relates to an image processing apparatus and an image processing program.

画像から文字画像を切り出す技術がある。
これに関連する技術として、例えば、特許文献１には、文字認識において文字の大きさや形、ピッチ等が文書毎に異なっていても精度のよい認識を行うことを目的とし、言語的に正しいと思われる部分の文字を確定文字検出部が確定文字として検出し、認識結果出力部が認識結果として出力する一方、文字矩形情報検出部が確定文字の矩形情報を検出し、矩形評価関数の最適化を行い、矩形分割統合部は最適化された矩形評価関数に基づいてまだ確定されていない部分の基本矩形の分割・統合を行うことによって新たな基本矩形を求め、再び候補文字選出部以下の処理を行い、また筆記者推定部が確定文字から得た情報を用いて候補文字選出部で用いる辞書を最適化することが開示されている。 There is a technique for cutting out a character image from an image.
As a technology related to this, for example, Patent Document 1 describes that character recognition is accurate for the purpose of performing accurate recognition even if the size, shape, pitch, and the like of characters are different for each document. The deterministic character detection unit detects the character that appears to be a definite character and the recognition result output unit outputs it as the recognition result, while the character rectangle information detection unit detects the rectangle information of the definite character and optimizes the rectangle evaluation function The rectangular division integration unit obtains a new basic rectangle by dividing and integrating the basic rectangles of parts that have not yet been determined based on the optimized rectangle evaluation function, and again performs processing below the candidate character selection unit It is also disclosed that the writer estimation unit optimizes the dictionary used in the candidate character selection unit using information obtained from the confirmed characters.

また、例えば、特許文献２には、文書画像から高速にかつ正確に文字の切り出しを可能とし、さらに、漢字やひらがななどに英数字記号の混在する文書における文字の切り出しを高速にかつ正確に行うことを目的とし、各外接矩形の形状情報から切り出し候補を推定し、この推定した切り出し候補に対して文字認識を行い、この文字認識の結果により切り出し確定可能と判断された切り出し候補に対してはそれを切り出し結果として確定し、前記文字認識の結果により切り出し確定可能と判断されなかった切り出し候補に対しては、各外接矩形の組み合わせによる複数の切り出し候補を推定し、それぞれの切り出し候補毎に個々の矩形に対する認識評価値を求め、これら個々の矩形に対する認識評価値を用いた各切り出し候補毎の組み合わせ評価値のうち最適な組み合わせ評価値を得た切り出し候補を切り出し結果として確定し、また、英数字記号のみを対象とした認識を行って、英数字記号のみを先に確定したのちに、英数字記号と確定された文字以外の文字の切り出しを行うことが開示されている。 Further, for example, in Patent Document 2, it is possible to cut out characters from a document image at high speed and accurately, and cut out characters in a document in which alphanumeric symbols such as kanji and hiragana are mixed at high speed and accurately. For this purpose, a cutout candidate is estimated from the shape information of each circumscribed rectangle, character recognition is performed on the estimated cutout candidate, and a cutout candidate that is determined to be cutout can be determined based on the character recognition result. For a cutout candidate that is determined as a cutout result and that is not determined to be cutout by the character recognition result, a plurality of cutout candidates based on a combination of each circumscribed rectangle are estimated, and each cutout candidate is individually A recognition evaluation value for each rectangle is obtained, and combinations for each segmentation candidate using the recognition evaluation values for these individual rectangles The cutout candidate that obtained the optimal combination evaluation value among the values is confirmed as the cutout result, and only the alphanumeric symbols are recognized after performing recognition for only the alphanumeric symbols, and then the alphanumeric symbols And cutting out characters other than the confirmed characters.

また、例えば、特許文献３には、文字識別や文字列照合で、文字の切り出し方が確定できない場合でも、高い精度で文字を切り出し文字列を認識することを課題とし、多重仮説検定型の文字切り出し処理において、まず従来の方法で全部の切り出し方の中からより正しいと判断される複数の切り出し方の候補を選別し、次いで本発明の方法である各文字パターンの大きさや前後のパターンとの位置関係に基づき切り出し方の仮説の妥当性を評価するための評価値（概形ペナルティ）を求め、この仮定は予め収集登録しサンプルを学習して求めた線形識別関数により行い、このようにして正しい切り出し方を判別することにより、大きさや位置関係の情報を容易に扱えるようになることが開示されている。 In addition, for example, Patent Document 3 has a problem of recognizing a character string with high accuracy even when the character extraction or character string collation cannot determine how to extract the character. In the cutout process, first, a plurality of cutout methods that are judged to be more correct are selected from all cutout methods by the conventional method, and then the size of each character pattern and the pattern before and after the method of the present invention are selected. The evaluation value (rough shape penalty) for evaluating the validity of the hypothesis of how to cut out is calculated based on the positional relationship, and this assumption is made by the linear discriminant function obtained by collecting and registering in advance and learning the sample. It is disclosed that information on size and positional relationship can be easily handled by discriminating a correct clipping method.

特開平０５−１７４１８７号公報JP 05-174187 A 特開平０８−１６１４３２号公報Japanese Patent Laid-Open No. 08-161432 特開平０９−１８５６８１号公報Japanese Patent Laid-Open No. 09-185681

本発明は、画像内に存在する文字画像を切り出す位置を決定する場合にあって、切り出す位置の候補の評価値が特異の値となった場合に、その特異な評価値だけによって切り出す位置が決定されてしまうことを防ぐようにした画像処理装置及び画像処理プログラムを提供することを目的としている。 In the present invention, when a position to cut out a character image existing in an image is determined, and the evaluation value of a candidate for a cut-out position becomes a singular value, the position to be cut out is determined only by the singular evaluation value. It is an object of the present invention to provide an image processing apparatus and an image processing program that prevent this from happening.

かかる目的を達成するための本発明の要旨とするところは、次の各項の発明に存する。
請求項１の発明は、画像内に存在する１つの文字画像を切り出す位置の候補に関する複数の特徴量に対して、重み付き線形和を計算する第１の計算手段と、前記第１の計算手段による計算結果を引数として、該引数が極限の値の場合に予め定められた値に収束するようになる、又は該引数と予め定められた値との距離が大きくなると出力の傾きの絶対値が小さくなるようになる非線形単調関数又は該非線形単調関数に近似する関数によって、前記１つの文字画像を切り出す位置の候補の評価値を計算する第２の計算手段と、前記第２の計算手段によって計算された評価値に基づいて、前記画像内に存在する文字画像を切り出す位置を決定する切出位置決定手段と、文字画像の切り出し位置の教師データを受け付ける受付手段と、前記切出位置決定手段によって決定された切り出し位置と前記受付手段によって受け付けられた教師データを比較して、切り出し位置の正解個数又は誤り個数を算出する個数算出手段と、前記個数算出手段によって算出された切り出し位置の正解個数又は誤り個数に基づいて、１文字分の文字切り出し位置における前記第１の計算手段で用いる重みを変更する重み変更手段を具備し、前記重み変更手段は、現在の重みでの場合の正解個数又は誤り個数に基づいた値から変更後の重みでの正解個数又は誤り個数に基づいた値への変更量から次の重みを決定することを特徴とする画像処理装置である。 The gist of the present invention for achieving the object lies in the inventions of the following items.
According to the first aspect of the present invention, there is provided first calculation means for calculating a weighted linear sum for a plurality of feature amounts related to position candidates for cutting out one character image existing in an image, and the first calculation means. As an argument, the absolute value of the slope of the output is obtained when the argument converges to a predetermined value when the argument is an extreme value, or the distance between the argument and the predetermined value increases. A second calculation means for calculating an evaluation value of a candidate for a position to cut out the one character image by a nonlinear monotone function that becomes smaller or a function that approximates the nonlinear monotone function; and a calculation by the second calculation means based on the evaluation value, a clipping position determining means for determining a position for cutting out a character image that is present in the image, and receiving means for receiving training data segmentation position of the character image, the cropping position A number calculation unit that calculates the number of correct answers or errors of the cutout position by comparing the cutout position determined by the determination unit with the teacher data received by the reception unit; and the cutout position calculated by the number calculation unit Based on the number of correct answers or the number of errors, weight change means for changing the weight used in the first calculation means at the character cut-out position for one character is provided, and the weight change means is a correct answer when the current weight is used. The image processing apparatus is characterized in that the next weight is determined from the amount of change from the value based on the number or the number of errors to the number of correct answers with the changed weight or the value based on the number of errors .

請求項２の発明は、画像内に存在する１つの文字画像を切り出す位置の候補に関する複数の特徴量に対して、重み付き線形和を計算する第１の計算手段と、前記第１の計算手段による計算結果を引数として、該引数が極限の値の場合に予め定められた値に収束するようになる、又は該引数と予め定められた値との距離が大きくなると出力の傾きの絶対値が小さくなるようになる非線形単調関数又は該非線形単調関数に近似する関数によって、前記１つの文字画像を切り出す位置の候補の評価値を計算する第２の計算手段と、前記第２の計算手段によって計算された評価値に基づいて、前記画像内に存在する文字画像を切り出す位置を決定する切出位置決定手段と、前記第１の計算手段と前記第２の計算手段による組を複数有し、前記複数の第２の計算手段によって計算された評価値に対して、重み付き線形和を計算する第３の計算手段と、前記第３の計算手段による計算結果を引数として、該引数が極限の値の場合に予め定められた値に収束するようになる、又は該引数と予め定められた値との距離が大きくなると出力の傾きの絶対値が小さくなるようになる非線形単調関数又は該非線形単調関数に近似する関数によって、前記１つの文字画像を切り出す位置の候補の評価値を計算する第４の計算手段を具備し、前記切出位置決定手段は、前記第４の計算手段によって計算された評価値に基づいて、前記画像内に存在する文字画像を切り出す位置を決定し、文字画像の切り出し位置の教師データを受け付ける受付手段と、前記切出位置決定手段によって決定された切り出し位置と前記受付手段によって受け付けられた教師データを比較して、切り出し位置の正解個数又は誤り個数を算出する個数算出手段と、前記個数算出手段によって算出された切り出し位置の正解個数又は誤り個数に基づいて、１文字分の文字切り出し位置における前記第１の計算手段、又は前記第３の計算手段で用いる重みを変更する重み変更手段を具備し、前記重み変更手段は、現在の重みでの場合の正解個数又は誤り個数に基づいた値から変更後の重みでの正解個数又は誤り個数に基づいた値への変更量から次の重みを決定することを特徴とする画像処理装置である。 According to a second aspect of the present invention , there is provided a first calculation means for calculating a weighted linear sum for a plurality of feature quantities related to a position candidate for cutting out one character image existing in an image, and the first calculation means. As an argument, the absolute value of the slope of the output is obtained when the argument converges to a predetermined value when the argument is an extreme value, or the distance between the argument and the predetermined value increases. A second calculation means for calculating an evaluation value of a candidate for a position to cut out the one character image by a nonlinear monotone function that becomes smaller or a function that approximates the nonlinear monotone function; and a calculation by the second calculation means A plurality of sets of cut position determining means for determining a position to cut out a character image existing in the image based on the evaluated value, and a set of the first calculation means and the second calculation means, Multiple second A third calculation means for calculating a weighted linear sum with respect to the evaluation value calculated by the calculation means, and a calculation result by the third calculation means as an argument, and when the argument is a limit value, A nonlinear monotonic function that approximates to the nonlinear monotonic function, which converges to a predetermined value, or whose absolute value of the slope of output decreases as the distance between the argument and a predetermined value increases. And a fourth calculating means for calculating an evaluation value of a position candidate to cut out the one character image, wherein the cutting position determining means is based on the evaluation value calculated by the fourth calculating means. the determines the position for cutting out a character image that is present in the image, and receiving means for receiving training data segmentation position of the character image, the cropping position determining extraction position before and determined by means Based on the number of correct answers or the number of errors of the cutout position calculated by the number calculating means by comparing the teacher data received by the receiving means and calculating the number of correct answers or the number of errors of the cutout position, 1 A weight changing means for changing a weight used in the first calculating means or the third calculating means at a character cut-out position for a character, wherein the weight changing means The image processing apparatus is characterized in that the next weight is determined from the amount of change from the value based on the number of errors to the number of correct answers with the weight after change or the value based on the number of errors .

請求項３の発明は、画像内に存在する１つの文字画像を切り出す位置の候補に関する複数の特徴量に対して、重み付き線形和を計算する第１の計算手段と、前記第１の計算手段による計算結果を引数として、該引数が極限の値の場合に予め定められた値に収束するようになる、又は該引数と予め定められた値との距離が大きくなると出力の傾きの絶対値が小さくなるようになる非線形単調関数又は該非線形単調関数に近似する関数によって、前記１つの文字画像を切り出す位置の候補の評価値を計算する第２の計算手段と、前記第２の計算手段によって計算された評価値に基づいて、前記画像内に存在する文字画像を切り出す位置を決定する切出位置決定手段と、前記第１の計算手段と前記第２の計算手段による組を複数有し、前記複数の第２の計算手段によって計算された評価値の和を計算する第５の計算手段を具備し、前記切出位置決定手段は、前記第５の計算手段によって計算された評価値の和に基づいて、前記画像内に存在する文字画像を切り出す位置を決定し、文字画像の切り出し位置の教師データを受け付ける受付手段と、前記切出位置決定手段によって決定された切り出し位置と前記受付手段によって受け付けられた教師データを比較して、切り出し位置の正解個数又は誤り個数を算出する個数算出手段と、前記個数算出手段によって算出された切り出し位置の正解個数又は誤り個数に基づいて、１文字分の文字切り出し位置における前記第１の計算手段で用いる重みを変更する重み変更手段を具備し、前記重み変更手段は、現在の重みでの場合の正解個数又は誤り個数に基づいた値から変更後の重みでの正解個数又は誤り個数に基づいた値への変更量から次の重みを決定することを特徴とする画像処理装置である。 According to a third aspect of the present invention , there is provided a first calculation means for calculating a weighted linear sum for a plurality of feature quantities related to a position candidate for cutting out one character image existing in an image, and the first calculation means. As an argument, the absolute value of the slope of the output is obtained when the argument converges to a predetermined value when the argument is an extreme value, or the distance between the argument and the predetermined value increases. A second calculation means for calculating an evaluation value of a candidate for a position to cut out the one character image by a nonlinear monotone function that becomes smaller or a function that approximates the nonlinear monotone function; and a calculation by the second calculation means A plurality of sets of cut position determining means for determining a position to cut out a character image existing in the image based on the evaluated value, and a set of the first calculation means and the second calculation means, Multiple second A fifth calculating means for calculating the sum of the evaluation values calculated by the calculating means, and the cutting position determining means is based on the sum of the evaluation values calculated by the fifth calculating means. A receiving unit that determines a position to cut out a character image existing in the image and receives teacher data of a cutout position of the character image, a cutout position determined by the cutout position determining unit, and a teacher data received by the receiving unit The number calculation means for calculating the correct number or error number of the cutout position and the correct number or error number of the cutout position calculated by the number calculation means, Weight changing means for changing the weight used in the first calculating means is provided, and the weight changing means is the number of correct answers or errors in the case of the current weight. An image processing apparatus characterized by the amount of change to the correct number or value based on the number of errors in the weight of the changed from a value based on the number determines the next weight.

請求項４の発明は、画像内に存在する１つの文字画像を切り出す位置の候補に関する複数の特徴量に対して、重み付き線形和を計算する第１の計算手段と、前記第１の計算手段による計算結果を引数として、該引数が極限の値の場合に予め定められた値に収束するようになる、又は該引数と予め定められた値との距離が大きくなると出力の傾きの絶対値が小さくなるようになる非線形単調関数又は該非線形単調関数に近似する関数によって、前記１つの文字画像を切り出す位置の候補の評価値を計算する第２の計算手段と、前記第２の計算手段によって計算された評価値に基づいて、前記画像内に存在する文字画像を切り出す位置を決定する切出位置決定手段と、前記第１の計算手段と前記第２の計算手段による組を複数有し、前記複数の第２の計算手段によって計算された評価値に対して、重み付き線形和を計算する第３の計算手段と、前記第３の計算手段による計算結果を引数として、該引数が極限の値の場合に予め定められた値に収束するようになる、又は該引数と予め定められた値との距離が大きくなると出力の傾きの絶対値が小さくなるようになる非線形単調関数又は該非線形単調関数に近似する関数によって、前記１つの文字画像を切り出す位置の候補の評価値を計算する第４の計算手段を具備し、前記切出位置決定手段は、前記第４の計算手段によって計算された評価値に基づいて、前記画像内に存在する文字画像を切り出す位置を決定し、前記第１の計算手段と前記第２の計算手段による複数の組と、前記第３の計算手段と前記第４の計算手段による組を複数有し、前記複数の第４の計算手段によって計算された評価値の和を計算する第６の計算手段を具備し、前記切出位置決定手段は、前記第６の計算手段によって計算された評価値の和に基づいて、前記画像内に存在する文字画像を切り出す位置を決定し、文字画像の切り出し位置の教師データを受け付ける受付手段と、前記切出位置決定手段によって決定された切り出し位置と前記受付手段によって受け付けられた教師データを比較して、切り出し位置の正解個数又は誤り個数を算出する個数算出手段と、前記個数算出手段によって算出された切り出し位置の正解個数又は誤り個数に基づいて、１文字分の文字切り出し位置における前記第１の計算手段、又は前記第３の計算手段で用いる重みを変更する重み変更手段を具備し、前記重み変更手段は、現在の重みでの場合の正解個数又は誤り個数に基づいた値から変更後の重みでの正解個数又は誤り個数に基づいた値への変更量から次の重みを決定することを特徴とする画像処理装置である。 According to a fourth aspect of the present invention , there is provided a first calculation means for calculating a weighted linear sum for a plurality of feature amounts related to a position candidate for cutting out one character image existing in an image, and the first calculation means. As an argument, the absolute value of the slope of the output is obtained when the argument converges to a predetermined value when the argument is an extreme value, or the distance between the argument and the predetermined value increases. A second calculation means for calculating an evaluation value of a candidate for a position to cut out the one character image by a nonlinear monotone function that becomes smaller or a function that approximates the nonlinear monotone function; and a calculation by the second calculation means A plurality of sets of cut position determining means for determining a position to cut out a character image existing in the image based on the evaluated value, and a set of the first calculation means and the second calculation means, Multiple second A third calculation means for calculating a weighted linear sum with respect to the evaluation value calculated by the calculation means, and a calculation result by the third calculation means as an argument, and when the argument is a limit value, A nonlinear monotonic function that approximates to the nonlinear monotonic function, which converges to a predetermined value, or whose absolute value of the slope of output decreases as the distance between the argument and a predetermined value increases. And a fourth calculating means for calculating an evaluation value of a position candidate to cut out the one character image, wherein the cutting position determining means is based on the evaluation value calculated by the fourth calculating means. , Determining a position to cut out a character image existing in the image, a plurality of sets by the first calculation means and the second calculation means, a set by the third calculation means and the fourth calculation means Have multiple Sixth calculating means for calculating a sum of evaluation values calculated by the plurality of fourth calculating means is provided, and the cut-out position determining means is a sum of evaluation values calculated by the sixth calculating means. The character image existing in the image is determined to be cut out, and accepting means for receiving the character image cut-out position teacher data, the cut-out position determined by the cut-out position determining means and the accepting means Based on the number of correct answers or the number of errors of the cutout position by comparing the received teacher data, and based on the number of correct answers or the number of errors of the cutout position calculated by the number calculation means A weight changing means for changing a weight used in the first calculating means or the third calculating means at the character cutout position; Image processing characterized in that the next weight is determined from the amount of change from a value based on the number of correct answers or the number of errors in the case of the current weight to a value based on the number of correct answers or the number of errors in the changed weight Device.

請求項５の発明は、コンピュータを、画像内に存在する１つの文字画像を切り出す位置の候補に関する複数の特徴量に対して、重み付き線形和を計算する第１の計算手段と、前記第１の計算手段による計算結果を引数として、該引数が極限の値の場合に予め定められた値に収束するようになる、又は該引数と予め定められた値との距離が大きくなると出力の傾きの絶対値が小さくなるようになる非線形単調関数又は該非線形単調関数に近似する関数によって、前記１つの文字画像を切り出す位置の候補の評価値を計算する第２の計算手段と、前記第２の計算手段によって計算された評価値に基づいて、前記画像内に存在する文字画像を切り出す位置を決定する切出位置決定手段と、文字画像の切り出し位置の教師データを受け付ける受付手段と、前記切出位置決定手段によって決定された切り出し位置と前記受付手段によって受け付けられた教師データを比較して、切り出し位置の正解個数又は誤り個数を算出する個数算出手段と、前記個数算出手段によって算出された切り出し位置の正解個数又は誤り個数に基づいて、１文字分の文字切り出し位置における前記第１の計算手段で用いる重みを変更する重み変更手段として機能させ、前記重み変更手段は、現在の重みでの場合の正解個数又は誤り個数に基づいた値から変更後の重みでの正解個数又は誤り個数に基づいた値への変更量から次の重みを決定することを特徴とする画像処理プログラムである。
請求項６の発明は、コンピュータを、画像内に存在する１つの文字画像を切り出す位置の候補に関する複数の特徴量に対して、重み付き線形和を計算する第１の計算手段と、前記第１の計算手段による計算結果を引数として、該引数が極限の値の場合に予め定められた値に収束するようになる、又は該引数と予め定められた値との距離が大きくなると出力の傾きの絶対値が小さくなるようになる非線形単調関数又は該非線形単調関数に近似する関数によって、前記１つの文字画像を切り出す位置の候補の評価値を計算する第２の計算手段と、前記第２の計算手段によって計算された評価値に基づいて、前記画像内に存在する文字画像を切り出す位置を決定する切出位置決定手段と、前記第１の計算手段と前記第２の計算手段による組を複数有し、前記複数の第２の計算手段によって計算された評価値に対して、重み付き線形和を計算する第３の計算手段と、前記第３の計算手段による計算結果を引数として、該引数が極限の値の場合に予め定められた値に収束するようになる、又は該引数と予め定められた値との距離が大きくなると出力の傾きの絶対値が小さくなるようになる非線形単調関数又は該非線形単調関数に近似する関数によって、前記１つの文字画像を切り出す位置の候補の評価値を計算する第４の計算手段として機能させ、前記切出位置決定手段は、前記第４の計算手段によって計算された評価値に基づいて、前記画像内に存在する文字画像を切り出す位置を決定し、文字画像の切り出し位置の教師データを受け付ける受付手段と、前記切出位置決定手段によって決定された切り出し位置と前記受付手段によって受け付けられた教師データを比較して、切り出し位置の正解個数又は誤り個数を算出する個数算出手段と、前記個数算出手段によって算出された切り出し位置の正解個数又は誤り個数に基づいて、１文字分の文字切り出し位置における前記第１の計算手段、又は前記第３の計算手段で用いる重みを変更する重み変更手段として機能させ、前記重み変更手段は、現在の重みでの場合の正解個数又は誤り個数に基づいた値から変更後の重みでの正解個数又は誤り個数に基づいた値への変更量から次の重みを決定することを特徴とする画像処理プログラムである。
請求項７の発明は、コンピュータを、画像内に存在する１つの文字画像を切り出す位置の候補に関する複数の特徴量に対して、重み付き線形和を計算する第１の計算手段と、前記第１の計算手段による計算結果を引数として、該引数が極限の値の場合に予め定められた値に収束するようになる、又は該引数と予め定められた値との距離が大きくなると出力の傾きの絶対値が小さくなるようになる非線形単調関数又は該非線形単調関数に近似する関数によって、前記１つの文字画像を切り出す位置の候補の評価値を計算する第２の計算手段と、前記第２の計算手段によって計算された評価値に基づいて、前記画像内に存在する文字画像を切り出す位置を決定する切出位置決定手段と、前記第１の計算手段と前記第２の計算手段による組を複数有し、前記複数の第２の計算手段によって計算された評価値の和を計算する第５の計算手段として機能させ、前記切出位置決定手段は、前記第５の計算手段によって計算された評価値の和に基づいて、前記画像内に存在する文字画像を切り出す位置を決定し、文字画像の切り出し位置の教師データを受け付ける受付手段と、前記切出位置決定手段によって決定された切り出し位置と前記受付手段によって受け付けられた教師データを比較して、切り出し位置の正解個数又は誤り個数を算出する個数算出手段と、前記個数算出手段によって算出された切り出し位置の正解個数又は誤り個数に基づいて、１文字分の文字切り出し位置における前記第１の計算手段で用いる重みを変更する重み変更手段として機能させ、前記重み変更手段は、現在の重みでの場合の正解個数又は誤り個数に基づいた値から変更後の重みでの正解個数又は誤り個数に基づいた値への変更量から次の重みを決定することを特徴とする画像処理プログラムである。
請求項８の発明は、コンピュータを、画像内に存在する１つの文字画像を切り出す位置の候補に関する複数の特徴量に対して、重み付き線形和を計算する第１の計算手段と、前記第１の計算手段による計算結果を引数として、該引数が極限の値の場合に予め定められた値に収束するようになる、又は該引数と予め定められた値との距離が大きくなると出力の傾きの絶対値が小さくなるようになる非線形単調関数又は該非線形単調関数に近似する関数によって、前記１つの文字画像を切り出す位置の候補の評価値を計算する第２の計算手段と、前記第２の計算手段によって計算された評価値に基づいて、前記画像内に存在する文字画像を切り出す位置を決定する切出位置決定手段と、前記第１の計算手段と前記第２の計算手段による組を複数有し、前記複数の第２の計算手段によって計算された評価値に対して、重み付き線形和を計算する第３の計算手段と、前記第３の計算手段による計算結果を引数として、該引数が極限の値の場合に予め定められた値に収束するようになる、又は該引数と予め定められた値との距離が大きくなると出力の傾きの絶対値が小さくなるようになる非線形単調関数又は該非線形単調関数に近似する関数によって、前記１つの文字画像を切り出す位置の候補の評価値を計算する第４の計算手段として機能させ、前記切出位置決定手段は、前記第４の計算手段によって計算された評価値に基づいて、前記画像内に存在する文字画像を切り出す位置を決定し、前記第１の計算手段と前記第２の計算手段による複数の組と、前記第３の計算手段と前記第４の計算手段による組を複数有し、前記複数の第４の計算手段によって計算された評価値の和を計算する第６の計算手段として機能させ、前記切出位置決定手段は、前記第６の計算手段によって計算された評価値の和に基づいて、前記画像内に存在する文字画像を切り出す位置を決定し、文字画像の切り出し位置の教師データを受け付ける受付手段と、前記切出位置決定手段によって決定された切り出し位置と前記受付手段によって受け付けられた教師データを比較して、切り出し位置の正解個数又は誤り個数を算出する個数算出手段と、前記個数算出手段によって算出された切り出し位置の正解個数又は誤り個数に基づいて、１文字分の文字切り出し位置における前記第１の計算手段、又は前記第３の計算手段で用いる重みを変更する重み変更手段として機能させ、前記重み変更手段は、現在の重みでの場合の正解個数又は誤り個数に基づいた値から変更後の重みでの正解個数又は誤り個数に基づいた値への変更量から次の重みを決定することを特徴とする画像処理プログラムである。 According to a fifth aspect of the present invention, there is provided a first calculating means for calculating a weighted linear sum with respect to a plurality of feature amounts relating to a position candidate for cutting out one character image existing in an image; When the calculation result of the calculation means is used as an argument, when the argument is a limit value, the value converges to a predetermined value, or when the distance between the argument and the predetermined value increases, the slope of the output Second calculation means for calculating an evaluation value of a candidate for a position to cut out the one character image by a non-linear monotone function whose absolute value becomes small or a function approximating the non-linear monotone function; and the second calculation based on the evaluation value calculated by the means, and the cropping position determining means for determining a position for cutting out a character image that is present in the image, receiving means for receiving training data segmentation position of the character image The number calculation means for comparing the cutout position determined by the cutout position determination means and the teacher data received by the reception means to calculate the correct number or the number of errors of the cutout position, and calculated by the number calculation means Based on the number of correct answers or the number of errors in the cutout position, the weight changer functions to change the weight used in the first calculation unit at the character cutout position for one character. An image processing program for determining a next weight from an amount of change from a value based on the number of correct answers or the number of errors in the case of a value to a value based on the number of correct answers or the number of errors in the changed weight .
According to a sixth aspect of the present invention, there is provided a first calculation means for calculating a weighted linear sum with respect to a plurality of feature amounts related to a position candidate for cutting out one character image existing in an image; When the calculation result of the calculation means is used as an argument, when the argument is a limit value, the value converges to a predetermined value, or when the distance between the argument and the predetermined value increases, the slope of the output Second calculation means for calculating an evaluation value of a candidate for a position to cut out the one character image by a non-linear monotone function whose absolute value becomes small or a function approximating the non-linear monotone function; and the second calculation Based on the evaluation value calculated by the means, there are a plurality of sets of cutting position determining means for determining a position for cutting out a character image existing in the image, and a set of the first calculating means and the second calculating means. A third calculation means for calculating a weighted linear sum with respect to the evaluation values calculated by the plurality of second calculation means; and a calculation result obtained by the third calculation means as an argument. A non-linear monotonic function or the non-linear function in which the absolute value of the output slope decreases as the distance between the argument and the predetermined value increases. A function that approximates a monotonic function functions as a fourth calculation unit that calculates an evaluation value of a position candidate to cut out one character image, and the cut-out position determination unit is calculated by the fourth calculation unit. Based on the evaluation value, the position for cutting out the character image existing in the image is determined, and the reception means for receiving the teacher data of the cutting position of the character image is determined by the cutting position determination means. A number calculating means for calculating the number of correct answers or errors of the cutout position by comparing the cutout position and the teacher data received by the receiving means, and the number of correct answers or errors of the cutout position calculated by the number calculating means Based on the number of characters, the weight calculation unit functions as a weight change unit that changes the weight used in the first calculation unit or the third calculation unit at the character cut-out position for one character. In this case, the image processing program is characterized in that the next weight is determined from the amount of change from the value based on the number of correct answers or the number of errors to the value based on the number of correct answers or the number of errors with the changed weight.
According to a seventh aspect of the present invention, there is provided a first calculating means for calculating a weighted linear sum with respect to a plurality of feature amounts related to a position candidate for cutting out one character image existing in an image; When the calculation result of the calculation means is used as an argument, when the argument is a limit value, the value converges to a predetermined value, or when the distance between the argument and the predetermined value increases, the slope of the output Second calculation means for calculating an evaluation value of a candidate for a position to cut out the one character image by a non-linear monotone function whose absolute value becomes small or a function approximating the non-linear monotone function; and the second calculation Based on the evaluation value calculated by the means, there are a plurality of sets of cutting position determining means for determining a position for cutting out a character image existing in the image, and a set of the first calculating means and the second calculating means. , Functioning as fifth calculation means for calculating the sum of the evaluation values calculated by the plurality of second calculation means, and the cut-out position determination means is configured to calculate the evaluation values calculated by the fifth calculation means. Based on the sum, a position for cutting out a character image existing in the image is determined, and accepting means for receiving teacher data of the cutout position of the character image; a cutout position determined by the cutout position determining means; and the receiving means The number calculation means for calculating the number of correct answers or errors at the cutout position by comparing the teacher data received by the above, and the number of correct answers or the number of errors at the cutout position calculated by the number calculation means. Functioning as a weight changing means for changing the weight used in the first calculating means at the character cutout position of the character, and the weight changing means An image processing program for determining a next weight from an amount of change from a value based on the number of correct answers or the number of errors in the case of a value to a value based on the number of correct answers or the number of errors in the changed weight .
The invention according to claim 8 is a first calculation means for calculating a weighted linear sum with respect to a plurality of feature quantities related to a position candidate for cutting out one character image existing in the image; When the calculation result of the calculation means is used as an argument, when the argument is a limit value, the value converges to a predetermined value, or when the distance between the argument and the predetermined value increases, the slope of the output Second calculation means for calculating an evaluation value of a candidate for a position to cut out the one character image by a non-linear monotone function whose absolute value becomes small or a function approximating the non-linear monotone function; and the second calculation Based on the evaluation value calculated by the means, there are a plurality of sets of cutting position determining means for determining a position for cutting out a character image existing in the image, and a set of the first calculating means and the second calculating means. A third calculation means for calculating a weighted linear sum with respect to the evaluation values calculated by the plurality of second calculation means; and a calculation result obtained by the third calculation means as an argument. A non-linear monotonic function or the non-linear function in which the absolute value of the output slope decreases as the distance between the argument and the predetermined value increases. A function that approximates a monotonic function functions as a fourth calculation unit that calculates an evaluation value of a position candidate to cut out one character image, and the cut-out position determination unit is calculated by the fourth calculation unit. The character image existing in the image is cut out based on the evaluation value, a plurality of sets of the first calculation unit and the second calculation unit, the third calculation unit, and the second calculation unit; Calculation of 4 A plurality of sets of stages, and function as sixth calculation means for calculating a sum of evaluation values calculated by the plurality of fourth calculation means, wherein the cut-out position determination means is the sixth calculation means Based on the sum of the evaluation values calculated by step (b), a position for cutting out a character image existing in the image is determined, and a receiving unit that receives teacher data for the cutting position of the character image is determined by the cutting position determination unit. The number calculation means for calculating the number of correct answers or the number of errors of the cutout position by comparing the cutout position and the teacher data received by the receiving means, and the number of correct answers or the number of errors of the cutout position calculated by the number calculation means And a weight changing means for changing a weight used in the first calculating means or the third calculating means at the character cutout position for one character, The weight changing means is configured to change from a change amount from a value based on the number of correct answers or the number of errors in the case of the current weight to a value based on the number of correct answers or the number of errors in the changed weight. An image processing program characterized by determining a weight.

請求項１の画像処理装置によれば、画像内に存在する文字画像を切り出す位置を決定する場合にあって、切り出す位置の候補の評価値が特異の値となった場合に、その特異な評価値だけによって切り出す位置が決定されてしまうことを防ぐことができる。また、画像内に存在する文字画像を切り出す位置を決定する場合にあって、評価値を計算するために利用する重みを決定することができる。 According to the image processing apparatus of claim 1, when a position to cut out a character image existing in an image is determined and the evaluation value of a candidate for the cut-out position becomes a unique value, the unique evaluation is performed. It is possible to prevent the position to be cut out from being determined only by the value. In addition, when determining the position to cut out the character image existing in the image, it is possible to determine the weight to be used for calculating the evaluation value.

請求項２の画像処理装置によれば、本構成を有していない場合に比較して、画像内に存在する文字画像を切り出す位置を精度よく決定することができる。また、画像内に存在する文字画像を切り出す位置を決定する場合にあって、評価値を計算するために利用する重みを決定することができる。 According to the image processing apparatus of the second aspect, it is possible to accurately determine the position where the character image existing in the image is cut out as compared with the case where the present configuration is not provided. In addition, when determining the position to cut out the character image existing in the image, it is possible to determine the weight to be used for calculating the evaluation value.

請求項３の画像処理装置によれば、本構成を有していない場合に比較して、画像内に存在する文字画像を切り出す位置を精度よく決定することができる。また、画像内に存在する文字画像を切り出す位置を決定する場合にあって、評価値を計算するために利用する重みを決定することができる。 According to the image processing apparatus of the third aspect, it is possible to accurately determine the position where the character image existing in the image is cut out as compared with the case where the present configuration is not provided. In addition, when determining the position to cut out the character image existing in the image, it is possible to determine the weight to be used for calculating the evaluation value.

請求項４の画像処理装置によれば、本構成を有していない場合に比較して、画像内に存在する文字画像を切り出す位置を精度よく決定することができる。また、画像内に存在する文字画像を切り出す位置を決定する場合にあって、評価値を計算するために利用する重みを決定することができる。 According to the image processing apparatus of the fourth aspect, it is possible to accurately determine the position where the character image existing in the image is cut out as compared with the case where the present configuration is not provided. In addition, when determining the position to cut out the character image existing in the image, it is possible to determine the weight to be used for calculating the evaluation value.

請求項５の画像処理プログラムによれば、画像内に存在する文字画像を切り出す位置を決定する場合にあって、切り出す位置の候補の評価値が特異の値となった場合に、その特異な評価値だけによって切り出す位置が決定されてしまうことを防ぐことができる。また、画像内に存在する文字画像を切り出す位置を決定する場合にあって、評価値を計算するために利用する重みを決定することができる。
請求項６の画像処理プログラムによれば、本構成を有していない場合に比較して、画像内に存在する文字画像を切り出す位置を精度よく決定することができる。また、画像内に存在する文字画像を切り出す位置を決定する場合にあって、評価値を計算するために利用する重みを決定することができる。
請求項７の画像処理プログラムによれば、本構成を有していない場合に比較して、画像内に存在する文字画像を切り出す位置を精度よく決定することができる。また、画像内に存在する文字画像を切り出す位置を決定する場合にあって、評価値を計算するために利用する重みを決定することができる。
請求項８の画像処理プログラムによれば、本構成を有していない場合に比較して、画像内に存在する文字画像を切り出す位置を精度よく決定することができる。また、画像内に存在する文字画像を切り出す位置を決定する場合にあって、評価値を計算するために利用する重みを決定することができる。 According to the image processing program of claim 5 , when a position to cut out a character image existing in an image is determined and the evaluation value of the candidate for the cut out position becomes a unique value, the unique evaluation is performed. It is possible to prevent the position to be cut out from being determined only by the value. In addition, when determining the position to cut out the character image existing in the image, it is possible to determine the weight to be used for calculating the evaluation value.
According to the image processing program of the sixth aspect, the position to cut out the character image existing in the image can be determined with higher accuracy than in the case where the present configuration is not provided. In addition, when determining the position to cut out the character image existing in the image, it is possible to determine the weight to be used for calculating the evaluation value.
According to the image processing program of the seventh aspect, it is possible to accurately determine the position to cut out the character image existing in the image as compared with the case where the present configuration is not provided. In addition, when determining the position to cut out the character image existing in the image, it is possible to determine the weight to be used for calculating the evaluation value.
According to the image processing program of the eighth aspect, it is possible to accurately determine the position where the character image existing in the image is cut out as compared with the case where the present configuration is not provided. In addition, when determining the position to cut out the character image existing in the image, it is possible to determine the weight to be used for calculating the evaluation value.

第１の実施の形態の構成例についての概念的なモジュール構成図である。It is a conceptual module block diagram about the structural example of 1st Embodiment. 第１の実施の形態のアーク評価値決定モジュール内の構成例についての概念的なモジュール構成図である。It is a conceptual module block diagram about the structural example in the arc evaluation value determination module of 1st Embodiment. 教師用データテーブルのデータ構造例を示す説明図である。It is explanatory drawing which shows the example of a data structure of the data table for teachers. 第２の実施の形態のアーク評価値決定モジュール内の構成例についての概念的なモジュール構成図である。It is a conceptual module block diagram about the structural example in the arc evaluation value determination module of 2nd Embodiment. 第３の実施の形態のアーク評価値決定モジュール内の構成例についての概念的なモジュール構成図である。It is a notional module block diagram about the structural example in the arc evaluation value determination module of 3rd Embodiment. 第３の実施の形態のアーク評価値算出モジュール内の構成例についての概念的なモジュール構成図である。It is a conceptual module block diagram about the structural example in the arc evaluation value calculation module of 3rd Embodiment. 教師用データテーブルのデータ構造例を示す説明図である。It is explanatory drawing which shows the example of a data structure of the data table for teachers. アーク候補決定モジュール、アーク評価値決定モジュール、文字切り出し位置決定モジュールの関係例を示す説明図である。It is explanatory drawing which shows the example of a relationship between an arc candidate determination module, an arc evaluation value determination module, and a character extraction position determination module. 第４の実施の形態の構成例についての概念的なモジュール構成図である。It is a notional module block diagram about the structural example of 4th Embodiment. 第１〜第４の実施の形態を実現するコンピュータのハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware structural example of the computer which implement | achieves the 1st-4th embodiment. 文字列画像の例を示す説明図である。It is explanatory drawing which shows the example of a character string image. 文字境界候補の例を示す説明図である。It is explanatory drawing which shows the example of a character boundary candidate. 外接矩形の例を示す説明図である。It is explanatory drawing which shows the example of a circumscribed rectangle. 文字切り出し結果の例を示す説明図である。It is explanatory drawing which shows the example of a character cutout result. 文字切り出し位置を示すグラフ表現の例を示す説明図である。It is explanatory drawing which shows the example of the graph expression which shows a character cutout position. グラフ表現内のパターン２の例を示す説明図である。It is explanatory drawing which shows the example of the pattern 2 in a graph expression. 一般的な文字切り出し、文字認識を行う画像処理装置の構成例についての概念的なモジュール構成図である。It is a conceptual module block diagram about the structural example of the image processing apparatus which performs general character cutout and character recognition. 文字切り出し位置を示すグラフ表現にアーク評価値を付加した例を示す説明図である。It is explanatory drawing which shows the example which added the arc evaluation value to the graph expression which shows a character cutout position. 特徴量空間において直線で分離できる場合の例を示す説明図である。It is explanatory drawing which shows the example when it can isolate | separate with a straight line in feature-value space. 特徴量空間において直線では分離できない場合の例を示す説明図である。It is explanatory drawing which shows the example when it cannot isolate | separate with a straight line in feature-value space.

本実施の形態は、例えば文字認識等のように文字画像を対象とした処理を行う場合に、画像内に存在する文字画像を切り出すためのものである。
まず、本実施の形態を説明する前に、その前提又は本実施の形態を利用する画像処理装置について説明する。なお、この説明は、本実施の形態の理解を容易にすることを目的とするものである。 The present embodiment is for cutting out a character image existing in an image when processing for a character image, such as character recognition, is performed.
First, before describing the present embodiment, the premise or an image processing apparatus using the present embodiment will be described. This description is intended to facilitate understanding of the present embodiment.

例えば、図１１の例に示すような文字列画像を対象とする。まず、この文字列画像を文字セグメントに分割する。文字セグメントとは、文字そのもの、あるいは文字の一部となる可能性がある文字部分である。ここでは、図１１の例に示すような横書きの文字列画像を例にとる。横書きの画像では、垂直な線（あるいは垂直に近い線）で分割を行うことにより、文字セグメントに分割する。例えば図１２に示した縦線（切れ目候補１２１０、切れ目候補１２２０）で、文字列画像を分割して、３つの文字セグメント「イ」、「ヒ」、及び、「学」を得ることができる。図１２の例に示した縦線を切れ目候補と呼ぶこととする。切れ目候補１２１０が「イ」と「ヒ」を分け、切れ目候補１２２０が「ヒ」と「学」を分けている。 For example, a character string image as shown in the example of FIG. 11 is targeted. First, this character string image is divided into character segments. A character segment is a character itself or a character portion that can be a part of a character. Here, a horizontally written character string image as shown in the example of FIG. 11 is taken as an example. In a horizontally written image, the image is divided into character segments by dividing the image with vertical lines (or lines close to vertical). For example, the character string image can be divided by the vertical lines (cut candidate 1210, cut candidate 1220) shown in FIG. 12 to obtain three character segments “I”, “HI”, and “Study”. The vertical line shown in the example of FIG. 12 is called a break candidate. The break candidate 1210 separates “I” and “HI”, and the break candidate 1220 separates “HI” and “Study”.

次に、図１３の例に示すように、各文字セグメントに対し、その外接矩形（外接矩形１３１０、外接矩形１３２０、外接矩形１３３０）を抽出する。
以下、特許文献３に記載されている技術内容を例にして説明する。なお、以下の説明で用いる用語は、特許文献３で用いる用語とは異なっている場合がある。
前述の文字セグメントを統合して、文字画像を決定する。複数の文字セグメントを統合して１つの文字画像を形成する場合もあれば、１つの文字セグメントが１つの文字となる場合もある。文字画像を決定するとは、文字の切り出し位置を決定することと同値であるから、以下では文字切り出し位置の決定という場合もある。
文字セグメントの統合のパターンは複数存在する。複数存在するパターンの中で、最も文字画像として評価の高いものを選択することによって、最終的な文字切り出し位置を決定する。
図１３の例に対しては、全ての文字切り出しパターンは、図１４に示す例のようになる。つまり、図１４（ａ）の例では、パターン１として３つの文字画像（外接矩形１３１０、１３２０、１３３０）、図１４（ｂ）の例では、パターン２として２つの文字画像（外接矩形１３１０と１３２０、１３３０）、図１４（ｃ）の例では、パターン３として１つの文字画像（外接矩形１３１０と１３２０と１３３０）、図１４（ｄ）の例では、パターン４として２つの文字画像（外接矩形１３１０、外接矩形１３２０と１３３０）を示している。 Next, as shown in the example of FIG. 13, circumscribed rectangles (circumscribed rectangle 1310, circumscribed rectangle 1320, circumscribed rectangle 1330) are extracted for each character segment.
Hereinafter, the technical content described in Patent Document 3 will be described as an example. Note that the terms used in the following description may be different from the terms used in Patent Document 3.
A character image is determined by integrating the character segments described above. A plurality of character segments may be integrated to form one character image, or one character segment may become one character. Determining a character image is equivalent to determining a character cutout position, and hence may be referred to as determining a character cutout position below.
There are multiple patterns of character segment integration. A final character cutout position is determined by selecting a character image having the highest evaluation from among a plurality of existing patterns.
For the example of FIG. 13, all the character cutout patterns are as shown in the example of FIG. That is, in the example of FIG. 14A, three character images (circumscribed rectangles 1310, 1320, 1330) are used as the pattern 1, and in the example of FIG. 14B, two character images (circumscribed rectangles 1310 and 1320 are used as the pattern 2. 1330) and FIG. 14C, one character image (circumscribed rectangles 1310, 1320, and 1330) is used as the pattern 3, and two character images (circumscribed rectangle 1310 are used as the pattern 4 in the example shown in FIG. 14D. , Circumscribed rectangles 1320 and 1330).

文字切り出し位置を示すグラフ表現として、図１４の例に示した複数の切り出しパターンを表すことができる。図１５の例において、グラフは、始点ノード１５００、終点ノード１５９０、中間ノード１５１０（ノード１）、中間ノード１５２０（ノード２）の４つのノードと、ノード間を接続するアークで構成されている（ノード間の接続線をアークと呼ぶこととする）。始点は、文字列画像の左端、終点は文字列画像の右端にあたる。中間ノード１５１０（ノード１）、中間ノード１５２０（ノード２）は、それぞれ、文字の切れ目候補位置（すなわち、図１２の例に示した切れ目候補１２１０、切れ目候補１２２０）を示す。中間ノード１５１０（ノード１）は、切れ目候補１２１０に対応している。また、中間ノード１５２０（ノード２）は、切れ目候補１２２０に対応している。 A plurality of cutout patterns shown in the example of FIG. 14 can be represented as a graph expression indicating a character cutout position. In the example of FIG. 15, the graph is composed of four nodes, a start node 1500, an end node 1590, an intermediate node 1510 (node 1), and an intermediate node 1520 (node 2), and arcs connecting the nodes ( Connection lines between nodes are called arcs). The start point corresponds to the left end of the character string image, and the end point corresponds to the right end of the character string image. Intermediate node 1510 (node 1) and intermediate node 1520 (node 2) respectively indicate character break candidate positions (that is, break candidate 1210 and break candidate 1220 shown in the example of FIG. 12). The intermediate node 1510 (node 1) corresponds to the break candidate 1210. Further, the intermediate node 1520 (node 2) corresponds to the break candidate 1220.

始点から、各ノードを通って、終点に至る経路を以下、「パス」と呼ぶ。パスは、１又は複数のアークから構成される。通常、複数のパスが存在する。図１４の例に示した文字切り出しパターンは、これらの複数のパスに対応している。例えば、図１４（ｂ）の例に示したパターン２は、図１６の太線で示したパス（文字切り出しパターン１５０４、文字切り出しパターン１５２２）と対応している。
ここで、どれか１つのアークには、１つの文字画像の候補が対応している。例えば、始点ノード１５００と中間ノード１５２０（ノード２）を結ぶアークには、「化」という文字画像（文字切り出しパターン１５０４）が対応している。１つのアークに対応する文字に対して、その文字の評価値を決定することができる。これを「アーク評価値」と呼ぶこととする。
アーク評価値は、文字の形状情報や、文字認識における認識確度などから算出する。この詳細に関しては後述する。 A route from the start point through each node to the end point is hereinafter referred to as a “path”. The path is composed of one or a plurality of arcs. Usually there are multiple paths. The character cutout pattern shown in the example of FIG. 14 corresponds to these multiple paths. For example, the pattern 2 shown in the example of FIG. 14B corresponds to the paths (character cutout pattern 1504 and character cutout pattern 1522) shown by the thick lines in FIG.
Here, one character image candidate corresponds to any one arc. For example, a character image (character cutout pattern 1504) “K” corresponds to an arc connecting the start point node 1500 and the intermediate node 1520 (node 2). For a character corresponding to one arc, an evaluation value of the character can be determined. This is called an “arc evaluation value”.
The arc evaluation value is calculated from character shape information, recognition accuracy in character recognition, and the like. Details of this will be described later.

ここで、図１７を用いて、一般的な文字切り出し、文字認識を行う画像処理装置の構成例についての概念的なモジュール構成図を説明する。
この画像処理装置は、画像受付モジュール１１０、文字列抽出モジュール１２０、文字境界候補抽出モジュール１３０、アーク特徴量抽出モジュール１４０、線形重み付け加算モジュール１７１０、文字切り出しモジュール１６０、文字認識モジュール１７０を有している。 Here, a conceptual module configuration diagram of a configuration example of an image processing apparatus that performs general character segmentation and character recognition will be described with reference to FIG.
The image processing apparatus includes an image reception module 110, a character string extraction module 120, a character boundary candidate extraction module 130, an arc feature amount extraction module 140, a linear weighting addition module 1710, a character segmentation module 160, and a character recognition module 170. Yes.

画像受付モジュール１１０は、文字列抽出モジュール１２０と接続されており、対象となる画像を受け付けて、その画像を文字列抽出モジュール１２０へ渡す。画像を受け付けるとは、例えば、スキャナ、カメラ等で画像を読み込むこと、ファックス等で通信回線を介して外部機器から画像を受信すること、ハードディスク（コンピュータに内蔵されているものの他に、ネットワークを介して接続されているもの等を含む）等に記憶されている画像を読み出すこと等が含まれる。画像は、２値画像、多値画像（カラー画像を含む）であってもよい。受け付ける画像は、１枚であってもよいし、複数枚であってもよい。また、画像の内容として、文字が含まれていれば、ビジネスに用いられる文書、広告宣伝用のパンフレット等であってもよい。 The image reception module 110 is connected to the character string extraction module 120, receives a target image, and passes the image to the character string extraction module 120. Accepting an image means, for example, reading an image with a scanner, a camera, etc., receiving an image from an external device via a communication line by fax, etc. And the like, and the like read out the images stored in the device etc.). The image may be a binary image or a multi-value image (including a color image). One image may be received or a plurality of images may be received. Further, as long as characters are included in the content of the image, it may be a document used for business, a pamphlet for advertisement, or the like.

文字列抽出モジュール１２０は、画像受付モジュール１１０、文字境界候補抽出モジュール１３０と接続されており、画像受付モジュール１１０から画像を受け取り、その画像から文字列画像を抽出し、その文字列画像を文字境界候補抽出モジュール１３０へ渡す。文字列画像の抽出は、従来から知られている技術を用いるようにしてもよい。例えば、横方向又は縦方向に存在する黒画素数のヒストグラムを作成し、そのヒストグラムについて予め定められた幅を有しており、隣のヒストグラムと予め定められた距離以上離れているものを文字列の画像として抽出する。 The character string extraction module 120 is connected to the image reception module 110 and the character boundary candidate extraction module 130, receives an image from the image reception module 110, extracts a character string image from the image, and converts the character string image into a character boundary. Pass to candidate extraction module 130. For the extraction of the character string image, a conventionally known technique may be used. For example, a histogram of the number of black pixels existing in the horizontal direction or the vertical direction is created, the histogram has a predetermined width, and a character string that has a predetermined distance from the adjacent histogram is separated from the adjacent histogram Extracted as an image.

文字境界候補抽出モジュール１３０は、文字列抽出モジュール１２０、アーク特徴量抽出モジュール１４０と接続されており、文字列抽出モジュール１２０から文字列画像を受け取り、文字列画像の境界候補を抽出し、その境界候補をアーク特徴量抽出モジュール１４０へ渡す。例えば、図１２の例に示した切れ目候補１２１０、切れ目候補１２２０である。 The character boundary candidate extraction module 130 is connected to the character string extraction module 120 and the arc feature quantity extraction module 140, receives a character string image from the character string extraction module 120, extracts boundary candidates of the character string image, and extracts the boundary Candidates are passed to the arc feature extraction module 140. For example, the cut candidate 1210 and the cut candidate 1220 shown in the example of FIG.

アーク特徴量抽出モジュール１４０は、文字境界候補抽出モジュール１３０、線形重み付け加算モジュール１７１０と接続されており、文字境界候補抽出モジュール１３０から境界候補を受け取り、その境界候補の特徴量を抽出し、複数の特徴量を特徴量ベクトルとして線形重み付け加算モジュール１７１０へ渡す。前述の文字切り出し位置を示すグラフ表現における各アークの特徴量を抽出する。アークの特徴量については後述する。アーク特徴量は一般的には複数であるが、１つであってもよい。以下、アーク特徴量を特徴量ベクトルともいう。 The arc feature amount extraction module 140 is connected to the character boundary candidate extraction module 130 and the linear weight addition module 1710, receives the boundary candidate from the character boundary candidate extraction module 130, extracts the feature amount of the boundary candidate, and The feature quantity is passed to the linear weighting addition module 1710 as a feature quantity vector. The feature amount of each arc in the graph expression indicating the character cutout position is extracted. The feature amount of the arc will be described later. Generally, there are a plurality of arc feature amounts, but one may be used. Hereinafter, the arc feature quantity is also referred to as a feature quantity vector.

線形重み付け加算モジュール１７１０は、アーク特徴量抽出モジュール１４０、文字切り出しモジュール１６０と接続されており、アーク特徴量抽出モジュール１４０から特徴量ベクトルを受け取り、その特徴量ベクトルのアーク評価値を計算し、そのアーク評価値を文字切り出しモジュール１６０へ渡す。アーク評価値の計算については後述する。 The linear weighting addition module 1710 is connected to the arc feature quantity extraction module 140 and the character segmentation module 160, receives a feature quantity vector from the arc feature quantity extraction module 140, calculates an arc evaluation value of the feature quantity vector, The arc evaluation value is passed to the character cutout module 160. The calculation of the arc evaluation value will be described later.

文字切り出しモジュール１６０は、線形重み付け加算モジュール１７１０、文字認識モジュール１７０と接続されており、線形重み付け加算モジュール１７１０からアーク評価値を受け取り、アーク評価値に基づいて、切れ目候補の選択、つまり文字列画像内に存在する文字画像を切り出す位置を決定し、その切れ目候補に沿って文字画像を文字列画像（又は画像受付モジュール１１０が受け取った画像）から切り出し、その文字画像を文字認識モジュール１７０へ渡す。アーク評価値に基づいてとは、例えば、アーク評価値が最も高い値（１つの文字を切り出している可能性が高いことを示している値）のものを選択することである。
文字認識モジュール１７０は、文字切り出しモジュール１６０と接続されており、文字切り出しモジュール１６０から文字画像を受け取り、その文字画像を文字認識して、認識結果としての文字コードを出力する。 The character segmentation module 160 is connected to the linear weighting addition module 1710 and the character recognition module 170, receives an arc evaluation value from the linear weighting addition module 1710, and selects a break candidate based on the arc evaluation value, that is, a character string image. The position where the character image existing inside is cut out is determined, the character image is cut out from the character string image (or the image received by the image receiving module 110) along the cut candidate, and the character image is passed to the character recognition module 170. “Based on the arc evaluation value” means, for example, selecting a value having the highest arc evaluation value (a value indicating that there is a high possibility of cutting out one character).
The character recognition module 170 is connected to the character cutout module 160, receives a character image from the character cutout module 160, recognizes the character image, and outputs a character code as a recognition result.

アーク特徴量抽出モジュール１４０、線形重み付け加算モジュール１７１０の処理について説明する。
１つのパスは、複数のアークから構成されている。複数のアーク評価値を用いて、そのアークから構成されるパスの評価値を計算することができる。これを「パス評価値」と呼ぶこととする。
パス評価値としては、例えば、アーク評価値の重み付け和などが相当する。特許文献３に記載されている技術では、アーク内の文字セグメント数で重み付けを行う。
文字切り出し位置を決定するため、複数のパスの中で、最もパス評価値の高いパスを選択する。パスが選択できれば、文字切り出し位置が確定して、さらに、文字認識結果も確定することになる。
図１６の例では、太線のパスが選択されたとする。この場合、文字切り出し位置は、始点ノード１５００と、中間ノード１５２０（ノード２）と、終点ノード１５９０の３点となる。また、文字認識結果は、「化」、「学」となる。 Processing of the arc feature quantity extraction module 140 and the linear weighting addition module 1710 will be described.
One path is composed of a plurality of arcs. Using a plurality of arc evaluation values, an evaluation value of a path constituted by the arcs can be calculated. This is called a “path evaluation value”.
The path evaluation value corresponds to, for example, a weighted sum of arc evaluation values. In the technique described in Patent Document 3, weighting is performed by the number of character segments in the arc.
In order to determine the character cutout position, the path with the highest path evaluation value is selected from the plurality of paths. If the path can be selected, the character cutout position is confirmed, and the character recognition result is also confirmed.
In the example of FIG. 16, it is assumed that a thick line path is selected. In this case, the character cutout positions are three points: a start node 1500, an intermediate node 1520 (node 2), and an end node 1590. In addition, the character recognition results are “formation” and “study”.

特に、線形重み付け加算モジュール１７１０が行うアーク評価値の算出方法を説明する。
特許文献３に記載の技術では、文字形状情報と、文字認識確度情報の重み付き線形和を用いて、文字評価値を算出する。さらに具体的には、特許文献３に記載の技術では、下記のようにアーク評価値の算出を行う。
まず、各アークに対応する文字の外接矩形を作る。これは、図１４の例に示す各パターン内の各文字の外接矩形に相当する。以下、各アークに対応する文字の外接矩形を、アークの外接矩形と呼ぶ。
次に、下記のように、アークの特徴量（複数）を計算する。
ｆ_１：該当アークの外接矩形の高さ
ｆ_２：該当アークの外接矩形の幅
ｆ_３：該当アークの外接矩形と、左側アークの外接矩形の間隔
ｆ_４：該当アークの外接矩形と、右側アークの外接矩形の間隔
ｆ_５：該当アーク内の文字セグメントの外接矩形間の最大の間隔
ｆ_６：該当アーク内の連結成分数
さらに、文字類似度を、ｆ_７とする。
特許文献３に記載の技術では、アーク評価値Ｖを式（１）で決定する。ただし、この場合、（１）式でＮ＝７とする。

In particular, an arc evaluation value calculation method performed by the linear weighting addition module 1710 will be described.
In the technique described in Patent Document 3, a character evaluation value is calculated using a weighted linear sum of character shape information and character recognition accuracy information. More specifically, in the technique described in Patent Document 3, the arc evaluation value is calculated as follows.
First, a circumscribed rectangle of characters corresponding to each arc is created. This corresponds to the circumscribed rectangle of each character in each pattern shown in the example of FIG. Hereinafter, a circumscribed rectangle of a character corresponding to each arc is referred to as an arc circumscribed rectangle.
Next, arc feature values (plural) are calculated as follows.
f _1: the corresponding arc of the circumscribed rectangle of height f _2: the width f of the circumscribed rectangle of the corresponding arc _3: and a circumscribed rectangle of the corresponding arc of the circumscribed rectangle of the left arc distance f _4: the circumscribed rectangle of the corresponding arc, right arc enclosing rectangles spacing f _5: maximum interval f between the circumscribed rectangle of a character segment in the corresponding arc _6: number connected components in the corresponding arc further, the character similarity, and f _7.
In the technique described in Patent Document 3, the arc evaluation value V is determined by Expression (1). In this case, however, N = 7 in equation (1).

つまり、文字形状情報（ｆ_１〜ｆ_６）と、文字認識確度情報ｆ_７の重み付け線形和でアーク評価値は決定される。ｗ_ｉは、線形和算出時の重みである。ｃは定数である。特許文献３の記述方法では、式（１）の記載ではなく、別の形式で記述されている。しかし、記述の違いはあるが、数学的には同じとなっている。
線形重み付け加算モジュール１７１０は、特徴量ベクトルとして、特徴量である前述のｆ_１〜ｆ_７の値を受け付ける。ここでは特徴数をＮとしている。線形重み付け加算モジュール１７１０の内部動作は、式（１）で示されるものである。そして、アーク評価値Ｖを文字切り出しモジュール１６０へ渡す。 That is, the arc evaluation value is determined by the weighted linear sum of the character shape information (f _{1 to} f ₆ ) and the character recognition accuracy information f ₇ . w _i is a weight at the time of calculating the linear sum. c is a constant. In the description method of Patent Document 3, the expression (1) is described in a different format instead of the expression (1). However, although there are differences in description, they are mathematically the same.
The linear weighted addition module 1710 receives the above-described values of f _{1 to} f ₇ that are feature amounts as the feature amount vectors. Here, the number of features is N. The internal operation of the linear weighted addition module 1710 is shown by the equation (1). Then, the arc evaluation value V is passed to the character cutout module 160.

次に、特許文献３に記載の技術を実施した場合に起こり得る現象について説明する。
＜現象１＞
特許文献３に記載の技術では、特徴量の線形和をアーク評価値としていた。線形和であるため、特徴量の内容によっては、アーク評価値の値域はマイナス無限大〜プラス無限大の値を取り得る。
このように線形和を用いてアーク評価値を算出すると、アーク評価値が非常に高い値や非常に低い値になってしまう場合がある。
アーク評価値が非常に高い値や非常に低い値になってしまう場合、全体のパス評価値がその非常に高い、あるいは、非常に低い値に引きずられてしまう場合がある。例えば、ここではパス評価値をアーク評価値の重み付き和で評価するとする。重みは適当に定める。ここでは従来技術のようにアーク内の文字セグメント数で重み付けを行うとする。
図１５の例において、図１８の例に示すような評価値となっているとする。
このようなアーク評価値となっている場合、例えば、アーク内の文字セグメント数で重み付けを行うとすると、
・文字切り出しパターン１５０４「化」、文字切り出しパターン１５２２「学」の場合のパス評価値は、１０×２＋１０＝３０である。
・文字切り出しパターン１５０６「イ」、文字切り出しパターン１５１２「ヒ」、文字切り出しパターン１５２２「学」の場合のパス評価値は、１＋１００＋１０＝１１１である。
すなわち、「ヒ」のアーク評価値が他の評価値と比べて高すぎるために、他の評価値が小さい場合（すなわち、文字らしくない場合、例えば、文字切り出しパターン１５０６「イ」のアーク評価値は１）でも、その値に引きずられて、「ヒ」を含むパスが選択されてしまうこととなる。 Next, a phenomenon that may occur when the technique described in Patent Document 3 is implemented will be described.
<Phenomenon 1>
In the technique described in Patent Document 3, a linear sum of feature amounts is used as an arc evaluation value. Since it is a linear sum, the range of the arc evaluation value can be a value between minus infinity and plus infinity depending on the content of the feature amount.
When the arc evaluation value is calculated using the linear sum in this way, the arc evaluation value may become a very high value or a very low value.
When the arc evaluation value becomes a very high value or a very low value, the overall path evaluation value may be dragged to the very high or very low value. For example, here, the path evaluation value is evaluated as a weighted sum of arc evaluation values. The weight is determined appropriately. Here, it is assumed that weighting is performed by the number of character segments in the arc as in the prior art.
In the example of FIG. 15, it is assumed that the evaluation values are as shown in the example of FIG.
If it is such an arc evaluation value, for example, if weighting is performed by the number of character segments in the arc,
The path evaluation value in the case of the character cutout pattern 1504 “K” and the character cutout pattern 1522 “Study” is 10 × 2 + 10 = 30.
The path evaluation value in the case of the character cutout pattern 1506 “I”, the character cutout pattern 1512 “H”, and the character cutout pattern 1522 “Study” is 1 + 100 + 10 = 111.
That is, since the arc evaluation value of “HI” is too high compared to other evaluation values, the other evaluation values are small (that is, when the character evaluation is not character-like, for example, the arc evaluation value of the character cutout pattern 1506 “I”) 1), however, the path including “hi” is selected by being dragged by the value.

＜現象２＞
ここで、アーク特徴量ベクトルｆ＝（ｆ_１， …，ｆ_Ｎ）、重みベクトルをｗ＝（ｗ_１， …，ｗ_Ｎ）とする。また、アーク評価値関数をＶ（ｆ）とする。式（１）は、式（２）のようになる。

次に、正解切り出し位置に対応するアーク特徴量ベクトルをｆ_Ｔ、不正解切り出し位置に対応するアーク特徴量ベクトルをｆ_Ｆとする。
アーク評価値関数が妥当であるためには、式（３）のような関係になることが望ましい。つまり、正解切り出し位置の場合のアーク評価値は、不正解切り出し位置の場合のアーク評価値よりも大きな値であることが望ましい。

正解アーク、不正解アークは複数存在するため、式（３）のような関係を得るためには、正解切り出し位置の場合のアーク評価値の最小値Ｖ_Ｔｍｉｎと、不正解切り出し位置の場合のアーク評価値の最大値Ｖ_Ｆｍａｘとの関係が、式（４）のようになる必要がある。

式（４）が成り立つとき、式（５）を満たす値Ｖ_０が存在する。

<Phenomenon 2>
Here, it is assumed that the arc feature vector f = (f ₁ ,..., F _N ) and the weight vector is w = (w ₁ ,..., W _N ). The arc evaluation value function is V (f). Formula (1) becomes like Formula (2).

Next, an arc feature quantity vector corresponding to the correct answer cut-out position is set to f _T , and an arc feature quantity vector corresponding to the incorrect answer cut-out position is set to f _F.
In order for the arc evaluation value function to be valid, it is desirable that the relationship is as shown in Expression (3). That is, it is desirable that the arc evaluation value in the case of the correct answer cut-out position is larger than the arc evaluation value in the case of the incorrect answer cut-out position.

Since there are a plurality of correct answer arcs and incorrect answer arcs, in order to obtain the relationship as in Expression (3), the minimum arc evaluation value V _{Tmin in} the case of the correct answer cut-out position and the arc in the case of the incorrect answer cut-out position The relationship between the evaluation value and the maximum value V _Fmax needs to be as shown in Equation (4).

When equation (4) holds, there is a value V ₀ that satisfies equation (5).

ここで、特徴量ベクトルはＮ次元の空間内に存在する。式（１）又は式（２）は、このＮ次元特徴量空間における超平面を形成する。つまり、アーク評価値Ｖ（ｆ）が、所定の値Ｖ_０となるような特徴量ベクトルｆの集合は、式（６）で示される超平面上に存在することになる。

（６）式で表すことのできる超平面は（５）式より、正解切り出し位置の場合の特徴量ベクトル（正解特徴量ベクトル）の分布と、不正解切り出し位置の場合の特徴量ベクトル（不正解特徴量ベクトル）の分布を完全に分離することになる。このように、特徴量空間内の超平面で、正解特徴量ベクトルの分布と不正解特徴量ベクトルの分布を分離することができれば、（３）式を満たすことができて、妥当なアーク評価値関数を設計することが可能となる。
簡単のため、特徴量空間が２次元の場合を図示する。２次元の場合には、超平面は直線となる。図１９の例に示す破線１９３０のように、正解特徴量分布１９２０と不正解特徴量分布１９１０が直線で分離できる場合は問題がない。 Here, the feature vector exists in an N-dimensional space. Expression (1) or Expression (2) forms a hyperplane in this N-dimensional feature amount space. That is, a set of feature quantity vectors f such that the arc evaluation value V (f) becomes the predetermined value V ₀ exists on the hyperplane represented by the equation (6).

The hyperplane that can be expressed by the equation (6) is the distribution of the feature vector (correct feature vector) at the correct cutout position and the feature vector (incorrect solution) at the incorrect cutout position from the equation (5). The distribution of the feature vector is completely separated. Thus, if the correct feature vector distribution and the incorrect feature vector distribution can be separated on the hyperplane in the feature space, equation (3) can be satisfied, and an appropriate arc evaluation value can be obtained. It is possible to design a function.
For simplicity, the case where the feature space is two-dimensional is illustrated. In the case of two dimensions, the hyperplane is a straight line. There is no problem if the correct feature quantity distribution 1920 and the incorrect answer feature quantity distribution 1910 can be separated by a straight line as indicated by a broken line 1930 in the example of FIG.

特許文献３に記載の技術では、式（１）を採用しているため、分離面は超平面以外にはあり得ない。ところが、実際には、正解と不正解の分離面は超平面ではなく、もっと複雑な形状を示している可能性がある。超平面で分離不可能な複雑な形状を正解と不正解の分布が持つ場合、特許文献３に記載の技術では対応できない。
例えば、図２０の例に示すような正解特徴量分布２０２０と不正解特徴量分布２０１０の分布の場合、もはや直線で分離することは不可能である。このような場合、特許文献３に記載の技術では、妥当なアーク評価値を求めることができなくなる。つまり、式（７）で示されるような現象が起こってしまう。この現象が起きると、間違った文字切り出し位置であるにも関わらず、高いアーク評価値を得ることになってしまう。結果として、文字切り出し位置を誤ることになる。

以上で、本実施の形態の前提又は本実施の形態を利用する画像処理装置についての説明を終了する。 In the technique described in Patent Document 3, since the formula (1) is adopted, the separation surface cannot be other than a hyperplane. However, in reality, the separation plane between the correct answer and the incorrect answer is not a hyperplane, and may have a more complicated shape. The technique described in Patent Document 3 cannot cope with the case where the distribution of correct and incorrect answers has a complicated shape that cannot be separated on a hyperplane.
For example, in the case of the distribution of the correct feature amount distribution 2020 and the incorrect feature amount distribution 2010 as shown in the example of FIG. 20, it is no longer possible to separate them with straight lines. In such a case, the technique described in Patent Document 3 cannot obtain a proper arc evaluation value. That is, a phenomenon as expressed by the equation (7) occurs. When this phenomenon occurs, a high arc evaluation value is obtained despite the wrong character cutout position. As a result, the character cutout position is incorrect.

This is the end of the description of the premise of the present embodiment or the image processing apparatus using the present embodiment.

以下、図面に基づき本発明を実現するにあたっての好適な各種の実施の形態の例を説明する。
＜第１の実施の形態＞
図１は、第１の実施の形態の構成例についての概念的なモジュール構成図を示している。
なお、モジュールとは、一般的に論理的に分離可能なソフトウェア（コンピュータ・プログラム）、ハードウェア等の部品を指す。したがって、本実施の形態におけるモジュールはコンピュータ・プログラムにおけるモジュールのことだけでなく、ハードウェア構成におけるモジュールも指す。それゆえ、本実施の形態は、それらのモジュールとして機能させるためのコンピュータ・プログラム（コンピュータにそれぞれの手順を実行させるためのプログラム、コンピュータをそれぞれの手段として機能させるためのプログラム、コンピュータにそれぞれの機能を実現させるためのプログラム）、システム及び方法の説明をも兼ねている。ただし、説明の都合上、「記憶する」、「記憶させる」、これらと同等の文言を用いるが、これらの文言は、実施の形態がコンピュータ・プログラムの場合は、記憶装置に記憶させる、又は記憶装置に記憶させるように制御するの意である。また、モジュールは機能に一対一に対応していてもよいが、実装においては、１モジュールを１プログラムで構成してもよいし、複数モジュールを１プログラムで構成してもよく、逆に１モジュールを複数プログラムで構成してもよい。また、複数モジュールは１コンピュータによって実行されてもよいし、分散又は並列環境におけるコンピュータによって１モジュールが複数コンピュータで実行されてもよい。なお、１つのモジュールに他のモジュールが含まれていてもよい。また、以下、「接続」とは物理的な接続の他、論理的な接続（データの授受、指示、データ間の参照関係等）の場合にも用いる。「予め定められた」とは、対象としている処理の前に定まっていることをいい、本実施の形態による処理が始まる前はもちろんのこと、本実施の形態による処理が始まった後であっても、対象としている処理の前であれば、そのときの状況・状態に応じて、又はそれまでの状況・状態に応じて定まることの意を含めて用いる。
また、システム又は装置とは、複数のコンピュータ、ハードウェア、装置等がネットワーク（一対一対応の通信接続を含む）等の通信手段で接続されて構成されるほか、１つのコンピュータ、ハードウェア、装置等によって実現される場合も含まれる。「装置」と「システム」とは、互いに同義の用語として用いる。もちろんのことながら、「システム」には、人為的な取り決めである社会的な「仕組み」（社会システム）にすぎないものは含まない。
また、各モジュールによる処理毎に又はモジュール内で複数の処理を行う場合はその処理毎に、対象となる情報を記憶装置から読み込み、その処理を行った後に、処理結果を記憶装置に書き出すものである。したがって、処理前の記憶装置からの読み込み、処理後の記憶装置への書き出しについては、説明を省略する場合がある。なお、ここでの記憶装置としては、ハードディスク、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、外部記憶媒体、通信回線を介した記憶装置、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）内のレジスタ等を含んでいてもよい。 Hereinafter, examples of various preferred embodiments for realizing the present invention will be described with reference to the drawings.
<First Embodiment>
FIG. 1 is a conceptual module configuration diagram of a configuration example according to the first embodiment.
The module generally refers to components such as software (computer program) and hardware that can be logically separated. Therefore, the module in the present embodiment indicates not only a module in a computer program but also a module in a hardware configuration. Therefore, the present embodiment is a computer program for causing these modules to function (a program for causing a computer to execute each procedure, a program for causing a computer to function as each means, and a function for each computer. This also serves as an explanation of the program and system and method for realizing the above. However, for the sake of explanation, the words “store”, “store”, and equivalents thereof are used. However, when the embodiment is a computer program, these words are stored in a storage device or stored in memory. It is the control to be stored in the device. Modules may correspond to functions one-to-one, but in mounting, one module may be configured by one program, or a plurality of modules may be configured by one program, and conversely, one module May be composed of a plurality of programs. The plurality of modules may be executed by one computer, or one module may be executed by a plurality of computers in a distributed or parallel environment. Note that one module may include other modules. Hereinafter, “connection” is used not only for physical connection but also for logical connection (data exchange, instruction, reference relationship between data, etc.). “Predetermined” means that the process is determined before the target process, and not only before the process according to this embodiment starts but also after the process according to this embodiment starts. In addition, if it is before the target processing, it is used in accordance with the situation / state at that time or with the intention to be decided according to the situation / state up to that point.
In addition, the system or device is configured by connecting a plurality of computers, hardware, devices, and the like by communication means such as a network (including one-to-one correspondence communication connection), etc., and one computer, hardware, device. The case where it implement | achieves by etc. is also included. “Apparatus” and “system” are used as synonymous terms. Of course, the “system” does not include a social “mechanism” (social system) that is an artificial arrangement.
In addition, when performing a plurality of processes in each module or in each module, the target information is read from the storage device for each process, and the processing result is written to the storage device after performing the processing. is there. Therefore, description of reading from the storage device before processing and writing to the storage device after processing may be omitted. Here, the storage device may include a hard disk, a RAM (Random Access Memory), an external storage medium, a storage device via a communication line, a register in a CPU (Central Processing Unit), and the like.

本実施の形態である画像処理装置は、図１の例に示すように、画像受付モジュール１１０、文字列抽出モジュール１２０、文字境界候補抽出モジュール１３０、アーク特徴量抽出モジュール１４０、アーク評価値決定モジュール１５０、文字切り出しモジュール１６０、文字認識モジュール１７０を有している。なお、前述の図１７の例に示した画像処理装置と同種の部位には同一符号を付し重複した説明を省略する。したがって、アーク評価値決定モジュール１５０を詳細に説明する。ただし、文字列抽出モジュール１２０、文字境界候補抽出モジュール１３０、アーク特徴量抽出モジュール１４０については、より詳細に説明する。 As shown in the example of FIG. 1, the image processing apparatus according to the present embodiment includes an image reception module 110, a character string extraction module 120, a character boundary candidate extraction module 130, an arc feature quantity extraction module 140, and an arc evaluation value determination module. 150, a character cutout module 160, and a character recognition module 170. Note that parts of the same type as those of the image processing apparatus shown in the example of FIG. Therefore, the arc evaluation value determination module 150 will be described in detail. However, the character string extraction module 120, the character boundary candidate extraction module 130, and the arc feature amount extraction module 140 will be described in more detail.

文字列抽出モジュール１２０は、画像受付モジュール１１０、文字境界候補抽出モジュール１３０と接続されている。
文字列抽出モジュール１２０は、対象としている画像から横書き又は縦書きである１列の文字列画像を抽出する。ここで、列とは、横書きの場合は、横に並ぶ列であり、縦書きの場合は縦に並ぶ列である。
画像として、複数の文字列が存在するものがある。このような複数文字列を単一の文字列になるように分離する手法としては、従来よりさまざまなものが提案されているため、それらを用いればよい。
単一の文字列となるように分離する例として、特開平４−３１１２８３号公報、特開平３−２３３７８９号公報、特開平５−７３７１８号公報、特開２０００−９０１９４号公報等に記載の技術がある。これらの手法や、その他の手法を用いればよい。 The character string extraction module 120 is connected to the image reception module 110 and the character boundary candidate extraction module 130.
The character string extraction module 120 extracts one column of character string images that are horizontally or vertically written from the target image. Here, the column is a column aligned horizontally in the case of horizontal writing, and a column aligned vertically in the case of vertical writing.
Some images have a plurality of character strings. Various methods for separating such a plurality of character strings into a single character string have been proposed so far, and these may be used.
As an example of separation to form a single character string, techniques described in JP-A-4-311283, JP-A-3-233789, JP-A-5-73718, JP-A-2000-90194, etc. There is. These methods and other methods may be used.

文字境界候補抽出モジュール１３０は、文字列抽出モジュール１２０、アーク特徴量抽出モジュール１４０と接続されている。
文字境界候補抽出モジュール１３０は、１列の文字列画像を受け取り、複数の文字セグメントに分割する。この文字セグメント分割方式としてもさまざまな方式があるため、そのうちのどれかを用いればよい。例えば、特開平５−１１４０４７号公報、特開平４−１００１８９号公報、特開平４−９２９９２号公報、特開平４−６８４８１号公報、特開平９−５４８１４号公報等に記載の技術、特許文献３の特に００２１段落に記載の文字の境界候補抽出方式、特開平５−１２８３０８号公報の特に０００５段落に記載の文字切り出し位置決定方式等を用いればよい。これ以外の方法でももちろん構わない。 The character boundary candidate extraction module 130 is connected to the character string extraction module 120 and the arc feature amount extraction module 140.
The character boundary candidate extraction module 130 receives a single character string image and divides it into a plurality of character segments. Since there are various methods for dividing the character segment, any one of them may be used. For example, the techniques described in JP-A-5-114047, JP-A-4-100189, JP-A-4-92992, JP-A-4-68481, JP-A-9-54814, etc., Patent Document 3 In particular, a character boundary candidate extraction method described in paragraph 0021, a character cutout position determination method described in Japanese Patent Laid-Open No. 5-128308, particularly in paragraph 0005, and the like may be used. Of course, other methods are also acceptable.

アーク特徴量抽出モジュール１４０は、文字境界候補抽出モジュール１３０、アーク評価値決定モジュール１５０と接続されている。
アーク特徴量抽出モジュール１４０が抽出するアークの特徴量ベクトルの内容としては、特に限定しない。例えば、前述したｆ_１〜ｆ_７の特徴量を用いてもよい。その他の特徴量を用いてもよい。特徴量ベクトルの次元数（すなわち、特徴量の種類数）に関しても、さまざまであって、何次元でも構わない。 The arc feature amount extraction module 140 is connected to the character boundary candidate extraction module 130 and the arc evaluation value determination module 150.
The content of the arc feature quantity vector extracted by the arc feature quantity extraction module 140 is not particularly limited. For example, the above-described feature quantities f _{1 to} f ₇ may be used. Other feature amounts may be used. The number of dimensions of the feature vector (that is, the number of types of feature quantities) is various and may be any number of dimensions.

また、文字を切り出した後に文字認識を行い、その文字認識確度をアークの特徴量の１つとして用いる場合の具体例に関して補足する。これは、特許文献３に記載の技術では文字類似度として示していた量である。
文字認識確度としては、文字認識時に出力した文字コードの確信度合いあるいは尤度のようなものを得ることができればよい。このような文字認識確度を得る手法としても、従来よりさまざまな手法が提案されているため、そのうちのいずれかを用いればよい。例えば、特許文献３の００２４段落に記載の方式、特許文献２の００５１段落に記載の認識評価値取得方式等を用いてもよい。その他の手法を用いてもよい。 Further, a supplementary description will be made regarding a specific example in which character recognition is performed after a character is cut out, and the character recognition accuracy is used as one of the feature quantities of the arc. This is the amount indicated as the character similarity in the technique described in Patent Document 3.
As the character recognition accuracy, it is only necessary to obtain a certainty or likelihood of the character code output at the time of character recognition. As methods for obtaining such character recognition accuracy, various methods have been proposed so far, and any one of them may be used. For example, a method described in paragraph 0024 of Patent Document 3 and a recognition evaluation value acquisition method described in Paragraph 0051 of Patent Document 2 may be used. Other methods may be used.

アーク評価値決定モジュール１５０は、アーク特徴量抽出モジュール１４０、文字切り出しモジュール１６０と接続されており、アーク特徴量抽出モジュール１４０から特徴量ベクトルを受け取り、その特徴量ベクトルを用いてアーク評価値を決定し、そのアーク評価値を文字切り出しモジュール１６０に渡す。
アーク評価値決定モジュール１５０は、他のアーク評価値と比べた場合に非常に大きなアーク評価値、他のアーク評価値と比べた場合に非常に小さなアーク評価値による影響が大きくなることを防ぐものである。つまり、他のアーク評価値と比べた場合に大きなアーク評価値の場合にはその大きさによる影響を小さくし、小さなアーク評価値の場合にはその小ささによる影響を小さくする。手法としては、特徴量の重み付け加算結果に対して、さらに、以下の特徴を持った非線形関数を付与する。非線形関数としては、（１）単調関数であって、（２）入力がプラス無限大のときや、マイナス無限大のとき、所定の値に収束すること、又は、ある中心位置から外れれば外れるほど、その傾きの絶対値が小さくなる関数である。 The arc evaluation value determination module 150 is connected to the arc feature quantity extraction module 140 and the character segmentation module 160, receives a feature quantity vector from the arc feature quantity extraction module 140, and determines an arc evaluation value using the feature quantity vector. The arc evaluation value is passed to the character segmentation module 160.
The arc evaluation value determination module 150 prevents the influence of a very large arc evaluation value when compared with other arc evaluation values and a very small arc evaluation value when compared with other arc evaluation values from increasing. It is. That is, when the arc evaluation value is large compared to other arc evaluation values, the influence of the magnitude is reduced, and when the arc evaluation value is small, the influence of the magnitude is reduced. As a technique, a nonlinear function having the following characteristics is further added to the weighted addition result of the feature amount. Nonlinear functions include: (1) a monotonic function, and (2) when the input is plus infinity or minus infinity, it converges to a predetermined value, or the more it deviates from a certain center position. , A function that decreases the absolute value of the slope.

図２は、第１の実施の形態のアーク評価値決定モジュール１５０内の構成例についての概念的なモジュール構成図である。アーク評価値決定モジュール１５０は、線形重み付け加算モジュール２１０、非線形関数モジュール２２０を有している。線形重み付け加算モジュール２１０と非線形関数モジュール２２０は接続されている。
線形重み付け加算モジュール２１０は、アーク特徴量抽出モジュール１４０から特徴ベクトルとして、特徴量１〜Ｎ（画像内に存在する１つの文字画像を切り出す位置の候補に関する複数の特徴量）を受け取り、前述の図１７の例の線形重み付け加算モジュール１７１０と同等の重み付き線形和の計算処理を行う。その結果を非線形関数モジュール２２０へ渡す。
非線形関数モジュール２２０は、線形重み付け加算モジュール２１０から計算結果を引数として受け取り、その引数が極限の値の場合に予め定められた値に収束するようになる、又はその引数と予め定められた値との距離が大きくなると出力の傾きの絶対値が小さくなるようになる非線形単調関数又はその非線形単調関数に近似する関数によって、１つの文字画像を切り出す位置の候補の評価値（アーク評価値）を計算する。つまり、アーク評価関数によって計算する。 FIG. 2 is a conceptual module configuration diagram of a configuration example in the arc evaluation value determination module 150 according to the first embodiment. The arc evaluation value determination module 150 includes a linear weighting addition module 210 and a nonlinear function module 220. The linear weighting addition module 210 and the nonlinear function module 220 are connected.
The linear weighted addition module 210 receives feature amounts 1 to N (a plurality of feature amounts relating to position candidates for cutting out one character image existing in the image) as feature vectors from the arc feature amount extraction module 140, and the above-described FIG. The calculation processing of the weighted linear sum equivalent to the linear weighted addition module 1710 in the example 17 is performed. The result is passed to the nonlinear function module 220.
The nonlinear function module 220 receives the calculation result from the linear weighted addition module 210 as an argument, and when the argument is a limit value, the nonlinear function module 220 converges to a predetermined value, or the argument and a predetermined value The evaluation value (arc evaluation value) of the candidate for the position to cut out one character image is calculated by a nonlinear monotone function that makes the absolute value of the output slope smaller as the distance increases. To do. That is, the calculation is performed using an arc evaluation function.

アーク評価値関数をＶ（ｆ）とすると、この構成の演算は、式（８）となる。入力特徴量ベクトルをｆ、重みベクトルをｗ、重みのスカラー値をｃとしている。σ（）は非線形関数である。

Assuming that the arc evaluation value function is V (f), the calculation of this configuration is expressed by equation (8). The input feature vector is f, the weight vector is w, and the scalar value of the weight is c. σ () is a nonlinear function.

このような構成を採ることによって、極端に大きな（あるいは極端に小さな）値をとった場合の影響を少なくする。
図１８に示した例で、非線形関数（線形重み付け加算モジュール２１０）の入出力を、例えば、下記のような場合について説明する。
入力：１出力：１
入力：１０出力：２
入力：１００出力：３
このようにすると、
パス１：「化」「学」の場合のパス評価値は、２×２＋２＝６
パス２：「イ」「ヒ」「学」の場合のパス評価値は、１＋３＋２＝６
となって、値１００の影響を小さくすることができて、パス１とパス２の評価値が同程度の値となる。 By adopting such a configuration, the influence when an extremely large (or extremely small) value is taken is reduced.
In the example shown in FIG. 18, the input / output of the nonlinear function (linear weighted addition module 210) will be described, for example, in the following case.
Input: 1 Output: 1
Input: 10 Output: 2
Input: 100 Output: 3
If you do this,
Pass 1: The pass evaluation value in the case of “K” or “Study” is 2 × 2 + 2 = 6
Pass 2: The pass evaluation value for “I”, “Hi” and “Study” is 1 + 3 + 2 = 6
Thus, the influence of the value 100 can be reduced, and the evaluation values of the path 1 and the path 2 become comparable values.

非線形関数の例としては、ハイパーボリックタンジェント関数、ロジスティックシグモイド関数など、前述の「非線形関数条件」に合致するものであれば、どのような関数であってもよい。 As an example of the non-linear function, any function such as a hyperbolic tangent function or a logistic sigmoid function may be used as long as it meets the above-described “non-linear function condition”.

本実施の形態を実際に使用する場合には、線形重み付け加算モジュール２１０における重み（すなわち、ｗとｃ）を決定しなければならない。そのため、図３の例に示すような教師用データテーブル３００を用意する。
教師用データテーブル３００は、データ番号欄３１０、特徴量１欄３２０、特徴量２欄３３０、特徴量Ｎ欄３８０、正解／非正解欄３９０等を有している。データ番号欄３１０は、アークを一意に識別するデータ番号を記憶する。例えば、各アークに対して個別のデータ番号を１から順に与えることにする。特徴量１欄３２０から特徴量Ｎ欄３８０は、アーク特徴量抽出モジュール１４０が抽出した特徴量を記憶する。つまり、あるアークの特徴量をアーク特徴量抽出モジュール１４０が抽出し、それを教師用データテーブル３００の表の横に並べて入力する。さらに、そのアークが正解の文字の切れ目を表している場合には正解／非正解欄３９０に例えば１と入力する。そのアークが正解の文字の切れ目を表していない場合には正解／非正解欄３９０に例えば０と入力する。 When this embodiment is actually used, the weights (that is, w and c) in the linear weight addition module 210 must be determined. Therefore, a teacher data table 300 as shown in the example of FIG. 3 is prepared.
The teacher data table 300 includes a data number column 310, a feature amount 1 column 320, a feature amount 2 column 330, a feature amount N column 380, a correct / incorrect answer column 390, and the like. The data number column 310 stores a data number that uniquely identifies an arc. For example, an individual data number is assigned to each arc in order from 1. The feature quantity 1 column 320 to the feature quantity N column 380 store the feature quantities extracted by the arc feature quantity extraction module 140. That is, the arc feature quantity extraction module 140 extracts a feature quantity of a certain arc, and inputs it alongside the table of the teacher data table 300. Further, when the arc represents a break between correct characters, 1 is input in the correct / incorrect answer column 390, for example. If the arc does not represent a break between correct characters, for example, 0 is entered in the correct / incorrect answer column 390.

以上のように作成したデータを教師データとして、特徴量を入力したときに、前述の正解／非正解データとできるだけ近い値を得ることができるように重みを決定すればよい。
できるだけ近い値の評価方法としては、以下のような手法がある。まず、各アークに番号を与えるデータ番号をｋとする。データ番号ｋのアークに対して、アーク評価値決定モジュール１５０を用いて算出したアーク評価値をＶ_ｋとする。また、番号ｋのアークの教師データをｔ_ｋとする。ここでは、例えば、
・正解のときｔ_ｋ＝１
・不正解のときｔ_ｋ＝０
とする。
このとき、（９）式を最小とするように重みを決定すればよい。

又は、（１０）式を最小とするように重みを決定すればよい。

又は、（１１）式を最小とするように重みを決定すればよい。

これらに限らず、Ｖ_ｋとｔ_ｋの値の差が小さいとき小さくなるような評価値を最小とするように重みを決定する方法であれば、どのようなものでもよい。
重み決定方式としては、一般的なロジスティック回帰の回帰係数決定方式や、単層パーセプトロンの重み決定方式を利用すればよい。 The weights may be determined so that values as close as possible to the above-mentioned correct / incorrect data can be obtained when feature values are input using the data created as described above as teacher data.
There are the following methods as evaluation methods for values as close as possible. First, let k be a data number that gives a number to each arc. Against arc data number k, and V _k arc evaluation value calculated using the arc evaluation value determination module 150. In addition, the teacher data of the arc of the number k and t _k. Here, for example,
• When the correct answer t _k = 1
・ Incorrect answer t _k = 0
And
At this time, the weight may be determined so as to minimize Equation (9).

Alternatively, the weight may be determined so as to minimize Equation (10).

Alternatively, the weight may be determined so as to minimize Equation (11).

The present invention is not limited to these, and any method may be used as long as the weight is determined so as to minimize the evaluation value that becomes small when the difference between the values of V _k and t _k is small.
As the weight determination method, a general logistic regression regression coefficient determination method or a single-layer perceptron weight determination method may be used.

文字切り出しモジュール１６０は、アーク評価値決定モジュール１５０、文字認識モジュール１７０と接続されており、アーク評価値決定モジュール１５０によって計算されたアーク評価値に基づいて、文字列画像内に存在する文字画像を切り出す位置を決定し、文字列画像（又は画像受付モジュール１１０が受け取った画像）から文字画像を切り出す。 The character segmentation module 160 is connected to the arc evaluation value determination module 150 and the character recognition module 170, and based on the arc evaluation value calculated by the arc evaluation value determination module 150, character images existing in the character string image are extracted. The position to be cut out is determined, and the character image is cut out from the character string image (or the image received by the image receiving module 110).

＜第２の実施の形態＞
第１の実施の形態では、非線形関数を単調関数としたため、出力のアーク評価値の大きさは変わるが、その相対的な順番を変えることはない。つまり、前述の＜現象２＞は起こり得る。
第２の実施の形態は、図１の例に示した構成を有しており、アーク評価値決定モジュール１５０が線形重み付け加算と非線形関数の組による処理を２度繰り返す構成を採る。 <Second Embodiment>
In the first embodiment, since the nonlinear function is a monotone function, the magnitude of the output arc evaluation value changes, but the relative order does not change. That is, the above-described <Phenomenon 2> can occur.
The second embodiment has the configuration shown in the example of FIG. 1 and employs a configuration in which the arc evaluation value determination module 150 repeats the process based on the combination of the linear weighting addition and the nonlinear function twice.

図４は、第２の実施の形態のアーク評価値決定モジュール１５０内の構成例についての概念的なモジュール構成図である。
アーク評価値決定モジュール１５０は、線形重み付け加算モジュール１−１：４１１、線形重み付け加算モジュール１−２：４１２、・・・、線形重み付け加算モジュール１−Ｍ：４１Ｍ、非線形関数σ_１−１モジュール４２１、非線形関数σ_１−２モジュール４２２、・・・、非線形関数σ_１−Ｍモジュール４２Ｍ、線形重み付け加算モジュール２：４３０、非線形関数σ_２モジュール４４０を有している。図４において、複数の特徴量（特徴量１〜特徴量Ｎ）を特徴量ベクトルとして１本の線で記述している。
線形重み付け加算モジュール１−１：４１１は、非線形関数σ_１−１モジュール４２１と接続されている。
線形重み付け加算モジュール１−２：４１２は、非線形関数σ_１−２モジュール４２２と接続されている。
線形重み付け加算モジュール１−Ｍ：４１Ｍは、非線形関数σ_１−Ｍモジュール４２Ｍと接続されている。
非線形関数σ_１−１モジュール４２１は、線形重み付け加算モジュール１−１：４１１、線形重み付け加算モジュール２：４３０と接続されている。
非線形関数σ_１−２モジュール４２２は、線形重み付け加算モジュール１−２：４１２、線形重み付け加算モジュール２：４３０と接続されている。
非線形関数σ_１−Ｍモジュール４２Ｍは、線形重み付け加算モジュール１−Ｍ：４１Ｍ、線形重み付け加算モジュール２：４３０と接続されている。
線形重み付け加算モジュール１−１：４１１と非線形関数σ_１−１モジュール４２１の組み合わせ、線形重み付け加算モジュール１−２：４１２と非線形関数σ_１−２モジュール４２２の組み合わせ、線形重み付け加算モジュール１−Ｍ：４１Ｍと非線形関数σ_１−Ｍモジュール４２Ｍの組み合わせは、第１の実施の形態における線形重み付け加算モジュール２１０と非線形関数モジュール２２０の組み合わせに該当する。
線形重み付け加算モジュール２：４３０は、非線形関数σ_１−１モジュール４２１、非線形関数σ_１−２モジュール４２２、非線形関数σ_１−Ｍモジュール４２Ｍ、非線形関数σ_２モジュール４４０と接続されている。
非線形関数σ_２モジュール４４０は、線形重み付け加算モジュール２：４３０と接続されている。
線形重み付け加算モジュール２：４３０は第１の実施の形態における線形重み付け加算モジュール２１０に該当し、非線形関数σ_２モジュール４４０は第１の実施の形態における非線形関数モジュール２２０に該当する。 FIG. 4 is a conceptual module configuration diagram of a configuration example in the arc evaluation value determination module 150 according to the second embodiment.
The arc evaluation value determination module 150 includes a linear weighted addition module 1-1: 411, a linear weighted addition module 1-2: 412,..., A linear weighted addition module 1-M: 41M, and a nonlinear function σ _1-1 module 421. , A non-linear function σ _1-2 module 422,..., A non-linear function σ _1-M module 42M, a linear weighting addition module 2: 430, and a non-linear function σ ₂ module 440. In FIG. 4, a plurality of feature amounts (feature amount 1 to feature amount N) are described as a feature amount vector by a single line.
The linear weighted addition module 1-1: 411 is connected to the nonlinear function σ _1-1 module 421.
Linear weighted addition Module 1-2: 412, is connected to the non-linear function sigma _1-2 module 422.
The linear weighted addition module 1-M: 41M is connected to the nonlinear function σ _1-M module 42M.
The nonlinear function σ _1-1 module 421 is connected to the linear weighted addition module 1-1: 411 and the linear weighted addition module 2: 430.
The nonlinear function σ _1-2 module 422 is connected to the linear weighted addition module 1-2: 412 and the linear weighted addition module 2: 430.
The nonlinear function σ _1-M module 42M is connected to the linear weighted addition module 1-M: 41M and the linear weighted addition module 2: 430.
Combination of linear weighted addition module 1-1: 411 and nonlinear function σ _1-1 module 421, combination of linear weighted addition module 1-2: 412 and nonlinear function σ _1-2 module 422, linear weighted addition module 1-M: The combination of 41M and the nonlinear function σ _1-M module 42M corresponds to the combination of the linear weighted addition module 210 and the nonlinear function module 220 in the first embodiment.
The linear weighted addition module 2: 430 is connected to the nonlinear function σ _1-1 module 421, the nonlinear function σ _1-2 module 422, the nonlinear function σ _1-M module 42M, and the nonlinear function σ ₂ module 440.
The nonlinear function σ ₂ module 440 is connected to the linear weighted addition module 2: 430.
The linear weighted addition module 2: 430 corresponds to the linear weighted addition module 210 in the first embodiment, and the nonlinear function σ ₂ module 440 corresponds to the nonlinear function module 220 in the first embodiment.

線形重み付け加算モジュール１−ｉと非線形関数σ_１−ｉモジュールの組み合わせでは（ただし、ｉ＝１，２，…，Ｍ）、（１２）式による演算を行って、出力値Ｕ_ｉを得る。ｗ_１−ｉ及びｃ_１−ｉは、線形重み付け加算モジュール１−ｉで用いる重みである。なお、Ｍは、１段目の線形重み付け加算モジュールの数である。

さらに、（１３）式に示すようにＵを定義する。

線形重み付け加算モジュール２：４３０と非線形関数σ_２モジュール４４０の組み合わせでは、（１４）式による演算を行って、出力値Ｖを得る。ｗ_２及びｃ_２は、線形重み付け加算モジュール２：４３０で用いる重みである。

前述において、非線形関数σは非線形関数σ_１−１モジュール４２１等の関数σ_１−ｉ（ただし、ｉ＝１，２，…，Ｍ）、及び関数σ_２の複数存在している。これらの関数は同じであってもよいし、異なっていてもよい。典型的にはハイパーボリックタンジェント関数やロジスティックシグモイド関数などを共通に用いればよい。
第２の実施の形態においては、その構成が３層のパーセプトロンと同等となる。そのため、正解アークと非正解アークが非線形な分離面を持っていても対応が可能となる。また、通常の誤差逆伝播方式を用いて重み係数を決定するようにしてもよい。その際の教師データとしては、図１３の例に示した教師用データテーブル３００を用いればよい。第２の実施の形態のアーク評価値決定モジュール１５０を用いて算出したアーク評価値ｙ_ｉと教師データｔ_ｉの値の差が小さいとき小さくなるような評価値の例も、第１の実施と同等である。 In the combination of the linear weighted addition module 1-i and the nonlinear function σ _1-i module (where i = 1, 2,..., M), the calculation according to the equation (12) is performed to obtain the output value U _i . w _1-i and c _1-i are weights used in the linear weighted addition module 1-i. M is the number of first-stage linear weighting addition modules.

Furthermore, U is defined as shown in equation (13).

Linear weighted addition module 2: 430 and the combination of the non-linear function sigma ₂ module 440, performs calculation according to (14) to obtain an output value V. w ₂ and c ₂ are weights used in the linear weighted addition module 2: 430.

As described above, the nonlinear function σ has a plurality of functions σ _1-i (where i = 1, 2,..., M) and the function σ ₂ such as the nonlinear function σ _1-1 module 421. These functions may be the same or different. Typically, a hyperbolic tangent function or a logistic sigmoid function may be used in common.
In the second embodiment, the configuration is equivalent to a three-layer perceptron. Therefore, even if the correct arc and the non-correct arc have a non-linear separation surface, it is possible to cope with it. Further, the weight coefficient may be determined using a normal error back propagation method. As the teacher data at that time, the teacher data table 300 shown in the example of FIG. 13 may be used. Examples of the second arc evaluation value determination module 150 arc evaluation value y _i and teacher evaluation value as reduced when the difference is small value of the data t _i calculated using the embodiment also, as in the first embodiment It is equivalent.

＜第３の実施の形態＞
第３の実施の形態は、図１の例に示した構成を有しており、さらに、アーク評価値決定モジュール１５０内で、第２の実施の形態のアーク評価値決定モジュール１５０を複数用いて、その和を取るものである。
１つの推定器であるアーク評価値決定モジュール１５０の性能が悪くても、複数の推定器を用いることによって、性能を上げることが可能である。例えば、３つの推定器があるとする。そのうち、１つの推定器が不正解で、残り２つの推定器が正解であるとする。この３つの多数決を取って、正解のほうを採用することによって、正解の推定を行うことが可能となる。
本実施の形態の例では、多数決の演算を加算によって行う。 <Third Embodiment>
The third embodiment has the configuration shown in the example of FIG. 1, and further uses a plurality of arc evaluation value determination modules 150 of the second embodiment in the arc evaluation value determination module 150. , Take the sum.
Even if the performance of the arc evaluation value determination module 150 that is one estimator is poor, the performance can be improved by using a plurality of estimators. For example, assume that there are three estimators. It is assumed that one estimator is an incorrect answer and the remaining two estimators are correct answers. By taking the three majority decisions and adopting the correct answer, it is possible to estimate the correct answer.
In the example of the present embodiment, the majority operation is performed by addition.

図５は、第３の実施の形態のアーク評価値決定モジュール１５０内の構成例についての概念的なモジュール構成図である。
アーク評価値決定モジュール１５０は、アーク評価値算出モジュール１：５１１、アーク評価値算出モジュール２：５１２、アーク評価値算出モジュールＫ：５１Ｋ、アーク評価値加算モジュール５２０を有している。
アーク評価値算出モジュール１：５１１、アーク評価値算出モジュール２：５１２、アーク評価値算出モジュールＫ：５１Ｋは、それぞれアーク評価値加算モジュール５２０と接続されている。
第３の実施の形態では、複数のアーク評価値算出モジュール（アーク評価値算出モジュール１〜アーク評価値算出モジュールＫ）を用いる。アーク評価値算出モジュールｊの出力は、Ｖ_ｊとする。 FIG. 5 is a conceptual module configuration diagram of a configuration example in the arc evaluation value determination module 150 according to the third embodiment.
The arc evaluation value determination module 150 includes an arc evaluation value calculation module 1: 511, an arc evaluation value calculation module 2: 512, an arc evaluation value calculation module K: 51K, and an arc evaluation value addition module 520.
The arc evaluation value calculation module 1: 511, the arc evaluation value calculation module 2: 512, and the arc evaluation value calculation module K: 51K are connected to the arc evaluation value addition module 520, respectively.
In the third embodiment, a plurality of arc evaluation value calculation modules (arc evaluation value calculation module 1 to arc evaluation value calculation module K) are used. The output of the arc evaluation value calculation module j is V _j .

アーク評価値算出モジュールｊは、第２の実施の形態のアーク評価値決定モジュール１５０と同等の構成を採る。
図６は、第３の実施の形態のアーク評価値算出モジュール内の構成例についての概念的なモジュール構成図である。
線形重み付け加算モジュールｊ−１−１：６１１、線形重み付け加算モジュールｊ−１−２：６１２、線形重み付け加算モジュールｊ−１−Ｍｊ：６１Ｍ、非線形関数σ_{ｊ−１−１}モジュール６２１、非線形関数σ_{ｊ−１−２}モジュール６２２、非線形関数σ_{ｊ−１−Ｍｊ}モジュール６２Ｍ、線形重み付け加算モジュールｊ−２：６３０、非線形関数σ_ｊ−２モジュール６４０を有している。
線形重み付け加算モジュールｊ−１−１：６１１は、非線形関数σ_{ｊ−１−１}モジュール６２１と接続されている。
線形重み付け加算モジュールｊ−１−２：６１２は、非線形関数σ_{ｊ−１−２}モジュール６２２と接続されている。
線形重み付け加算モジュールｊ−１−Ｍｊ：６１Ｍは、非線形関数σ_{ｊ−１−Ｍｊ}モジュール６２Ｍと接続されている。
非線形関数σ_{ｊ−１−１}モジュール６２１は、線形重み付け加算モジュールｊ−１−１：６１１、線形重み付け加算モジュールｊ−２：６３０と接続されている。
非線形関数σ_{ｊ−１−２}モジュール６２２は、線形重み付け加算モジュールｊ−１−２：６１２、線形重み付け加算モジュールｊ−２：６３０と接続されている。
非線形関数σ_{ｊ−１−Ｍｊ}モジュール６２Ｍは、線形重み付け加算モジュールｊ−１−Ｍｊ：６１Ｍ、線形重み付け加算モジュールｊ−２：６３０と接続されている。
線形重み付け加算モジュールｊ−２：６３０は、非線形関数σ_{ｊ−１−１}モジュール６２１、非線形関数σ_{ｊ−１−２}モジュール６２２、非線形関数σ_{ｊ−１−Ｍｊ}モジュール６２Ｍ、非線形関数σ_ｊ−２モジュール６４０と接続されている。
非線形関数σ_ｊ−２モジュール６４０は、線形重み付け加算モジュールｊ−２：６３０と接続されている。
アーク評価値算出モジュールｊは、各構成要素に添え字ｊが付与されていることを除いて、第２の実施の形態と動作は同等である。以下、動作を示す。線形重み付け加算器ｊ−１−ｉと非線形関数σ_{ｊ−１−ｉ}の組み合わせでは（ただし、ｉ＝１，２，…，Ｍｊ）、（１５）式による演算を行って、出力値Ｕ_ｊ−ｉを得る。ｗ_{ｊ−１−ｉ}及びｃ_{ｊ−１−ｉ}は、線形重み付け加算モジュールｊ−１−ｉが用いる重みである。Ｍｊは、１段目の線形重み付け加算モジュールの数である。

さらに、（１６）式に示すようにＵを定義する。

線形重み付け加算モジュールｊ−２と非線形関数σ_ｊ−２モジュールの組み合わせでは、（１７）式による演算を行って、出力値Ｖ_ｊを得る。ｗ_ｊ−２及びｃ_ｊ−２は、線形重み付け加算モジュールｊ−２が用いる重みである。

アーク評価値加算モジュール５２０では、アーク評価値算出モジュール１：５１１、アーク評価値算出モジュール２：５１２、・・・、アーク評価値算出モジュールＫ：５１Ｋによって計算されたアーク評価値の和を計算する。具体的には、例えば、（１８）式を用いて、アーク評価値Ｖを算出する。

このＶが、第３の実施の形態のアーク評価値決定モジュール１５０が文字切り出しモジュール１６０へ渡すアーク評価値である。 The arc evaluation value calculation module j adopts the same configuration as the arc evaluation value determination module 150 of the second embodiment.
FIG. 6 is a conceptual module configuration diagram of a configuration example in the arc evaluation value calculation module according to the third embodiment.
Linear weighted addition module j-1-1: 611, linear weighted addition module j-1-2: 612, linear weighted addition module j-1-Mj: 61M, nonlinear function σ _j-1-1 module 621, nonlinear function σ _{a j-1-2} module 622; a nonlinear function σ _j-1-Mj module 62M; a linear weighting addition module j-2: 630; and a nonlinear function σ _j-2 module 640.
The linear weighted addition module j-1-1: 611 is connected to the nonlinear function σ _j-1-1 module 621.
The linear weighted addition module j-1-2: 612 is connected to the nonlinear function σ _j-1-2 module 622.
The linear weighted addition module j-1-Mj: 61M is connected to the nonlinear function σ _j-1-Mj module 62M.
The nonlinear function σ _j-1-1 module 621 is connected to the linear weighted addition module j-1-1: 611 and the linear weighted addition module j-2: 630.
The nonlinear function σ _j-1-2 module 622 is connected to the linear weighted addition module j-1-2: 612 and the linear weighted addition module j-2: 630.
The nonlinear function σ _j-1-Mj module 62M is connected to the linear weighted addition module j-1-Mj: 61M and the linear weighted addition module j-2: 630.
The linear weighted addition module j-2: 630 includes a nonlinear function σ _j-1-1 module 621, a nonlinear function σ _j-1-2 module 622, a nonlinear function σ _j-1-Mj module 62M, and a nonlinear function σ _j-2. A module 640 is connected.
The nonlinear function σ _j-2 module 640 is connected to the linear weighted addition module j-2: 630.
The arc evaluation value calculation module j has the same operation as that of the second embodiment except that a subscript j is assigned to each component. The operation will be described below. In the combination of the linear weighting adder j-1-i and the nonlinear function σ _j-1-i (where i = 1, 2,..., Mj), the calculation according to the equation (15) is performed, and the output value U _{j− i} is obtained. w _j-1-i and c _j-1-i are weights used by the linear weighted addition module j-1-i. Mj is the number of first linear weighting addition modules.

Furthermore, U is defined as shown in equation (16).

In the combination of the linear weighted addition module j-2 and the nonlinear function σ _j-2 module, the calculation according to the equation (17) is performed to obtain the output value V _j . w _j-2 and c _j-2 are weights used by the linear weighted addition module j-2.

The arc evaluation value addition module 520 calculates the sum of the arc evaluation values calculated by the arc evaluation value calculation module 1: 511, the arc evaluation value calculation module 2: 512,..., The arc evaluation value calculation module K: 51K. . Specifically, for example, the arc evaluation value V is calculated using the equation (18).

This V is the arc evaluation value that the arc evaluation value determination module 150 of the third embodiment passes to the character segmentation module 160.

第３の実施の形態においては、重みｗ_{ｊ−１−ｉ}、ｃ_{ｊ−１−ｉ}、ｗ_ｊ−２及びｃ_ｊ−２を決定する必要がある。ただし、ｉ＝１，２，…，Ｍｊ、ｊ＝１，２，…，Ｋである。
前記の重み決定方法としては、文献「Ｊ．Ｆｒｉｅｄｍａｎ，Ｔ．Ｈａｓｔｉｅ，Ｒ．Ｔｉｂｓｈｉｒａｎｉ著 “ＡｄｄｉｔｉｖｅＬｏｇｉｓｔｉｃＲｅｇｒｅｓｓｉｏｎ：ａＳｔａｔｉｓｔｉｃａｌＶｉｅｗｏｆＢｏｏｓｔｉｎｇ”、ＡｎｎａｌｓｏｆＳｔａｔｉｓｔｉｃｓ、Ｖｏｌ．２８，Ｎｏ．２，ｐｐ．３３７−４０７，２０００」に記載のジェントルアダブースト方式と呼ばれる方式を用いるようにしてもよい。
以下、ここでは、説明の都合上、非線形関数σとして、例えば、入力がマイナス無限大で０、プラス無限大で１になる関数を用いることとする。実際にはマイナス無限大やプラス無限大で収束する値に応じて線形変換を行えば、入力がマイナス無限大で−１、プラス無限大で１になる関数などに変更してもよい。
次に、教師用データとして、図７の例に示す教師用データテーブル７００を用意する。教師用データテーブル７００は、データ番号欄７１０、特徴量１欄７２０、特徴量２欄７３０、特徴量Ｎ欄７７０、正解／非正解欄７８０、ウエイト欄７９０等を有している。これは、図３の例に示す教師用データテーブル３００にウエイト欄７９０を追加したものである。ここで、「ウエイト」と「重み」は、意味的には同じである。しかし、これまで、線形重み付け加算モジュールにおける係数に対して「重み」という用語を用いていたため、その用語と区別をするため、教師データの重みに関しては「ウエイト」という用語を用いることとする。また、データの量をＧとする。 In the third embodiment, it is necessary to determine weights w _j-1-i , c _j-1-i , w _j-2 and c _j-2 . However, i = 1, 2,..., Mj, j = 1, 2,.
The weight determination method is described in the literature “J. Friedman, T. Hastie, R. Tibshirani“ Additive Logistic Regression: a Statistical View of Boosting, ”Anals of V.3. 407, 2000 "may be used.
Hereinafter, for convenience of explanation, for example, a function that takes 0 when the input is minus infinity and 1 when plus infinity is used as the nonlinear function σ. Actually, if linear transformation is performed according to a value that converges at minus infinity or plus infinity, the function may be changed to a function that takes −1 when the input is minus infinity and 1 when plus infinity.
Next, a teacher data table 700 shown in the example of FIG. 7 is prepared as teacher data. The teacher data table 700 has a data number column 710, a feature amount 1 column 720, a feature amount 2 column 730, a feature amount N column 770, a correct / incorrect answer column 780, a weight column 790, and the like. This is obtained by adding a weight column 790 to the teacher data table 300 shown in the example of FIG. Here, “weight” and “weight” are semantically the same. However, since the term “weight” has been used for the coefficient in the linear weighted addition module, the term “weight” is used for the weight of the teacher data in order to distinguish it from the term. The amount of data is G.

さらに、以下に示す手法を用いて重みを決定していく。
ここで、データ番号ｋに対して、正解／非正解を表す記号をｙ_ｋとする。第３の実施の形態においては、例えば、
・正解のときｙ_ｋ＝＋１
・不正解のときｙ_ｋ＝−１
とする。
さらに、非線形関数σとして、ロジスティックシグモイド関数を採用する。このとき、Ｖ_ｊは０〜１までの値を取る。
１．まず、図７の例に示した教師用データテーブル７００内のデータのウエイトを全て等しく１／Ｇとする。
２．ｊ＝１とする。
（ア）各教師データのウエイトを用いて、その重み付け２乗誤差を最小とするように、アーク評価値算出モジュールｊの重みを決定する。決定方法は、第２の実施の形態の説明で記載したものと同等である。通常は単なる２乗誤差を最小化するように重みを決定するのに対して、ウエイトで重み付けした重み付け２乗誤差を最小とするように重みを決定する点が異なる。
つまり、第１の実施の形態又は第２の実施の形態では、（１９）式を最小とするように重みを決定していたのに対し、ここでは、（２０）式を最小とするように重みを決定する。ただし、ここで、ｔ_ｋ＝（ｙ_ｋ＋１）／２の関係がある。

（イ）教師データのウエイトを更新する。
（イ−１）ｋ番目のデータに対するアーク評価値をＶ_ｊｋとする。
（イ−２）ウエイトを（２１）式で更新する。これは、アーク評価値の推定が間違ったデータのウエイトを大きくして、合っていたデータのウエイトを小さくする操作を示している。

（ウ）もしｊが所定の値以上になっているか、あるいは、評価値推定精度が十分であれば終了する。
（エ）ｊを１増大させて、（ア）に戻る。 Furthermore, the weight is determined using the following method.
Here, a symbol representing a correct / incorrect answer for data number k is y _k . In the third embodiment, for example,
・ When correct, y _k = + 1
・ Incorrect answer y _k = -1
And
Further, a logistic sigmoid function is adopted as the nonlinear function σ. At this time, V _j takes a value from 0 to 1.
1. First, all the data weights in the teacher data table 700 shown in the example of FIG.
2. Let j = 1.
(A) Using the weight of each teacher data, the weight of the arc evaluation value calculation module j is determined so as to minimize the weighted square error. The determination method is the same as that described in the description of the second embodiment. Normally, the weight is determined so as to minimize the square error, but the weight is determined so as to minimize the weighted square error weighted by the weight.
That is, in the first embodiment or the second embodiment, the weight is determined so as to minimize Equation (19), but here, Equation (20) is minimized. Determine the weight. However, there is a relationship of t _k = (y _k +1) / 2 here.

(B) Update the weight of the teacher data.
(A-1) Let the arc evaluation value for the kth data be V _jk .
(B-2) The weight is updated by equation (21). This indicates an operation of increasing the weight of data for which the estimation of the arc evaluation value is incorrect and decreasing the weight of the matched data.

(C) If j is equal to or greater than a predetermined value, or if the evaluation value estimation accuracy is sufficient, the process ends.
(D) Increase j by 1 and return to (a).

前記（ア）では、２乗誤差を最小としているが、第１の実施の形態の説明でも述べたように、（１０）式、（１１）式その他の評価値を最小としてもよい。その場合のウエイトのつけ方も同等であり、各ｋに対してａ_ｋを乗じればよい。具体的には（２２）式のようになる。この（２２）式を最小とするように重みを決定すればよい。

又は、（２３）式を最小とするように重みを決定すればよい。

In (a), the square error is minimized, but as described in the description of the first embodiment, the evaluation values of Equation (10), Equation (11), and others may be minimized. In this case, the weighting is the same, and each k may be multiplied by a _k . Specifically, the equation (22) is obtained. What is necessary is just to determine a weight so that this (22) Formula may be made into the minimum.

Alternatively, the weight may be determined so as to minimize Equation (23).

前述では、非線形関数σとして、ロジスティックシグモイド関数を採用する。このとき、Ｖ_ｊは０〜１までの値を取るため、２Ｖ_ｊｋ−１の計算を行った。これは値域を−１〜＋１の範囲に変更するためである。この変更は単に線形変換を行っているにすぎない。他の非線形関数を用いる場合においても、単に値域を−１〜＋１の範囲にするように線形変換を行えばよい。 In the above description, a logistic sigmoid function is employed as the nonlinear function σ. At this time, since V _j takes a value from 0 to 1, 2V _jk −1 was calculated. This is to change the value range to a range of −1 to +1. This change is merely a linear transformation. Even when other nonlinear functions are used, linear transformation may be performed simply so that the range of values is in the range of −1 to +1.

また、前述では、各アーク評価値算出モジュール（アーク評価値算出モジュール１：５１１、アーク評価値算出モジュール２：５１２、・・・、アーク評価値算出モジュールＫ：５１Ｋ）に入力する特徴量ベクトルを同じものとしていたが、その内容を異なるものにしてもよい。
すなわち、
・アーク評価値算出モジュール１の入力を、文字セグメント外接矩形の高さのみとする。
・アーク評価値算出モジュール２の入力を、全ての特徴量とする。
・アーク評価値算出モジュール３の入力を、文字セグメント外接矩形の幅のみとする。
・ …
等としてもよい。 In the above description, the feature quantity vector input to each arc evaluation value calculation module (arc evaluation value calculation module 1: 511, arc evaluation value calculation module 2: 512,..., Arc evaluation value calculation module K: 51K) is used. The contents are the same, but the contents may be different.
That is,
The input of the arc evaluation value calculation module 1 is only the height of the circumscribed rectangle of the character segment.
The input of the arc evaluation value calculation module 2 is set as all feature quantities.
The input of the arc evaluation value calculation module 3 is only the width of the character segment circumscribed rectangle.
・…
Etc.

また、前述では、各アーク評価値算出器の構成は第２の実施の形態のアーク評価値決定モジュール１５０の構成（図４の例参照）としていたが、第１の実施の形態のアーク評価値決定モジュール１５０の構成（図２の例参照）を混在させて含んでもよいし、全てが第１の実施の形態のアーク評価値決定モジュール１５０の構成であってもよい。すなわち、
・アーク評価値算出器１が第２の実施の形態のアーク評価値決定モジュール１５０の構成
・アーク評価値算出器２が第１の実施の形態のアーク評価値決定モジュール１５０の構成
・アーク評価値算出器３が第２の実施の形態のアーク評価値決定モジュール１５０の構成
・ …
としてもよいし、
・アーク評価値算出器１が第１の実施の形態のアーク評価値決定モジュール１５０の構成
・アーク評価値算出器２が第１の実施の形態のアーク評価値決定モジュール１５０の構成
・アーク評価値算出器３が第１の実施の形態のアーク評価値決定モジュール１５０の構成
・ …
としてもよいし、
・アーク評価値算出器１が第２の実施の形態のアーク評価値決定モジュール１５０の構成
・アーク評価値算出器２が第２の実施の形態のアーク評価値決定モジュール１５０の構成
・アーク評価値算出器３が第２の実施の形態のアーク評価値決定モジュール１５０の構成
・ …
としてもよい。 In the above description, the configuration of each arc evaluation value calculator is the configuration of the arc evaluation value determination module 150 of the second embodiment (see the example of FIG. 4), but the arc evaluation value of the first embodiment. The configuration of the determination module 150 (see the example of FIG. 2) may be mixed and all may be the configuration of the arc evaluation value determination module 150 of the first embodiment. That is,
The arc evaluation value calculator 1 is a configuration of the arc evaluation value determination module 150 of the second embodiment. The arc evaluation value calculator 2 is a configuration of the arc evaluation value determination module 150 of the first embodiment. The calculator 3 is a configuration of the arc evaluation value determination module 150 of the second embodiment.
Or
The arc evaluation value calculator 1 is a configuration of the arc evaluation value determination module 150 of the first embodiment. The arc evaluation value calculator 2 is a configuration of the arc evaluation value determination module 150 of the first embodiment. The calculator 3 has the configuration of the arc evaluation value determination module 150 of the first embodiment.
Or
The arc evaluation value calculator 1 is a configuration of the arc evaluation value determination module 150 of the second embodiment. The arc evaluation value calculator 2 is a configuration of the arc evaluation value determination module 150 of the second embodiment. The calculator 3 is a configuration of the arc evaluation value determination module 150 of the second embodiment.
It is good.

＜第４の実施の形態＞
前述の実施の形態では、アーク評価値決定モジュール１５０において、アーク評価値を推定していることになる。
推定するアーク評価値の教師データとしては、例えば、そのアークが文字の正解切り出し位置に相当している場合は１として、不正解切り出し位置に相当している場合は０としていた。
その場合、以下の２通りの最適化（重み決定）となっていることになる。
・クラス０とクラス１の２クラス分類問題として、クラス分類の誤りができるだけ小さくなるように重みを決定する。
・０〜１の間に存在する推定値と、教師データ（０又は１）との２乗誤差（絶対値誤差、クロスエントロピー等の誤差を示すような評価値であってもよい）を最小化するように重みを決定する。 <Fourth embodiment>
In the above-described embodiment, the arc evaluation value determination module 150 estimates the arc evaluation value.
The teacher data of the arc evaluation value to be estimated is, for example, 1 when the arc corresponds to the correct cutout position of the character, and 0 when the arc corresponds to the incorrect cutout position.
In that case, the following two types of optimization (weight determination) are performed.
As a two-class classification problem of class 0 and class 1, weights are determined so that errors in class classification are as small as possible.
-Minimize the square error between the estimated value existing between 0 and 1 and the teacher data (0 or 1) (it may be an evaluation value indicating an error such as an absolute value error or cross-entropy) Determine the weights to

しかし、アーク評価値が不正確であっても、文字切り出し位置が正確であればよい。逆に、アーク評価値が正確であったとしても、文字切り出し位置が不正確ではいけない。
アーク評価値と、文字切り出し位置の正確さは、複雑な関係になっており、単調な関係ではない。図８に、アーク候補決定モジュール８１０、アーク評価値決定モジュール８２０、文字切り出し位置決定モジュール８３０の関係例を示す。
アーク候補決定モジュール８１０は、アーク評価値決定モジュール８２０と接続されている。
アーク評価値決定モジュール８２０は、アーク候補決定モジュール８１０、文字切り出し位置決定モジュール８３０と接続されている。
文字切り出し位置決定モジュール８３０は、アーク評価値決定モジュール８２０と接続されている。
文字認識の処理において、まずアーク候補決定モジュール８１０において、画像を受け付け、前述したように複数のアーク候補が抽出される。さらに、アーク評価値決定モジュール８２０において、アークの評価値が決定され、文字切り出し位置決定モジュール８３０において、複数のアーク候補の集合としての複数のパスの中から、最適なパスを選択されることによって、文字切り出し位置が確定する。なお、図１の例に示したモジュール構成と比較すると、アーク候補決定モジュール８１０は画像受付モジュール１１０〜アーク特徴量抽出モジュール１４０に該当し、アーク評価値決定モジュール８２０はアーク評価値決定モジュール１５０に該当し、文字切り出し位置決定モジュール８３０は文字切り出しモジュール１６０に該当する。
第１の実施の形態〜第３の実施の形態では、アーク評価値決定モジュール１５０におけるアーク評価値決定を、アークの中だけを参照して行っていたが、第４の実施の形態では上図全体を考えて、アーク評価値決定モジュール８２０で用いられる重みを決定する例を示す。
以下、アーク評価値決定モジュール８２０の構成は、第１の実施の形態〜第３の実施の形態の説明で述べたもののいずれかであるとする。 However, even if the arc evaluation value is inaccurate, the character cutout position may be accurate. Conversely, even if the arc evaluation value is accurate, the character cutout position should not be inaccurate.
The accuracy of the arc evaluation value and the character cut-out position has a complicated relationship and is not a monotonous relationship. FIG. 8 shows a relationship example between the arc candidate determination module 810, the arc evaluation value determination module 820, and the character cutout position determination module 830.
The arc candidate determination module 810 is connected to the arc evaluation value determination module 820.
The arc evaluation value determination module 820 is connected to the arc candidate determination module 810 and the character cutout position determination module 830.
The character cutout position determination module 830 is connected to the arc evaluation value determination module 820.
In the character recognition process, first, the arc candidate determination module 810 receives an image and extracts a plurality of arc candidates as described above. Further, the arc evaluation value determination module 820 determines an arc evaluation value, and the character cutout position determination module 830 selects an optimum path from a plurality of paths as a set of a plurality of arc candidates. The character cutout position is fixed. Compared to the module configuration shown in the example of FIG. 1, the arc candidate determination module 810 corresponds to the image reception module 110 to the arc feature amount extraction module 140, and the arc evaluation value determination module 820 corresponds to the arc evaluation value determination module 150. The character cutout position determination module 830 corresponds to the character cutout module 160.
In the first to third embodiments, the arc evaluation value determination in the arc evaluation value determination module 150 is performed by referring only to the inside of the arc. An example of determining the weight used in the arc evaluation value determination module 820 in consideration of the whole will be described.
Hereinafter, it is assumed that the configuration of the arc evaluation value determination module 820 is any of those described in the description of the first to third embodiments.

第４の実施の形態は、第１の実施の形態〜第３の実施の形態の重みを決定する方法に関するものである。アーク評価値決定モジュール８２０の構成は第１の実施の形態〜第３の実施の形態の例で示したアーク評価値決定モジュール１５０である。
図９は、第４の実施の形態の構成例についての概念的なモジュール構成図である。
第４の実施の形態の画像処理装置は、図９の例に示すように、アーク候補決定モジュール９１０、重み変更モジュール９２０、アーク評価値決定モジュール９３０、文字切り出し位置決定モジュール９４０、切り出し位置正解個数算出モジュール９５０を有している。なお、アーク候補決定モジュール９１０は図８の例に示したアーク候補決定モジュール８１０に該当し、アーク評価値決定モジュール９３０は図８の例に示したアーク評価値決定モジュール８２０に該当し、文字切り出し位置決定モジュール９４０は図８の例に示した文字切り出し位置決定モジュール８３０に該当する。 The fourth embodiment relates to a method for determining the weights of the first to third embodiments. The configuration of the arc evaluation value determination module 820 is the arc evaluation value determination module 150 shown in the examples of the first to third embodiments.
FIG. 9 is a conceptual module configuration diagram of a configuration example according to the fourth embodiment.
As shown in the example of FIG. 9, the image processing apparatus according to the fourth embodiment includes an arc candidate determination module 910, a weight change module 920, an arc evaluation value determination module 930, a character cutout position determination module 940, and the number of correct cutout positions. A calculation module 950 is included. The arc candidate determination module 910 corresponds to the arc candidate determination module 810 shown in the example of FIG. 8, and the arc evaluation value determination module 930 corresponds to the arc evaluation value determination module 820 shown in the example of FIG. The position determination module 940 corresponds to the character cutout position determination module 830 shown in the example of FIG.

アーク候補決定モジュール９１０は、アーク評価値決定モジュール９３０と接続されており、画像を受け付け、アーク候補を決定する。
重み変更モジュール９２０は、アーク評価値決定モジュール９３０、切り出し位置正解個数算出モジュール９５０と接続されており、切り出し位置正解個数算出モジュール９５０によって算出された切り出し位置の正解個数に基づいて、１文字分の文字切り出し位置におけるアーク評価値決定モジュール９３０で用いる重みを変更する。そして、現在の重みでの場合の正解個数から変更後の重みでの正解個数への変更量から次の重みを決定する。 The arc candidate determination module 910 is connected to the arc evaluation value determination module 930, receives an image, and determines an arc candidate.
The weight change module 920 is connected to the arc evaluation value determination module 930 and the cut-out position correct answer number calculation module 950, and based on the correct number of cut-out positions calculated by the cut-out position correct number calculation module 950, The weight used in the arc evaluation value determination module 930 at the character cutout position is changed. Then, the next weight is determined from the amount of change from the correct answer number with the current weight to the correct answer number with the changed weight.

アーク評価値決定モジュール９３０は、アーク候補決定モジュール９１０、重み変更モジュール９２０、文字切り出し位置決定モジュール９４０と接続されており、アーク候補決定モジュール９１０からのアーク候補を受け取り、重み変更モジュール９２０からの重みを用いて、アーク評価値を決定する。
文字切り出し位置決定モジュール９４０は、アーク評価値決定モジュール９３０、切り出し位置正解個数算出モジュール９５０と接続されており、アーク評価値決定モジュール９３０からのアーク評価値に基づいて、画像内に存在する文字画像を切り出す位置を決定し、その決定された切り出し位置を切り出し位置正解個数算出モジュール９５０へ渡す。
切り出し位置正解個数算出モジュール９５０は、重み変更モジュール９２０、文字切り出し位置決定モジュール９４０と接続されており、文字切り出し位置決定モジュール９４０から切り出し位置と文字画像を切り出す位置の教師データを受け付け、文字切り出し位置決定モジュール９４０からの切り出し位置と教師データを比較して、切り出し位置の正解個数を算出する。 The arc evaluation value determination module 930 is connected to the arc candidate determination module 910, the weight change module 920, and the character extraction position determination module 940, receives the arc candidate from the arc candidate determination module 910, and receives the weight from the weight change module 920. Is used to determine the arc evaluation value.
The character cutout position determination module 940 is connected to the arc evaluation value determination module 930 and the cutout position correct answer number calculation module 950, and based on the arc evaluation value from the arc evaluation value determination module 930, the character image existing in the image Is determined, and the determined clipping position is passed to the clipping position correct number calculation module 950.
The cutout position correct number calculation module 950 is connected to the weight change module 920 and the character cutout position determination module 940, accepts the cutout position and the teacher data of the position to cut out the character image from the character cutout position determination module 940, and receives the character cutout position. The cutout position from the determination module 940 is compared with the teacher data to calculate the correct number of cutout positions.

次に処理の流れを説明する。
まず、アーク候補決定モジュール９１０は画像を受け付け、アーク候補を決定する。
アーク評価値決定モジュール９３０が用いる初期の重みは、乱数であってもよいし、第１の実施の形態〜第３の実施の形態の説明に記載した手法で定めた重みであってもよい。いずれにせよ、重み変更モジュール９２０では、初期の重みを保持する。
次に、アーク評価値決定モジュール９３０はアーク評価値を決定する。そして、文字切り出し位置決定モジュール９４０が、そのアーク評価値を用いて、文字切り出し位置を決定する。
決定後の文字切り出し位置は、切り出し位置正解個数算出モジュール９５０に渡される。それとは別に、文字切り出し教師データが切り出し位置正解個数算出モジュール９５０に入力される。
ここで、文字切り出し結果とは、例えば、画像中の文字の外接矩形の位置、サイズと、文字コードのペアからなっている。文字切り出し教師データも同様である。
切り出し位置正解個数算出モジュール９５０では、
・教師文字：文字切り出し教師データ内に存在する、複数の文字（外接矩形の位置、サイズと、文字コードを持っている）
と、
・推定文字：文字切り出し位置決定モジュール９４０で決定した文字
との比較を行う。
教師文字と推定文字の文字切り出し位置、サイズと文字コードが一致した個数を、切り出し位置正解個数算出モジュール９５０では算出する。ここで、文字切り出し位置、サイズの一致の判定に関しては、微小なずれを許容するようにしてもよい。なお、文字コードの一致を判定せずに、文字切り出し位置、サイズの一致だけを判定してもよい。 Next, the flow of processing will be described.
First, the arc candidate determination module 910 receives an image and determines an arc candidate.
The initial weight used by the arc evaluation value determination module 930 may be a random number, or may be a weight determined by the method described in the description of the first to third embodiments. In any case, the weight change module 920 retains the initial weight.
Next, the arc evaluation value determination module 930 determines an arc evaluation value. Then, the character cutout position determination module 940 determines the character cutout position using the arc evaluation value.
The character cutout position after determination is passed to the cutout position correct number calculation module 950. Separately, character cutout teacher data is input to the cutout position correct answer number calculation module 950.
Here, the character cutout result is composed of, for example, a pair of a character code and a position and size of a circumscribed rectangle of the character in the image. The same applies to the character segmentation teacher data.
In the cutout position correct answer number calculation module 950,
・ Teacher characters: Multiple characters present in the character cutout teacher data (has the position, size, and character code of the circumscribed rectangle)
When,
Estimated character: Comparison with the character determined by the character cutout position determination module 940 is performed.
The cutout position correct answer number calculation module 950 calculates the character cutout position of the teacher character and the estimated character, and the number of matching character codes. Here, regarding the determination of matching between the character cutout position and the size, a minute shift may be allowed. Note that it is possible to determine only the match of the character cutout position and size without determining the match of the character code.

以上のように判定した正解個数が、重み変更モジュール９２０に渡される。
重み変更モジュール９２０では、正解個数と、重み変更モジュール９２０内で保持している過去の重みを用いて、次の重みを決定する。
ここで、アーク評価値決定モジュール９３０で用いる重み（すなわち全てのｗやｃ）を並べた重みベクトルをＷとする。Ｗの要素を（Ｗ_１，Ｗ_２，…）とする。
また、初期重みをＷ_０とする。次の重みをＷ_１とする。このように次々に重みを更新していく。正解個数が増加しなくなった時点や、正解個数の増加率が所定の値以下になった時点、又は、繰り返し回数が予め定められた回数となった時点で重み変更の処理を終了して、その時点の重みをアーク評価値決定モジュール９３０に出力する。 The number of correct answers determined as described above is passed to the weight change module 920.
The weight change module 920 determines the next weight by using the number of correct answers and the past weight held in the weight change module 920.
Here, W is a weight vector in which weights (that is, all w and c) used in the arc evaluation value determination module 930 are arranged. Let the elements of W be (W ₁ , W ₂ ,...).
In addition, the initial weight and _{W 0.} The following weight and _{W 1.} In this way, the weights are updated one after another. When the number of correct answers stops increasing, when the rate of increase in the number of correct answers falls below a predetermined value, or when the number of repetitions reaches a predetermined number of times, the weight change process is terminated. The weight at the time is output to the arc evaluation value determination module 930.

次に、重み変更モジュール９２０の処理の詳細を説明する。
まず、文字切り出し正解個数をＡとする。Ａは、Ｗの関数である。すなわち、Ａ（Ｗ）と記すことができる。Ａを最大化するようにＷを決定すればよい。さて、現在の重みをＷ_ｍとする。また、変更後の重みをＷ_ｍ＋１とする。
重みの更新式は、（２４）式となる。

ここでαは重み更新の速度を規定するパラメタである。∇は、（２５）式を示す演算子である。

∇Ａは、Ｗ_ｍを変更したときのＡの変化量を示す。この変化の方向にＷを動かせばＡを増大させることができる。 Next, details of the processing of the weight change module 920 will be described.
First, let A be the number of correct character cutouts. A is a function of W. That is, it can be written as A (W). What is necessary is just to determine W so that A may be maximized. Now, let W _{m be the} current weight. Further, the changed weight is set to W _{m + 1} .
The weight update formula is the formula (24).

Here, α is a parameter that defines the speed of weight update. ∇ is an operator indicating the expression (25).

∇A indicates the amount of change of A when W _m is changed. If W is moved in the direction of this change, A can be increased.

ただし、関数Ａ（Ｗ）の内容が不明であるため、∇Ａを解析的に計算することは不可能である。そこで、適当なεを定めて、（２６）式又は（２７）式として、数値演算的に∇Ａを計算する。

又は、重みの更新の別の方法として下記の方法を用いる。ランダム、網羅的、又は予め定められたアルゴリズムを用いて更新量ｄＷを設定して、（２８）式とする。

このＷ_ｍ＋１を用いて、（２９）式の関係を有しているならば、ＷをＷ_ｍ＋１に更新する。（２９）式の関係を有していなければ、更新せずに、次のｄＷを試す。

以上で、正解個数を最大化する。 However, since the content of the function A (W) is unknown, it is impossible to calculate ∇A analytically. Therefore, an appropriate ε is determined, and ∇A is calculated numerically as Equation (26) or Equation (27).

Alternatively, the following method is used as another method of updating the weight. The update amount dW is set using a random, exhaustive, or predetermined algorithm to obtain equation (28).

If this W _{m + 1} is used and the relationship of equation (29) is satisfied, W is updated to W _{m + 1} . If the relationship of the formula (29) is not satisfied, the next dW is tried without updating.

Thus, the number of correct answers is maximized.

前述の説明では、正解個数を最大化していたが、誤り個数を最小化してもよい。又は、誤り率（すなわち、誤り個数／正解個数、又は誤り個数／（誤り個数＋正解個数））を最小化してもよい。又は正解率（すなわち、正解個数／誤り個数、又は正解個数／（誤り個数＋正解個数））を最大化してもよい。つまり、正解個数又は誤り個数に基づいた値として、正解個数、誤り個数、正解率、誤り率がある。 In the above description, the number of correct answers is maximized, but the number of errors may be minimized. Alternatively, the error rate (that is, the number of errors / the number of correct answers, or the number of errors / (number of errors + number of correct answers)) may be minimized. Alternatively, the correct answer rate (that is, the number of correct answers / number of errors, or the number of correct answers / (number of errors + number of correct answers)) may be maximized. That is, as the values based on the number of correct answers or the number of errors, there are the number of correct answers, the number of errors, the correct answer rate, and the error rate.

図１０を参照して、第１〜第４の実施の形態の画像処理装置のハードウェア構成例について説明する。図１０に示す構成は、例えばパーソナルコンピュータ（ＰＣ）などによって構成されるものであり、スキャナ等のデータ読み取り部１０１７と、プリンタなどのデータ出力部１０１８を備えたハードウェア構成例を示している。 With reference to FIG. 10, a hardware configuration example of the image processing apparatuses according to the first to fourth embodiments will be described. The configuration illustrated in FIG. 10 is configured by, for example, a personal computer (PC), and illustrates a hardware configuration example including a data reading unit 1017 such as a scanner and a data output unit 1018 such as a printer.

ＣＰＵ１００１は、前述の実施の形態において説明した各種のモジュール、すなわち、図１、図２、図４、図５、図６、図８、図９、図１７等の例に示した各モジュールの実行シーケンスを記述したコンピュータ・プログラムにしたがった処理を実行する制御部である。 The CPU 1001 executes various modules described in the above-described embodiments, that is, the modules illustrated in the examples of FIGS. 1, 2, 4, 5, 6, 6, 8, 9, and 17 It is a control part which performs the process according to the computer program which described the sequence.

ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１００２は、ＣＰＵ１００１が使用するプログラムや演算パラメータ等を格納する。ＲＡＭ１００３は、ＣＰＵ１００１の実行において使用するプログラムや、その実行において適宜変化するパラメータ等を格納する。これらはＣＰＵバスなどから構成されるホストバス１００４により相互に接続されている。 A ROM (Read Only Memory) 1002 stores programs used by the CPU 1001, calculation parameters, and the like. The RAM 1003 stores programs used in the execution of the CPU 1001, parameters that change as appropriate during the execution, and the like. These are connected to each other by a host bus 1004 including a CPU bus.

ホストバス１００４は、ブリッジ１００５を介して、ＰＣＩ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｃｏｎｎｅｃｔ／Ｉｎｔｅｒｆａｃｅ）バスなどの外部バス１００６に接続されている。 The host bus 1004 is connected to an external bus 1006 such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 1005.

キーボード１００８、マウス等のポインティングデバイス１００９は、操作者により操作される入力デバイスである。ディスプレイ１０１０は、液晶表示装置又はＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）などがあり、各種情報をテキストやイメージ情報として表示する。 A keyboard 1008 and a pointing device 1009 such as a mouse are input devices operated by an operator. The display 1010 includes a liquid crystal display device or a CRT (Cathode Ray Tube), and displays various types of information as text or image information.

ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）１０１１は、ハードディスクを内蔵し、ハードディスクを駆動し、ＣＰＵ１００１によって実行するプログラムや情報を記録又は再生させる。ハードディスクには、画像、文字画像、文字切り出し位置、文字切り出し位置の候補、教師用データテーブル３００、教師用データテーブル７００などが格納される。さらに、その他の各種のデータ処理プログラム等、各種コンピュータ・プログラムが格納される。 An HDD (Hard Disk Drive) 1011 includes a hard disk, drives the hard disk, and records or reproduces a program executed by the CPU 1001 and information. The hard disk stores images, character images, character cutout positions, character cutout position candidates, a teacher data table 300, a teacher data table 700, and the like. Further, various computer programs such as various other data processing programs are stored.

ドライブ１０１２は、装着されている磁気ディスク、光ディスク、光磁気ディスク、又は半導体メモリ等のリムーバブル記録媒体１０１３に記録されているデータ又はプログラムを読み出して、そのデータ又はプログラムを、インタフェース１００７、外部バス１００６、ブリッジ１００５、及びホストバス１００４を介して接続されているＲＡＭ１００３に供給する。リムーバブル記録媒体１０１３も、ハードディスクと同様のデータ記録領域として利用可能である。 The drive 1012 reads data or a program recorded on a removable recording medium 1013 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, and the data or program is read out to the interface 1007 and the external bus 1006. , The bridge 1005, and the RAM 1003 connected via the host bus 1004. The removable recording medium 1013 can also be used as a data recording area similar to a hard disk.

接続ポート１０１４は、外部接続機器１０１５を接続するポートであり、ＵＳＢ、ＩＥＥＥ１３９４等の接続部を持つ。接続ポート１０１４は、インタフェース１００７、及び外部バス１００６、ブリッジ１００５、ホストバス１００４等を介してＣＰＵ１００１等に接続されている。通信部１０１６は、ネットワークに接続され、外部とのデータ通信処理を実行する。データ読み取り部１０１７は、例えばスキャナであり、ドキュメントの読み取り処理を実行する。データ出力部１０１８は、例えばプリンタであり、ドキュメントデータの出力処理を実行する。 The connection port 1014 is a port for connecting the external connection device 1015 and has a connection unit such as USB and IEEE1394. The connection port 1014 is connected to the CPU 1001 and the like via the interface 1007, the external bus 1006, the bridge 1005, the host bus 1004, and the like. A communication unit 1016 is connected to a network and executes data communication processing with the outside. The data reading unit 1017 is a scanner, for example, and executes document reading processing. The data output unit 1018 is a printer, for example, and executes document data output processing.

なお、図１０に示す画像処理装置のハードウェア構成は、１つの構成例を示すものであり、第１〜第４の実施の形態は、図１０に示す構成に限らず、第１〜第４の実施の形態において説明したモジュールを実行可能な構成であればよい。例えば、一部のモジュールを専用のハードウェア（例えば特定用途向け集積回路（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ：ＡＳＩＣ）等）で構成してもよく、一部のモジュールは外部のシステム内にあり通信回線で接続しているような形態でもよく、さらに図１０に示すシステムが複数互いに通信回線によって接続されていて互いに協調動作するようにしてもよい。また、複写機、ファックス、スキャナ、プリンタ、複合機（スキャナ、プリンタ、複写機、ファックス等のいずれか２つ以上の機能を有している画像処理装置）などに組み込まれていてもよい。 Note that the hardware configuration of the image processing apparatus shown in FIG. 10 shows one configuration example, and the first to fourth embodiments are not limited to the configuration shown in FIG. Any configuration can be used as long as the module described in the embodiment can be executed. For example, some modules may be configured with dedicated hardware (for example, Application Specific Integrated Circuit (ASIC), etc.), and some modules are in an external system and connected via a communication line In addition, a plurality of systems shown in FIG. 10 may be connected to each other via communication lines so as to cooperate with each other. Further, it may be incorporated in a copying machine, a fax machine, a scanner, a printer, a multifunction machine (an image processing apparatus having any two or more functions of a scanner, a printer, a copying machine, a fax machine, etc.).

なお、前述の各種の実施の形態を組み合わせてもよく（例えば、ある実施の形態内のモジュールを他の実施の形態内に適用する、入れ替えする等も含む）、各モジュールの処理内容として背景技術で説明した技術を採用してもよい。
なお、数式を用いて説明したが、数式には、その数式と同等のものが含まれる。同等のものとは、その数式そのものの他に、最終的な結果に影響を及ぼさない程度の数式の変形、又は数式をアルゴリズミックな解法で解くこと等が含まれる。 Note that the various embodiments described above may be combined (for example, a module in one embodiment may be applied to another embodiment, replaced, etc.), and the background art may be used as the processing content of each module. You may employ | adopt the technique demonstrated by.
In addition, although demonstrated using a numerical formula, the thing equivalent to the numerical formula is contained in a numerical formula. The equivalent includes not only the mathematical formula itself, but also transformation of the mathematical formula to the extent that the final result is not affected, or solving the mathematical formula by an algorithmic solution.

なお、説明したプログラムについては、記録媒体に格納して提供してもよく、また、そのプログラムを通信手段によって提供してもよい。その場合、例えば、前記説明したプログラムについて、「プログラムを記録したコンピュータ読み取り可能な記録媒体」の発明として捉えてもよい。
「プログラムを記録したコンピュータ読み取り可能な記録媒体」とは、プログラムのインストール、実行、プログラムの流通などのために用いられる、プログラムが記録されたコンピュータで読み取り可能な記録媒体をいう。
なお、記録媒体としては、例えば、デジタル・バーサタイル・ディスク（ＤＶＤ）であって、ＤＶＤフォーラムで策定された規格である「ＤＶＤ−Ｒ、ＤＶＤ−ＲＷ、ＤＶＤ−ＲＡＭ等」、ＤＶＤ＋ＲＷで策定された規格である「ＤＶＤ＋Ｒ、ＤＶＤ＋ＲＷ等」、コンパクトディスク（ＣＤ）であって、読出し専用メモリ（ＣＤ−ＲＯＭ）、ＣＤレコーダブル（ＣＤ−Ｒ）、ＣＤリライタブル（ＣＤ−ＲＷ）等、ブルーレイ・ディスク（Ｂｌｕ−ｒａｙＤｉｓｃ（登録商標））、光磁気ディスク（ＭＯ）、フレキシブルディスク（ＦＤ）、磁気テープ、ハードディスク、読出し専用メモリ（ＲＯＭ）、電気的消去及び書換可能な読出し専用メモリ（ＥＥＰＲＯＭ）、フラッシュ・メモリ、ランダム・アクセス・メモリ（ＲＡＭ）等が含まれる。
そして、前記のプログラム又はその一部は、前記記録媒体に記録して保存や流通等させてもよい。また、通信によって、例えば、ローカル・エリア・ネットワーク（ＬＡＮ）、メトロポリタン・エリア・ネットワーク（ＭＡＮ）、ワイド・エリア・ネットワーク（ＷＡＮ）、インターネット、イントラネット、エクストラネット等に用いられる有線ネットワーク、あるいは無線通信ネットワーク、さらにこれらの組み合わせ等の伝送媒体を用いて伝送させてもよく、また、搬送波に乗せて搬送させてもよい。
さらに、前記のプログラムは、他のプログラムの一部分であってもよく、あるいは別個のプログラムと共に記録媒体に記録されていてもよい。また、複数の記録媒体に分割して
記録されていてもよい。また、圧縮や暗号化など、復元可能であればどのような態様で記録されていてもよい。 The program described above may be provided by being stored in a recording medium, or the program may be provided by communication means. In that case, for example, the above-described program may be regarded as an invention of a “computer-readable recording medium recording the program”.
The “computer-readable recording medium on which a program is recorded” refers to a computer-readable recording medium on which a program is recorded, which is used for program installation, execution, program distribution, and the like.
The recording medium is, for example, a digital versatile disc (DVD), which is a standard established by the DVD Forum, such as “DVD-R, DVD-RW, DVD-RAM,” and DVD + RW. Standard “DVD + R, DVD + RW, etc.”, compact disc (CD), read-only memory (CD-ROM), CD recordable (CD-R), CD rewritable (CD-RW), Blu-ray disc ( Blu-ray Disc (registered trademark), magneto-optical disk (MO), flexible disk (FD), magnetic tape, hard disk, read-only memory (ROM), electrically erasable and rewritable read-only memory (EEPROM), flash Includes memory, random access memory (RAM), etc. .
The program or a part of the program may be recorded on the recording medium for storage or distribution. Also, by communication, for example, a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN), a wired network used for the Internet, an intranet, an extranet, etc., or wireless communication It may be transmitted using a transmission medium such as a network or a combination of these, or may be carried on a carrier wave.
Furthermore, the program may be a part of another program, or may be recorded on a recording medium together with a separate program. Moreover, it may be divided and recorded on a plurality of recording media. Further, it may be recorded in any manner as long as it can be restored, such as compression or encryption.

４１Ｍ…線形重み付け加算モジュール１−Ｍ
４２Ｍ…非線形関数σ_１−Ｍモジュール
５１Ｋ…アーク評価値算出モジュールＫ
６１Ｍ…線形重み付け加算モジュールｊ−１−Ｍｊ
６２Ｍ…非線形関数σ_{ｊ−１−Ｍｊ}モジュール
１１０…画像受付モジュール
１２０…文字列抽出モジュール
１３０…文字境界候補抽出モジュール
１４０…アーク特徴量抽出モジュール
１５０…アーク評価値決定モジュール
１６０…文字切り出しモジュール
１７０…文字認識モジュール
２１０…線形重み付け加算モジュール
２２０…非線形関数モジュール
４１１…線形重み付け加算モジュール１−１
４１２…線形重み付け加算モジュール１−２
４２１…非線形関数σ_１−１モジュール
４２２…非線形関数σ_１−２モジュール
４３０…線形重み付け加算モジュール２
４４０…非線形関数σ_２モジュール
５１１…アーク評価値算出モジュール１
５１２…アーク評価値算出モジュール２
５２０…アーク評価値加算モジュール
６１１…線形重み付け加算モジュールｊ−１−１
６１２…線形重み付け加算モジュールｊ−１−２
６２１…非線形関数σ_{ｊ−１−１}モジュール
６２２…非線形関数σ_{ｊ−１−２}モジュール
６３０…線形重み付け加算モジュールｊ−２
６４０…非線形関数σ_ｊ−２モジュール
８１０…アーク候補決定モジュール
８２０…アーク評価値決定モジュール
８３０…文字切り出し位置決定モジュール
９１０…アーク候補決定モジュール
９２０…重み変更モジュール
９３０…アーク評価値決定モジュール
９４０…文字切り出し位置決定モジュール
９５０…切り出し位置正解個数算出モジュール
１７１０…線形重み付け加算モジュール 41M ... Linear weighting addition module 1-M
42M: nonlinear function σ _1-M module 51K: arc evaluation value calculation module K
61M: Linear weighted addition module j-1-Mj
62M ... Nonlinear function [sigma] _j-1-Mj module 110 ... Image reception module 120 ... Character string extraction module 130 ... Character boundary candidate extraction module 140 ... Arc feature value extraction module 150 ... Arc evaluation value determination module 160 ... Character segmentation module 170 ... Character recognition module 210 ... Linear weighting addition module 220 ... Nonlinear function module 411 ... Linear weighting addition module 1-1
412 ... Linear weighting addition module 1-2
421: nonlinear function σ _1-1 module 422: nonlinear function σ _1-2 module 430: linear weighted addition module 2
440: nonlinear function σ ₂ module 511: arc evaluation value calculation module 1
512 ... Arc evaluation value calculation module 2
520 ... Arc evaluation value addition module 611 ... Linear weighting addition module j-1-1
612 ... Linear weighting addition module j-1-2
621 ... Nonlinear function σ _j-1-1 module 622 ... Nonlinear function σ _j-1-2 module 630 ... Linear weighted addition module j-2
640: Nonlinear function σ _j-2 module 810: Arc candidate determination module 820 ... Arc evaluation value determination module 830 ... Character extraction position determination module 910 ... Arc candidate determination module 920 ... Weight change module 930 ... Arc evaluation value determination module 940 ... Character Cutout position determination module 950 ... Cutout position correct answer number calculation module 1710 ... Linear weighting addition module

Claims

First calculating means for calculating a weighted linear sum for a plurality of feature amounts related to a position candidate for cutting out one character image existing in the image;
The calculation result by the first calculation means is used as an argument, and when the argument is a limit value, it converges to a predetermined value, or output when the distance between the argument and the predetermined value increases Second calculating means for calculating an evaluation value of a candidate for a position to cut out the one character image by a non-linear monotone function in which the absolute value of the slope of the non-linear monotonic function or a function approximating the non-linear monotone function is reduced;
Based on the evaluation value calculated by the second calculating means, a cutting position determining means for determining a position for cutting out a character image existing in the image ;
Accepting means for accepting teacher data of the cutout position of the character image;
A number calculating means for comparing the cutout position determined by the cutout position determining means with the teacher data received by the receiving means, and calculating the correct number of answers or the number of errors of the cutout position;
Weight change means for changing the weight used in the first calculation means at the character cutout position for one character based on the correct number or the error number of the cutout position calculated by the number calculation means.
Comprising
The weight changing means determines a next weight from an amount of change from a value based on the number of correct answers or the number of errors in the case of the current weight to a value based on the number of correct answers or the number of errors in the changed weight.
The image processing apparatus characterized by.

First calculating means for calculating a weighted linear sum for a plurality of feature amounts related to a position candidate for cutting out one character image existing in the image;
The calculation result by the first calculation means is used as an argument, and when the argument is a limit value, it converges to a predetermined value, or output when the distance between the argument and the predetermined value increases Second calculating means for calculating an evaluation value of a candidate for a position to cut out the one character image by a non-linear monotone function in which the absolute value of the slope of the non-linear monotonic function or a function approximating the non-linear monotone function is reduced;
Based on the evaluation value calculated by the second calculating means, a cutting position determining means for determining a position for cutting out a character image existing in the image;
A plurality of sets of the first calculation means and the second calculation means;
Third calculation means for calculating a weighted linear sum for the evaluation values calculated by the plurality of second calculation means;
The calculation result by the third calculation means is used as an argument, and when the argument is a limit value, it converges to a predetermined value, or output when the distance between the argument and the predetermined value increases A fourth calculation means for calculating an evaluation value of a candidate for a position to cut out the one character image by a nonlinear monotone function in which the absolute value of the slope of the curve becomes small or a function approximating the nonlinear monotone function;
The cut-out position determining means determines a position to cut out a character image existing in the image based on the evaluation value calculated by the fourth calculation means ,
Accepting means for accepting teacher data of the cutout position of the character image;
A number calculating means for comparing the cutout position determined by the cutout position determining means with the teacher data received by the receiving means, and calculating the correct number of answers or the number of errors of the cutout position;
A weight change for changing the weight used in the first calculation means or the third calculation means at the character cutout position for one character based on the correct number or the error number of the cutout position calculated by the number calculation means means
Comprising
The weight changing means determines a next weight from a change amount from a value based on the number of correct answers or the number of errors in the case of the current weight to a value based on the number of correct answers or the number of errors in the changed weight. An image processing apparatus.

First calculating means for calculating a weighted linear sum for a plurality of feature amounts related to a position candidate for cutting out one character image existing in the image;
The calculation result by the first calculation means is used as an argument, and when the argument is a limit value, it converges to a predetermined value, or output when the distance between the argument and the predetermined value increases Second calculating means for calculating an evaluation value of a candidate for a position to cut out the one character image by a non-linear monotone function in which the absolute value of the slope of the non-linear monotonic function or a function approximating the non-linear monotone function is reduced;
Based on the evaluation value calculated by the second calculating means, a cutting position determining means for determining a position for cutting out a character image existing in the image;
A plurality of sets of the first calculation means and the second calculation means;
Fifth calculation means for calculating a sum of evaluation values calculated by the plurality of second calculation means;
The cut-out position determining means determines a position to cut out a character image existing in the image based on the sum of evaluation values calculated by the fifth calculation means ,
Accepting means for accepting teacher data of the cutout position of the character image;
A number calculating means for comparing the cutout position determined by the cutout position determining means with the teacher data received by the receiving means, and calculating the correct number of answers or the number of errors of the cutout position;
Weight change means for changing the weight used in the first calculation means at the character cutout position for one character based on the correct number or the error number of the cutout position calculated by the number calculation means.
Comprising
The weight changing means determines a next weight from a change amount from a value based on the number of correct answers or the number of errors in the case of the current weight to a value based on the number of correct answers or the number of errors in the changed weight. An image processing apparatus.

First calculating means for calculating a weighted linear sum for a plurality of feature amounts related to a position candidate for cutting out one character image existing in the image;
The calculation result by the first calculation means is used as an argument, and when the argument is a limit value, it converges to a predetermined value, or output when the distance between the argument and the predetermined value increases Second calculating means for calculating an evaluation value of a candidate for a position to cut out the one character image by a non-linear monotone function in which the absolute value of the slope of the non-linear monotonic function or a function approximating the non-linear monotone function is reduced;
Based on the evaluation value calculated by the second calculating means, a cutting position determining means for determining a position for cutting out a character image existing in the image;
A plurality of sets of the first calculation means and the second calculation means;
Third calculation means for calculating a weighted linear sum for the evaluation values calculated by the plurality of second calculation means;
The calculation result by the third calculation means is used as an argument, and when the argument is a limit value, it converges to a predetermined value, or output when the distance between the argument and the predetermined value increases Fourth calculation means for calculating an evaluation value of a candidate for a position to cut out the one character image by a nonlinear monotone function or an approximate function of the nonlinear monotone function in which the absolute value of the slope of the image becomes small
Comprising
The cut-out position determining means determines a position to cut out a character image existing in the image based on the evaluation value calculated by the fourth calculation means,
A plurality of sets by the first calculation means and the second calculation means; a plurality of sets by the third calculation means and the fourth calculation means;
Sixth calculation means for calculating a sum of evaluation values calculated by the plurality of fourth calculation means,
The cut-out position determining means determines a position to cut out a character image existing in the image based on the sum of evaluation values calculated by the sixth calculation means ,
Accepting means for accepting teacher data of the cutout position of the character image;
A number calculating means for comparing the cutout position determined by the cutout position determining means with the teacher data received by the receiving means, and calculating the correct number of answers or the number of errors of the cutout position;
A weight change for changing the weight used in the first calculation means or the third calculation means at the character cutout position for one character based on the correct number or the error number of the cutout position calculated by the number calculation means means
Comprising
The weight changing means determines a next weight from a change amount from a value based on the number of correct answers or the number of errors in the case of the current weight to a value based on the number of correct answers or the number of errors in the changed weight. An image processing apparatus.

Computer
First calculating means for calculating a weighted linear sum for a plurality of feature amounts related to a position candidate for cutting out one character image existing in the image;
The calculation result by the first calculation means is used as an argument, and when the argument is a limit value, it converges to a predetermined value, or output when the distance between the argument and the predetermined value increases Second calculating means for calculating an evaluation value of a candidate for a position to cut out the one character image by a non-linear monotone function in which the absolute value of the slope of the non-linear monotonic function or a function approximating the non-linear monotone function is reduced;
Based on the evaluation value calculated by the second calculating means, a cutting position determining means for determining a position for cutting out a character image existing in the image ;
Accepting means for accepting teacher data of the cutout position of the character image;
A number calculating means for comparing the cutout position determined by the cutout position determining means with the teacher data received by the receiving means, and calculating the correct number of answers or the number of errors of the cutout position;
Weight change means for changing the weight used in the first calculation means at the character cutout position for one character based on the correct number or the error number of the cutout position calculated by the number calculation means.
Function as
The weight changing means determines a next weight from an amount of change from a value based on the number of correct answers or the number of errors in the case of the current weight to a value based on the number of correct answers or the number of errors in the changed weight.
An image processing program characterized by that .

  Computer
  First calculating means for calculating a weighted linear sum for a plurality of feature amounts related to a position candidate for cutting out one character image existing in the image;
  The calculation result by the first calculation means is used as an argument, and when the argument is a limit value, it converges to a predetermined value, or output when the distance between the argument and the predetermined value increases Second calculating means for calculating an evaluation value of a candidate for a position to cut out the one character image by a non-linear monotone function in which the absolute value of the slope of the non-linear monotonic function or a function approximating the non-linear monotone function is reduced;
  Based on the evaluation value calculated by the second calculating means, a cutting position determining means for determining a position for cutting out a character image existing in the image;
  A plurality of sets of the first calculation means and the second calculation means;
  Third calculation means for calculating a weighted linear sum for the evaluation values calculated by the plurality of second calculation means;
  The calculation result by the third calculation means is used as an argument, and when the argument is a limit value, it converges to a predetermined value, or output when the distance between the argument and the predetermined value increases Fourth calculation means for calculating an evaluation value of a candidate for a position to cut out the one character image by a nonlinear monotone function or an approximate function of the nonlinear monotone function in which the absolute value of the slope of the image becomes small
  Function as
  The cut-out position determining means determines a position to cut out a character image existing in the image based on the evaluation value calculated by the fourth calculation means,
  Accepting means for accepting teacher data of the cutout position of the character image;
  A number calculating means for comparing the cutout position determined by the cutout position determining means with the teacher data received by the receiving means, and calculating the correct number of answers or the number of errors of the cutout position;
  A weight change for changing the weight used in the first calculation means or the third calculation means at the character cutout position for one character based on the correct number or the error number of the cutout position calculated by the number calculation means means
  Function as
  The weight changing means determines a next weight from an amount of change from a value based on the number of correct answers or the number of errors in the case of the current weight to a value based on the number of correct answers or the number of errors in the changed weight.
  An image processing program characterized by that.

  Computer
  First calculating means for calculating a weighted linear sum for a plurality of feature amounts related to a position candidate for cutting out one character image existing in the image;
  The calculation result by the first calculation means is used as an argument, and when the argument is a limit value, it converges to a predetermined value, or output when the distance between the argument and the predetermined value increases Second calculating means for calculating an evaluation value of a candidate for a position to cut out the one character image by a non-linear monotone function in which the absolute value of the slope of the non-linear monotonic function or a function approximating the non-linear monotone function is reduced;
  Based on the evaluation value calculated by the second calculating means, a cutting position determining means for determining a position for cutting out a character image existing in the image;
  A plurality of sets of the first calculation means and the second calculation means;
  Fifth calculation means for calculating the sum of the evaluation values calculated by the plurality of second calculation means
  Function as
  The cut-out position determining means determines a position to cut out a character image existing in the image based on the sum of evaluation values calculated by the fifth calculation means,
  Accepting means for accepting teacher data of the cutout position of the character image;
  A number calculating means for comparing the cutout position determined by the cutout position determining means with the teacher data received by the receiving means, and calculating the correct number of answers or the number of errors of the cutout position;
  Weight change means for changing the weight used in the first calculation means at the character cutout position for one character based on the correct number or the error number of the cutout position calculated by the number calculation means.
  Function as
  The weight changing means determines a next weight from an amount of change from a value based on the number of correct answers or the number of errors in the case of the current weight to a value based on the number of correct answers or the number of errors in the changed weight.
  An image processing program characterized by that.

  Computer
  First calculating means for calculating a weighted linear sum for a plurality of feature amounts related to a position candidate for cutting out one character image existing in the image;
  The calculation result by the first calculation means is used as an argument, and when the argument is a limit value, it converges to a predetermined value, or output when the distance between the argument and the predetermined value increases Second calculating means for calculating an evaluation value of a candidate for a position to cut out the one character image by a non-linear monotone function in which the absolute value of the slope of the non-linear monotonic function or a function approximating the non-linear monotone function is reduced;
  Based on the evaluation value calculated by the second calculating means, a cutting position determining means for determining a position for cutting out a character image existing in the image;
  A plurality of sets of the first calculation means and the second calculation means;
  Third calculation means for calculating a weighted linear sum for the evaluation values calculated by the plurality of second calculation means;
  The calculation result by the third calculation means is used as an argument, and when the argument is a limit value, it converges to a predetermined value, or output when the distance between the argument and the predetermined value increases Fourth calculation means for calculating an evaluation value of a candidate for a position to cut out the one character image by a nonlinear monotone function or an approximate function of the nonlinear monotone function in which the absolute value of the slope of the image becomes small
  Function as
  The cut-out position determining means determines a position to cut out a character image existing in the image based on the evaluation value calculated by the fourth calculation means,
  A plurality of sets by the first calculation means and the second calculation means; a plurality of sets by the third calculation means and the fourth calculation means;
  Sixth calculation means for calculating a sum of evaluation values calculated by the plurality of fourth calculation means.
  Function as
  The cut-out position determining means determines a position to cut out a character image existing in the image based on the sum of evaluation values calculated by the sixth calculation means,
  Accepting means for accepting teacher data of the cutout position of the character image;
  A number calculating means for comparing the cutout position determined by the cutout position determining means with the teacher data received by the receiving means, and calculating the correct number of answers or the number of errors of the cutout position;
  A weight change for changing the weight used in the first calculation means or the third calculation means at the character cutout position for one character based on the correct number or the error number of the cutout position calculated by the number calculation means means
  Function as
  The weight changing means determines a next weight from an amount of change from a value based on the number of correct answers or the number of errors in the case of the current weight to a value based on the number of correct answers or the number of errors in the changed weight.
  An image processing program characterized by that.