JP5211449B2

JP5211449B2 - Program, apparatus and method for adjusting recognition distance, and program for recognizing character string

Info

Publication number: JP5211449B2
Application number: JP2006218712A
Authority: JP
Inventors: 俊孫; 悦伸堀田; 裕勝山; 聡直井
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2005-08-18
Filing date: 2006-08-10
Publication date: 2013-06-12
Anticipated expiration: 2026-08-10
Also published as: CN1916938A; CN100430958C; JP2007052782A

Description

本発明は、文字認識プログラム、装置および方法に関し、詳細には、劣化した文字列中の文字を認識する文字列認識プログラム、装置および方法に関する。 The present invention relates to a character recognition program, apparatus, and method, and more particularly, to a character string recognition program, apparatus, and method for recognizing characters in a deteriorated character string.

文書画像取り込みにおいてディジタルカメラおよびディジタルビデオカメラが普及するに伴い、劣化文字列認識がますます注目されている。劣化文字列の認識は、単一文字認識および文字列セグメント化を含み、これらは相互に密接に関連している。 With the widespread use of digital cameras and digital video cameras in document image capture, the recognition of deteriorated character strings is gaining more and more attention. Degraded string recognition includes single character recognition and string segmentation, which are closely related to each other.

文字列セグメント化では、認識ベースのセグメント化が最も広く使用される戦略である。図１に、従来の認識ベースのセグメント化の原理を示す。まず、入力画像が２値化され、次いで、各文字のストロークを見出すために２値化画像の接続構成要素について解析が行われる（図１の上の行）。画像の接続構成要素解析については、下記非特許文献１を参照されたい。あらゆる接続構成要素は基本セグメント文字とみなされる（図１の中央の行）。接続構成要素の組み合わせは、合成セグメント文字とみなされる（図１の下の行）。本発明においては、基本セグメント文字と合成セグメント文字の両方を、たとえそれらが単なる無意味な接続構成要素や文字構成要素であったとしても、文字と呼ぶものとする。次いで、あらゆる基本セグメント文字ならびに合成セグメント文字に対して文字認識が行われ、認識距離を生じる。文字列は、様々な基本セグメント文字および合成セグメント文字からなる複数のセグメンテーションパスに分解され得る。あらゆるセグメンテーションパスの認識距離は、そのセグメントパスを構成している基本セグメント文字と合成セグメント文字の認識距離の和である。この文字列のセグメント化結果は、最小合計認識距離を持つセグメンテーションパスによって決定される。あらゆる基本セグメント文字および合成セグメント文字の認識結果が、その文字をセグメント化する際の、文字の最終認識結果である。 For string segmentation, recognition-based segmentation is the most widely used strategy. FIG. 1 illustrates the principle of conventional recognition-based segmentation. First, the input image is binarized, and then the connected components of the binarized image are analyzed to find the stroke of each character (upper row in FIG. 1). Refer to Non-Patent Document 1 below for image connection component analysis. Every connected component is considered a basic segment character (middle row in FIG. 1). The combination of connected components is considered a composite segment character (lower line in FIG. 1). In the present invention, both basic segment characters and composite segment characters are referred to as characters even if they are merely meaningless connection components or character components. Then, character recognition is performed on every basic segment character as well as composite segment characters, resulting in a recognition distance. The string can be broken down into multiple segmentation paths consisting of various basic segment characters and composite segment characters. The recognition distance of every segmentation path is the sum of the recognition distances of the basic segment characters and the composite segment characters constituting the segment path. The segmentation result of this string is determined by the segmentation path with the minimum total recognition distance. The recognition result of every basic segment character and composite segment character is the final recognition result of the character when the character is segmented.

図１に示すように、「ハ」と「リ」と「を」の組み合わせのセグメンテーションパスは、最小合計認識距離７２を有する。そのため、これらが最終セグメント化および認識結果として出力される。 As shown in FIG. 1, the segmentation path of a combination of “ha”, “li”, and “wo” has a minimum total recognition distance 72. Therefore, these are output as the final segmentation and recognition results.

図１からわかるように、認識距離値は、個々の認識結果に対してのみならず、正しいセグメント化プロセスにとっても非常に重要である。例えば、図１においては、「ハ」の最小認識距離は２１である。その左ストロークと右ストロークは、それぞれ、１９と２６の認識距離を得る。これら２つのストロークの和が２１より小さい場合には、「ハ」の最初の認識結果が正しい場合でも、それは、図１の左ストローク１と図１の右ストローク２
という２つの部分に誤ってセグメント化されることになる。 As can be seen from FIG. 1, the recognition distance value is very important not only for the individual recognition results but also for the correct segmentation process. For example, in FIG. 1, the minimum recognition distance of “C” is 21. The left stroke and right stroke obtain recognition distances 19 and 26, respectively. When the sum of these two strokes is smaller than 21, even if the initial recognition result of “c” is correct, it is the left stroke 1 in FIG. 1 and the right stroke 2 in FIG.
Will be incorrectly segmented into two parts.

テキストセグメント化については、下記特許文献１，２，３および下記非特許文献１，２のような多くの論文や特許がすでに公開されている。 Regarding text segmentation, many papers and patents such as the following Patent Documents 1, 2, and 3 and the following Non-Patent Documents 1 and 2 have already been published.

これらの論文および特許の大部分は接触文字を扱い、処理対象オブジェクトの多くは２値文字画像である。 Most of these articles and patents deal with contact characters, and many of the objects to be processed are binary character images.

劣化した文字列画像の場合、従来の２値化方法では、普通、ひどいストロークの切れ（ストローク画素ドットの喪失）や文字ストローク間の接触を生じる。そのため、認識性能が不十分である。一方、二重固有空間ベースの方法は、劣化文字認識に非常に有効であり、この方法は、文字特徴を、グレースケール文字画像から直接抽出する。図２は、二重固有空間ベースの方法を使った文字認識のフローチャートである。入力は、正規化された文字画像である。まず、文字画像の特徴が第１の辞書（図２の辞書１）によって抽出される。次いで、第２の辞書（図２に辞書２）を使って、文字画像がＭ個のカテゴリ候補の１つとして大ざっぱに分類される。最後に、第３の辞書（図３の辞書３）を使って、入力文字特徴をＭ個のカテゴリ候補の１つに精密に分類する。最終的に、システムは、認識された文字コードおよびその認識距離を出力する。 In the case of a degraded character string image, the conventional binarization method usually causes severe stroke cuts (loss of stroke pixel dots) and contact between character strokes. Therefore, the recognition performance is insufficient. On the other hand, the double eigenspace-based method is very effective for degraded character recognition, and this method extracts character features directly from grayscale character images. FIG. 2 is a flowchart of character recognition using a dual eigenspace-based method. The input is a normalized character image. First, character image features are extracted by the first dictionary (dictionary 1 in FIG. 2). Then, using the second dictionary (dictionary 2 in FIG. 2), the character image is roughly classified as one of the M category candidates. Finally, a third dictionary (dictionary 3 in FIG. 3) is used to precisely classify the input character features into one of M category candidates. Finally, the system outputs the recognized character code and its recognition distance.

二重固有空間ベースの方法では２値化は不要であり、グレースケール画像上で直接実行し、２値化された結果は大ざっぱなセグメント化でのみ使用される。二重固有空間ベースの方法は、２値化を行わずにグレースケール画像から直接特徴を抽出し、そのため、劣化によって生じる画像ノイズに対してより頑強である。しかしながら、二重固有空間ベースの方法が認識ベースのセグメント化方法に直接適用されると、いくつかの問題が生じる。 The dual eigenspace-based method does not require binarization, and is performed directly on the grayscale image, and the binarized result is used only for rough segmentation. The dual eigenspace based method extracts features directly from the grayscale image without binarization and is therefore more robust to image noise caused by degradation. However, several problems arise when the dual eigenspace-based method is applied directly to the recognition-based segmentation method.

図３は、従来技術の文字認識方法の欠点を概略的に示す図である。図３に示すように、第１行は文字列画像である。第２行は２値化結果である。この２値化画像は大ざっぱなセグメント化に使用される。点線の文字枠は大ざっぱなセグメント化の結果である。第３行は基本セグメント文字のグレースケール画像を正規化したものである。各セグメント画像の下にあるのは、認識された結果および対応する認識距離である。第４行は、正規化合成セグメント文字「年」および「開」の正規化グレースケール文字画像、ならびにそれらに対応する認識結果および認識距離である。従来の認識ベースのセグメント化方法が使用される場合、「開」は４つのセグメントに分離されることになる。４つの基本セグメント文字の認識距離の和は５．３９＋６１．０１＋４５．６９＋２０．３７＝１３２．４６であり、「開」の認識距離は４０９．７１であって、その４つのセグメントの認識距離の和より大きいため、文字列全体は、誤って、「年１回１！１１く」と認識されることになる。 FIG. 3 is a diagram schematically showing a drawback of the conventional character recognition method. As shown in FIG. 3, the first line is a character string image. The second row is the binarization result. This binarized image is used for rough segmentation. The dotted text box is the result of a rough segmentation. The third line is a normalized grayscale image of basic segment characters. Below each segment image is the recognized result and the corresponding recognition distance. The fourth line is a normalized grayscale character image of the normalized composite segment characters “year” and “open”, and recognition results and recognition distances corresponding to them. If a conventional recognition-based segmentation method is used, “open” will be separated into four segments. The sum of the recognition distances of the four basic segment characters is 5.39 + 61.01 + 45.69 + 20.37 = 132.46, the recognition distance of “open” is 409.71, and the sum of the recognition distances of the four segments Because it is larger, the entire character string is erroneously recognized as “1! 11 once a year”.

これまでのところ、劣化した文字列のセグメント化を扱う特許文献も論文もない。 So far, there is no patent literature or paper dealing with segmentation of degraded character strings.

米国特許第６３２７３８５号US Pat. No. 6,327,385 米国特許第５６９２０６９号U.S. Patent No. 5,691,069 米国特許第５１７２４２２号US Pat. No. 5,172,422 Ｒ．Ｃ．Ｇｏｎｚａｌｅｚ著、ＱｕｉｑｉＲＵＡＮ、ＹｕｚｈｉＲＵＡＮら編、「ディジタル画像処理第２版（ｄｉｇｉｔａｌＩｍａｇｅＰｒｏｃｅｓｓｉｎｇｓｅｃｏｎｄｅｄｉｔｉｏｎ）」の４３５頁R. C. Page 435 of Gonzalez, edited by Quiqi RUAN, Yuzhi RUAN et al., “Digital Image Processing Second Edition” Ｙ．Ｌｕ著、「ＭａｃｈｉｎｅＰｒｉｎｔｅｄＣｈａｒａｃｔｅｒＳｅｇｍｅｎｔａｔｉｏｎ − ＡｎＯｖｅｒｖｉｅｗ」、ＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ、第２８巻、第１号、６７〜８０頁、１９９５年１月Y. Lu, "Machine Printed Character Segmentation-An Overview", Pattern Recognition, Vol. 28, No. 1, pp. 67-80, January 1995 Ｓ．Ｗ．Ｌｅｅ、Ｄ．Ｊ．Ｌｅｅ、Ｈ．Ｓ．Ｐａｒｋ、「ＡＮｅｗＭｅｔｈｏｄｏｌｏｇｙｆｏｒＧｒａｙ−ＳｃａｌｅＣｈａｒａｃｔｅｒＳｅｇｍｅｎｔａｔｉｏｎａｎｄＲｅｃｏｇｎｉｔｉｏｎ」、ＩＥＥＥｔｒａｎｓａｃｔｉｏｎｏｎｐａｔｔｅｒｎａｎａｌｙｓｉｓａｎｄｍａｃｈｉｎｅｉｎｔｅｌｌｉｇｅｎｃｅ、第１８巻、第１０号、１０４５〜１０５０頁、１９９６年１０月S. W. Lee, D.C. J. et al. Lee, H.C. S. Park, “A New Methodology for Gray-Scale Character Segmentation and Recognition”, IEEE transaction on pattern analysis, 10th, 10th, 10th, 10th, 10th, 10th, 10th, 10th, 10th, 10th, 10th, 10th, 10th

本発明は、前述の従来技術における問題を考慮して提案するものである。本発明の目的は、二重固有空間を使ってセグメント化が行われる場合に生じる問題を解決するために、文字構造特徴を使って初期認識距離を調整し、それによって初期認識距離をよりセグメント化に適したものにすることである。 The present invention is proposed in consideration of the problems in the prior art described above. The purpose of the present invention is to adjust the initial recognition distance using character structure features to solve the problems that occur when segmentation is performed using double eigenspace, thereby making the initial recognition distance more segmented It is to make it suitable for.

本発明の一態様によれば、候補文字の初期認識距離を調整する方法であって、候補文字の１つ以上の訓練サンプルの構造特徴値を計算する構造特徴値計算ステップと、構造特徴値計算ステップによって計算される構造特徴値に基づいて初期認識距離を調整する調整ステップとを含む方法が提供される。 According to one aspect of the present invention, there is provided a method for adjusting an initial recognition distance of a candidate character, a structure feature value calculating step for calculating a structure feature value of one or more training samples of the candidate character, and a structure feature value calculation. Adjusting the initial recognition distance based on the structural feature values calculated by the steps.

本発明の別の態様によれば、候補文字の初期認識距離を調整する装置であって、候補文字の１つ以上の訓練サンプルの構造特徴値を計算する構造特徴値計算部と、構造特徴値計算部によって計算される構造特徴値に基づいて初期認識距離を調整する調整部とを含む装置が提供される。 According to another aspect of the present invention, an apparatus for adjusting an initial recognition distance of a candidate character, the structure feature value calculating unit for calculating the structure feature value of one or more training samples of the candidate character, and the structure feature value An apparatus is provided that includes an adjustment unit that adjusts the initial recognition distance based on the structural feature value calculated by the calculation unit.

本発明の別の態様によれば、特徴抽出ステップと、粗分類ステップと、特徴認識ステップと、詳細分類ステップとを含み、候補文字の初期認識距離が詳細分類ステップによって出力される、文字列を認識する方法であって、候補文字の１つ以上の訓練サンプルの構造特徴値を計算する構造特徴値計算ステップと、構造特徴値計算ステップによって計算される構造特徴値に基づいて初期認識距離を調整する調整ステップとをさらに含む方法が提供される。 According to another aspect of the present invention, there is provided a character string including a feature extraction step, a rough classification step, a feature recognition step, and a detailed classification step, and the initial recognition distance of the candidate character is output by the detailed classification step. A method of recognizing a structural feature value calculating step for calculating a structural feature value of one or more training samples of candidate characters, and adjusting an initial recognition distance based on the structural feature value calculated by the structural feature value calculating step And an adjusting step is provided.

本発明の別の態様によれば、特徴抽出部と、粗分類部と、特徴認識部と、詳細分類部とを含み、候補文字の初期認識距離が詳細分類部によって出力される、文字列を認識する装置であって、候補文字の１つ以上の訓練サンプルの構造特徴値を計算する構造特徴値計算部と、構造特徴値計算部によって計算される構造特徴値に基づいて初期認識距離を調整する調整部とをさらに含む装置が提供される。 According to another aspect of the present invention, there is provided a character string including a feature extraction unit, a coarse classification unit, a feature recognition unit, and a detailed classification unit, and an initial recognition distance of a candidate character is output by the detailed classification unit. A device for recognizing a structural feature value for calculating a structural feature value of one or more training samples of candidate characters, and adjusting an initial recognition distance based on the structural feature value calculated by the structural feature value calculator There is provided an apparatus further including an adjusting unit.

好ましくは、構造特徴値は形状特徴値であり、調整ステップは、乗算演算または乗算を含む演算によって初期認識距離を調整する。 Preferably, the structural feature value is a shape feature value, and the adjusting step adjusts the initial recognition distance by a multiplication operation or an operation including multiplication.

好ましくは、形状特徴値は、候補文字の訓練サンプルの文字ストローク画素ドットの密度、候補文字の訓練サンプルの文字ストロークの密度、または候補文字の訓練サンプルの文字ストロークの行列平均ストロークセグメント数である。 Preferably, the shape feature value is a density of character stroke pixel dots of the training sample of candidate characters, a density of character strokes of the training sample of candidate characters, or a matrix average stroke number of character strokes of the training sample of candidate characters.

好ましくは、候補文字の訓練サンプルの文字ストローク画素ドットの密度は、候補文字の訓練サンプルの最小方形文字枠の面積と候補文字の訓練サンプルの文字ストローク画素ドットの数との比であり、候補文字の訓練サンプルの文字ストロークの密度は、候補文字の訓練サンプルの最小方形文字枠の面積と上記候補文字の訓練サンプルの文字ストロークの数のｎ乗との比であり、ｎは正の整数である。 Preferably, the density of the character stroke pixel dots of the candidate character training sample is the ratio of the area of the minimum square character frame of the candidate character training sample to the number of character stroke pixel dots of the candidate character training sample, and the candidate character The training sample character stroke density is the ratio of the area of the minimum rectangular character frame of the candidate character training sample to the nth power of the number of character strokes of the candidate character training sample, where n is a positive integer. .

好ましくは、候補文字の１つ以上の訓練サンプルは、候補文字の文字コードから獲得される。 Preferably, one or more training samples of candidate characters are obtained from the character codes of the candidate characters.

好ましくは、乗算を含む演算は、初期認識距離に、構造特徴計算ステップによって計算される構造特徴値の対数値を掛け合わせる乗算である。 Preferably, the operation including multiplication is multiplication by multiplying the initial recognition distance by the logarithmic value of the structure feature value calculated by the structure feature calculation step.

好ましくは、構造特徴値の文字構造に対する変化傾向は、認識距離の文字構造に対する変化傾向と同じであり、調整ステップ（部）は、除算演算によって、または除算を含む演算によって初期認識距離を調整する。 Preferably, the change tendency of the structure feature value with respect to the character structure is the same as the change tendency of the recognition distance with respect to the character structure, and the adjustment step (part) adjusts the initial recognition distance by a division operation or an operation including division. .

本発明は、従来技術において未解決の問題を解決し、文字列の正しいセグメント化および認識を行い、著しい技術的効果を実現することができる。 The present invention can solve unsolved problems in the prior art, perform correct segmentation and recognition of character strings, and realize significant technical effects.

本発明のさらなる理解を提供するために本明細書に含める添付の各図面は、以下の説明と共に、本発明の原理を説明するのに使用するものである。 The accompanying drawings, which are included herein to provide a further understanding of the invention, together with the following description, are used to explain the principles of the invention.

本発明は、二重固有空間を使ってセグメント化が行われる場合に生じる問題を解決するために、文字構造特徴を使って初期認識距離を調整するようにしたので、文字列の正しいセグメント化および認識を行うことができるという効果を奏する。 In the present invention, the character recognition feature is used to adjust the initial recognition distance in order to solve the problem that occurs when the segmentation is performed using the double eigenspace. There is an effect that recognition can be performed.

以下で、添付の図面を参照して、本発明の好ましい実施形態について説明する。これらの実施形態は、説明のための概略的なものにすぎず、したがって、本発明の保護適用範囲を限定するものではない。 Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings. These embodiments are merely schematic for the purposes of illustration and therefore do not limit the protective scope of the present invention.

図４は、本発明の一実施形態を示すフローチャートである。入力正規化文字画像４０１について、第１の辞書４０３を使い、特徴抽出部４０２によって、この画像の特徴が抽出される。 FIG. 4 is a flowchart showing an embodiment of the present invention. With respect to the input normalized character image 401, the feature extraction unit 402 extracts the feature of this image using the first dictionary 403.

式中、Ｘ＝［x₁，x₂，…，x_w*h］^Tであり、幅と高さがそれぞれｗとｈである正規化文字画像を表す。式（２）はすべての訓練サンプルの正規化文字画像の平均値である。Ｕ＝［ｕ₁，ｕ₂，…，ｕ_n］は、ｕ_i＝［ｕ_i1，ｕ_i2，…，ｕ_iw*h]^Tとする変換行列であり、式中ｕ_iは、行列Ｕの成分である１次元列ベクトルである。ｕ_iの意味およびこれを獲得する方法は、以下に引用する文献から理解され得る。ｕ_i1は、ベクトルｕ_iなどの最初の成分であり、ｎは次元数である。Ｕおよび式（３）は、第１の辞書４０３を構成する。式（１）に示す特徴抽出方法を主成分分析（ＰＣＡ）という。ＰＣＡの詳細については、Ｒ．Ｏ．Ｄｕｄａ、Ｐ．Ｅ．ＨａｒｔおよびＤ．Ｇ．Ｓｔｏｒｋ著、ＡＷｉｌｅｙ−ＩｎｔｅｒｓｃｉｅｎｃｅＰｕｂｌｉｃａｔｉｏｎＪｏｈｎＷｉｌｅｙ＆Ｓｏｎｓ社刊、２００１、「パターン分類第２版（ＰａｔｔｅｒｎＣｌａｓｓｉｆｉｃａｔｉｏｎ，ｓｅｃｏｎｄｅｄｉｔｉｏｎ）」、１１５〜１１７頁、５６８〜５６９頁に記載されている。 In the formula, X = [x ₁ , x ₂ ,..., X _{w * h} ] ^T , and represents a normalized character image whose width and height are w and h, respectively. Equation (2) is the average value of the normalized character images of all training samples. _{_{U = [u 1, u 2}} , ..., u n] _{_{is, u i = [u i1,}} u i2, ..., u iw * h] is a transformation matrix is ^T, wherein u _i is the matrix U It is a one-dimensional column vector that is a component. The meaning of u _i and how to obtain it can be understood from the literature cited below. u _i1 is the first component such as vector u _i and n is the number of dimensions. U and Expression (3) constitute the first dictionary 403. The feature extraction method shown in Equation (1) is called principal component analysis (PCA). For details of PCA, see R.A. O. Duda, P.A. E. Hart and D.H. G. Stork, A Wiley-Interscience Publication John Wiley & Sons, 2001, “Pattern Classification, second edition”, pages 115-117, 568-569.

粗分類部４０４においては、抽出された文字特徴Ｙが、第２の辞書４０５に格納されているあらゆる文字カテゴリの特徴と比較される。多くの特徴比較方法があり、その１つはユークリッド距離、Ｄ_i＝｜Ｙ−Ｙ_i｜に基づくものであり、式中、Ｄ_iは、特徴Ｙの、ｉ番目の文字カテゴリの特徴Ｙ_iからのユークリッド距離である。粗分類部によって出力されるＭ個の文字カテゴリ候補があり、粗分類の出力結果として、最小ユークリッド距離を有する（最初の）Ｍ個の文字カテゴリが選択されるものとする。
特徴再構築部４０６は、第３の辞書４０７を使ってＭ個のカテゴリ候補に対応するＭ個の特徴を再構築するのに使用される。第３の辞書４０７は、変換行列式（４）およびあらゆる文字カテゴリの平均特徴ベクトルＣ_iを格納している。式（５）によれば、再構築特徴ベクトル式（６）は、 In the coarse classification unit 404, the extracted character feature Y is compared with the features of all character categories stored in the second dictionary 405. There are many features comparison method, that one Euclidean _{distance, D i = | Y-Y} i | is based on, wherein, D _i is the feature Y, i-th feature Y _i of a character category Euclidean distance from. Assume that there are M character category candidates output by the coarse classification unit, and the (first) M character categories having the minimum Euclidean distance are selected as the coarse classification output result.
The feature reconstruction unit 406 is used to reconstruct M features corresponding to M category candidates using the third dictionary 407. The third dictionary 407 stores the transformation determinant (4) and the average feature vector C _i for all character categories. According to equation (5), the reconstructed feature vector equation (6) is

として獲得される。

As earned.

式（７）はある演算によって獲得される行列であり、この演算の詳細については、以下の論文を参照されたい。式（８）は式（７）の転置行列である。再構築に関しては、以下の論文を参照されたい。 Expression (7) is a matrix obtained by a certain operation. For details of this operation, refer to the following paper. Equation (8) is the transposed matrix of Equation (7). For the reconstruction, please refer to the following paper.

Ｊ．Ｓｕｎ、Ｙ．Ｈｏｔｔａ、Ｙ．Ｋａｔｓｕｙａｍａ、Ｓ．Ｎａｏｉ、「二重固有空間および合成劣化パターンによる低解像度文字認識（Ｌｏｗｒｅｓｏｌｕｔｉｏｎｃｈａｒａｃｔｅｒｒｅｃｏｇｎｉｔｉｏｎｂｙｄｕａｌｅｉｇｅｎｓｐａｃｅａｎｄｓｙｎｔｈｅｔｉｃｄｅｇｒａｄｅｄｐａｔｔｅｒｎｓ）」、ＡＣＭ１ｓｔＨａｒｄｃｏｐｙＤｏｃｕｍｅｎｔＰｒｏｃｅｓｓｉｎｇＷｏｒｋｓｈｏｐ、１５〜２２頁、２００４年。 J. et al. Sun, Y. et al. Hotta, Y. et al. Katsuyama, S .; Naoi, “Low resolution character recognition by dual-eigenspace and synthetic degraded patterns”, ACM 1st Hardcopy Document page 5 to 2nd year.

図４の詳細分類部４０８は、元の特徴Ｙと、Ｍ再構築特徴式（６）の間の距離を計算する。最小の距離を有する文字カテゴリが詳細分類結果として選択される。対応する文字カテゴリ（候補文字）のコードが認識文字コード４０９として出力される。最小の距離が、初期認識距離４１０として出力される。図２に示す従来の方法では、４１０が最終認識距離とみなされ、それがセグメント化でのエラーを生じ得るものであることに留意されたい。 4 calculates the distance between the original feature Y and the M reconstructed feature equation (6). The character category having the smallest distance is selected as the detailed classification result. The code of the corresponding character category (candidate character) is output as the recognized character code 409. The minimum distance is output as the initial recognition distance 410. Note that in the conventional method shown in FIG. 2, 410 is considered the final recognition distance, which can cause errors in segmentation.

本実施形態では、文字形状特徴値計算部４１１を使って、認識文字の形状特徴が計算される。簡単に言えば、文字形状特徴は、文字構造特徴の１つであり、文字ストロークの複雑度の表現とみなされ得る。ストロークが複雑であるほど、形状特徴値は小さく、ストロークが単純であるほど、形状特徴値は大きい。文字形状特徴値計算部の入力は、認識文字コードと、それに対応する２値文字画像である（候補文字に対応する訓練サンプルを、候補文字の訓練サンプルともいう）。２値文字画像は、図４の認識文字コード４０９から獲得される。２値文字画像および画像の各カテゴリに対応する文字コードは、予め、ハードディスクなどの記憶媒体に格納されているため、あるコードに対応する文字画像すべてをそのコードに基づいて検索することができ、逆もまた同様である。複数の２値画像が選択された場合、文字形状特徴の値は、それらの２値画像の形状特徴Ｗｇすべての値の平均値または加重平均値である。加重平均値を使用する場合、２値画像の重みは、それに対応して記憶媒体に格納される。本実施形態では、文字形状特徴を計算するために、文字ストローク画素ドットの密度を例に取る。具体的には、文字形状特徴は、次式（９）によって計算される。 In the present embodiment, the shape feature of the recognized character is calculated using the character shape feature value calculation unit 411. Simply put, the character shape feature is one of the character structure features and can be regarded as an expression of the complexity of the character stroke. The more complex the stroke, the smaller the shape feature value, and the simpler the stroke, the greater the shape feature value. The input of the character shape feature value calculation unit is a recognized character code and a binary character image corresponding to the recognition character code (a training sample corresponding to a candidate character is also referred to as a candidate character training sample). The binary character image is obtained from the recognized character code 409 in FIG. Since the binary character image and the character code corresponding to each category of the image are stored in advance in a storage medium such as a hard disk, all character images corresponding to a certain code can be searched based on the code, The reverse is also true. When a plurality of binary images are selected, the value of the character shape feature is an average value or a weighted average value of all the shape features Wg of the binary images. When using the weighted average value, the weight of the binary image is stored in the storage medium correspondingly. In this embodiment, the density of character stroke pixel dots is taken as an example in order to calculate character shape characteristics. Specifically, the character shape feature is calculated by the following equation (9).

Ｗｇ＝ｓ＊ｓ／文字ストローク画素ドット数・・・（９） Wg = s * s / number of character stroke pixel dots (9)

式中、ｓは２値文字画像の最小方形文字枠の辺の長さである。２値画像中の文字ストロークを表すドットは、いわゆる文字ストローク画素ドットである。あらゆるドットの値は、そのドットが文字ストローク画素ドットであるかどうか判定するために、画像を上から下、左から右にスキャンすることによって獲得される。２値画像中の各ドットには、２種類の値、すなわち０または１だけしかない。値０は背景に相当し、値１はストロークに相当し、よって、ストローク画素ドット数は、値１の数を計算することによって獲得され得る。最小方形文字枠は、文字ストローク画素ドットの上端、下端、左端、右端の各位置を検索することによって求められる。これら４つの値を、それぞれ、ｘｓ、ｘｅ、ｙｓ、ｙｅとする。これらの値によって、文字ストローク画像の長方形文字枠が一意に決定される。この長方形は、幅ｗ＝ｘｅ−ｘｓ＋１、高さｈ＝ｙｅ−ｙｓ＋１を有する。最小方形文字枠のサイズは、この幅と高さの最大値である。ｗ＞ｈの場合、最小方形文字枠は、（ｗ−ｈ）／２画素ドットを、この長方形文字枠から、高さ方向の上方と下方とにそれぞれ拡大することによって獲得される。ｈ＞ｗの場合、最小方形文字枠は、（ｈ−ｗ）／２画素ドットを、この長方形文字枠から、幅方向の左方と右方とにそれぞれ拡大することによって獲得される。 In the formula, s is the length of the side of the minimum rectangular character frame of the binary character image. Dots representing character strokes in the binary image are so-called character stroke pixel dots. The value of every dot is obtained by scanning the image from top to bottom and left to right to determine whether the dot is a character stroke pixel dot. Each dot in a binary image has only two types of values: 0 or 1. A value of 0 corresponds to the background and a value of 1 corresponds to a stroke, so the number of stroke pixel dots can be obtained by calculating the number of values 1. The minimum rectangular character frame is obtained by searching the positions of the upper end, lower end, left end, and right end of the character stroke pixel dot. Let these four values be xs, xe, ys, and ye, respectively. These values uniquely determine the rectangular character frame of the character stroke image. This rectangle has a width w = xe−xs + 1 and a height h = ye−ys + 1. The size of the minimum rectangular character frame is the maximum value of this width and height. When w> h, the minimum rectangular character frame is obtained by enlarging (w−h) / 2 pixel dots upward and downward in the height direction from the rectangular character frame. When h> w, the minimum rectangular character frame is obtained by enlarging (h−w) / 2 pixel dots from the rectangular character frame to the left and right in the width direction.

上記の例では、文字形状特徴の計算を、画素ドットの低密度を例に取って説明している。しかしながら、本発明は、それだけに限定されるものではない。ストロークの低密度など、複雑な構造の文字と単純な構造の文字とを区別することのできる他の特徴も使用され得る。具体的には、文字形状特徴は、次式（１０）を使って計算され得る。 In the above example, the calculation of the character shape feature is described by taking the low density of pixel dots as an example. However, the present invention is not limited to that. Other features that can distinguish between complex and simple structured characters, such as low stroke density, can also be used. Specifically, the character shape feature can be calculated using the following equation (10).

Ｗｇ＝ｓ＊ｓ／（文字ストローク数）ⁿ ・・・（１０） Wg = s * s / (number of character strokes) ⁿ (10)

式中、ｎは、１より大きい整数であり、経験的に、好ましくは４から１０までの間になるように決定される。 Where n is an integer greater than 1 and is determined empirically to be preferably between 4 and 10.

文字形状特徴を使用する理由は、同じ程度の劣化状態下にあっては、「１」や「く」のような単純なストローク構造の文字で認識距離が非常に小さいことである。「楽」や「開」のような複雑なストローク構造の文字では、認識距離は非常に大きい。この現象は図３で認められる。文字形状特徴値の特性は、図５から分かるように、それが複雑な構造の文字では小さく、単純な構造の文字では大きいことである。そのため、文字形状特徴は、認識距離に対して異なる文字構造がもたらす影響を補償するのに使用され得る。本発明は、形状特徴を使用する場合だけに限定されるものではない。他の構造特徴も、それらが、異なる文字構造によって影響受ける認識距離を補償することができる限り、使用され得る。したがって、本発明においては、文字形状特徴を、広義に、すなわち、その値の文字構造に対する変化傾向が、認識距離の文字構造に対する変化傾向と反対である構造特徴として説明する必要がある。例えば、構造特徴は、行および列もしくは行あるいは列の平均ストロークセグメント数とすることもできる。まず、文字画像のあらゆる行をスキャンしてストローク数を計算し、次いで、あらゆる列をスキャンしてストローク数を計算し、このようにして、結果として生じる平均値は、文字ストロークの複雑度を示すことができる。具体的には、画像中の１行のストロークセグメントの計算方法は以下の通りである。画像中のあらゆる行を左から右にスキャンし、取り込んだ第１の値１を有する画素ドットをストロークの左端部として記録し、次いで、下方にスキャンして値が１から０に変わるところの画素ドットをストロークの右端部として記録し、この左端部と右端部が１つのストロークセグメントに対応する。続けて、値が０から１に変わるところの画素ドット、すなわち、第２のストロークセグメントの左端部と、値が１から０に変わるところの画素ドット、すなわち第２のストロークセグメントの右端部とをサーチし、以下、スキャンが終了するまで同様に行う。１列のストローク数も同様に獲得され得る。 The reason for using the character shape feature is that the recognition distance is very short with a character having a simple stroke structure such as “1” or “ku” under the same degree of deterioration. For characters with a complicated stroke structure such as “Easy” or “Open”, the recognition distance is very large. This phenomenon can be seen in FIG. As can be seen from FIG. 5, the characteristic of the character shape feature value is that it is small for a character having a complicated structure and large for a character having a simple structure. As such, character shape features can be used to compensate for the effects of different character structures on recognition distance. The present invention is not limited to using shape features. Other structural features can be used as long as they can compensate for the recognition distance affected by different character structures. Therefore, in the present invention, it is necessary to describe the character shape feature in a broad sense, that is, as a structure feature in which the change tendency of the value with respect to the character structure is opposite to the change tendency of the recognition distance with respect to the character structure. For example, the structural feature may be the average number of stroke segments in rows and columns or rows or columns. First, every line of the character image is scanned to calculate the number of strokes, then every column is scanned to calculate the number of strokes, and thus the resulting average value indicates the complexity of the character strokes be able to. Specifically, a method for calculating a stroke segment in one line in the image is as follows. Scan every line in the image from left to right, record the captured pixel dot with the first value 1 as the left end of the stroke, then scan down and the pixel where the value changes from 1 to 0 A dot is recorded as the right end of a stroke, and the left end and the right end correspond to one stroke segment. Subsequently, the pixel dot whose value changes from 0 to 1, that is, the left end of the second stroke segment, and the pixel dot whose value changes from 1 to 0, that is, the right end of the second stroke segment. The search is performed, and thereafter, the same operation is performed until the scan is completed. The number of strokes in a row can be obtained as well.

認識距離調整部４１３は、文字形状特徴を使って初期認識距離を調整する。具体的には、調整にＲ１＝Ｗｇ＊Ｒが使用される。Ｒ１は、図４における最終認識距離４１４である。Ｒは図４における初期認識距離である。乗算演算のみならず、乗算を含む演算（例えば、対数（ｌｏｇ）演算が行われた後の形状特徴値に初期認識距離を掛ける演算など）も使用され得る。同じ劣化度の２つの異なる文字画像の初期認識距離Ｒでは、複雑な構造の画像のＲは比較的大きく、他方、単純な構造の画像のＲは比較的小さく、よって、セグメント化が難しい。ＷｇとＲは、文字構造に従って互いに対して相反して変化する。Ｗｇは、単純な構造の文字では比較的大きく、複雑な構造の文字では小さく、よって、構造に対するＲの感受性は、乗算などの演算によって除去され得る。 The recognition distance adjustment unit 413 adjusts the initial recognition distance using the character shape feature. Specifically, R1 = Wg * R is used for adjustment. R1 is the final recognition distance 414 in FIG. R is the initial recognition distance in FIG. Not only a multiplication operation but also an operation including multiplication (for example, an operation of multiplying a shape feature value after a logarithmic (log) operation is performed, an initial recognition distance, etc.) can be used. At the initial recognition distance R of two different character images of the same degree of degradation, the R of a complex structure image is relatively large, while the R of a simple structure image is relatively small and therefore difficult to segment. Wg and R change in opposition to each other according to the character structure. Wg is relatively large for simple structure characters and small for complex structure characters, so the sensitivity of R to the structure can be removed by operations such as multiplication.

上記の実施形態では、初期認識距離を調整し、補償するのに形状特徴が使用されている。しかしながら、他の実施形態では、文字構造に対するその構造特徴の変化傾向が、文字構造に対する初期認識距離の変化傾向と同じである他の構造特徴も使用され得る。すなわち、構造特徴変化の傾向は、認識距離のものと同じであり、値は、複雑な構造の文字では大きく、単純な構造の文字では小さい。この場合、初期認識距離を調整するのに除算演算または除算を含む演算が使用され得る。要約すると、構造に対するＲの感受性を除去することのできる任意の演算が使用され得る。本発明の教示を用いれば、様々な変更を思い付くことができることは、当業者には明らかである。簡単にするために、本願においてはこれらに関する詳細な記述を行わない。 In the above embodiment, shape features are used to adjust and compensate for the initial recognition distance. However, in other embodiments, other structural features may be used whose change tendency of the structural feature relative to the character structure is the same as the change tendency of the initial recognition distance relative to the character structure. That is, the tendency of the structural feature change is the same as that of the recognition distance, and the value is large for characters having a complex structure and small for characters having a simple structure. In this case, a division operation or an operation involving division may be used to adjust the initial recognition distance. In summary, any operation that can remove the sensitivity of R to the structure can be used. It will be apparent to those skilled in the art that various modifications can be made using the teachings of the invention. For the sake of simplicity, this application will not be described in detail.

図６に、図３の対応する文字画像での最終認識距離を示す。調整後、「開」の最終認識距離は４７１であるが、その４つの構成要素の認識距離の和は６７８である。よって、「開」は正しくセグメント化される。 FIG. 6 shows the final recognition distance in the corresponding character image of FIG. After the adjustment, the final recognition distance of “open” is 471, but the sum of the recognition distances of the four components is 678. Thus, “open” is correctly segmented.

本発明における例は日本語であるが、本発明は日本語のみに限定されないことに留意されたい。また、本発明の原理は、中国語や韓国語など、他の言語にも適用され得る。さらに、上記実施形態では、文字列に複数の文字がある。しかしながら、特に、文字列にはただ１つの文字しかないこともあり、本発明はこの場合にも適用可能である。 Note that although examples in the present invention are in Japanese, the present invention is not limited to Japanese only. The principle of the present invention can also be applied to other languages such as Chinese and Korean. Furthermore, in the above embodiment, there are a plurality of characters in the character string. However, in particular, there may be only one character in the character string, and the present invention is applicable also in this case.

当業者であれば、本発明の精神および範囲を逸脱することなく、様々な変更および変形を思いつくはずであることを理解すべきである。したがって、本発明の範囲は、最大限に解釈されるべきであり、それらの変更および変形を、それらが添付の特許請求の範囲およびその均等物に含まれる限りにおいて包含するものである。 It should be understood by those skilled in the art that various modifications and variations can be devised without departing from the spirit and scope of the invention. Accordingly, the scope of the present invention should be construed to the fullest and encompass those modifications and variations as long as they are included in the appended claims and their equivalents.

（付記１）候補文字の初期認識距離を調整するプログラムであって、
前記候補文字の１つ以上の訓練サンプルの構造特徴値を計算する構造特徴値計算手順と、
前記構造特徴値計算手順によって計算される前記構造特徴値に基づいて前記初期認識距離を調整する調整手順と
をコンピュータに実行させることを特徴とする候補文字の初期認識距離を調整するプログラム。 (Supplementary note 1) A program for adjusting the initial recognition distance of a candidate character,
A structural feature value calculation procedure for calculating a structural feature value of one or more training samples of the candidate character;
A program for adjusting an initial recognition distance of a candidate character, characterized by causing a computer to execute an adjustment procedure for adjusting the initial recognition distance based on the structure feature value calculated by the structure feature value calculation procedure.

（付記２）前記構造特徴値は形状特徴値であり、前記調整手順は、乗算または乗算を含む演算によって前記初期認識距離を調整することを特徴とする付記１に記載の候補文字の初期認識距離を調整するプログラム。 (Supplementary Note 2) The initial recognition distance of the candidate character according to Supplementary Note 1, wherein the structural feature value is a shape feature value, and the adjustment procedure adjusts the initial recognition distance by multiplication or an operation including multiplication. Adjust the program.

（付記３）前記形状特徴値は、前記候補文字の前記訓練サンプルの文字ストローク画素の密度、前記候補文字の前記訓練サンプルの文字ストロークの密度、または前記候補文字の前記訓練サンプルの文字ストロークの行／列当たりの平均ストロークセグメント数とすることを特徴とする付記２に記載の候補文字の初期認識距離を調整するプログラム。 (Supplementary note 3) The shape feature value is a character stroke pixel density of the training sample of the candidate character, a character stroke density of the training sample of the candidate character, or a row of character strokes of the training sample of the candidate character. The program for adjusting the initial recognition distance of candidate characters according to appendix 2, wherein the average stroke segment number per row is used.

（付記４）前記候補文字の前記訓練サンプルの前記文字ストローク画素の前記密度は、前記候補文字の前記訓練サンプルの最小方形文字枠の面積と前記候補文字の前記訓練サンプルの前記文字ストローク画素の数との比であり、前記候補文字の前記訓練サンプルの前記文字ストロークの前記低密度値は、前記候補文字の前記訓練サンプルの最小方形文字枠の面積と前記候補文字の前記訓練サンプルの前記文字ストロークの数のｎ乗との比であり、ｎは正の整数である、ことを特徴とする付記３に記載の候補文字の初期認識距離を調整するプログラム。 (Additional remark 4) The said density of the said character stroke pixel of the said training sample of the said candidate character is the area of the minimum square character frame of the said training sample of the said candidate character, and the number of the said character stroke pixels of the said training sample of the said candidate character The low density value of the character stroke of the training sample of the candidate character is the area of the minimum rectangular character frame of the training sample of the candidate character and the character stroke of the training sample of the candidate character. The program for adjusting the initial recognition distance of the candidate character according to appendix 3, wherein n is a ratio of the number to the nth power, and n is a positive integer.

（付記５）前記候補文字の前記１つ以上の訓練サンプルは前記候補文字の文字コードによって獲得されることを特徴とする付記１に記載の候補文字の初期認識距離を調整するプログラム。 (Additional remark 5) The program which adjusts the initial recognition distance of the candidate character of Additional remark 1 characterized by the said one or more training samples of the said candidate character being acquired by the character code of the said candidate character.

（付記６）前記演算は、前記構造特徴計算手順によって獲得される前記構造特徴値の対数と前記初期認識距離とを掛け合わせる乗算を含むことを特徴とする付記２に記載の候補文字の初期認識距離を調整するプログラム。 (Additional remark 6) The said calculation includes the multiplication which multiplies the logarithm of the said structural feature value acquired by the said structural feature calculation procedure, and the said initial recognition distance, Initial recognition of the candidate character of Additional remark 2 characterized by the above-mentioned A program to adjust the distance.

（付記７）前記候補文字が複数の訓練サンプルを有するときに、前記構造特徴値計算手順によって獲得される前記構造特徴値は、前記複数の訓練サンプルの前記構造特徴値すべての平均値または加重平均値であることを特徴とする付記１〜６のいずれか１つに記載の候補文字の初期認識距離を調整するプログラム。 (Supplementary note 7) When the candidate character has a plurality of training samples, the structural feature value obtained by the structural feature value calculation procedure is an average value or a weighted average of all the structural feature values of the plurality of training samples. A program for adjusting an initial recognition distance of a candidate character according to any one of supplementary notes 1 to 6, wherein the initial recognition distance is a value.

（付記８）候補文字の初期認識距離を調整する方法であって、
前記候補文字の１つ以上の訓練サンプルの構造特徴値を計算する構造特徴値計算ステップと、
前記構造特徴値計算ステップによって計算される前記構造特徴値に基づいて前記初期認識距離を調整する調整ステップと
を含むことを特徴とする候補文字の初期認識距離を調整する方法。 (Appendix 8) A method for adjusting the initial recognition distance of a candidate character,
A structural feature value calculating step of calculating a structural feature value of one or more training samples of the candidate character;
And adjusting the initial recognition distance based on the structural feature value calculated by the structural feature value calculating step.

（付記９）候補文字の初期認識距離を調整する装置であって、
前記候補文字の１つ以上の訓練サンプルの構造特徴値を計算する構造特徴値計算部と、
前記構造特徴値計算部によって計算される前記構造特徴値に基づいて前記初期認識距離を調整する調整部と
を備えることを特徴とする候補文字の初期認識距離を調整する装置。 (Appendix 9) A device for adjusting the initial recognition distance of a candidate character,
A structural feature value calculator for calculating a structural feature value of one or more training samples of the candidate character;
An apparatus for adjusting an initial recognition distance of a candidate character, comprising: an adjustment unit that adjusts the initial recognition distance based on the structure feature value calculated by the structure feature value calculation unit.

（付記１０）前記構造特徴値は形状特徴値であり、前記調整部は、乗算または乗算を含む演算によって前記初期認識距離を調整することを特徴とする付記９に記載の候補文字の初期認識距離を調整する装置。 (Supplementary note 10) The initial recognition distance of candidate characters according to supplementary note 9, wherein the structural feature value is a shape feature value, and the adjustment unit adjusts the initial recognition distance by multiplication or an operation including multiplication. Adjusting device.

（付記１１）前記形状特徴値は、前記候補文字の前記訓練サンプルの文字ストローク画素の密度、前記候補文字の前記訓練サンプルの文字ストロークの密度、または前記候補文字の前記訓練サンプルの文字ストロークの行／列当たりの平均ストロークセグメント数とすることを特徴とする付記１０に記載の候補文字の初期認識距離を調整する装置。 (Supplementary note 11) The shape feature value is a character stroke pixel density of the training sample of the candidate character, a character stroke density of the training sample of the candidate character, or a row of character strokes of the training sample of the candidate character. The apparatus for adjusting the initial recognition distance of the candidate character according to Supplementary Note 10, wherein the average number of stroke segments per row is used.

（付記１２）前記候補文字の前記訓練サンプルの前記文字ストローク画素の前記密度は、前記候補文字の前記訓練サンプルの最小方形文字枠の面積と前記候補文字の前記訓練サンプルの前記文字ストローク画素の数との比であり、前記候補文字の前記訓練サンプルの前記文字ストロークの前記密度は、前記候補文字の前記訓練サンプルの最小方形文字枠の面積と前記候補文字の前記訓練サンプルの前記文字ストロークの数のｎ乗との比であり、ｎは正の整数である、ことを特徴とする付記１１に記載の候補文字の初期認識距離を調整する装置。 (Additional remark 12) The said density of the said character stroke pixel of the said training sample of the said candidate character is the area of the minimum square character frame of the said training sample of the said candidate character, and the number of the said character stroke pixels of the said training sample of the said candidate character And the density of the character strokes of the training sample of the candidate character is the area of a minimum rectangular character frame of the training sample of the candidate character and the number of character strokes of the training sample of the candidate character. The apparatus for adjusting the initial recognition distance of the candidate character according to appendix 11, wherein n is a positive integer.

（付記１３）前記候補文字の前記１つ以上の訓練サンプルは前記候補文字の文字コードによって獲得されることを特徴とする付記９に記載の候補文字の初期認識距離を調整する装置。 (Supplementary note 13) The apparatus for adjusting an initial recognition distance of a candidate character according to supplementary note 9, wherein the one or more training samples of the candidate character are acquired by a character code of the candidate character.

（付記１４）前記候補文字の前記文字コードと前記１つ以上の訓練サンプルを対応させて格納し、または前記候補文字の前記文字コードと前記１つ以上の訓練サンプルと、前記候補文字の前記１つ以上の訓練サンプルの重み付けとを対応させて格納する記憶部をさらに含むことを特徴とする付記１３に記載の候補文字の初期認識距離を調整する装置。 (Supplementary Note 14) The character code of the candidate character and the one or more training samples are stored in association with each other, or the character code of the candidate character and the one or more training samples, and the one of the candidate characters The apparatus for adjusting the initial recognition distance of candidate characters according to appendix 13, further comprising a storage unit that stores the weights of at least two training samples in association with each other.

（付記１５）前記演算は、前記構造特徴計算部によって獲得される前記構造特徴値の対数と前記初期認識距離とを掛け合わせる乗算を含むことを特徴とする付記１０に記載の候補文字の初期認識距離を調整する装置。 (Supplementary note 15) The initial recognition of candidate characters according to supplementary note 10, wherein the calculation includes a multiplication of multiplying a logarithm of the structural feature value acquired by the structural feature calculation unit and the initial recognition distance. A device that adjusts the distance.

（付記１６）前記候補文字が複数の訓練サンプルを有するときに、前記構造特徴値計算部によって獲得される前記構造特徴値は、前記複数の訓練サンプルの前記構造特徴値すべての平均値または加重平均値とすることを特徴とする付記９〜１５のいずれか１つに記載の候補文字の初期認識距離を調整する装置。 (Supplementary Note 16) When the candidate character has a plurality of training samples, the structural feature value acquired by the structural feature value calculation unit is an average value or a weighted average of all the structural feature values of the plurality of training samples. The apparatus for adjusting the initial recognition distance of the candidate character according to any one of appendices 9 to 15, characterized in that the value is a value.

（付記１７）特徴抽出手順と、粗分類手順と、特徴認識手順と、詳細分類手順とをコンピュータに実行させ、前記詳細分類手順が候補文字の初期認識距離を出力する、文字列を認識するプログラムであって、
前記候補文字の１つ以上の訓練サンプルの構造特徴値を計算する構造特徴値計算手順と、
前記構造特徴値計算手順によって計算される前記構造特徴値に基づいて前記初期認識距離を調整する調整手順と
をコンピュータに実行させることを特徴とする文字列を認識するプログラム。 (Supplementary Note 17) A program for recognizing a character string that causes a computer to execute a feature extraction procedure, a coarse classification procedure, a feature recognition procedure, and a detailed classification procedure, and that the detailed classification procedure outputs an initial recognition distance of a candidate character Because
A structural feature value calculation procedure for calculating a structural feature value of one or more training samples of the candidate character;
A program for recognizing a character string, causing a computer to execute an adjustment procedure for adjusting the initial recognition distance based on the structural feature value calculated by the structural feature value calculation procedure.

（付記１８）前記構造特徴値は形状特徴値であり、前記調整ステップは、乗算または乗算を含む演算によって前記初期認識距離を調整することを特徴とする付記１７に記載の文字列を認識するプログラム。 (Supplementary note 18) The program for recognizing a character string according to supplementary note 17, wherein the structural feature value is a shape feature value, and the adjusting step adjusts the initial recognition distance by a calculation including multiplication or multiplication. .

（付記１９）前記形状特徴値は、前記候補文字の前記訓練サンプルの文字ストローク画素の密度、前記候補文字の前記訓練サンプルの文字ストロークの密度、または前記候補文字の前記訓練サンプルの文字ストロークの行／列当たりの平均ストロークセグメント数とすることを特徴とする付記１８に記載の文字列を認識するプログラム。 (Supplementary note 19) The shape feature value is a character stroke pixel density of the training sample of the candidate character, a character stroke density of the training sample of the candidate character, or a row of character strokes of the training sample of the candidate character. A program for recognizing a character string according to appendix 18, wherein the average stroke segment number per column is used.

（付記２０）前記候補文字の前記訓練サンプルの前記文字ストローク画素の前記密度は、前記候補文字の前記訓練サンプルの最小方形文字枠の面積と前記候補文字の前記訓練サンプルの前記文字ストローク画素の数との比であり、前記候補文字の前記訓練サンプルの前記文字ストロークの前記密度は、前記候補文字の前記訓練サンプルの最小方形文字枠の面積と前記候補文字の前記訓練サンプルの前記文字ストロークの数のｎ乗との比であり、ｎは正の整数である、ことを特徴とする付記１９に記載の文字列を認識するプログラム。 (Supplementary Note 20) The density of the character stroke pixels of the training sample of the candidate character is the area of the minimum rectangular character frame of the training sample of the candidate character and the number of character stroke pixels of the training sample of the candidate character. And the density of the character strokes of the training sample of the candidate character is the area of a minimum rectangular character frame of the training sample of the candidate character and the number of character strokes of the training sample of the candidate character. The program for recognizing a character string according to appendix 19, wherein n is a positive integer.

（付記２１）前記候補文字の前記１つ以上の訓練サンプルは前記候補文字の文字コードによって獲得されることを特徴とする付記１７に記載の文字列を認識するプログラム。 (Supplementary note 21) The program for recognizing a character string according to supplementary note 17, wherein the one or more training samples of the candidate character are acquired by a character code of the candidate character.

（付記２２）前記演算は、前記構造特徴計算ステップによって獲得される前記構造特徴値の対数と前記初期認識距離とを掛け合わせる乗算を含むことを特徴とする付記１８に記載の文字列を認識するプログラム。 (Additional remark 22) The said operation | movement recognizes the character string of Additional remark 18 characterized by including the multiplication which multiplies the logarithm of the said structural feature value acquired by the said structural feature calculation step, and the said initial recognition distance. program.

（付記２３）前記候補文字が複数の訓練サンプルを有するときに、前記構造特徴値計算ステップによる前記構造特徴値は、前記複数の訓練サンプルの前記構造特徴値すべての平均値または加重平均値とすることを特徴とする付記１７〜２２のいずれか１つに記載の文字列を認識するプログラム。 (Supplementary Note 23) When the candidate character has a plurality of training samples, the structural feature value obtained by the structural feature value calculation step is an average value or a weighted average value of all the structural feature values of the plurality of training samples. The program which recognizes the character string as described in any one of additional notes 17-22 characterized by the above-mentioned.

（付記２４）特徴抽出手段と、粗分類手段と、特徴認識手段と、詳細分類手段とを含み、前記詳細分類手段が候補文字の初期認識距離を出力する、文字列を認識する装置であって、
前記候補文字の１つ以上の訓練サンプルの構造特徴値を計算する構造特徴値計算手段と、
前記構造特徴値計算手段によって計算される前記構造特徴値に基づいて前記初期認識距離を調整する調整手段と
をさらに備えることを特徴とする文字列を認識する装置。 (Supplementary Note 24) An apparatus for recognizing a character string, including a feature extraction unit, a coarse classification unit, a feature recognition unit, and a detailed classification unit, wherein the detailed classification unit outputs an initial recognition distance of a candidate character. ,
A structural feature value calculating means for calculating a structural feature value of one or more training samples of the candidate character;
An apparatus for recognizing a character string, further comprising: an adjusting unit that adjusts the initial recognition distance based on the structural feature value calculated by the structural feature value calculating unit.

（付記２５）前記構造特徴値は形状特徴値であり、前記調整手段は、乗算または乗算を含む演算によって前記初期認識距離を調整することを特徴とする付記２４に記載の文字列を認識する装置。 (Supplementary note 25) The device for recognizing a character string according to supplementary note 24, wherein the structural feature value is a shape feature value, and the adjusting unit adjusts the initial recognition distance by multiplication or an operation including multiplication. .

（付記２６）前記形状特徴値は、前記候補文字の前記訓練サンプルの文字ストローク画素の密度、前記候補文字の前記訓練サンプルの文字ストロークの密度、または前記候補文字の前記訓練サンプルの文字ストロークの行／列当たりの平均ストロークセグメント数であることを特徴とする付記２５に記載の文字列を認識する装置。 (Supplementary note 26) The shape feature value is a character stroke pixel density of the training sample of the candidate character, a character stroke density of the training sample of the candidate character, or a row of character strokes of the training sample of the candidate character. / The apparatus for recognizing a character string according to appendix 25, wherein the number of stroke segments is an average number per stroke.

（付記２７）前記候補文字の前記訓練サンプルの前記文字ストローク画素の前記密度は、前記候補文字の前記訓練サンプルの最小方形文字枠の面積と前記候補文字の前記訓練サンプルの前記文字ストローク画素の数との比であり、前記候補文字の前記訓練サンプルの前記文字ストロークの前記密度は、前記候補文字の前記訓練サンプルの最小方形文字枠の面積と前記候補文字の前記訓練サンプルの前記文字ストロークの数のｎ乗との比であり、ｎは正の整数である、ことを特徴とする付記２６に記載の文字列を認識する装置。 (Supplementary note 27) The density of the character stroke pixels of the training sample of the candidate character is the area of the minimum rectangular character frame of the training sample of the candidate character and the number of character stroke pixels of the training sample of the candidate character. And the density of the character strokes of the training sample of the candidate character is the area of a minimum rectangular character frame of the training sample of the candidate character and the number of character strokes of the training sample of the candidate character. 27. The apparatus for recognizing a character string according to appendix 26, wherein n is a positive integer.

（付記２８）前記候補文字の前記１つ以上の訓練サンプルは前記候補文字の文字コードによって獲得されることを特徴とする付記２４に記載の文字列を認識する装置。 (Supplementary note 28) The apparatus for recognizing a character string according to supplementary note 24, wherein the one or more training samples of the candidate character are acquired by a character code of the candidate character.

（付記２９）前記候補文字の前記文字コードと前記１つ以上の訓練サンプルを対応させて格納し、または前記候補文字の前記文字コードと前記１つ以上の訓練サンプルと、前記候補文字の前記１つ以上の訓練サンプルの重み付けとを対応させて格納する記憶手段をさらに備えたことを特徴とする付記２８に記載の文字列を認識する装置。 (Supplementary note 29) The character code of the candidate character and the one or more training samples are stored in association with each other, or the character code of the candidate character and the one or more training samples, and the one of the candidate character 29. The apparatus for recognizing a character string according to appendix 28, further comprising storage means for storing weights of two or more training samples in association with each other.

（付記３０）前記演算は、前記構造特徴計算部によって獲得される前記構造特徴値の対数と前記初期認識距離とを掛け合わせる乗算を含むことを特徴とする付記２５に記載の文字列を認識する装置。 (Additional remark 30) The said operation | movement recognizes the character string of Additional remark 25 characterized by including the multiplication which multiplies the logarithm of the said structural feature value acquired by the said structural feature calculation part, and the said initial recognition distance. apparatus.

（付記３１）前記候補文字が複数の訓練サンプルを有するときに、前記構造特徴値計算部によって獲得される前記構造特徴値は、前記複数の訓練サンプルの前記構造特徴値すべての平均値または加重平均値であることを特徴とする付記２４〜３０のいずれか１つに記載の文字列を認識する装置。 (Supplementary Note 31) When the candidate character has a plurality of training samples, the structural feature value acquired by the structural feature value calculation unit is an average value or a weighted average of all the structural feature values of the plurality of training samples. A device for recognizing a character string according to any one of appendices 24 to 30, wherein the character string is a value.

（付記３２）文字列から文字切り出しを行う際に、前記調整された距離値を用いてセグメンテーションパスの計算を行い、切り出された文字矩形に対しては初期認識距離に基づく文字認識結果を出力することを特徴とする付記２４〜３１のいずれか１つに記載の文字列を認識する装置。 (Supplementary Note 32) When character extraction is performed from a character string, a segmentation path is calculated using the adjusted distance value, and a character recognition result based on an initial recognition distance is output for the extracted character rectangle. An apparatus for recognizing a character string according to any one of appendices 24-31.

以上のように本発明は、文字認識に有用であり、特に、劣化した文字列中の文字を認識する文字列認識装置および方法に有用である。 As described above, the present invention is useful for character recognition, and particularly useful for a character string recognition apparatus and method for recognizing characters in a deteriorated character string.

従来の認識ベースのセグメント化方法の原理を示す図である。It is a figure which shows the principle of the conventional recognition-based segmentation method. 二重固有空間ベースの方法を使った文字認識を示すフローチャートである。6 is a flowchart illustrating character recognition using a dual eigenspace-based method. 従来技術の文字認識方法の欠点を概略的に示す図である。It is a figure which shows schematically the fault of the character recognition method of a prior art. 本発明の一実施形態を示すフローチャートである。It is a flowchart which shows one Embodiment of this invention. 文字形状特徴の特性および計算を概略的に示す図である。It is a figure which shows the characteristic and calculation of a character shape characteristic roughly. 図３の認識距離に対応する調整された認識距離を示す図である。It is a figure which shows the adjusted recognition distance corresponding to the recognition distance of FIG.

Explanation of symbols

１左ストローク
２右ストローク 1 Left stroke 2 Right stroke

Claims

A program for adjusting the initial recognition distance of candidate characters,
By multiplying the reciprocal calculated by said calculation procedure and the calculation procedure for calculating the inverse number density of the one or more character strokes of training samples of the candidate characters in the initial recognition distance, adjusting the initial recognition distance A program for adjusting an initial recognition distance of a candidate character, characterized by causing a computer to execute an adjustment procedure.

The density of the character stroke of said training sample before Symbol candidate characters, wherein the area of the smallest rectangular bounding box of the training samples of candidate character of the training sample of candidate characters the number of character strokes of n-th power and the program for adjusting the initial recognition distance candidate characters according to claim 1, wherein the ratio der Rukoto.

The program for adjusting an initial recognition distance of a candidate character according to claim 1, wherein the one or more training samples of the candidate character are obtained by a character code of the candidate character.

4. The initial recognition distance of the candidate character according to claim 1, wherein the adjustment procedure multiplies the logarithm of the reciprocal obtained by the calculation procedure and the initial recognition distance. The program to adjust.

The reciprocal obtained by the calculation procedure when the candidate character has a plurality of training samples is an average value or a weighted average value of all the reciprocals of the plurality of training samples. The program which adjusts the initial recognition distance of the candidate character as described in any one of -4.

A program for recognizing a character string, causing a computer to execute a feature extraction procedure, a coarse classification procedure, a feature recognition procedure, and a detailed classification procedure, wherein the detailed classification procedure outputs an initial recognition distance of a candidate character,
By multiplying the reciprocal calculated by said calculation procedure and the calculation procedure for calculating the inverse number density of the one or more character strokes of training samples of the candidate characters in the initial recognition distance, adjusting the initial recognition distance A program for recognizing a character string, characterized by causing a computer to execute an adjustment procedure.

A device for adjusting the initial recognition distance of candidate characters,
By multiplying the reciprocal is calculated by the calculation means and the calculation means for calculating the inverse number of one or more density of character stroke training samples of the candidate characters in the initial recognition distance, adjusting the initial recognition distance An apparatus for adjusting the initial recognition distance of a candidate character, comprising:

A method for adjusting the initial recognition distance of a candidate character,
The inverse number of one or more density of character stroke training samples of the candidate characters is calculated,
The initial recognition distance is adjusted by multiplying the reciprocal by the initial recognition distance. A method for adjusting an initial recognition distance of a candidate character, wherein the initial recognition distance is adjusted.