JP5028911B2

JP5028911B2 - Character string recognition program, method and apparatus

Info

Publication number: JP5028911B2
Application number: JP2006226997A
Authority: JP
Inventors: 俊孫; 悦伸堀田; 克仁藤本; 裕勝山; 聡直井
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2005-08-26
Filing date: 2006-08-23
Publication date: 2012-09-19
Anticipated expiration: 2026-08-23
Also published as: CN100409251C; CN1920855A; JP2007066310A

Description

本発明は概して文字認識用のプログラム、方法および装置に関し、特に劣化文字列用の文字列認識プログラム、方法および装置に関する。 The present invention generally relates to a program, method and apparatus for character recognition, and more particularly to a character string recognition program, method and apparatus for a deteriorated character string.

文書画像を捕捉するためのディジタルカメラおよびディジタルビデオカメラの普及に伴い、劣化文字列の認識がますます注目されている。劣化文字列の認識は、単一文字認識および文字列からの文字切り出しを含んでおり、これら２つのパーツはもともと結合されているものである。 With the widespread use of digital cameras and digital video cameras for capturing document images, the recognition of degraded character strings is gaining more and more attention. The recognition of a deteriorated character string includes single character recognition and character extraction from a character string, and these two parts are originally combined.

文字列からの文字切り出しについて、認識に基づく切り出し方法はもっとも広く使用されている方法である。図１は、認識に基づく従来の切り出し方法の原理を示している。入力画像がまず二値化されて、次いで二値化画像の画素連結成分は、文字のストロークを見つけるために分析される（図１の最上行）。画像の画素連結成分の分析アルゴリズムについては、下記非特許文献１を参照のこと。全ての画素連結成分は基本セグメント文字とみなされる（図１の真ん中の行）。画素連結成分の結合は合成セグメント文字とみなされる（図１の最下行）。続いて、文字認識が全ての基本セグメント文字ならびに合成セグメント文字に対して実行され、これによって認識距離が提供される。全ての文字列は、異なる基本セグメント文字および結合された合成セグメント文字からなる複数の切り出しパスに分離可能であり、各切り出しパスの認識距離は、これを構成する基本セグメント文字および合成セグメント文字の認識距離の合計である。文字列の正確な切り出し結果は、最小総認識距離を有する切り出しパスを選択することによって得られる。切り出しを達成する一方で、各基本セグメント文字および合成セグメント文字に関する認識結果はまた、文字の最終認識結果でもある。 Regarding character segmentation from a character string, a segmentation method based on recognition is the most widely used method. FIG. 1 shows the principle of a conventional clipping method based on recognition. The input image is first binarized and then the pixel connected components of the binarized image are analyzed to find the stroke of the character (top row of FIG. 1). Refer to the following Non-Patent Document 1 for the analysis algorithm of the pixel connected component of the image. All pixel connected components are considered basic segment characters (middle row in FIG. 1). The combination of pixel connected components is regarded as a composite segment character (bottom line in FIG. 1). Subsequently, character recognition is performed on all basic segment characters as well as composite segment characters, thereby providing a recognition distance. All character strings can be separated into multiple cutout paths consisting of different basic segment characters and combined composite segment characters, and the recognition distance of each cutout path is the recognition of the basic segment characters and composite segment characters that compose it The total distance. An accurate cutout result of the character string is obtained by selecting a cutout path having the minimum total recognition distance. While achieving segmentation, the recognition result for each basic segment character and composite segment character is also the final recognition result for the character.

図１は、認識に基づく従来の切り出し方法の原理を示している。 FIG. 1 shows the principle of a conventional clipping method based on recognition.

図１に示されるように、「ハ」、「リ」および「を」からなる切り出しパスは最小認識距離７２を有している。従って、これらは最終切り出しおよび認識結果として出力される。 As shown in FIG. 1, the cut-out path made up of “C”, “Li”, and “O” has a minimum recognition distance 72. Therefore, these are output as the final cutout and recognition results.

上記の原理より、認識距離は、認識結果だけでなく正確な切り出しにとっても非常に重要であることが分かる。例えば、図１において、「ハ」の最小認識距離は２１であり、文字の左右の２つのストロークの認識距離はそれぞれ１９および２６である。これら２つのストロークの認識距離の合計が２１よりも小さければ、「ハ」の最初の認識結果が正確であっても、図１の左ストローク１と右ストローク２に依然として誤ってセグメント化されることになる。 From the above principle, it can be seen that the recognition distance is very important not only for the recognition result but also for accurate extraction. For example, in FIG. 1, the minimum recognition distance of “C” is 21, and the recognition distances of the two left and right strokes of the character are 19 and 26, respectively. If the sum of the recognition distances of these two strokes is less than 21, even if the first recognition result of “c” is accurate, it is still erroneously segmented into left stroke 1 and right stroke 2 in FIG. become.

以下の特許文献１，２および非特許文献１〜３のような文字列からの文字切り出しに関する多数の論文および特許がこれまで公開されている。 Numerous papers and patents related to character segmentation from character strings such as the following patent documents 1 and 2 and non-patent documents 1 to 3 have been published so far.

これらの論文および特許の多くは接触文字の処理を目的としており、処理対象の多くは二値化画像である。劣化文字列画像に関しては、従来の二値化の方法はしばしば重大なかすれたストローク（ストロークの画素ポイントの欠落）やストロークの接触の原因となっており、望ましい認識効果を達成するのは不可能である。 Many of these papers and patents aim to process contact characters, and many of the objects to be processed are binarized images. For degraded character string images, conventional binarization methods often cause significant blurred strokes (missing stroke pixel points) and stroke contact, making it impossible to achieve the desired recognition effect It is.

二重固有空間ベースの方法は劣化文字の認識においては極めて効果的であり、この方法は、グレースケール文字画像から直接文字の特徴を抽出する。図２は、二重固有空間ベースの方法を使用する文字認識を示すフローチャートである。入力は正規化された文字画像である。文字画像の特徴がまず第１の辞書（図２の辞書１）で抽出される。次いで文字画像が、第２の辞書（図２の辞書２）によってＭ個のカテゴリ候補に大まかに分類される。続いて第３の辞書（図２の辞書３）が、入力された文字特徴をＭ個のカテゴリ候補のうちの１つに最終的に分類するために使用される。最終的には、認識された文字コードならびに認識距離が出力される。 The dual eigenspace-based method is very effective in recognizing degraded characters, and this method extracts character features directly from grayscale character images. FIG. 2 is a flowchart illustrating character recognition using a dual eigenspace-based method. The input is a normalized character image. The feature of the character image is first extracted by the first dictionary (dictionary 1 in FIG. 2). The character images are then roughly classified into M category candidates by the second dictionary (dictionary 2 in FIG. 2). A third dictionary (dictionary 3 in FIG. 2) is then used to finally classify the input character features into one of M category candidates. Finally, the recognized character code and the recognition distance are output.

二重固有空間ベースの方法はグレースケール画像から直接特徴を抽出することによって二値化のプロセスを回避するため、不良による画像雑音に対してより安定的である。しかしながら、二重固有空間ベースの方法が、認識に基づく切り出し方法に直接適用される場合にはいくつかの問題がある。 The dual eigenspace based method is more stable against image noise due to defects because it avoids the binarization process by extracting features directly from the grayscale image. However, there are some problems when the double eigenspace based method is directly applied to the recognition based clipping method.

図３に示されるように、第１の行の画像は文字列画像である。第２の行はバイナリ結果であり、このバイナリ化画像は粗切り出しに使用される。示されているような外接矩形は粗切り出しの結果である。第３の行は基本セグメント文字のグレースケール画像を正規化したものである。各セグメント画像の下に示されているのは、認識された文字およびその認識距離である。第４の行は、正規化後の合成セグメント文字「年」および「開」の正規化グレースケール文字画像、ならびに対応する認識結果および認識距離である。認識に基づく従来の切り出し方法が使用される場合、第２の行の「開」は４つのセグメントに分離されるために「開」は正しく認識されることはなく、また、４つのセグメントの認識距離の合計は５．３９＋６１．０１＋４５．６９＋２０．３７＝１３２．４６である。「開」の認識距離は４０９．７１であり、これは４つのセグメントの認識距離の合計よりも大きいため、文字列全体は「年１回Ｉ！ＩＩく」に認識される。 As shown in FIG. 3, the image in the first row is a character string image. The second row is the binary result, and this binary image is used for rough segmentation. The circumscribed rectangle as shown is the result of the rough cut. The third row is a normalized grayscale image of basic segment characters. Shown below each segment image is the recognized character and its recognition distance. The fourth row is a normalized grayscale character image of the composite segment characters “year” and “open” after normalization, and the corresponding recognition result and recognition distance. When the conventional cut-out method based on recognition is used, “open” in the second row is separated into four segments, so “open” is not recognized correctly, and four segments are recognized. The total distance is 5.39 + 61.01 + 45.69 + 20.37 = 132.46. The recognition distance of “open” is 409.71, which is larger than the sum of the recognition distances of the four segments, so that the entire character string is recognized as “I! II” once a year.

米国特許第６３２７３８５号US Pat. No. 6,327,385 米国特許第５６９２０６９号U.S. Patent No. 5,691,069 米国特許第５１７２４２２号US Pat. No. 5,172,422 Ｒ．Ｃ．Ｇｏｎｚａｌｅｚ著、ＱｕｉｑｉＲＵＡＮ、ＹｕｚｈｉＲＵＡＮら編、“ｄｉｇｉｔａｌＩｍａｇｅＰｒｏｃｅｓｓｉｎｇｓｅｃｏｎｄｅｄｉｔｉｏｎ“の４３５頁R. C. Page 435 of "digital Image Processing second edition" by Gonzalez, edited by Quiqi RUAN, Yuzhi RUAN et al. Ｙ．Ｌｕ著、「ＭａｃｈｉｎｅＰｒｉｎｔｅｄＣｈａｒａｃｔｅｒＳｅｇｍｅｎｔａｔｉｏｎ − ＡｎＯｖｅｒｖｉｅｗ」、ＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ、第２８巻、第１号、６７〜８０頁、１９９５年１月。Y. Lu, “Machine Printed Character Segmentation—An Overview”, Pattern Recognition, Vol. 28, No. 1, pp. 67-80, January 1995. Ｓ．Ｗ．Ｌｅｅ、Ｄ．Ｊ．Ｌｅｅ、Ｈ．Ｓ．Ｐａｒｋ、「ＡＮｅｗＭｅｔｈｏｄｏｌｏｇｙｆｏｒＧｒａｙ−ＳｃａｌｅＣｈａｒａｃｔｅｒＳｅｇｍｅｎｔａｔｉｏｎａｎｄＲｅｃｏｇｎｉｔｉｏｎ」、ＩＥＥＥｔｒａｎｓａｃｔｉｏｎｏｎｐａｔｔｅｒｎａｎａｌｙｓｉｓａｎｄｍａｃｈｉｎｅｉｎｔｅｌｌｉｇｅｎｃｅ、第１８巻、第１０号、１０４５〜１０５０頁、１９９６年１０月。S. W. Lee, D.C. J. et al. Lee, H.C. S. Park, “A New Methodology for Gray-Scale Character Segmentation and Recognition”, IEEE transaction on pattern analysis, 10th, 10th, 10th, 10th, 10th, 10th, 10th, 10th, 10th, 10th.

本発明の目的は、二重固有空間を使用する切り出しに伴う問題を解決するためにより良好な特徴を使用してより合理的な認識距離を生成する、劣化文字列用の文字列認識装置および方法を提供することである。 An object of the present invention is to provide a character string recognition apparatus and method for a deteriorated character string that uses a better feature to generate a more reasonable recognition distance to solve the problems associated with clipping using a double eigenspace. Is to provide.

本発明の一態様に従って、各正規化画像の変換行列および平均値から構成されている第１の辞書を使用して入力された正規化画像から特徴を抽出する特徴抽出手段と、該抽出された特徴と、第２の辞書に記憶されている特徴とを比較することによって、一定数の文字カテゴリ候補を選択する粗分類手段と、各文字カテゴリの変換行列および平均特徴ベクトルを記憶する第３の辞書と、該一定数の選択された文字カテゴリ候補とを使用して、一定数の複数の再構成特徴を構成する特徴再構成手段と、該特徴抽出手段によって抽出された特徴と該再構成された特徴とに従って認識された文字コードを認識および出力する詳細分類手段と、該第１の辞書と、該特徴再構成手段によって生成された該再構成特徴とを使用して、一定の数の複数の再構成画像を構成する画像再構成手段と、該入力された正規化画像と、該画像再構成手段によって生成された該再構成画像とに従って認識距離を算出および出力する認識距離算出手段と、該認識距離算出手段によって算出された認識距離に基づいて文字切り出しを行い、該文字切り出し結果に対応する該詳細分類手段によって算出された文字コードを最終認識文字コードとして認識する文字列認識手段とを備える、劣化文字列用の文字列認識装置が提供される。 In accordance with one aspect of the present invention, feature extraction means for extracting features from a normalized image input using a first dictionary composed of a transformation matrix and an average value of each normalized image, and the extracted A coarse classification means for selecting a certain number of character category candidates by comparing the features and the features stored in the second dictionary; and a third classification means for storing a conversion matrix and an average feature vector for each character category . Using the dictionary and the predetermined number of selected character category candidates, feature reconstructing means for constructing a certain number of reconstructed features, features extracted by the feature extracting means and the reconstructed Using a detailed classification means for recognizing and outputting a character code recognized according to the feature, the first dictionary, and the reconstructed feature generated by the feature reconstructing means, Reconfiguration An image reconstruction means for constructing an image, a normalized image that is the input, a recognition distance calculating means for calculating and outputting a recognition distance in accordance with the reconstruction image generated by the image reconstruction means, said recognition distance A character string recognition unit that performs character segmentation based on the recognition distance calculated by the calculation unit and recognizes the character code calculated by the detailed classification unit corresponding to the character segmentation result as a final recognition character code. A character string recognition device for a character string is provided.

本発明の別の態様に従って、各正規化画像の変換行列および平均値から構成されている第１の辞書を使用して入力された正規化画像から特徴を抽出するステップと、該抽出された特徴と、第２の辞書に記憶されている特徴とを比較することによって一定数の文字カテゴリ候補を選択するステップと、各文字カテゴリの変換行列および平均特徴ベクトルを記憶する第３の辞書と、該一定数の選択された文字カテゴリ候補とを使用して、一定数の複数の再構成特徴を構成するステップと、該抽出された特徴と該再構成された特徴とに従って認識された文字コードを認識および出力するステップと、該第１の辞書と該再構成された特徴とを使用して、一定の数の複数の再構成画像を構成するステップと、該入力された正規化画像と該再構成画像とに従って認識距離を算出および出力するステップと、該算出された認識距離に基づいて文字切り出しを行い、該認識された文字コードのうち該文字切り出し結果に対応する文字コードを最終認識文字コードとして認識するステップとを備える、劣化文字列用の文字列認識方法が提供される。 In accordance with another aspect of the present invention, extracting features from a normalized image input using a first dictionary composed of a transformation matrix and an average value of each normalized image, and the extracted features Selecting a certain number of character category candidates by comparing the features stored in the second dictionary; a third dictionary storing a conversion matrix and an average feature vector for each character category ; and Using a fixed number of selected character category candidates, constructing a fixed number of reconstructed features, and recognizing recognized character codes according to the extracted features and the reconstructed features And using the first dictionary and the reconstructed features to construct a fixed number of reconstructed images, the input normalized image and the reconstructed To the image Calculating and outputting a recognition distance I performs character extraction based on the recognition distances said calculated recognize a character code corresponding to said character extraction result of the recognized character code as the last recognized character codes and a step of, string recognition method for degradation strings are provided.

抽出された特徴および再構成された特徴に従って最終的な認識文字コードを認識および出力し、入力された正規化画像および再構成された画像に従って認識距離を算出および出力することが本発明において可能であるため、本発明はより良好な特徴を使用して、切り出しにより適した認識距離を生成することによって、劣化文字列の文字を正しくセグメント化することが可能になる。 It is possible in the present invention to recognize and output the final recognition character code according to the extracted feature and the reconstructed feature, and to calculate and output the recognition distance according to the input normalized image and the reconstructed image. For this reason, the present invention makes it possible to correctly segment the characters of the degraded character string by using a better feature and generating a recognition distance more suitable for clipping.

本発明によれば、抽出された特徴および再構成された特徴に従って最終的な認識文字コードを認識および出力し、入力された正規化画像および再構成された画像に従って認識距離を算出および出力することにしたので、劣化した文字列画像の文字を正しくセグメント化することが可能になるという効果を奏する。 According to the present invention, the final recognition character code is recognized and output according to the extracted feature and the reconstructed feature, and the recognition distance is calculated and output according to the input normalized image and the reconstructed image. As a result, it is possible to correctly segment the characters of the degraded character string image.

本発明の実施形態を以下添付の図面を参照して説明する。 Embodiments of the present invention will be described below with reference to the accompanying drawings.

図４は、本発明の実施形態に従った文字列認識装置によって使用される文字列認識方法を示すフローチャートである。 FIG. 4 is a flowchart showing a character string recognition method used by the character string recognition device according to the embodiment of the present invention.

図４に示されるように、本発明の実施形態に従った文字列認識装置は、第１の辞書４０３を使用して入力された正規化画像４０１から特徴を抽出する特徴抽出部４０２と、第２の辞書４０５に記憶されている特徴と抽出された特徴とを比較することによってＭ個の文字カテゴリ候補を選択する粗分類部４０４と、第３の辞書４０７およびＭ個の文字カテゴリ候補を使用してＭ個の再構成特徴を再構成する特徴再構成部４０６と、第１の辞書４０３を使用してＭ個の再構成画像を構成する画像再構成部４０８と、特徴抽出部４０２によって抽出された特徴と再構成された特徴の差を比較することによって認識された文字コード４１１を出力する詳細分類部４０９と、認識距離４１２を出力する認識距離算出部４１０とを備える。 As shown in FIG. 4, the character string recognition apparatus according to the embodiment of the present invention includes a feature extraction unit 402 that extracts features from a normalized image 401 input using the first dictionary 403, The coarse classification unit 404 that selects M character category candidates by comparing the features stored in the second dictionary 405 with the extracted features, and the third dictionary 407 and the M character category candidates. Extracted by the feature reconstruction unit 406 that reconstructs the M reconstruction features, the image reconstruction unit 408 that configures the M reconstruction images using the first dictionary 403, and the feature extraction unit 402 A detailed classification unit 409 that outputs a character code 411 recognized by comparing the difference between the reconstructed feature and the reconstructed feature, and a recognition distance calculation unit 410 that outputs a recognition distance 412.

図４に示されたフローチャートによると、入力された正規化文字画像４０１については、特徴抽出部４０２が、第１の辞書４０３を使用して入力された正規化文字画像４０１の特徴を以下の式（１）に従い抽出する： According to the flowchart shown in FIG. 4, for the input normalized character image 401, the feature extraction unit 402 uses the following expression to calculate the features of the normalized character image 401 input using the first dictionary 403. Extract according to (1):

ここでＸ＝［x₁，x₂，…，x_w*h］^Tは、高さおよび幅がそれぞれｗおよびｈである正規化文字画像を表している。式（２）は全正規化文字画像の平均値である。Ｕ＝［ｕ₁，ｕ₂，…，ｕ_n］はｕ_i＝［ｕ_i1，ｕ_i2，…，ｕ_iw*h]^Tとする変換行列である。第１の辞書４０３はＵおよび式（３）からなる。式（１）で示された特徴抽出方法は主成分分析（ＰＣＡ）と称される。ＰＣＡについての詳細は、「ＰａｔｔｅｒｎＣｌａｓｓｉｆｉｃａｔｉｏｎ」、Ｓｅｃｏｎｄｅｄｉｔｉｏｎ，ｂｙＲ．Ｏ．Ｄｕｄａ、Ｐ．Ｅ．ＨａｒｔａｎｄＤ．Ｇ．Ｓｔｏｒｋ，ａＷｉｌｅｙ−ＩｎｔｅｒｓｃｉｅｎｃｅＰｕｂｌｉｃａｔｉｏｎ，ＪｏｈｎＷｉｌｅｙ＆Ｓｏｎｓ，Ｉｎｃ．２００１．１１５〜１１７頁，５６８〜５６９に説明されている。 Here, X = [x ₁ , x ₂ ,..., X _{w * h} ] ^T represents a normalized character image whose height and width are w and h, respectively. Equation (2) is the average value of all normalized character images. _{_{U = [u 1, u 2}} , ..., u n] is _{_{u i = [u i1, u}} i2, ..., u iw * h] is a transformation matrix for the ^T. The first dictionary 403 consists of U and Equation (3). The feature extraction method shown in equation (1) is called principal component analysis (PCA). For more information on PCA, see “Pattern Classification”, Second edition, by R.C. O. Duda, P.A. E. Hart and D.H. G. Stork, a Wiley-Interscience Publication, John Wiley & Sons, Inc. 2001.115-117, 568-569.

特徴抽出後、抽出された特徴Ｙは、粗分類部４０４によって、第２の辞書４０５に記憶されている各文字カテゴリの特徴と比較される。特徴比較については多数のアルゴリズムがあり、その１つはユークリッド距離：Ｄ_i＝｜Ｙ−Ｙ_i｜（ここでＤ_iはｉ番目の文字カテゴリＹ_iの特徴までの特徴Ｙのユークリッド距離である）に基づくものである。粗分類部４０４から出力される候補文字カテゴリ数がＭである場合、セグメント文字ごとにユークリッド距離の小さいものから順にＭ個の文字カテゴリが、粗分類の出力として選択される。 After the feature extraction, the extracted feature Y is compared with the features of each character category stored in the second dictionary 405 by the coarse classification unit 404. There are a number of algorithms for feature comparison, one of which is the Euclidean distance: D _i = | Y−Y _i | (where D _i is the Euclidean distance of the feature Y up to the feature of the i th character category Y _i. ). When the number of candidate character categories output from the coarse classification unit 404 is M, M character categories are selected as the coarse classification output in order from the smallest Euclidean distance for each segment character.

次に、特徴再構成部４０６が第３の辞書４０７を使用して、Ｍ個の候補カテゴリに対応するＭ個の再構成特徴を構成する。第３の辞書は各文字カテゴリの変換行列式（４）および平均特徴ベクトルＣ_iを記憶する。ｉ番目の再構成特徴式（５）は式（６）によって得られる。 Next, the feature reconstruction unit 406 uses the third dictionary 407 to construct M reconstruction features corresponding to the M candidate categories. The third dictionary stores the transformation determinant (4) and average feature vector C _i for each character category. The i-th reconstruction feature equation (5) is obtained by equation (6).

図４の詳細分類部４０９は、元の特徴ＹとＭ個の再構成特徴式（５）の距離を算出する。これらセグメント文字ごとに算出されたＭ個の文字カテゴリのうち、最小の距離を有する文字カテゴリに対応するコードがそのセグメント文字の認識された文字コード４１１として出力される。 The detailed classification unit 409 in FIG. 4 calculates the distance between the original feature Y and the M reconstruction feature equations (5). Of the M character categories calculated for each segment character, a code corresponding to the character category having the minimum distance is output as the recognized character code 411 of the segment character.

図２に示されたような二重固有空間を用いる従来の方法とは異なり、本発明の認識距離は、抽出された特徴Ｙと再構成された特徴の差ではない。本発明においては、新たな画像再構成部４０８が、第１の辞書４０３を使用して、以下の式（７）、式（８）に従いＭ個の再構成画像式（９）を算出することを提案する。 Unlike the conventional method using a double eigenspace as shown in FIG. 2, the recognition distance of the present invention is not the difference between the extracted feature Y and the reconstructed feature. In the present invention, the new image reconstruction unit 408 uses the first dictionary 403 to calculate M reconstructed image equations (9) according to the following equations (7) and (8). Propose.

式（７）は式（１）から導くことが可能である。式（８）は、再構成画像の画素の値の範囲を０〜２５５に正規化するために使用され、この範囲は、元の画像の画素の値の範囲と一致する。 Equation (7) can be derived from Equation (1). Equation (8) is used to normalize the range of pixel values in the reconstructed image to 0-255, which matches the range of pixel values in the original image.

図４の認識距離算出部４１０は、元の正規化文字画像４０１とＭ個の再構成画像式（９）との距離を算出する。最小となる距離が、最終的に出力される認識距離４１２とみなされる。このように認識距離４１２が最小となるように文字切り出しを行い、詳細分類部４０９から出力された認識された文字コード４１１のうち、その文字切り出しに対応する文字コードが、最終的な認識された文字コードとなる。 The recognition distance calculation unit 410 in FIG. 4 calculates the distance between the original normalized character image 401 and the M reconstructed image equations (9). The minimum distance is regarded as the recognition distance 412 that is finally output. In this way, character segmentation is performed so that the recognition distance 412 is minimized, and among the recognized character codes 411 output from the detailed classification unit 409, the character code corresponding to the character segmentation is finally recognized. It becomes a character code.

図５は、本発明の実施形態に従った文字列認識装置で使用される文字列認識方法によって得られる認識距離を示している。図５の認識距離は切り出しについてより合理的であることがわかる。「開」の認識距離は１０４．７８であるのに対して、その４つの成分の認識距離の合計は４９４．０２であり、これは１０４．７８よりもかなり大きい。従ってこの文字は正しくセグメント化されかつ認識されることが可能である。 FIG. 5 shows the recognition distance obtained by the character string recognition method used in the character string recognition device according to the embodiment of the present invention. It can be seen that the recognition distance in FIG. 5 is more reasonable for clipping. The recognition distance of “open” is 104.78, while the total recognition distance of its four components is 494.02, which is much larger than 104.78. This character can therefore be correctly segmented and recognized.

本実施形態で使用されている例示的な文字は日本語の文字であるが、本発明で提示する方法は日本語のみに限定されない。これはまた、中国語や韓国語などの他の言語にも適用可能である。 Although the exemplary characters used in this embodiment are Japanese characters, the method presented in the present invention is not limited to Japanese only. This is also applicable to other languages such as Chinese and Korean.

（付記１）劣化文字列用の文字列認識プログラムであって、
第１の辞書を使用して入力された正規化画像から特徴を抽出するための特徴抽出手順と、
前記抽出された特徴を第２の辞書に記憶されている特徴と比較することによって、一定数の文字カテゴリ候補を選択するための粗分類手順と、
第３の辞書および前記一定数の選択された文字カテゴリを使用して、一定数の複数の再構成特徴を構成するための特徴再構成手順と、
を備え、さらに、
前記特徴抽出手順によって抽出された前記特徴および前記再構成された特徴に従って認識された文字コードを認識および出力するための詳細分類手順と、
前記第１の辞書と、前記特徴再構成手順によって生成された前記再構成特徴とを使用して、一定数の複数の再構成画像を構成するための画像再構成手順と、
前記入力された正規化画像と、前記画像再構成手順によって生成された前記再構成画像とに従って認識距離を算出および出力するための認識距離算出手順と、
をコンピュータに実行させることを特徴とする文字列認識プログラム。 (Supplementary note 1) A character string recognition program for a deteriorated character string,
A feature extraction procedure for extracting features from the normalized image input using the first dictionary;
A coarse classification procedure for selecting a certain number of character category candidates by comparing the extracted features with features stored in a second dictionary;
A feature reconstruction procedure for constructing a fixed number of reconstructed features using a third dictionary and the fixed number of selected character categories;
In addition,
A detailed classification procedure for recognizing and outputting the character code recognized according to the feature extracted by the feature extraction procedure and the reconstructed feature;
An image reconstruction procedure for constructing a fixed number of reconstructed images using the first dictionary and the reconstruction features generated by the feature reconstruction procedure;
A recognition distance calculation procedure for calculating and outputting a recognition distance according to the input normalized image and the reconstructed image generated by the image reconstruction procedure;
A character string recognition program that causes a computer to execute.

（付記２）前記認識距離算出手順において算出された認識距離に基づいて文字切り出しを行い、前記文字切り出し結果に対応する前記詳細分類手順において算出された文字コードを最終認識文字コードとして算出することを特徴とする付記１に記載の文字列認識プログラム。 (Supplementary note 2) Character extraction is performed based on the recognition distance calculated in the recognition distance calculation procedure, and the character code calculated in the detailed classification procedure corresponding to the character extraction result is calculated as a final recognition character code. The character string recognition program according to Supplementary Note 1, which is characterized.

（付記３）前記詳細分類手順が、前記特徴抽出手順によって抽出された前記特徴と前記再構成特徴の差を比較し、最小差を有する前記再構成特徴に対応する文字コードを認識された文字コードとして出力することを特徴とする付記１または２に記載の文字列認識プログラム。 (Supplementary Note 3) The detailed classification procedure compares the difference between the feature extracted by the feature extraction procedure and the reconstructed feature, and the character code corresponding to the reconstructed feature having the minimum difference is recognized. The character string recognition program according to appendix 1 or 2, characterized in that:

（付記４）前記画像再構成手順は、前記再構成画像の画素値の範囲を０〜２５５に正規化することを特徴とする付記１または２に記載の文字列認識プログラム。 (Additional remark 4) The said character reconstruction procedure normalizes the range of the pixel value of the said reconstructed image to 0-255, The character string recognition program of Additional remark 1 or 2 characterized by the above-mentioned.

（付記５）前記認識距離算出手順は、前記入力された正規化画像と、前記画像再構成手順によって生成された前記再構成画像間の距離を算出し、最小距離を認識距離として出力することを特徴とする付記１または２に記載の文字列認識プログラム。 (Supplementary Note 5) The recognition distance calculation procedure calculates a distance between the input normalized image and the reconstructed image generated by the image reconstruction procedure, and outputs a minimum distance as a recognition distance. The character string recognition program according to Supplementary Note 1 or 2, which is a feature.

（付記６）前記第１の辞書は、各正規化画像の変換行列および平均値から構成されていることを特徴とする付記１〜５のいずれか１つに記載の文字列認識プログラム。 (Additional remark 6) The said 1st dictionary is comprised from the conversion matrix and average value of each normalized image, The character string recognition program as described in any one of Additional remark 1-5 characterized by the above-mentioned.

（付記７）前記第２の辞書は各文字カテゴリの特徴を記憶することを特徴とする付記１〜５のいずれか１つに記載の文字列認識プログラム。 (Additional remark 7) The said 2nd dictionary memorize | stores the characteristic of each character category, The character string recognition program as described in any one of additional remarks 1-5 characterized by the above-mentioned.

（付記８）前記第３の辞書は、各文字カテゴリの変換行列および平均特徴ベクトルを記憶することを特徴とする付記１〜５のいずれか１つに記載の文字列認識プログラム。 (Supplementary note 8) The character string recognition program according to any one of Supplementary notes 1 to 5, wherein the third dictionary stores a conversion matrix and an average feature vector of each character category.

（付記９）不良テキストストリング文字列用の文字列認識方法であって、
第１の辞書を使用して入力された正規化画像から特徴を抽出するステップと、
前記抽出された特徴と第２の辞書に記憶されている特徴とを比較することによって、一定数の文字カテゴリ候補を選択するステップと、
第３の辞書と前記一定数の選択された文字カテゴリとを使用して、一定の数の複数の再構成特徴を構成するステップと、
前記抽出された特徴と前記再構成された特徴とに従って認識された文字コードを認識および出力するステップと、
前記第１の辞書と前記再構成された特徴とを使用して、一定数の複数の再構成画像を構成するステップと、
前記入力された正規化画像と前記再構成画像とに従って認識距離を算出および出力するステップと、
を備えることを特徴とする文字認識列方法。 (Supplementary note 9) A character string recognition method for a bad text string character string,
Extracting features from the normalized image input using the first dictionary;
Selecting a certain number of character category candidates by comparing the extracted features with features stored in a second dictionary;
Configuring a fixed number of reconstructed features using a third dictionary and the fixed number of selected character categories;
Recognizing and outputting a character code recognized according to the extracted features and the reconstructed features;
Constructing a fixed number of reconstructed images using the first dictionary and the reconstructed features;
Calculating and outputting a recognition distance according to the input normalized image and the reconstructed image;
A character recognition sequence method comprising:

（付記１０）前記認識距離算出ステップにおいて算出された認識距離に基づいて文字切り出しを行い、前記文字切り出し結果に対応する前記詳細分類ステップにおいて算出された文字コードを最終認識文字コードとして算出することを特徴とする付記９に記載の文字列認識方法。 (Supplementary note 10) Character extraction is performed based on the recognition distance calculated in the recognition distance calculation step, and the character code calculated in the detailed classification step corresponding to the character extraction result is calculated as a final recognition character code. The character string recognition method according to appendix 9, which is a feature.

（付記１１）前記認識された文字コードを認識および出力するステップが、
前記抽出された特徴と前記再構成された特徴の差を比較して、最小差を有する前記再構成された特徴に対応する文字コードを認識された文字コードとして出力するステップを含むことを特徴とする付記９または１０に記載の文字列認識方法。 (Supplementary Note 11) The step of recognizing and outputting the recognized character code includes:
Comparing a difference between the extracted feature and the reconstructed feature, and outputting a character code corresponding to the reconstructed feature having a minimum difference as a recognized character code; The character string recognition method according to Supplementary Note 9 or 10.

（付記１２）前記再構成画像を構成再構成するステップが、
前記再構成画像の画素値の範囲を０〜２５５の範囲に正規化するステップを含むことを特徴とする、付記９または１０に記載の文字列認識方法。 (Supplementary note 12) The step of reconstructing the reconstructed image comprises:
11. The character string recognition method according to appendix 9 or 10, comprising a step of normalizing a range of pixel values of the reconstructed image to a range of 0 to 255.

（付記１３）前記認識距離を算出および出力するステップが、
前記入力された正規化画像と前記再構成画像間の距離を算出して、最小距離を認識距離として出力するステップを含むことを特徴とする付記９または１０に記載の文字列認識方法。 (Supplementary Note 13) The step of calculating and outputting the recognition distance includes:
The character string recognition method according to appendix 9 or 10, further comprising a step of calculating a distance between the input normalized image and the reconstructed image and outputting a minimum distance as a recognition distance.

（付記１４）前記第１の辞書が、各正規化画像の変換行列および平均値によって構成されていることを特徴とする付記９〜１３のいずれか１つに記載の文字列認識方法。 (Additional remark 14) The said 1st dictionary is comprised by the conversion matrix and average value of each normalized image, The character string recognition method as described in any one of additional marks 9-13 characterized by the above-mentioned.

（付記１５）前記第２の辞書が各文字カテゴリの特徴を記憶することを特徴とする付記９〜１３のいずれか１つに記載の文字列認識方法。 (Supplementary note 15) The character string recognition method according to any one of supplementary notes 9 to 13, wherein the second dictionary stores characteristics of each character category.

（付記１６）前記第３の辞書が、各文字カテゴリの変換行列および平均値を記憶することを特徴とする付記９〜１３のいずれかに１つに記載の文字列認識方法。 (Supplementary note 16) The character string recognition method according to any one of supplementary notes 9 to 13, wherein the third dictionary stores a conversion matrix and an average value of each character category.

（付記１７）不良テキストストリング文字列用の文字列認識装置であって、
第１の辞書を使用して入力された正規化画像から特徴を抽出するための特徴抽出手段と、
前記抽出された特徴を第２の辞書に記憶されている特徴と比較することによって、一定数の文字カテゴリ候補を選択するための粗分類手段と、
第３の辞書および前記一定数の選択された文字カテゴリを使用して、一定数の複数の再構成特徴を構成するための特徴再構成部手段と、
を備えており、さらに、
前記特徴抽出手段によって抽出された前記特徴および前記再構成された特徴に従って認識された文字コードを認識および出力するためのファイン認識詳細分類手段と、
前記第１の辞書と、前記特徴再構成手段によって生成された前記再構成特徴とを使用して、一定の数の複数の再構成画像を構成するための画像再構成手段と、
前記入力された正規化画像と、前記画像再構成手段によって生成された前記再構成画像とに従って認識距離を算出および出力するための認識距離算出手段と、
を備えることを特徴とする文字列認識装置。 (Supplementary Note 17) A character string recognition device for a defective text string character string,
Feature extraction means for extracting features from the normalized image input using the first dictionary;
Coarse classification means for selecting a certain number of character category candidates by comparing the extracted features with features stored in a second dictionary;
Feature reconstructor means for constructing a fixed number of reconstructed features using a third dictionary and the fixed number of selected character categories;
In addition,
Fine recognition detailed classification means for recognizing and outputting the character code recognized according to the feature extracted by the feature extraction means and the reconstructed feature;
Image reconstructing means for constructing a certain number of reconstructed images using the first dictionary and the reconstructed features generated by the feature reconstructing means;
Recognition distance calculation means for calculating and outputting a recognition distance according to the input normalized image and the reconstructed image generated by the image reconstruction means;
A character string recognition apparatus comprising:

（付記１８）前記認識距離算出手段において算出された認識距離に基づいて文字切り出しを行い、前記文字切り出し結果に対応する前記詳細分類手段において算出された文字コードを最終認識文字コードとして算出することを特徴とする付記１７に記載の文字列認識装置。 (Supplementary note 18) Character extraction is performed based on the recognition distance calculated by the recognition distance calculation means, and the character code calculated by the detailed classification means corresponding to the character extraction result is calculated as a final recognition character code. The character string recognition device according to Supplementary Note 17, which is a feature.

（付記１９）前記認識詳細分類手段が、前記特徴抽出手段によって抽出された前記特徴と前記再構成特徴の差を比較し、最小差を有する前記再構成特徴に対応する文字コードを認識された文字コードとして出力することを特徴とする、付記１７または１８に記載の文字列認識装置。 (Additional remark 19) The recognition detailed classification means compares the difference between the feature extracted by the feature extraction means and the reconstructed feature, and the character code corresponding to the reconstructed feature having the minimum difference is recognized. 19. The character string recognition device according to appendix 17 or 18, wherein the character string recognition device outputs the code.

（付記２０）前記画像再構成手段は、前記再構成画像の画素値の範囲を０〜２５５に正規化する、ことを特徴とする付記１７または１８に記載の文字列認識装置。 (Supplementary note 20) The character string recognition device according to supplementary note 17 or 18, wherein the image reconstruction unit normalizes a range of pixel values of the reconstructed image to 0 to 255.

（付記２１）前記認識距離算出手段は、前記入力された正規化画像と、前記画像再構成手段によって生成された前記再構成画像間の距離を算出し、最小距離を認識距離として出力することを特徴とする付記１７または１８に記載の文字列認識装置。 (Supplementary Note 21) The recognition distance calculation unit calculates a distance between the input normalized image and the reconstructed image generated by the image reconstruction unit, and outputs a minimum distance as a recognition distance. 19. The character string recognition device according to appendix 17 or 18, which is a feature.

（付記２２）前記第１の辞書は、各正規化画像の変換行列および平均値から構成されていることを特徴とする付記１７〜２１のいずれか１つに記載の文字列認識装置。 (Additional remark 22) The said 1st dictionary is comprised from the conversion matrix and average value of each normalization image, The character string recognition apparatus as described in any one of additional remarks 17-21 characterized by the above-mentioned.

（付記２３）前記第２の辞書は各文字カテゴリの特徴を記憶することを特徴とする付記１７〜２１のいずれか１つに記載の文字列認識装置。 (Additional remark 23) The said 2nd dictionary memorize | stores the characteristic of each character category, The character string recognition apparatus as described in any one of Additional remarks 17-21 characterized by the above-mentioned.

（付記２４）前記第３の辞書は、各文字カテゴリの変換行列および平均特徴ベクトルを記憶することを特徴とする付記１７〜２１のいずれか１つに記載の文字列認識装置。 (Additional remark 24) The said 3rd dictionary memorize | stores the conversion matrix and average feature vector of each character category, The character string recognition apparatus as described in any one of additional marks 17-21 characterized by the above-mentioned.

認識に基づく従来の切り出し方法の原理を示す図である。It is a figure which shows the principle of the conventional clipping method based on recognition. 二重固有空間ベースの方法を使用する文字認識を示すフローチャートである。FIG. 6 is a flowchart illustrating character recognition using a dual eigenspace-based method. 二重固有空間を使用する文字認識の一例を示す図である。It is a figure which shows an example of the character recognition which uses double eigenspace. 本発明の実施形態に従った文字列認識装置によって使用される文字列認識方法を示すフローチャートである。It is a flowchart which shows the character string recognition method used by the character string recognition apparatus according to embodiment of this invention. 本発明の実施形態に従った文字列認識装置によって使用される文字列認識方法の一例を示す図である。It is a figure which shows an example of the character string recognition method used by the character string recognition apparatus according to embodiment of this invention.

Explanation of symbols

１左ストローク
２右ストローク 1 Left stroke 2 Right stroke

Claims

A character string recognition program for a deteriorated character string,
A feature extraction procedure for extracting features from a normalized image input using a first dictionary composed of a transformation matrix and an average value of each normalized image ;
A coarse classification procedure for selecting a certain number of character category candidates by comparing the extracted features with features stored in a second dictionary;
Feature reconstruction procedure for constructing a fixed number of reconstructed features using a third dictionary storing a transformation matrix and an average feature vector for each character category and the fixed number of selected character category candidates When,
In addition,
A detailed classification procedure for recognizing and outputting the character code recognized according to the feature extracted by the feature extraction procedure and the reconstructed feature;
An image reconstruction procedure for constructing a fixed number of reconstructed images using the first dictionary and the reconstruction features generated by the feature reconstruction procedure;
A recognition distance calculation procedure for calculating and outputting a recognition distance according to the input normalized image and the reconstructed image generated by the image reconstruction procedure;
A character string recognition procedure that performs character segmentation based on the recognition distance calculated in the recognition distance calculation procedure and recognizes the character code calculated in the detailed classification procedure corresponding to the character segmentation result as a final recognition character code. A character string recognition program which is executed by a computer.

The detailed classification procedure compares the difference between the feature extracted by the feature extraction procedure and the reconstructed feature, and outputs a character code corresponding to the reconstructed feature having the minimum difference as a recognized character code. The character string recognition program according to claim 1 .

The character string recognition program according to claim 1, wherein the image reconstruction procedure normalizes a range of pixel values of the reconstructed image to 0 to 255.

The recognition distance calculation procedure calculates a distance between the input normalized image and the reconstructed image generated by the image reconstruction procedure, and outputs a minimum distance as a recognition distance. Item 12. A character string recognition program according to Item 1 .

Said second dictionary character string recognition program according to any one of claims 1-4, characterized by storing the characteristics of each character category.

A character recognition sequence method for a degraded character string,
Extracting features from a normalized image input using a first dictionary composed of a transformation matrix and an average value of each normalized image ;
Selecting a certain number of character category candidates by comparing the extracted features with features stored in a second dictionary;
Configuring a fixed number of reconstructed features using a third dictionary storing a transformation matrix and an average feature vector for each character category and the fixed number of selected character category candidates ;
Recognizing and outputting a character code recognized according to the extracted features and the reconstructed features;
Constructing a fixed number of reconstructed images using the first dictionary and the reconstructed features;
Calculating and outputting a recognition distance according to the input normalized image and the reconstructed image;
Performing character segmentation based on the calculated recognition distance, and recognizing a character code corresponding to the character segmentation result among the recognized character codes as a final recognition character code. Recognition method.

A character string recognition device for a deteriorated character string,
A feature extraction means for extracting features from a normalized image input using a first dictionary composed of a transformation matrix and an average value of each normalized image ;
Coarse classification means for selecting a certain number of character category candidates by comparing the extracted features with features stored in a second dictionary;
Feature reconstructing means for constructing a fixed number of reconstructed features using a third dictionary storing a transformation matrix and an average feature vector for each character category and the fixed number of selected character category candidates When,
In addition,
Detailed classification means for recognizing and outputting the character code recognized according to the feature extracted by the feature extraction means and the reconstructed feature;
It said first dictionary, using said reconstructed feature generated by the feature reconstruction means, and an image reconstruction means for constructing a plurality of reconstructed images of a certain number,
A normalized image that is the input, a recognition distance calculating means for calculating and outputting a recognition distance in accordance with said reconstructed image generated by the image reconstruction means,
A character string recognition unit that performs character segmentation based on the recognition distance calculated by the recognition distance calculation unit and recognizes the character code calculated by the detailed classification unit corresponding to the character segmentation result as a final recognition character code. A character string recognition device comprising: