JP2000076378A

JP2000076378A - Character recognizing method

Info

Publication number: JP2000076378A
Application number: JP10242200A
Authority: JP
Inventors: Yoshiko Hozumi; 芳子穂積
Original assignee: Victor Company of Japan Ltd
Current assignee: Victor Company of Japan Ltd
Priority date: 1998-08-27
Filing date: 1998-08-27
Publication date: 2000-03-14

Abstract

PROBLEM TO BE SOLVED: To recognize a character with high precision even from a character image which is of small size or has illuminance spots or deformation and to precisely recognize characters from character images of many kinds of type without using many dictionaries corresponding to the types. SOLUTION: A recognizing processor 2 to which this method is applied generates outline data of image data inputted from an image input device 1 such as a digital still camera and matches the data obtained by normalizing the outline data (including enlargement, reduction, etc., of data) against dictionary data prepared by extracting outline data of standard characters registered in a storage device 4 to perform character recognition. At this time, the outline data of the image data are deformed into a type to be recognized with the dictionary data or deformed so as to have distortion corrected depending upon the type, distortion, etc., of the input image data. Consequently, even an image which is of small size or has illuminance spots or deformation can be recognized with high precision and precise character recognition of character images of many kinds of type can be performed only with e.g. a dictionary corresponding to one kind of type.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、デジタルカメラ、
ビデオカメラ及びスキャナー等で入力した画像に含まれ
る文字を文字認識する文字認識方法に関する。The present invention relates to a digital camera,
The present invention relates to a character recognition method for recognizing characters included in an image input by a video camera, a scanner, or the like.

【０００２】[0002]

【従来の技術】従来より、文字認識については多くの方
法がある。例えば、入力データと辞書のイメージデータ
とのパターンマッチングを行って、入力データの文字を
認識したり、或いは文字の背景特徴やストローク特徴を
用いて入力データの文字認識を行なう方法がある。しか
し、いずれの文字認識方法も、文書をスキャナー等で入
力した比較的高品質、高解像度の画像を文字認識の対象
としている。2. Description of the Related Art Conventionally, there are many methods for character recognition. For example, there is a method of performing pattern matching between input data and image data of a dictionary to recognize characters of the input data, or performing character recognition of the input data using background characteristics and stroke characteristics of the characters. However, in any of the character recognition methods, a relatively high-quality and high-resolution image obtained by inputting a document with a scanner or the like is targeted for character recognition.

【０００３】これまでは、文書をスキャナーで入力する
方法が主流であったが、ここのところ、デジタルスチル
カメラやビデオカメラの普及に伴い、これらのカメラを
画像入力装置として使用する場合が急速に増え、そのア
プリケーションとして、これらの画像入力装置で入力し
た画像に対して文字認識を行うことが生じてきた。Heretofore, the method of inputting documents with a scanner has been the mainstream, but recently, with the spread of digital still cameras and video cameras, the use of these cameras as image input devices is rapidly increasing. As an application, character recognition has been performed on images input by these image input devices.

【０００４】[0004]

【発明が解決しようとする課題】従来より行われている
文字認識方法は高品質で高解像度の文字画像を対象とし
ているために、この方法をそのままデジタルカメラやビ
デオカメラ等で入力した低解像度で、輝度斑や歪みのあ
る文字画像に適用すると、文字認識率が著しく低下す
る。Since the conventional character recognition method is intended for high-quality and high-resolution character images, this method can be directly applied to a low-resolution character image input by a digital camera or a video camera or the like. When applied to a character image having luminance unevenness or distortion, the character recognition rate is significantly reduced.

【０００５】その原因としては、低解像度のために認識
に必要な文字サイズが得られないこと、またカメラ固有
の画像の輝度斑やノイズ、更には撮影位置による歪み等
が挙げられる。また、従来の文字認識方法では、文書を
スキャナーで入力する場合にもサイズの小さい文字は認
識ができないことが多かった。[0005] The causes include the inability to obtain a character size necessary for recognition due to low resolution, luminance unevenness and noise of an image unique to a camera, and distortion due to a shooting position. Further, in the conventional character recognition method, small characters cannot be recognized even when a document is input by a scanner in many cases.

【０００６】また、従来の文字認識のアプリケーション
はパーソナルコンピュータ等で使用することを想定して
いるために、認識対象の文字が複数の書体である場合に
は各書体に対応した辞書を作成し、これをハードディス
ク等の大容量記憶装置に記憶させて使用している。Further, since the conventional character recognition application is assumed to be used on a personal computer or the like, if the character to be recognized is a plurality of typefaces, a dictionary corresponding to each typeface is created. This is stored in a large-capacity storage device such as a hard disk and used.

【０００７】しかし、例えば、携帯端末のように大容量
記憶装置を持たない装置で文字認識を行う場合には、上
記した各書体に対応した辞書を保持することができず、
認識できる文字書体が限られてしまうという問題があっ
た。[0007] However, for example, when character recognition is performed by a device having no large-capacity storage device such as a portable terminal, the dictionary corresponding to each typeface described above cannot be held.
There is a problem that the typefaces that can be recognized are limited.

【０００８】本発明は、上述の如き従来の課題を解決す
るためになされたもので、その目的は、デジタルスチル
カメラなどで入力したサイズが小さかったり、或いは輝
度斑や歪みのある文字画像でも高い精度で文字認識でき
ると共に、多種類の書体に対応した多数の辞書を用いる
ことなく、多種類の書体の文字画像を精度良く文字認識
できる文字認識方法を提供することである。SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned conventional problems, and has as its object to improve the size of a character image input with a digital still camera or the like or a character image having a luminance unevenness or distortion. It is an object of the present invention to provide a character recognition method capable of accurately recognizing characters and accurately recognizing character images of various types of fonts without using a large number of dictionaries corresponding to various types of fonts.

【０００９】[0009]

【課題を解決するための手段】上記目的を達成するため
に、第１の発明の特徴は、文字画像データを文字認識す
る文字認識方法において、前記文字画像データの輝度等
高線を抽出して文字画像の輪郭線データを得る過程と、
前記得られた輪郭線データと辞書データとを照合して文
字認識結果を得る過程とを備えたことにある。According to a first aspect of the present invention, there is provided a character recognition method for character recognition of character image data, comprising extracting a luminance contour line of the character image data to obtain a character image. Obtaining contour data of
Collating the obtained contour data with the dictionary data to obtain a character recognition result.

【００１０】この第１の発明によれば、例えば、デジタ
ルスチルカメラなどから入力した文字画像データの輝度
等高線を抽出して文字画像の輪郭線データを得、通常
は、この輪郭線データを拡大又は縮小する正規化を行な
って適切なサイズのデータとしておく。その後、前記辞
書データとして、例えば標準文字データの輝度等高線を
抽出して得た輪郭線データと前記正規化データとを照合
する。この際、文字画像データが低解像度で、輝度斑な
どがあっても、得られる輪郭線データは明確なので、前
記辞書データとの照合をシビアーに行なうことができ、
精度の高い文字認識結果が得られる。また、入力文字画
像のサイズが小さい場合でも、前記正規化により、輪郭
線データが適切なサイズに変換されて前記照合が行なわ
れるため、精度の高い文字認識結果が得られる。According to the first invention, for example, the contour lines of the character image are obtained by extracting the luminance contours of the character image data input from a digital still camera or the like. Usually, the contour line data is enlarged or Perform normalization for reduction to obtain data of an appropriate size. After that, as the dictionary data, for example, the contour data obtained by extracting the luminance contours of the standard character data and the normalized data are collated. At this time, even if the character image data has a low resolution, even if there is a luminance unevenness, etc., the obtained contour data is clear, so that the collation with the dictionary data can be severely performed,
A highly accurate character recognition result can be obtained. Even when the size of the input character image is small, the normalization converts the outline data to an appropriate size and performs the collation, so that a highly accurate character recognition result can be obtained.

【００１１】第２の発明の特徴は、前記請求項１に記載
の文字認識方法において、前記輪郭線データを辞書デー
タに照合する際に、前記輪郭線データを変形することに
ある。According to a second aspect of the present invention, in the character recognition method according to the first aspect, when the contour data is compared with dictionary data, the contour data is deformed.

【００１２】この第２の発明によれば、例えば、デジタ
ルスチルカメラなどの撮影位置の関係で、入力文字画像
が歪んでいるような場合、この歪みが是正されるような
例えば座標変換による変形を前記文字画像データの輪郭
線データに施すことにより、歪みが是正された輪郭線デ
ータを得る。この歪みが是正された輪郭線データを前記
辞書データと照合することにより、歪んでいる文字画像
に対しても精度の高い文字認識結果が得られる。また、
入力文字画像の書体が前記辞書データの標準書体と異な
る場合、入力文字画像の輪郭線データに例えば座標変換
による変形を施して、その書体を前記標準書体に合わせ
てから前記辞書データと照合することにより、精度の高
い文字認識結果が得られる。従って、多くの種類の書体
に対応して多くの辞書データを持つ必要がなくなる。According to the second aspect, for example, when an input character image is distorted due to a photographing position of a digital still camera or the like, a deformation such as coordinate transformation for correcting the distortion is performed. By applying the data to the contour data of the character image data, contour data in which distortion has been corrected is obtained. By comparing the contour data in which the distortion has been corrected with the dictionary data, a highly accurate character recognition result can be obtained even for a distorted character image. Also,
When the typeface of the input character image is different from the standard typeface of the dictionary data, the contour data of the input character image is subjected to deformation by, for example, coordinate conversion, and the typeface is adjusted to the standard typeface and then collated with the dictionary data. As a result, a highly accurate character recognition result can be obtained. Therefore, it is not necessary to have many dictionary data corresponding to many types of typefaces.

【００１３】第３の発明の特徴は、前記請求項２に記載
の文字認識方法において、前記輪郭線データの変形は前
記輪郭線データに対応する複数の文字画像の位置により
変形の程度を変更することにある。According to a third aspect of the present invention, in the character recognition method according to the second aspect, the degree of deformation of the outline data is changed according to positions of a plurality of character images corresponding to the outline data. It is in.

【００１４】この第３の発明によれば、歪みのある輪郭
線データに前記変形を施す場合、輪郭線データに対応す
る複数の文字画像の歪みがその位置によって異なる場
合、文字画像の位置によって変形の程度を変更すること
により、全ての文字画像の歪みを適正に是正することが
でき、このように場合にも、精度の高い文字認識結果が
得られる。According to the third aspect, when the deformation is applied to the contour data having distortion, when the distortion of a plurality of character images corresponding to the contour data differs depending on the position, the deformation is performed according to the position of the character image. By changing the degree, the distortion of all character images can be properly corrected, and in this case, a highly accurate character recognition result can be obtained.

【００１５】第４の発明の特徴は、前記請求項２又は３
に記載の文字認識方法において、前記変形は前記輪郭線
データに座標変換を施すことにある。According to a fourth aspect of the present invention, there is provided the above second or third aspect.
In the character recognition method described in (1), the transformation is to perform coordinate transformation on the outline data.

【００１６】第５の発明の特徴は、前記請求項１乃至４
いずれかに記載の文字認識方法において、前記照合は、
前記輪郭線データを正規化した後、この正規化した輪郭
線データと前記辞書データとの対応点同士間の距離を求
め、この距離の合計により文字認識結果を得ることにあ
る。According to a fifth aspect of the present invention, there is provided the first to fourth aspects of the present invention.
In any one of the character recognition methods described above, the collation includes:
After normalizing the contour data, a distance between corresponding points of the normalized contour data and the dictionary data is obtained, and a character recognition result is obtained by summing the distances.

【００１７】この第５の発明によれば、前記正規化した
輪郭線データと前記辞書データとの対応点同士間の距離
を求め、この距離の合計が最も小さい辞書データの文字
が、認識文字の第１候補になる。According to the fifth aspect, the distance between the corresponding points of the normalized contour data and the dictionary data is obtained, and the character of the dictionary data having the smallest sum of the distances is determined as the recognized character. Become the first candidate.

【００１８】第６の発明の特徴は、前記請求項１乃至４
いずれかに記載の文字認識方法において、前記照合は、
前記輪郭線データを正規化した後、この正規化した輪郭
線データと前記辞書データをそれぞれビットマップデー
タに展開し、両ビットマップデータ同士の一致度により
文字認識結果を得ることにある。According to a sixth aspect of the present invention, there is provided the first to fourth aspects.
In any one of the character recognition methods described above, the collation includes:
After normalizing the contour data, the normalized contour data and the dictionary data are respectively developed into bitmap data, and a character recognition result is obtained based on the degree of coincidence between the two bitmap data.

【００１９】この第６の発明によれば、前記文字画像の
前記輪郭線データのビットマップデータと辞書データの
ビットマップデータの一致度により、認識文字を選択す
るため、前記輪郭線データに線の接触や途切れがあって
も、これが一部であれば、前記一致度に余り影響を与え
ないため、高い文字認識率を得ることができる。According to the sixth aspect, a character to be recognized is selected based on the degree of coincidence between the bitmap data of the contour data of the character image and the bitmap data of the dictionary data. Even if there is a contact or interruption, if this is a part, it does not significantly affect the matching degree, so that a high character recognition rate can be obtained.

【００２０】第７の発明の特徴は、前記請求項１乃至６
いずれかに記載の文字認識方法において、前記辞書デー
タは標準文字から輝度等高線を抽出して作成した輪郭線
データであることにある。A seventh feature of the present invention is the above-mentioned claims 1 to 6.
In any one of the character recognition methods described above, the dictionary data is contour data created by extracting luminance contours from standard characters.

【００２１】第８の発明の特徴は、前記請求項１乃至６
いずれかに記載の文字認識方法において、前記辞書デー
タは標準文字のアウトラインフォントであることにあ
る。According to an eighth aspect of the present invention, there is provided the above-described first to sixth aspects.
In any of the character recognition methods described above, the dictionary data is an outline font of standard characters.

【００２２】この第８の発明によれば、既存のアウトラ
インフォントを用いることにより、簡単に辞書データを
得ることができる。According to the eighth aspect, dictionary data can be easily obtained by using an existing outline font.

【００２３】第９の発明の特徴は、前記請求項１に記載
の文字認識方法において、前記文字認識結果を更に単語
辞書に照合して、最終的な文字認識結果を得ることにあ
る。A ninth aspect of the present invention is the character recognition method according to the first aspect, wherein the character recognition result is further collated with a word dictionary to obtain a final character recognition result.

【００２４】この第９の発明によれば、例えば文字認識
対象の文字画像が英語の単語などであった場合、一つの
単語の認識が全体の文字についてできず、１文字単位で
認識したときの正解が例えば第３位の候補までに入って
いれば第１候補でなくても、これらの文字列（単語）を
単語辞書と照合することにより、正解の単語を得ること
ができ。更に文字認識率を向上させることができる。According to the ninth aspect, for example, when a character image to be recognized is an English word or the like, one word cannot be recognized for the entire character, and when the recognition is performed in units of one character. If the correct answer is, for example, up to the third candidate, even if it is not the first candidate, the correct word can be obtained by collating these character strings (words) with the word dictionary. Further, the character recognition rate can be improved.

【００２５】[0025]

【発明の実施の形態】以下、本発明の実施の形態を図面
に基づいて説明する。図１は、本発明の文字認識方法を
適用した文字認識装置の一実施の形態を示したブロック
図である。スキャナー、デジタルスチルカメラ、ビデオ
カメラにキャプチャーを組み合わせた装置等の画像入力
装置１は被写体の画像データを認識処理装置２に入力す
る。認識処理装置２は、ユーザからの指示などを入力す
るマウス、キーボード等の入力装置３も接続しており、
辞書データ及びプログラムなどを記憶した記憶装置４を
用いて文字認識を行ない、その結果などを画像表示装置
５に表示したり、プリンタなどの出力装置６に出力す
る。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing an embodiment of a character recognition device to which the character recognition method of the present invention is applied. An image input device 1 such as a scanner, a digital still camera, a video camera and a combination of capture devices inputs image data of a subject to a recognition processing device 2. The recognition processing device 2 is also connected to an input device 3 such as a mouse or a keyboard for inputting an instruction or the like from a user.
Character recognition is performed using the storage device 4 that stores dictionary data and programs, and the results are displayed on the image display device 5 or output to an output device 6 such as a printer.

【００２６】尚、上記した文字認識装置は、例えばパー
ソナルコンピューターとその周辺機器により構成するこ
とができる。The above-described character recognition device can be constituted by, for example, a personal computer and its peripheral devices.

【００２７】次に本実施の形態の動作について説明す
る。まず、記憶装置４に記憶されている辞書データを図
２に示したフローチャートに従って作成し、これを記憶
装置４に登録する。即ち、ステップ２０１にて、標準文
字の画像データから輝度等高線を抽出して輪郭線データ
を作成し、この輪郭線データをステップ２０２にて記憶
装置４に登録して辞書データとする。尚、文字フォント
データとして既に作成されているアウトライフォントデ
ータを辞書データとして使用しても良い。Next, the operation of this embodiment will be described. First, the dictionary data stored in the storage device 4 is created in accordance with the flowchart shown in FIG. That is, in step 201, contour lines are created by extracting brightness contour lines from the image data of the standard characters, and the contour data is registered in the storage device 4 in step 202 as dictionary data. Note that outline font data already created as character font data may be used as dictionary data.

【００２８】次に上記のように作成した辞書データを用
いて、図１に示した認識処理装置２が文字認識を行なう
動作について図３に示したフローチャートに従って説明
する。まず、ステップ３０１にて、デジタルスチルカメ
ラ、スキャナー等の画像入力装置１によって入力された
文字画像データは記憶装置４に保存される。次に認識処
理装置２はステップ３０２にて、この文字画像データか
ら輝度等高線を抽出して文字の輪郭線データ（文字デー
タと称することもある）を得る。この場合、１つの文字
が複数の輪郭線で構成されている場合もあるので、この
ような場合は輪郭線データの位置関係により統合を行い
１文字分のデータとする。Next, an operation in which the recognition processing device 2 shown in FIG. 1 performs character recognition using the dictionary data created as described above will be described with reference to a flowchart shown in FIG. First, in step 301, character image data input by the image input device 1 such as a digital still camera or a scanner is stored in the storage device 4. Next, in step 302, the recognition processing device 2 extracts contour lines of brightness from the character image data to obtain contour data of the character (sometimes referred to as character data). In this case, since one character may be composed of a plurality of outlines, in such a case, integration is performed based on the positional relationship of the outline data to obtain data for one character.

【００２９】ここで、上記した輝度等高線の抽出方法に
ついて説明する。輝度等高線の抽出は２値画像の輪郭抽
出を拡張した方法で行ない、図４に示すような輝度分布
として抽出される。このような輝度分布をある閾値で２
値化した場合の輪郭線は図５に示すように白画素と黒画
素の境界を追跡することによって得られる。図５に示し
た太い実線Ｐで示すように輪郭線は必ず閉ループを形成
する。与えられた閾値は境界線の両側の画素値間にある
ので、輝度等高線はこの境界線の位置を両側の画素値に
従って移動すればよい。Here, a method for extracting the above-mentioned luminance contour lines will be described. The extraction of the luminance contour is performed by a method that extends the contour extraction of the binary image, and is extracted as a luminance distribution as shown in FIG. Such a luminance distribution is calculated by a certain threshold value of
The contour in the case of the value is obtained by tracing the boundary between the white pixel and the black pixel as shown in FIG. As shown by the thick solid line P shown in FIG. 5, the contour always forms a closed loop. Since the given threshold value is between the pixel values on both sides of the boundary line, the luminance contour should just move the position of this boundary line according to the pixel values on both sides.

【００３０】図４に示したような輝度分布で閾値を２０
０とすると、上記した図５に示すように境界線Ｐが得ら
れるので、この境界線Ｐを両側の画素値によって比例計
算を行って移動させると、図５のＱに示すような、より
滑らかな輝度等高線が得られる。図６は図５に示したＡ
部の前記比例計算による輪郭線の求め方を示した図であ
る。In the luminance distribution as shown in FIG.
If 0, a boundary line P is obtained as shown in FIG. 5 described above. If this boundary line P is moved by performing a proportional calculation using the pixel values on both sides, a smoother line as shown by Q in FIG. 5 is obtained. A high brightness contour line is obtained. FIG. 6 shows A shown in FIG.
FIG. 8 is a diagram showing how to obtain a contour line by the proportional calculation of the section.

【００３１】但し、境界線Ｑが進行方向に対して曲が
り、移動後の２点が近接する場合にはこの２点の中点を
とって輪郭点とする。また、上記処理で用いる閾値は適
当な値を指定してもよいし、判別分析方法などの方法を
とり、画像の濃度ヒストグラムから自動的に求めてもよ
い。このようにして抽出した輪郭線は凸凹があり照合に
悪影響があるため、次に平滑化を行う。平滑化では各点
の座標値を前後数点の座標値の平均値で置き換えること
がなされる。However, when the boundary line Q bends in the traveling direction and two points after the movement are close to each other, the middle point between the two points is taken as the contour point. Further, an appropriate value may be designated as the threshold value used in the above processing, or a method such as a discriminant analysis method may be used to automatically obtain the threshold value from the density histogram of the image. Since the contour line extracted in this way has irregularities and adversely affects the collation, smoothing is performed next. In the smoothing, the coordinate value of each point is replaced with the average value of the coordinate values of several points before and after.

【００３２】ここで、抽出した文字データの例を図７に
示す。図７（Ａ）が入力文字画像データ、図７（Ｂ）が
入力文字画像データの輝度等高線による輪郭線抽出結
果、図７（Ｃ）が図７（Ｂ）で示した郭線抽出結果を平
滑化処理して得られた文字データ（輪郭線データ）であ
る。Here, an example of the extracted character data is shown in FIG. 7 (A) shows the result of extracting the contours of the input character image data using the luminance contours, and FIG. 7 (C) shows the result of extracting the contour lines shown in FIG. 7 (B). Character data (contour line data) obtained by the conversion process.

【００３３】認識処理装置２は上記のようにして得られ
た文字データを記憶装置４内の辞書データと順次照合す
る。その前に、ステップ３０３にて、入力文字データを
正規化する。この正規化は図８（Ｂ）に示す通り、各文
字の外接矩形の大きさに従って、縦横の拡大縮小を行っ
て、図８（Ａ）に記した辞書データと同じ大きさにする
処理である。The recognition processing device 2 sequentially compares the character data obtained as described above with dictionary data in the storage device 4. Before that, in step 303, the input character data is normalized. This normalization is, as shown in FIG. 8B, a process of performing vertical and horizontal enlargement / reduction according to the size of the circumscribed rectangle of each character to make it the same size as the dictionary data shown in FIG. 8A. .

【００３４】その後、ステップ３０４にて正規化した文
字データを更に座標変換等を用いて変形する必要がある
かどうか判定し、あればステップ３０４にて、変形す
る。尚、この変形の必要性はステップ３０４にて、後述
する入力文字画像データの書体や歪みの有無などにより
判断される。その必要がない場合はステップ３０６に直
接進んで、文字画像データを辞書データと照合する。認
識処理装置４はステップ３０７にて、この照合により得
た文字認識結果を画像表示装置５に出力する。Thereafter, it is determined whether or not the character data normalized in step 304 needs to be further transformed by using coordinate transformation or the like. The necessity of this deformation is determined in step 304 based on the typeface of input character image data described later and the presence or absence of distortion. If it is not necessary, the process proceeds directly to step 306, where the character image data is collated with the dictionary data. In step 307, the recognition processing device 4 outputs the character recognition result obtained by the collation to the image display device 5.

【００３５】ところで上記した照合の方法は大きく分け
て２通りある。１つは、文字データである輪郭線データ
の座標値同士の距離を計算する方法、もう１つは輪郭線
データからビットマップに展開し、ビットマップ同士の
一致度を計算する方法である。The above collation methods can be roughly divided into two types. One is a method of calculating the distance between the coordinate values of the contour data as character data, and the other is a method of developing the contour data into a bitmap and calculating the degree of coincidence between the bitmaps.

【００３６】最初に座標値同士の距離を計算する方法に
ついて述べる。この方法は、輪郭線データを並び順に従
って対応させ、対応点同士の距離を求めるものである。
まず、輪郭線データの始点を合わせる必要があるが、こ
れについては輪郭線抽出時の始点のサーチを左下から図
９（Ｃ）に示すようなジグザグスキャンを行うことで、
文字中の左下の点が必ず始点となるようにできる。尚、
図９（Ｂ）は通常スキャンの例である。First, a method of calculating the distance between coordinate values will be described. In this method, the contour data is made to correspond according to the arrangement order, and the distance between corresponding points is obtained.
First, it is necessary to match the start point of the contour data. For this purpose, a zigzag scan as shown in FIG.
The lower left point in the character can always be the starting point. still,
FIG. 9B shows an example of a normal scan.

【００３７】例えば、図９（Ａ）に示した文字ｍについ
て、左下から横方向に通常スキャンをすると、始点がＡ
ＢＣのどれになるかは特定できないが、ジグザグスキャ
ンをすれば必ずＡになるので、始点合わせの必要がなく
なる。For example, when the character m shown in FIG. 9A is normally scanned in the horizontal direction from the lower left, the starting point is A
It is not possible to specify which of the BCs, but if zigzag scanning is performed, it will always be A, eliminating the need for starting point alignment.

【００３８】図１０（Ｂ）に示すような輪郭線データと
図１０（Ａ）に示すような辞書データとの対応を求める
際には、文字の幅や高さによってずれることがないよう
に、輪郭線中の特徴点、ここでは、曲がり角の点をまず
対応点（同番号で示してある点）としてから、他の点の
対応点を求める。曲がり角の点は座標値が極大となる点
または、輪郭線の向きが大きく変わる点を座標値から判
断して決める。When determining the correspondence between the contour data as shown in FIG. 10B and the dictionary data as shown in FIG. 10A, the correspondence is not changed by the width or height of the character. First, feature points in the contour line, in this case, a corner point are set as corresponding points (points indicated by the same number), and corresponding points of other points are obtained. The turning point is determined by judging a point where the coordinate value becomes a maximum or a point at which the direction of the outline changes greatly from the coordinate value.

【００３９】認識処理装置２は、このようにして決めた
対応点同士の距離を図１０（Ｃ）に示すように求め、こ
れを合計した値を辞書データと文字データの距離とし、
ステップ３０７にて、この距離の小さいものから認識結
果として出力する。また、１つの文字が複数の輪郭線で
構成されている場合には、それぞれについて輪郭線デー
タ同士の距離を求め、これを合計して距離とする。The recognition processing device 2 obtains the distance between the corresponding points determined in this way as shown in FIG. 10 (C), and the sum is used as the distance between the dictionary data and the character data.
In step 307, the recognition result is output starting from the one with the shortest distance. When one character is composed of a plurality of outlines, the distance between the outline data is obtained for each of the outlines, and the distance is summed up.

【００４０】次に輪郭線データからビットマッブ展開を
行う方法について述べる。前述の方法では、輪郭線の数
や輪郭点の順番が違ってしまうと、辞書データとの対応
点がとれないために正解が得られない場合がある。例え
ば、入力画像データが図１１（Ａ）に示すように「Ｅｘ
ｐｌａｎａｔｏｒｙ」であった場合、図１１（Ｂ）に示
すような輪郭線抽出結果が得られるが、ａの一部が接触
してしまうことがある。このような場合、辞書データは
図１１（Ｄ）に示すように輪郭線が２つとなっているの
に対して、文字データは図１１（Ｃ）に示すように輪郭
線が３つとなり、その経路も違ってしまうので、ａとい
う認識結果が得られない。特に画像に輝度斑がある場
合、線が接触したり、逆に途切れたりする傾向があるた
め、この方法では文字の認識率が低下する。Next, a method of performing bit map development from contour data will be described. In the above-described method, if the number of contour lines or the order of contour points is different, a correct point may not be obtained because a corresponding point with dictionary data cannot be obtained. For example, as shown in FIG.
In the case of “planarity”, a contour extraction result as shown in FIG. 11B is obtained, but a part of “a” may come into contact. In such a case, the dictionary data has two contour lines as shown in FIG. 11D, whereas the character data has three contour lines as shown in FIG. 11C. Since the route is also different, a recognition result of a cannot be obtained. In particular, in the case where there is luminance unevenness in the image, lines tend to touch or conversely break, and this method reduces the character recognition rate.

【００４１】例えば、特平６−１６２２６４では、途切
れに対応するために文字パターンを変形してマッチング
を行っているが、デジタルカメラからの入力のように局
所的な輝度斑やノイズのある場合には十分な認識結果が
得られない。For example, in Japanese Patent Publication No. Hei 6-162264, a character pattern is deformed and matched in order to cope with a break. However, when there is a local luminance unevenness or noise such as an input from a digital camera. Does not provide sufficient recognition results.

【００４２】そこで、輪郭線データから一定の大きさの
ビットマップに展開し、このビットマップ同士の一致度
を計算する方法を採る。この方法によれば、線の接触や
途切れに大きく影響されることなく、文字の形そのもの
を照合することができる。入力された画像データは輪郭
線抽出によりベクトルデータに変換されているので、座
標値変換により、任意の大きさのビットマップ画像を作
成することができる。そのために一般的に行われている
入力画像をそのまま拡大縮小する方法に比較して、高品
質のビットマップ画像を作成することができるので、文
字の認識率を上げることができる。Therefore, a method of developing a bitmap of a predetermined size from the contour data and calculating the degree of coincidence between the bitmaps is adopted. According to this method, the shape of the character itself can be collated without being largely affected by contact or interruption of the line. Since the input image data is converted into vector data by contour line extraction, a bitmap image of an arbitrary size can be created by coordinate value conversion. For this reason, a high-quality bitmap image can be created as compared with a general method of directly enlarging or reducing an input image, so that the character recognition rate can be increased.

【００４３】図１２（Ａ）に上記したビットマップ展開
例を示す。これは３２×３２の大きさに展開した例であ
る。辞書データも図１２（Ｂ）に示すように同様にビッ
トマップ展開を行い、ビットマップの白／黒が一致しな
い数を距離とし、距離の小さいものをステップ３０７に
て、認識結果として出力する。FIG. 12A shows an example of the bitmap development described above. This is an example developed to a size of 32 × 32. As shown in FIG. 12B, the dictionary data is similarly subjected to bitmap development, the number of bitmaps that do not match white / black is defined as the distance, and the smaller distance is output as a recognition result in step 307.

【００４４】展開するビットマップ画像の大きさは任意
であり、メモリや記憶装置４の容量、ＣＰＵの性能等に
より適当な値を指定することが可能である。The size of the bitmap image to be expanded is arbitrary, and an appropriate value can be designated according to the capacity of the memory or the storage device 4, the performance of the CPU, and the like.

【００４５】辞書データについては、入力文字毎にビッ
トマップ展開をするのではなく、辞書データの輪郭線を
読み込んだ時点でビットマップ展開をしておくことで、
計算時間を短縮することができる。The dictionary data is not developed in a bitmap manner for each input character, but is developed in a bitmap manner when the outline of the dictionary data is read.
Calculation time can be reduced.

【００４６】しかし、この方法によると、英字のｌやｉ
などのように縦長の文字は正規化によってビットマップ
全体が黒となるために、黒の多い文字がこれらに誤認識
されてしまう。これを避けるために、予め、縦長の文字
とそれ以外とをグループ分けし、入力された文字の縦横
比によってグループを選択して文字認識を行う方法を採
っている。However, according to this method, the letters l and i
For example, vertically long characters such as... Become black in the entire bitmap due to normalization, and characters with much black are erroneously recognized. In order to avoid this, a method is adopted in which vertically long characters and other characters are grouped in advance, and character recognition is performed by selecting a group according to the aspect ratio of the input characters.

【００４７】その他に文字の位置についてのグループを
設定することも可能である。例えば英文字のｑ、ｊ．ｐ
等は基準線より下に位置する部分があるので、これを座
標値から判断してグループの絞りこみを行うことによ
り、更に認識率を上げることができる。In addition, it is also possible to set a group for the position of the character. For example, the letters q, j. p
And the like are located below the reference line. Therefore, by recognizing these from the coordinate values and narrowing down the groups, the recognition rate can be further increased.

【００４８】次に多種類の書体の文字画像への対応方法
について説明する。ここでは英文字を例に挙げて説明す
る。書体による違いは様々あり、書体によって例えば線
の太さ、縦横の線幅の違い、飾りの部分、傾き等の違い
が挙げられる。一般には特平５−１５９１０７で行われ
ているように、認識対象文字の書体分の辞書を用意して
多種類の書体への対応を行っているが、メモリ量やＣＰ
Ｕ性能の制約により、大量のデータを持てない場合があ
る。そこで、標準的な１種類、多くても２〜３種類の書
体の辞書を持つのみで、多種類の書体の認識を行う方法
が必要とされている。Next, a description will be given of a method of dealing with various types of character images. Here, English characters will be described as an example. There are various differences depending on the typeface, and for example, differences in line thickness, vertical and horizontal line widths, decorative portions, inclinations, and the like are given depending on the typeface. Generally, as in Japanese Patent Application Laid-Open No. 5-159107, dictionaries for fonts of characters to be recognized are prepared to handle various types of fonts.
Due to U-performance constraints, large amounts of data may not be available. Therefore, there is a need for a method of recognizing various types of fonts by only having a dictionary of one standard type and at most two to three types of typefaces.

【００４９】そこで、本例では、入力された文字画像デ
ータから得た輪郭線データをステップ３０５にて座標変
換して変形することにより、多種類の書体の文字を１種
類、多くても２〜３種類の書体用の辞書で認識可能とし
ている。Therefore, in this example, the outline data obtained from the input character image data is subjected to coordinate transformation in step 305 and deformed, so that one type of characters of various types is obtained, Recognition is possible with dictionaries for three types of fonts.

【００５０】例えば、図１３（Ａ）に示すような標準書
体の輪郭線データが辞書データとして記憶装置４に登録
されているとする。この例は、英字の形状特徴を表す単
純で飾りがなく、縦横の線幅も同じであるような書体で
ある。以下、画像入力装置１からの入力文字画像として
図１３（Ｂ）のような書体が与えられた場合についてそ
の動作を説明する。まず、認識処理装置２は、入力文字
画像が斜体であるために、図１３（Ｃ）に示すように、
これを正立させる座標変換を行い、更に図１３（Ｄ）に
示すように縦横の線幅を辞書データに合わせて細める変
換を行う。For example, it is assumed that contour data of a standard typeface as shown in FIG. 13A is registered in the storage device 4 as dictionary data. This example is a typeface that is simple, has no decoration, and has the same vertical and horizontal line widths that represent the shape characteristics of English characters. Hereinafter, the operation when a typeface as shown in FIG. 13B is given as an input character image from the image input apparatus 1 will be described. First, since the input character image is in italic, the recognition processing device 2 generates, as shown in FIG.
A coordinate transformation for erecting this is performed, and further a transformation for narrowing the vertical and horizontal line widths in accordance with the dictionary data is performed as shown in FIG.

【００５１】ここで、入力文字画像の正立は図１４
（Ａ）に示すような座標変換、線幅の補正は図１４
（Ｂ）に示すような座標点の移動によって実現される。
縦横の線幅の違う場合には縦横の移動比率を変えること
により、横線のみを太めるなどの変換もできる。入力画
像の飾り部分についても削除する必要があるが、形状か
ら判定して削除するには多くの処理を必要とする。Here, the erect of the input character image is shown in FIG.
The coordinate conversion and the line width correction as shown in FIG.
This is realized by the movement of the coordinate points as shown in FIG.
If the vertical and horizontal line widths are different, by changing the vertical and horizontal movement ratios, it is possible to perform conversion such as thickening only the horizontal lines. Although it is necessary to delete the decoration part of the input image, it is necessary to perform a lot of processing to delete it based on the shape.

【００５２】そこで、認識に悪影響を与える文字画像の
端の部分に着目する。文字画像の端の部分に飾りがある
場合にそのまま正規化を行うと縦線の位置がずれてしま
い、辞書と一致しない部分が多くなる。このため、図１
４（Ｃ）に示すように、入力画像の端部分での移動量を
大きくする特殊な座標変換を行って、図１３（Ｅ）に示
すように、文字画像の端の部分の飾りを見かけ上なくす
変換を行う。以上の処理は全て輪郭線の座標デ−タにつ
いて座標変換を行うので、ビットマップに対して同じ処
理を行うよりも短い処理時間で済むという利点がある。Therefore, attention is paid to the end portion of the character image which adversely affects the recognition. If normalization is performed as it is when there is a decoration at the end of the character image, the position of the vertical line will be shifted, and many parts will not match the dictionary. Therefore, FIG.
As shown in FIG. 4 (C), a special coordinate transformation for increasing the movement amount at the end of the input image is performed, and as shown in FIG. Perform a conversion to eliminate. In all of the above processes, since coordinate conversion is performed on the coordinate data of the outline, there is an advantage that a shorter processing time is required than performing the same process on the bit map.

【００５３】尚、認識処理装置２はどのような座標変換
を行うかについては、入力画像の最大長の角度や縦横の
輝度の投影分布、文字周辺部の輝度分布等の形状特徴を
得て、想定した書体のどれに当たるかを判断して決定す
る。The recognition processing device 2 determines the type of coordinate transformation to be performed by obtaining shape characteristics such as the maximum length angle of the input image, the projection distribution of vertical and horizontal luminance, and the luminance distribution of the periphery of the character. It is determined by judging which of the assumed typefaces corresponds.

【００５４】次に入力画像に歪みがある場合にこの歪み
補正するための変形（ステップ３０５）について説明す
る。従来の文字認識は、紙をスキャナーで読んだり、決
められた位置の画像を固定のカメラで入力した場合を想
定して作られたものが多く、デジタルカメラのように固
有の画像歪みを元々持っていたり、撮影対象に対して斜
めの位置から撮影して、入力画像が歪んでしまう場合は
考慮されていなかった。従って、このような場合には、
入力画像の歪みを補正する必要があるが、これも前述の
書体対応のような座標変換によって行なっている。Next, the deformation (step 305) for correcting the distortion when the input image has distortion will be described. Conventional character recognition is often designed based on the assumption that a paper is read with a scanner or an image at a fixed position is input with a fixed camera, and inherently has inherent image distortion like a digital camera. The case where the input image is distorted due to the input image being distorted or photographed from an oblique position with respect to the photographing target has not been considered. Therefore, in such a case,
Although it is necessary to correct the distortion of the input image, this is also performed by the coordinate conversion as in the case of the typeface described above.

【００５５】例えば図１５（Ａ）に示したような文字画
像が画像入力装置１から認識処理装置２へ入力された場
合、この入力文字画像の輪郭線抽出結果は、図１５
（Ｂ）に示すように文字板に対して上から撮影している
ために、下方が萎んだように歪み、更に右と左とで歪み
方が違っている。この歪みの小さい場合にはそのまま辞
書データとの照合を行っても正解が得られる場合が多い
が、歪みが大きい場合には補正する必要がある。ここで
は、文字画像内の位置により傾きの補正幅を変えて座標
変換を行うことにより、図１５（Ｃ）に示すように標準
文字に近い形状に補正することができる。For example, when a character image as shown in FIG. 15A is input from the image input device 1 to the recognition processing device 2, the contour extraction result of the input character image is obtained as shown in FIG.
As shown in FIG. 3B, since the dial is photographed from above, the lower part is distorted as if it were shrunken, and the distortion is different between right and left. In the case where the distortion is small, a correct answer can often be obtained even when the dictionary data is collated as it is, but when the distortion is large, it is necessary to correct the distortion. Here, by performing coordinate conversion while changing the correction width of the inclination depending on the position in the character image, it is possible to correct the shape to a shape close to a standard character as shown in FIG.

【００５６】また、ー般に文字認識においては、単独文
字での認識結果に加えて、単語や熟語の情報により文字
認識結果を補正して正解を出す方法が既知である。例え
ば英単語の認識では、単独での認識結果が間違っていて
も、単語辞書を用いることにより、周辺の正解の文字か
ら正解の単語を得ることが可能である。そのためには、
認識結果を１つだけ出力するのではなく、複数の認識結
果とその確実度を出力する必要がある。In character recognition, in general, there is known a method of correcting a character recognition result based on information of a word or an idiom and obtaining a correct answer in addition to a recognition result of a single character. For example, in the recognition of English words, even if the recognition result by itself is wrong, it is possible to obtain a correct word from surrounding correct characters by using a word dictionary. for that purpose,
Instead of outputting only one recognition result, it is necessary to output a plurality of recognition results and their certainty.

【００５７】本例においても、認識結果として例えば３
つの文字を出力し、これを単語辞書と照合させて正解を
得るようなことを併用しても良い。この併用により、例
えば第３位までに正しい文字が入っていれば、他の文字
が誤認識されていても、正解の単語を得ることができ
る。単語の検索時には候補となる単語を辞書から取り出
し、前述の確実度を用い、それぞれ評価値を求めて認識
結果を得ることもできる。Also in this example, for example, 3
Two characters may be output and compared with a word dictionary to obtain a correct answer. By this combination, for example, if a correct character is included in the third place, a correct word can be obtained even if another character is erroneously recognized. At the time of word search, candidate words can be taken out of the dictionary, and evaluation results can be obtained by using the above-described certainty factors to obtain recognition results.

【００５８】最後に、上述した本例の基本的な文字認識
処理及び入力文字データの多種書体や歪みに関する変形
処理をまとめた全体的な処理の流れを図１６に示してお
く。Finally, FIG. 16 shows an overall processing flow in which the above-described basic character recognition processing of the present example and deformation processing relating to various types of fonts and distortion of input character data are summarized.

【００５９】本実施の形態によれば、デジタルスチルカ
メラ等から入力された画像データなどの文字画像データ
の輪郭線を抽出して得た輪郭線データを正規化し（入力
文字画像のサイズを拡大縮小する）、これと標準文字の
輪郭線を抽出して得た辞書データとを照合することによ
り、低解像度で輝度斑などがある品質の良くない文字画
像やサイズの小さい文字画像に対しても誤認識を少なく
して、高精度の文字認識を行うことができる。また、入
力文字画像が歪んでいたりした場合は、この歪みを是正
するように変形した後に、文字認識を行うため、歪みの
ある文字画像データに対しても高精度の文字認識を行う
ことができる。According to this embodiment, the contour data obtained by extracting the contour of character image data such as image data input from a digital still camera or the like is normalized (the size of the input character image is enlarged or reduced). By comparing this with dictionary data obtained by extracting the outline of the standard character, erroneous low-resolution and small-size character images with luminance unevenness and the like can be mistaken. Highly accurate character recognition can be performed with less recognition. Further, when the input character image is distorted, the character is recognized after being deformed to correct the distortion, so that highly accurate character recognition can be performed even on distorted character image data. .

【００６０】更に、多種類の書体の文字画像の認識に当
たっては、文字画像を変形して認識できる書体の文字画
像としてから上記した文字認識を行うため、１種類又は
２、３種類の書体用の辞書を用意するだけで、多種類の
書体の文字画像を高精度に認識することができる。従っ
て、携帯端末のように大容量の記憶装置を搭載できない
ような装置においても、多種類の文字画像を高精度に文
字認識することができる。Further, in recognizing character images of various types of fonts, the above-described character recognition is performed from a character image of a typeface which can be recognized by transforming the character image. By simply preparing a dictionary, character images of various types of fonts can be recognized with high accuracy. Therefore, even in a device such as a portable terminal in which a large-capacity storage device cannot be mounted, various types of character images can be recognized with high accuracy.

【００６１】[0061]

【発明の効果】以上詳細に説明したように、本発明の文
字認識方法によれば、デジタルスチルカメラなどで入力
したサイズが小さかったり、或いは輝度斑や歪みのある
文字画像でも高い精度で文字認識できると共に、多種類
の書体に対応した多数の辞書を用いることなく、多種類
の書体の文字画像を精度良く文字認識することができ
る。As described above in detail, according to the character recognition method of the present invention, even if the size of a character input by a digital still camera or the like is small, or a character image having uneven brightness or distortion, character recognition can be performed with high accuracy. In addition to this, it is possible to accurately recognize character images of various types of fonts without using many dictionaries corresponding to various types of fonts.

[Brief description of the drawings]

【図１】本発明の文字認識方法を適用した文字認識装置
の一実施の形態を示したブロック図である。FIG. 1 is a block diagram showing an embodiment of a character recognition device to which a character recognition method of the present invention is applied.

【図２】図１に示した記憶装置に登録される辞書データ
の作成処理を示したフローチャートである。FIG. 2 is a flowchart showing a process of creating dictionary data registered in the storage device shown in FIG.

【図３】図１に示した認識処理装置による文字認識処理
手順を示したフローチャートである。FIG. 3 is a flowchart illustrating a character recognition processing procedure performed by the recognition processing device illustrated in FIG. 1;

【図４】入力画像データの輝度等高線抽出による輝度分
布例を示した図である。FIG. 4 is a diagram showing an example of a luminance distribution by extracting luminance contour lines of input image data.

【図５】図４に示した輝度分布から輪郭線を作成する方
法を示した図である。FIG. 5 is a diagram showing a method of creating a contour from the luminance distribution shown in FIG.

【図６】図５に示した輪郭線作成方法の詳細を説明する
図である。FIG. 6 is a diagram for explaining details of the contour line creation method shown in FIG. 5;

【図７】入力画像、その輪郭線抽出結果及び平滑化した
結果例を示した図である。FIG. 7 is a diagram illustrating an example of an input image, its contour line extraction result, and the result of smoothing.

【図８】入力データを正規化する方法を説明する図であ
る。FIG. 8 is a diagram illustrating a method of normalizing input data.

【図９】入力データのスキャンの仕方を説明する図であ
る。FIG. 9 is a diagram illustrating how to scan input data.

【図１０】入力画像データ（輪郭線データ）と辞書デー
タとの対応点同士の距離の求め方を示す図である。FIG. 10 is a diagram illustrating a method of obtaining a distance between corresponding points between input image data (contour line data) and dictionary data.

【図１１】ビットマップ展開を行うのに好適な入力デー
タとその輪郭線抽出結果及び問題点を説明した図であ
る。FIG. 11 is a diagram illustrating input data suitable for performing bitmap development, a contour line extraction result thereof, and problems.

【図１２】入力画像データ（輪郭線データ）と辞書デー
タのビットマップ展開例を示した図である。FIG. 12 is a diagram showing an example of bitmap expansion of input image data (contour line data) and dictionary data.

【図１３】書体に応じて入力画像データ（輪郭線デー
タ）の変形方法を説明する図である。FIG. 13 is a diagram illustrating a method of transforming input image data (contour line data) according to a typeface.

【図１４】図１３で用いた変形の仕方を説明する図であ
る。FIG. 14 is a diagram for explaining a method of deformation used in FIG.

【図１５】歪みのある入力画像データ（輪郭線データ）
を変形する方法を説明する図である。FIG. 15 shows distorted input image data (contour line data).
It is a figure explaining the method of transforming.

【図１６】図１で示した装置による文字認識及びそれに
関わる各種処理の流れを統合して示したフローチャート
である。FIG. 16 is a flowchart showing an integrated flow of character recognition and various processes related to the character recognition by the apparatus shown in FIG. 1;

[Explanation of symbols]

１画像入力装置２認識処理装置３入力装置４記憶装置５画像表示装置６出力装置 DESCRIPTION OF SYMBOLS 1 Image input device 2 Recognition processing device 3 Input device 4 Storage device 5 Image display device 6 Output device

Claims

[Claims]

1. A character recognition method for character recognition of character image data, comprising the steps of: extracting luminance contours of the character image data to obtain contour data of the character image; And obtaining a character recognition result by collating the characters.

2. The character recognition method according to claim 1, wherein the contour data is deformed when the contour data is compared with dictionary data.

3. The character recognition method according to claim 2, wherein the degree of deformation of the outline data is changed according to positions of a plurality of character images corresponding to the outline data. Recognition method.

4. The character recognition method according to claim 2, wherein the transformation is performed by performing a coordinate transformation on the outline data.

5. The character recognition method according to claim 1, wherein the collation is performed by normalizing the contour data, and then determining a corresponding point between the normalized contour data and the dictionary data. A character recognition method comprising: obtaining a distance between each other; and obtaining a character recognition result based on a sum of the distances.

6. The character recognition method according to claim 1, wherein in the collation, the contour data is normalized, and then the normalized contour data and the dictionary data are bit-mapped. A character recognition method which develops data and obtains a character recognition result based on the degree of coincidence between both bitmap data.

7. The character recognition method according to claim 1, wherein the dictionary data is contour data created by extracting luminance contours from standard characters.

8. The character recognition method according to claim 1, wherein the dictionary data is an outline font of a standard character.

9. The character recognition method according to claim 1, wherein the character recognition result is further collated with a word dictionary to obtain a final character recognition result. .