JPH05143776A

JPH05143776A - Character recognizing device

Info

Publication number: JPH05143776A
Application number: JP3304400A
Authority: JP
Inventors: Ayumi Tachibana; 亜由美橘
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1991-11-20
Filing date: 1991-11-20
Publication date: 1993-06-11

Abstract

PURPOSE:To provide the character recognizing device exactly performing the segmenting and recognition in a short time even when the character width is fluctuated due to the mixing of the special characters such as alphabets in the general characters with the same character width such as KANJI (Chinese character) and KANA (Japanese syllabary writing). CONSTITUTION:The external rectangular of the black picture element connection component is extracted from a picture inputted from an input part 1 by an external rectangular extraction part 2. The basic rectangular is prepared by a basic rectangular preparation part by unifying the external rectangular overlapped in the vertical direction in the extracted external rectangular. Based on the shape position of the basic rectangular, a character string discriminating part 6 discriminates whether the character string is the general or special character string. In the case of a general character string, the basic rectangular is unified based on the estimated reference character width, performing the recognition. In the case of a special character string, the basic rectangular is recognized separately without unification. Thus, the repetition of segmenting and recognition is reduced to the minimum, realizing the improvement of the segment accuracy and the shortening of the processing time.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、文書をスキャナ等の画
像読み取り装置から画像データとして読み込み、１文字
ずつ文字パターンを切り出し、認識する文字認識装置に
関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition device which reads a document as image data from an image reading device such as a scanner and cuts out a character pattern for each character and recognizes it.

【０００２】[0002]

【従来の技術】一般に、日本語文書では、漢字、仮名文
字等のように、文字幅が揃っている。このことを利用
し、従来は、入力された文字画像データから標準文字幅
を推定し、推定された標準文字幅より、１文字ずつ外接
する矩形などで文字パターンを切り出し、切り出された
外接矩形について認識し文字パターンの認識を行ってい
た。認識結果が該当する文字がなく棄却された場合、再
度文字の切り出し、位置や単位を変えてから認識を行
い、認識結果が受諾されるまで繰り返していた。2. Description of the Related Art Generally, Japanese documents have uniform character widths such as kanji and kana characters. Utilizing this fact, conventionally, the standard character width is estimated from the input character image data, and a character pattern is cut out from the estimated standard character width by a rectangle that circumscribes one character at a time. It recognized and recognized the character pattern. When the recognition result was rejected because there was no corresponding character, the character was cut out again, the position and unit were changed, recognition was performed, and the process was repeated until the recognition result was accepted.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、英数字
等かな漢字以外の文字が混在した文書では、文字幅が変
動し、何度も切り出し、認識を繰り返す必要があり、切
り出し誤り、処理時間の増大等の原因となっていた。However, in a document in which characters other than kana-kanji such as alphanumeric characters are mixed, the character width varies, and it is necessary to cut out and repeat recognition many times, resulting in an error in cutting out, an increase in processing time, etc. Was the cause of.

【０００４】本発明は上記課題に留意し、日本語以外の
文字が混在する文字列の文字認識が高速に処理される文
字認識装置を提供しようとするものである。The present invention has been made in consideration of the above problems, and an object of the present invention is to provide a character recognition device capable of performing high-speed character recognition of a character string in which characters other than Japanese are mixed.

【０００５】[0005]

【課題を解決するための手段】上記目的を達成するため
に、入力された文字画像データを所定単位で切り出し、
認識する文字認識装置において、文字画像データの黒画
素連結成分単位ごとに外接矩形を抽出する外接矩形抽出
手段と、この抽出された外接矩形で垂直方向に重なるも
のを統合し基本矩形を作成する基本矩形作成手段と、こ
の基本矩形の形状，位置により一般文字列と英文字など
のかな，漢字以外の特殊文字列のどちらであるかを判定
する文字列判定手段とを備え、この文字列判定手段によ
り判定された文字列に応じて切り出し、認識が行われる
ものである。In order to achieve the above object, the input character image data is cut out in a predetermined unit,
In a character recognizing device for recognizing, a circumscribing rectangle extracting means for extracting a circumscribing rectangle for each black pixel connected component unit of character image data and a vertical rectangle of the extracted circumscribing rectangles are integrated to form a basic rectangle. The character string determining means includes a rectangle creating means and a character string determining means for determining whether the character string is a general character string, a kana character such as an alphabetic character, or a special character string other than kanji, depending on the shape and position of the basic rectangle. It is cut out and recognized according to the character string determined by.

【０００６】[0006]

【作用】上記構成の本発明の文字認識装置は、文字画像
データを黒画素が連結した最小単位での外接矩形を抽出
し、近接する外接矩形は統合して基本矩形とし、この基
本矩形の幅，位置など大きさと位置から一般文字列とか
な，漢字以外の特殊文字列に文字列判定手段により選別
するものである。このようにして選別された文字列の性
質に応じて基本矩形の統合などして、文字の切り出し範
囲を決定して文字認識をするので、文字認識の繰り返し
が最小限に抑えられ、切り出し精度の向上，処理時間の
短縮などを実現することができる。The character recognition device of the present invention having the above-mentioned structure extracts the circumscribing rectangle in the minimum unit in which the black pixel is connected to the character image data, and the adjacent circumscribing rectangles are integrated to form the basic rectangle, and the width of the basic rectangle. The character string determination means selects the general character string, the kana character, and the special character string other than the Chinese character from the size and position such as the position. In this way, the basic rectangles are integrated according to the properties of the selected character strings to determine the character cutting range and perform character recognition. Improvements and reductions in processing time can be realized.

【０００７】[0007]

【実施例】図１は、本発明の一実施例における文字認識
装置の全体構成図である。構成要素として１はスキャナ
等の画像読み取り装置から画像データを入力する入力
部、２は入力された画像データから黒画素連結成分の外
接矩形を抽出する外接矩形抽出手段としての外接矩形抽
出部、３は垂直方向に重なる外接矩形を統合し、基本矩
形を作成する基本矩形作成手段としての基本矩形作成
部、４は基本矩形を統合し、文字パターンを切り出す基
本矩形統合部、５は切り出された文字パターンを認識す
る文字認識部、６は基本矩形統合部４から文字認識部５
への処理において基本矩形の幅，位置より、一般文字
列，特殊文字列のどちらの文字列であるか判定する文字
判定部、７は文字認識部５で得られた認識文字コードを
出力する出力部である。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 is an overall block diagram of a character recognition device in an embodiment of the present invention. As constituent elements, 1 is an input unit for inputting image data from an image reading device such as a scanner, 2 is a circumscribing rectangle extracting unit as circumscribing rectangle extracting means for extracting a circumscribing rectangle of a black pixel connected component from the input image data, 3 Is a basic rectangle creating unit as a basic rectangle creating unit that creates a basic rectangle by integrating circumscribing rectangles that overlap each other in the vertical direction, 4 is a basic rectangle creating unit, and a basic rectangle integrating unit that cuts out a character pattern, and 5 is a cut out character A character recognizing unit for recognizing a pattern, 6 is a basic rectangle integrating unit 4 to a character recognizing unit 5
In the processing, the character determination unit that determines whether the character string is a general character string or a special character string based on the width and position of the basic rectangle, and 7 is an output that outputs the recognized character code obtained by the character recognition unit 5. It is a department.

【０００８】以上のように構成された本実施例の文字認
識装置について、図２に示す全体フローチャートに従っ
て、以下その構成要素のお互いの関連動作を説明する。
ステップｓ１では、画像読み取り装置から画像データを
入力する。ステップｓ２では、入力された画像データの
黒画素連結部分の外接矩形を外接矩形抽出部２により抽
出する。図６ａに示す文字列の画像データであれば、図
６ｂのような外接矩形が抽出される。この場合は黒画素
連結成分が一つの外接矩形を有し、例えば文字「べ」は
「へ」と濁点の部分にそれぞれ外接矩形を有する。With respect to the character recognition apparatus of this embodiment having the above-described structure, the operation of the constituent elements in relation to each other will be described below with reference to the overall flow chart shown in FIG.
In step s1, image data is input from the image reading device. In step s2, the circumscribed rectangle of the black pixel connected portion of the input image data is extracted by the circumscribed rectangle extraction unit 2. With the image data of the character string shown in FIG. 6a, a circumscribed rectangle as shown in FIG. 6b is extracted. In this case, the black pixel connected component has one circumscribing rectangle, and for example, the character “be” has a circumscribing rectangle at the “he” and the dakuten.

【０００９】ステップｓ３では、抽出された外接矩形よ
り、文字行を切り出す。ステップｓ４では、切り出され
た文字行において、基本矩形作成部３により垂直方向に
重なる外接矩形を統合する。この統合された外接矩形を
基本矩形と呼ぶ。図６ｂに示す外接矩形であれば、図６
ｃのような基本矩形が作成される。この場合前述の例え
では文字「べ」については、一つの基本矩形が作成され
る。そして各基本矩形には基本矩形符号ａ〜ｚ，Ａ〜Ｅ
を付している。In step s3, a character line is cut out from the extracted circumscribed rectangle. In step s4, the circumscribed rectangles that overlap in the vertical direction are integrated by the basic rectangle creation unit 3 in the cut out character line. This integrated circumscribed rectangle is called a basic rectangle. If the circumscribing rectangle shown in FIG.
A basic rectangle such as c is created. In this case, one basic rectangle is created for the character "be" in the above example. The basic rectangle codes a to z and A to E are assigned to the respective basic rectangles.
Is attached.

【００１０】ステップｓ５では、切り出された文字行の
標準文字幅を推定する。文字行に含まれる基本矩形の高
さの最大値を、標準文字幅とする。これは、一般に、日
本語文書では、高さと幅の大きさが１：１である文字が
大部分を占めていることによる。In step s5, the standard character width of the cut character line is estimated. The maximum value of the height of the basic rectangle included in the character line is the standard character width. This is because, in general, in Japanese documents, most characters are 1: 1 in height and width.

【００１１】ステップｓ６では、文字列判定部６におい
て文字行に含まれる基本矩形について、一般文字列，英
文字列のどちらの文字列であるかを判定する。文字列の
判定については、後で詳述する。ステップｓ７では、処
理中の基本矩形が、一般文字列，英文字列のどちらであ
るか判断し、一般文字列であれば、ステップｓ８へ、英
文字列であれば、ステップｓ９へ移行する。ステップ８
では、一般文字の切り出し、認識を行い、ステップｓ９
では、英文字の切り出し、認識を行う。一般文字、及び
英文字の切り出し、認識については、後で詳述する。In step s6, the character string determination unit 6 determines whether the basic rectangle included in the character line is a general character string or an English character string. The determination of the character string will be described later in detail. In step s7, it is determined whether the basic rectangle being processed is a general character string or an English character string. If it is a general character string, the process proceeds to step s8, and if it is an English character string, the process proceeds to step s9. Step 8
Then, the general character is cut out and recognized, and step s9
Then, the English characters are cut out and recognized. The extraction and recognition of general characters and English characters will be described in detail later.

【００１２】ステップｓ１０では、切り出された文字行
に含まれる全ての基本矩形について処理が終わったかど
うか判定し、終わっていればステップｓ１１へ移行し、
終わっていなければステップｓ７へ戻って処理を繰り返
す。In step s10, it is determined whether or not the processing has been completed for all the basic rectangles included in the cut out character line. If the processing has been completed, the process proceeds to step s11,
If not, the process returns to step s7 to repeat the process.

【００１３】ステップｓ１１では、画像データに含まれ
る全ての文字行の切り出しが終わったかどうか判定し、
終わっていれば終了し、終わっていなければステップｓ
３へ戻って処理を繰り返す。In step s11, it is judged whether or not all the character lines included in the image data have been cut out,
If it's over, it's over, if it's over, step s
Return to 3 and repeat the process.

【００１４】次にステップｓ６文字列の判定処理につい
て、図３に示すフローチャートに従って、その動作を説
明する。Next, the operation of the character string determination processing in step s6 will be described with reference to the flowchart shown in FIG.

【００１５】ステップｓ１２では、行上端座標１ｕｐを
文字行に含まれる基本矩形の最上端座標に、行下端座標
１ｌｏｗを最下端座標に設定する。ステップｓ１３で
は、文字行の高さ１ｈを、１ｌｏｗ−１ｕｐ＋１に設定
する。ステップｓ１４では、ａ，ｃ，ｅ等の英文字の上
端座標ｕｐを１ｕｐ＋１ｈ×Ｔ１（ここでは、Ｔ１＝
０．４）に、下端座標ｌｏｗを１ｌｏｗに設定する。In step s12, the line upper end coordinate 1up is set to the uppermost end coordinate of the basic rectangle included in the character line, and the line lower end coordinate 1low is set to the lowermost end coordinate. In step s13, the height 1h of the character line is set to 1low-1up + 1. In step s14, the upper end coordinates up of the alphabetic characters such as a, c, e are set to 1up + 1h × T1 (here, T1 =
In 0.4), the lower end coordinate low is set to 1 low.

【００１６】ステップｓ１５では、英文字の幅ｗを標準
文字幅×Ｔ２（ここでは、Ｔ２＝０．６）に設定する。
図６の基本矩形の上端座標，下端座標が、図７に示すも
のであれば、１ｕｐ＝９７１ｌｏｗ＝１６３１ｈ＝１６３−９７＋１＝６７ｕｐ＝９７＋６７×０．４＝１２４ｌｏｗ＝１６３ｗ＝６３×０．６＝３８（標準文字幅＝
６３）となる。At step s15, the width w of the English character is set to the standard character width × T2 (here, T2 = 0.6).
If the upper end coordinates and the lower end coordinates of the basic rectangle of FIG. 6 are those shown in FIG. 7, then 1up = 97 1low = 163 1h = 163-97 + 1 = 67 up = 97 + 67 × 0.4 = 124 low = 163w = 63 × 0.6 = 38 (standard character width =
63).

【００１７】ステップｓ１６で、処理中の基本矩形の幅
が、ｗ以下で、かつ次の基本矩形との距離がｗ×Ｔ３
（ここでは、Ｔ３＝０．４）以下であれば、ステップｓ
１７へ移行し、そうでなければステップｓ１９へ移行す
る。図６ｃの基本矩形の幅、次の基本矩形との距離が図
７に示すものであれば、幅が３８以下で、かつ距離が１
５以下の基本矩形ｂ，ｊ，ｋ，ｌ，ｍ，ｎ，ｏ，ｐ，
ｑ，ｓ，ｗ，ｘ，ｙ，ｚ，Ａがステップｓ１７へ移行す
る。In step s16, the width of the basic rectangle being processed is w or less, and the distance to the next basic rectangle is w × T3.
If (here, T3 = 0.4) or less, step s
If not, the process proceeds to step s19. If the width of the basic rectangle in FIG. 6c and the distance to the next basic rectangle are as shown in FIG. 7, the width is 38 or less and the distance is 1
Basic rectangles b, j, k, l, m, n, o, p of 5 or less
q, s, w, x, y, z and A shift to step s17.

【００１８】ステップｓ１７では、基本矩形の上端座標
が、ｕｐ−１ｈ×Ｔ３（ここでは、Ｔ３＝０．２）〜ｕ
ｐの範囲に含まれていて、かつ、ｌｏｗ−１ｈ×Ｔ４
（ここでは、Ｔ４＝０．２）〜ｌｏｗの範囲に含まれて
いれば、ステップｓ１８へ移行する。それ以外は、何の
処理もされず未決定となる。図６ｃの基本矩形では、図
７に示す基本矩形の上端座標１１１〜１２４の範囲で、
下端座標が１５０〜１６３の範囲に含まれている基本矩
形ｊ，ｌ，ｎ，ｐが、ステップｓ１８へ移行する。In step s17, the upper end coordinates of the basic rectangle are up-1h × T3 (here, T3 = 0.2) to u.
is included in the range of p and is low-1h × T4
If it is included in the range of (here, T4 = 0.2) to low, the process proceeds to step s18. Other than that, no processing is performed and it is undecided. In the basic rectangle of FIG. 6c, in the range of the upper end coordinates 111 to 124 of the basic rectangle shown in FIG.
The basic rectangles j, l, n, and p whose bottom coordinates are included in the range of 150 to 163 move to step s18.

【００１９】ステップｓ１８では、文字列が未決定の基
本矩形を英文字列に決定する。例えば、図６ｃの基本矩
形ｊの場合、未決定の基本矩形は、ｊで、基本矩形ｌの
場合、未決定の基本矩形は、ｋ，ｌで、それぞれ英文字
列に決定する。In step s18, the basic rectangle whose character string has not been determined is determined as an English character string. For example, in the case of the basic rectangle j of FIG. 6c, the undetermined basic rectangle is j, and in the case of the basic rectangle l, the undetermined basic rectangles are k and l, which are determined to be English character strings, respectively.

【００２０】ステップｓ１９では、処理中の基本矩形を
一般文字列に決定し、ステップｓ２０で、直前に決定し
た基本矩形の文字列が、英文字列であれば、ステップｓ
２１へ移行し、一般文字列であれば、ステップｓ２２へ
移行する。例えば、図６ｃの基本矩形のｃの場合、ｂが
未決定であるので直前に決定した基本矩形はａで、一般
文字列であり、基本矩形ｒの場合、ｑが未決定であるの
で直前に決定した基本矩形はｐで、英文字列である。In step s19, the basic rectangle being processed is determined as a general character string. In step s20, if the character string of the basic rectangle determined immediately before is an English character string, step s
If the character string is a general character string, the process proceeds to step s22. For example, in the case of the basic rectangle c in FIG. 6c, since b is undecided, the basic rectangle determined immediately before is a and is a general character string, and in the case of the basic rectangle r, q is undetermined, so immediately before. The determined basic rectangle is p, which is an English character string.

【００２１】ステップｓ２１では、文字列が未決定の基
本矩形を英文字列に決定する。基本矩形ｒの場合、未決
定の基本矩形は、ｑであり、英文字列に決定する。ステ
ップｓ２２では、文字列が未決定の基本矩形を一般文字
列に決定する。基本矩形ｃの場合、未決定の基本矩形
は、ｂであり、一般文字列に決定する。In step s21, the basic rectangle whose character string has not been determined is determined as an English character string. In the case of the basic rectangle r, the undetermined basic rectangle is q, which is determined to be an English character string. In step s22, the basic rectangle whose character string has not been determined is determined as a general character string. In the case of the basic rectangle c, the undetermined basic rectangle is b, which is determined as the general character string.

【００２２】ステップｓ２３では、切り出された文字行
に含まれる全ての基本矩形について処理が終わったかど
うか判定し、終わっていれば終了し、終わっていなけれ
ばステップｓ１６へ戻って処理を繰り返す。以上の処理
で、基本矩形ｊ，ｋ，ｌ，ｍ．ｎ，ｏ，ｐ，ｑが、英文
字列と判定される。In step s23, it is determined whether or not the processing has been completed for all the basic rectangles included in the cut out character line. If all the basic rectangles have been processed, the processing ends. If not completed, the processing returns to step s16 to repeat the processing. With the above processing, the basic rectangles j, k, l, m. It is determined that n, o, p, and q are English character strings.

【００２３】次に、ステップｓ８の一般文字の切り出
し、認識処理について、図４に示すフローチャートに従
って、その動作を説明する。ステップｓ２４では、統合
を行う基本矩形の個数ｎを初期値として０を設定してお
く。ステップｓ２５では、ｎ＋１個の基本矩形の統合幅
が、推定しておいた標準文字幅より小さければ、ステッ
プｓ２６へ移行し、小さくなければ、ステップｓ２７へ
移行する。ステップｓ２６では、ｎをｎ＋１に設定す
る。Next, the operation of the general character cutout and recognition processing in step s8 will be described with reference to the flowchart shown in FIG. In step s24, 0 is set as the initial value of the number n of basic rectangles to be integrated. In step s25, if the integrated width of the n + 1 basic rectangles is smaller than the estimated standard character width, the process proceeds to step s26. If not, the process proceeds to step s27. In step s26, n is set to n + 1.

【００２４】ステップｓ２７では、ｎが１以上であれ
ば、ステップｓ２８へ移行し、そうでなければ、終了す
る。ステップｓ２８では、ｎ個の基本矩形を統合し、切
り出し文字パターンとする。ステップｓ２９では、切り
出された文字パターンを認識する。ステップｓ３０で
は、認識結果が受諾されれば、終了し、そうでなけれ
ば、ステップｓ３１へ移行する。ステップｓ３１では、
ｎをｎ−１に設定し、ステップｓ２７へ戻り、繰り返
す。図６ｃの基本矩形ｂは、２個の基本矩形（ｂ〜ｃ）
を統合し、認識を行う。基本矩形ｚは、２個の基本矩形
（ｚ〜Ａ）を統合し、認識結果が受諾されず、再度１個
の基本矩形（ｚ単独）で、認識を行う。In step s27, if n is 1 or more, the process proceeds to step s28, and if not, the process ends. In step s28, the n basic rectangles are integrated into a cut-out character pattern. In step s29, the cut-out character pattern is recognized. In step s30, if the recognition result is accepted, the process ends, and if not, the process proceeds to step s31. In step s31,
Set n to n-1, return to step s27 and repeat. The basic rectangle b in FIG. 6c is two basic rectangles (b to c).
Integrate and recognize. The basic rectangle z integrates two basic rectangles (z to A), the recognition result is not accepted, and recognition is performed again with one basic rectangle (z alone).

【００２５】次に、ステップｓ９の英文字の切り出し、
認識処理について、図５に示すフローチャートに従っ
て、その動作を説明する。ステップｓ３２では、統合を
行う基本矩形の個数ｎを初期値として１を設定してお
く。ステップｓ３３では、ｎ個の基本矩形の統合幅が、
標準文字幅以下であれば、ステップｓ３４へ移行し、そ
うでなければ、終了する。ステップｓ３４では、ｎ個の
基本矩形を統合し、切り出し文字パターンとする。ステ
ップｓ３５では、切り出された文字パターンを認識す
る。ステップｓ３６では、認識結果が受諾されれば、終
了し、そうでなければ、ステップｓ３７へ移行する。ス
テップｓ３７では、ｎをｎ＋１に設定し、ステップ３３
へ戻り、繰り返す。図６ｃの基本矩形ｊは、１個の基本
矩形（ｊ単独）で認識を行う。Next, in step s9, the English characters are cut out,
The operation of the recognition process will be described with reference to the flowchart shown in FIG. At step s32, 1 is set as the initial value for the number n of basic rectangles to be integrated. In step s33, the integrated width of the n basic rectangles is
If it is less than the standard character width, the process proceeds to step s34, and if not, the process ends. In step s34, n basic rectangles are integrated into a cut-out character pattern. In step s35, the cut-out character pattern is recognized. In step s36, if the recognition result is accepted, the process ends, and if not, the process proceeds to step s37. In step s37, n is set to n + 1, and step 33
Return to and repeat. The basic rectangle j in FIG. 6c is recognized by one basic rectangle (j alone).

【００２６】なお本実施例では、文字列判定を一般文字
列と英文字列としたが、英文字列はかな，漢字以外の特
殊文字列として扱っても同様であることは言うまでもな
い。In the present embodiment, the character string determination is a general character string and an English character string, but it goes without saying that the English character string may be treated as a special character string other than kana and kanji.

【００２７】[0027]

【発明の効果】以上の説明より明らかなように、本発明
の文字認識装置は黒画素連結成分の外接矩形を抽出する
外接矩形抽出手段と、垂直方向に重なった外接矩形を統
合して基本矩形を作成する基本矩形作成手段と、この基
本矩形の形状と位置より一般文字列と特殊文字列に区別
する文字列判定手段とを設けることによりその文字列に
応じ、次の文字の切り出し範囲を決定するため、切り出
し，認識の繰り返しを最小限に抑え、切り出し精度の向
上，処理時間の短縮等を実現することができる優れた文
字認識装置である。As is apparent from the above description, the character recognition device of the present invention integrates the circumscribing rectangle extracting means for extracting the circumscribing rectangle of the black pixel connected component and the circumscribing rectangle overlapping in the vertical direction to form a basic rectangle. By providing a basic rectangle creating means and a character string determining means for distinguishing between a general character string and a special character string based on the shape and position of this basic rectangle, the cutting range of the next character is determined according to the character string. Therefore, it is an excellent character recognition device that can minimize the repetition of clipping and recognition, improve the clipping accuracy, and shorten the processing time.

[Brief description of drawings]

【図１】本発明の一実施例における文字認識装置の全体
構成を示すブロック図FIG. 1 is a block diagram showing an overall configuration of a character recognition device according to an embodiment of the present invention.

【図２】同実施例の文字認識装置の全体動作を示すフロ
ーチャートFIG. 2 is a flowchart showing the overall operation of the character recognition device of the same embodiment.

【図３】同実施例の文字列判定部分の動作を示すフロー
チャートFIG. 3 is a flowchart showing an operation of a character string determination part of the embodiment.

【図４】同実施例の一般文字の切り出し，認識部分の動
作を示すフローチャートFIG. 4 is a flowchart showing an operation of a general character cutout and recognition part of the embodiment.

【図５】同実施例の英文字の切り出し，認識部分の動作
を示すフローチャートFIG. 5 is a flowchart showing the operation of the segmentation and recognition part of English characters in the same embodiment.

【図６】ａ〜ｃは同実施例の文字認識装置の動作を説明
するための文字画像データの一例とその切り出し手段を
示すパターン図6A to 6C are pattern diagrams showing an example of character image data for explaining the operation of the character recognition device of the embodiment and a cutout unit thereof.

【図７】同実施例の文字認識装置により作成された基本
矩形の性質を示す説明図FIG. 7 is an explanatory diagram showing the nature of a basic rectangle created by the character recognition device of the embodiment.

[Explanation of symbols]

１入力部２外接矩形抽出部３基本矩形作成部４基本矩形統合部５文字認識部６文字列判定部７出力部 1 input unit 2 circumscribed rectangle extraction unit 3 basic rectangle creation unit 4 basic rectangle integration unit 5 character recognition unit 6 character string determination unit 7 output unit

Claims

[Claims]

1. A circumscribing rectangle extracting means for extracting respective circumscribing rectangles of black pixel connected components from an input image and the circumscribing rectangles vertically overlapped with the circumscribing rectangles extracted by the circumscribing rectangle extracting means are integrated. , A basic rectangle creating means for creating a basic rectangle, and the shape and position of the basic rectangle,
A character string determination unit that determines whether the character string is a general character string or a special character string, and a character pattern that determines the range in which the basic rectangle is integrated according to the character string determined by the character string determination unit. A character recognition device comprising: a cutting-out means for cutting out a character pattern;