JPH03282983A

JPH03282983A - Method for extracting character

Info

Publication number: JPH03282983A
Application number: JP2084391A
Authority: JP
Inventors: Hiroshi Kameyama; 博史亀山; Shoji Miki; 三木　章司
Original assignee: Glory Ltd
Current assignee: Glory Ltd
Priority date: 1990-03-30
Filing date: 1990-03-30
Publication date: 1991-12-13
Anticipated expiration: 2010-07-31
Also published as: JPH0769934B2

Abstract

PURPOSE:To surely extract characters even when a bar is not straight in the horizontal direction by detecting character information from an end point and detecting a bar based upon the ratio of the shortest distance between respective end points to an effective distance. CONSTITUTION:A block having the prescribed number of continued dots or more and height or the number of dots larger than a prescribed value is extracted from the character information of a check at first. Then, each of obtained blocks BL1 to BL7 is divided into a matrix area of 3X3 and the numbers of vertical, oblique and horizontal masks are found out. Then, reliability including a bar is calculated in each block. After finding out the reliability in all the blocks, the block having the highest reliability is selected and end points in the block are extracted. Then, an effective bus length and a straight distance between respective end points are found out and the bar is detected based upon the ratio of the straight distance to the effective bus length. Thus, respective characters can surely be extracted.

Description

【発明の詳細な説明】発明の目的；（産業上の利用分野）この発明は、小切手等に手書きされたバーを含む文字情
報から、バーを検圧して取り除いて文字（数字）のみを
認識するための文字抽出方法に関する。[Detailed Description of the Invention] Purpose of the Invention; (Industrial Application Field) This invention recognizes only characters (numbers) from text information including bars handwritten on checks, etc. by pressure testing and removing the bars. Concerning a character extraction method for.

（従来の技術）従来、バーを含む文字情報の文字認識に際して、この文
字情報からバーを検出する方法としては、第１４図（Ａ
）　、　（Ｂ）に示す方法が知られている。(Prior Art) Conventionally, when character recognition is performed on character information including bars, a method for detecting bars from this character information is shown in Fig. 14 (A
) and (B) are known.

この方法は、第１４図（Ａ）　に示すように予め水平に
直線のバー１が基準線として書かれており、その上に文
字２を書くようになっている。そして、このようにバー
１の上側に書かれた文字２を認識する際には、まず文字
群の水平方向のドツト数の合計を垂直方向に向って各々
計数し、垂直方向にドツト数のヒストグラムを同図ＣＢ
）のように作成し、その極端に多い部分を基準線のバー
１と判断するようになっている。そして、この基準線を
基準に上側の文字２を判別するようになっている。In this method, as shown in FIG. 14(A), a straight bar 1 is drawn horizontally in advance as a reference line, and characters 2 are written on it. When recognizing character 2 written above bar 1, first count the total number of dots in the horizontal direction of each character group in the vertical direction, and then create a histogram of the number of dots in the vertical direction. The same figure CB
), and the extremely large portion is determined to be bar 1 of the reference line. Then, the upper character 2 is determined based on this reference line.

（発明が解決しようとする課題）しかし、上記バー検出方法では、基準線として予め直線
状のバーが水平方向に書かれているのて、垂直方向にド
ツト数のヒストグラムを作成すれば必ず検出できるが、
予め基準線として直線状のバーが記載されていない場合
には問題が生じる。例えば第１５図（Ａ）　に示すよう
にバー３自体も手書きとするような場合には、バー３か
必す水平になるとは限らず、ドツト数によりピストグラ
ムを作成しても文字とバーとか区別てきすバーの判断か
てきなかった。(Problem to be Solved by the Invention) However, in the above bar detection method, since a straight bar is drawn in advance in the horizontal direction as a reference line, it can always be detected by creating a histogram of the number of dots in the vertical direction. but,
A problem arises when a straight bar is not drawn in advance as a reference line. For example, when the bar 3 itself is handwritten as shown in Figure 15 (A), the bar 3 is not necessarily horizontal, and even if a pistogram is created by the number of dots, it is difficult to distinguish between the characters and the bar. I didn't understand the judgment of the kiss bar.

この発明は上述のような事情より成されたものであり、
この発明の目的は、バーか水平方向に直線状てない場合
であっても、又バー自体か手書きてあっても、バーを含
む文字情報からバーを確実に検出して文字を抽出する方
法を提供することにある。This invention was made due to the above-mentioned circumstances,
An object of the present invention is to provide a method for reliably detecting bars and extracting characters from character information including bars, even if the bars are not horizontally straight, or even if the bars themselves are handwritten. It is about providing.

発明の構成：（課題を解決するための手段）この発明は、バーを含む文字情報からバーを検出し、前
記バーを分離して文字を抽出する文字抽出方法に関する
もので、この発明の上記目的は、前記文字情報から端点
を検出し、各端点間の直線距離を検出し、前記各端点間
を結ぶドツト数を求め、前記各端点間の最短距離と実効
距離との比に基づいて前記バーを検出し、前記検出され
たバーを分離して文字を抽出することによって達成され
る。Structure of the Invention: (Means for Solving the Problems) The present invention relates to a character extraction method for detecting bars from character information including bars, separating the bars, and extracting characters. detects the end points from the character information, detects the straight-line distance between each end point, calculates the number of dots connecting each end point, and calculates the distance between the end points based on the ratio of the shortest distance between each end point and the effective distance. This is achieved by detecting the bar and separating the detected bar to extract the character.

（作用）この発明ては手書き文字と、文字記入のために予め印刷
されているかもしくは手書きで記入されたバーとを確実
に区別して、バーを文字認識の対象から除去して文字の
みを抽出するようにしている。バーの検出に際しては、
文字との関係てアンダーバー、ミドルバー、アッパーバ
ー、斜めハ、斜めアンダーバー、斜めアッパーバーの６
種類の式によってバーの確信度を求め、確信度の最も高
いバーを検出するようにしている。(Operation) This invention reliably distinguishes between handwritten characters and bars that are preprinted or handwritten for character entry, removes the bars from character recognition targets, and extracts only the characters. That's what I do. When detecting bars,
In relation to the letters, there are 6: under bar, middle bar, upper bar, diagonal C, diagonal under bar, and diagonal upper bar.
The confidence level of each bar is calculated using a type of formula, and the bar with the highest confidence level is detected.

（実施例）以下、本発明の実施例を図面を参照して説明する。先ず
この実施例は第２図又は第１１図（Ａ）〜（Ｆ）に示さ
れるような米国小切手上に手書きされた金額を認識する
に際して、ドルオーダーとセントオーダーを明確に区別
するために設けられたセントバーを自動的に検出する方
法であり、セントバーを検出することによってドルオー
ダー及びセントオーダーの認識に役立てることができる
ようにしたものである。(Example) Hereinafter, an example of the present invention will be described with reference to the drawings. First, this embodiment is provided to clearly distinguish between dollar orders and cent orders when recognizing handwritten amounts on U.S. checks as shown in FIG. 2 or FIGS. 11(A) to (F). This is a method of automatically detecting the cent bar that has been entered, and by detecting the cent bar, it can be used to recognize dollar orders and cent orders.

第１図はこの発明の動作を示すフローチャートであり、
大きく分けて文字情報の中からバーを含む領域の検出と
、その領域の中からバーの検出と、検出されたバー及び
各文字の位置関係による文字の認識動作とから成ってい
る。具体的に第２図に示す小切手の例を基に、第１図の
フローチャートに従って説明する。FIG. 1 is a flowchart showing the operation of this invention,
Broadly speaking, the process consists of detecting an area containing a bar from text information, detecting a bar from within that area, and character recognition based on the positional relationship between the detected bar and each character. The process will be specifically explained based on the check example shown in FIG. 2 and according to the flowchart in FIG. 1.

先す、第２図の小切手の文字情報からブロックの抽出を
行なう（ステップＳｌ）。ブロックの抽出は連続するド
ツト数か４以上のものについて行ない、４以下のものに
ついては文字情報を構成しないとして採用しない。そし
て、連続するドツト数が４以上のものついてはその平均
の高さをＥＦＩとし、高さ＞　ＦＥＩＸｏ、９以上のも
のの高さの平均をＥＦ２とした場合、高さがＥＦ２Ｘ１
／３より大きいもの又はドツト数がＥＦ２Ｘ１／２より
大きいものをブロックとして抽出する。第２図の文字情
報から上記のようにブロックを抽出すれば第３図のよう
に７つのブロックか抽出され、左側からの位置に従って
ＢＬＩ〜ＢＬ７の番号が付される。なお、ブロックの抽
出は特開平１−２３３５８５号公報に示されるような方
法によって行なうようにしても良い。First, blocks are extracted from the character information of the check shown in FIG. 2 (step Sl). Blocks are extracted for consecutive dots of 4 or more, and blocks of 4 or less are not adopted as they do not constitute character information. If the number of consecutive dots is 4 or more, the average height is EFI, and the height is > FEIXo, and the average height of 9 or more is EF2, the height is EF2X1.
A block larger than /3 or a block whose number of dots is larger than EF2X1/2 is extracted. If blocks are extracted from the character information in FIG. 2 as described above, seven blocks will be extracted as shown in FIG. 3, and numbers BLI to BL7 will be assigned according to the position from the left side. Note that block extraction may be performed by a method such as that disclosed in Japanese Patent Laid-Open No. 1-233585.

ブロックの抽出が終了すると、次に各ブロック毎にブロ
ック内にバーを含む確信度の算出動作に移る。先ず得ら
れた各ブロックＢＬＩ〜ＢＬ７を第５図に示す如く３×
３の９個のマトリクスエリアに区分する（ステップＳ２
）。例えばブロック８Ｌ４　に対しては第４図て示すよ
うな９個のエリアＺ　ｆｉ、　ｊｌが得られ、第５図で
示すような各エリアＺ　Ｎ、　Ｊｌ毎に垂直マスク数Ｚ
Ｖ（ｉ月、斜めマスク数ＺＳ　（１，Ｊ）　、水平マス
ク数ＺＨ＋＋、　Ｊｌ　を求める（ステップＳ３）。When the extraction of blocks is completed, the process moves on to the calculation operation of the confidence that a bar is included in each block. First, each of the obtained blocks BLI to BL7 is divided into 3× blocks as shown in FIG.
3 into nine matrix areas (step S2
). For example, for block 8L4, 9 areas Z fi, jl as shown in FIG. 4 are obtained, and the number of vertical masks Z for each area Z N, Jl as shown in FIG.
V (month i, number of diagonal masks ZS (1, J), number of horizontal masks ZH++, Jl are determined (step S3).

ここで、ブロックを３Ｘ３の９個のエリアに区分したの
は、後述するようにバーの存在と共にバーの種類の判別
を行なうためである。すなわち、ブロック内のバーの存
在位置の確率情報によりそのバーが、例えばブロック内
の下方（Ｚ　（０，２１。Here, the reason why the block is divided into nine areas of 3×3 is to determine the presence of a bar and the type of bar, as will be described later. That is, based on the probability information of the position of the bar within the block, the bar is located at, for example, the lower part of the block (Z (0, 21).

２（１，２１＋Ｚ＋２．２１）のエリアに存在している
場合にはアンダーバーと判別でき、ブロック内の上方の
エリア（Ｚｏｏ、　ｏ、＋Ｌ１．０＋　＋Ｚ＋２．　ｏ
＋）に存在していれはアッパーバーと判別てき、ブロッ
ク内の中段の１リア（Ｚｏｏ、　ｎ　＋Ｚｆ１．１１　
、＋２．　ｎ）ニ存在しティレバミドルバーと判別でき
、又ブロック内を斜めに遮断するエリア　（Ｚ　ｌｏ、
　ｏ、、Ｉ　Ｎ、　＋１．２１．２１　）又は（１＋２
．　。2 (1,21+Z+2.21), it can be determined as an underbar, and the upper area within the block (Zoo, o, +L1.0+ +Z+2.o
+), it is determined to be the upper bar, and the middle 1st rear in the block (Zoo, n +Zf1.11
, +2. n) There is an area that can be identified as Tireba middle bar and that cuts off the inside of the block diagonally (Z lo,
o,,IN, +1.21.21 ) or (1+2
．． .

Ｚ　＋＋、　＋＋　、Ｚ　ｌｏ、　２）　）　ニ存在し
テイル場合にはｆ、Ｊ　力バー。斜めアンターバー、斜
めアッパーバーと判別てき、アンダーバー、アッパーバ
ー、斜メバー、斜めアンダーバー、斜めアッパーバーの
ブロック内のバーの種類の判別に利用できる。Z ++, ++, Z lo, 2)) If there is a tail, then f, J force bar. It can be used to distinguish between diagonal underbars and diagonal upper bars, and can be used to determine the types of bars in blocks such as underbars, upper bars, diagonal mebars, diagonal underbars, and diagonal upper bars.

次にマスクの説明をする。マスクは３×３の９つのドツ
トで構成されており、文字情報の一部のドツトにおいて
垂直成分を構成するであろうドツトマスクのマスクパタ
ーンを垂直マスク、水平成分を構成するであろうマスク
パターンを水平マスク、斜め成分を構成するであろうマ
スクパターンを斜めマスクとしている。Next, I will explain the mask. The mask consists of nine dots in a 3 x 3 pattern, and the mask pattern of the dot mask that will constitute the vertical component of some dots of the character information is used as a vertical mask, and the mask pattern that will constitute the horizontal component is used as a vertical mask. A mask pattern that would constitute a horizontal mask and a diagonal component is used as a diagonal mask.

ここに、垂直マスクの例は第６図（Ａ）〜（Ｇ）に示す
７種類があり、斜めマスクは第７図（Ａ）及び（Ｂ）で
示す２種類があり、水平マスクは第８図（八）〜ｆＧ）
に示す７種類がある。なお、上記マスクは３×３で作成
されているか、３×３以上のマトリクスであれは良い。Here, there are seven types of vertical masks as shown in FIGS. 6(A) to (G), two types of diagonal masks as shown in FIGS. 7(A) and (B), and horizontal masks as shown in FIGS. Figure (8) ~fG)
There are seven types shown below. Note that the above mask may be created in a 3×3 matrix or in a matrix of 3×3 or more.

そして、上述したように各垂直マスクを各エリアＺ　ｆ
ｌ、　Ｊｌ内て走査させることによって、ブロック内の
垂直マスク数２Ｖ（ｉ、ｊｌを求め、各斜めマスクを各
エリアＺ　ｆｉ、　Ｊｌ内て走査させることによって、
ブロック内の斜めマスク数ＺＳ　、ｒＪ＋　を求め、各
水平マスクを各エリアＺ　ｆｉ、　Ｊｌ内で走査させる
ことによって、ブロック内の水平マスク数ＺＨ（ｉ、　
ｊｌ　を求める。そして、バーを含む確信度の算出を各
ブロックについて行なうか（ステップＳ４）、確信度の
算出は以下のように行なう。Then, as described above, each vertical mask is divided into each area Z f
By scanning within each area Z fi, Jl, the number of vertical masks in the block 2V(i, jl is determined, and by scanning each diagonal mask within each area Z fi, Jl,
The number of horizontal masks ZH(i,
Find jl. Then, the reliability including the bar is calculated for each block (step S4), and the reliability is calculated as follows.

アンダーバーの確信度αは、次のアンダーラインの抽出
を行なってから行なう。すなわち、先ずｕｄ−ｈｌｉｎ
ｅ［ｏ］＝ＺＨ（０，２）＋２Ｈ（１，２）＋ＺＨ（２
，２）−（ＺＳ（０，２）＋２５（１，２）＋ＺＳ（２
，２））／２（ＺＶ（０，２）＋ＺＶ（１，２）＋ＺＶ
（２，２））／２・・・・・・（１）ｕｄ−ｈｌｉｎｅ［１］−ＺＨ（０，２）＋Ｚ）１（１
，２）＋４８（２，１）］−（＋２　（０，２）　＋ｚ
ｓ　（１，２）　＋ｚｓ　（２、ｌ）　）／２−（ｚｖ
（ｏ、２）＋ｚｖ（１，２）＋ｚｖ（２，１））／２・
・・・・・（２）ｕｄ−１ｉｎｅ　［２］　−ＺＨ（０，１）　＋ＺＨ（
１，２）　＋２８　（２，２）−（ＺＳ（０，１）＋Ｚ
Ｓ（１，２）＋ＺＳ（２，２）　　）／２−（ＺＶ（０
，１）＋ＺＶ（１，２）＋ＺＶ（２，２）　　）／２・
・・・・・（３）を求め、このようにして求められたアンターラインの抽
出データｕｄ−ｈｌｉｎｅ［０］　、　ｕｄ−ｈｌｉｎ
ｅ［１］ｕｄ−ｈｌｉｎｅ　［２］を用いて、アンダー
バーの確信度αを下式に従って求める。The confidence level α of the underbar is determined after the next underline is extracted. That is, first ud-hlin
e[o]=ZH(0,2)+2H(1,2)+ZH(2
,2)-(ZS(0,2)+25(1,2)+ZS(2
,2))/2(ZV(0,2)+ZV(1,2)+ZV
(2,2))/2...(1) ud-hline[1]-ZH(0,2)+Z)1(1
,2)+48(2,1)]-(+2 (0,2) +z
s (1, 2) + zs (2, l) )/2-(zv
(o, 2)+zv(1,2)+zv(2,1))/2・
...(2) ud-1ine [2] -ZH(0,1) +ZH(
1,2) +28 (2,2)-(ZS(0,1)+Z
S(1,2)+ZS(2,2) )/2-(ZV(0
,1)+ZV(1,2)+ZV(2,2) )/2・
...(3), and the extracted data of the underline obtained in this way ud-hline[0], ud-hlin
Using e[1]ud-hline[2], the confidence level α of the underbar is determined according to the following formula.

α−にヒＭＡＸ［ｕｄ−ｈｌｉｎｅ［ｏ］　　、　ｕｄ
−ｈｌｉｎｅ［１］ｕｄ−ｈｌｉｎｅ［２］］　／（１
ブロツク内の黒の全画素数）　　　　　　　　　　　　
・・・・・・・・・（４）たたし、０≦α≦１てに、は
定数また、アッパーバの［（ｇ度βは、アッパーラインの抽出をｕｄ−ｈｌ　ｉｎｅ　［０］　４Ｈ（０、０）　＋ＺＨ
（１，０）　＋Ｚ）ｌ　（２、０）−（ＺＳ（０，０）
＋ＺＳ（１，０）＋ＺＳ（２，０）　　）／２（ＺＶ（
０，０）＋ＺＶ（１，０）＋ＺＶ（２，０）　　）／２
・・・・・・（５）ｕｄ−ｈｌｉｎｅ　［１］　−ＺＨ（０、１）　＋ＺＨ
（１、０）　＋ＺＨ（２、０）−（ＺＳ（０，１）＋Ｚ
Ｓ（１，０）＋ＺＳ（２，０）　　）／２−（ＺＶ（０
，１）＋ＺＶ（１，０）＋ＺＶ（２，０）　　）／２・
・−・・・（６）ｕｄ−ｈｌ　ｉｎｅ　［２］　−ＺＨ（０、Ｏ）　＋Ｚ
Ｈ（１、Ｏ）　＋ＨＩ（２、１）（ＺＳ（０，０）＋Ｚ
Ｓ（１，０）＋ＺＳ（２，１））／２−（ｚｖ（ｏ、ｏ
）＋ｚｖ（１，ｏ）＋ｚｖ（ｚ、１））／２・・・・・
・（７）で求めた後、下式に従って求める。α-hi MAX [ud-hline[o], ud
-hline[1]ud-hline[2]] /(1
total number of black pixels in the block)
・・・・・・・・・(4) Then, 0≦α≦1, is a constant, and the upper bar [(g degree β is the extraction of the upper line ud-hl ine [0] 4H (0,0) +ZH
(1,0) +Z)l (2,0)-(ZS(0,0)
+ZS(1,0)+ZS(2,0) )/2(ZV(
0,0)+ZV(1,0)+ZV(2,0) )/2
......(5) ud-hline [1] -ZH (0, 1) +ZH
(1,0) +ZH(2,0)-(ZS(0,1)+Z
S(1,0)+ZS(2,0) )/2-(ZV(0
,1)+ZV(1,0)+ZV(2,0) )/2・
・−・・・(6) ud-hl ine [2] −ZH(0, O) +Z
H(1,O) +HI(2,1)(ZS(0,0)+Z
S(1,0)+ZS(2,1))/2-(zv(o,o
)+zv(1,o)+zv(z,1))/2...
・After calculating in (7), calculate according to the formula below.

β−に２・ＭＡＸ［ｕｄ−ｈｌｉｎｅ［０］　、　ｕｄ
−ｈｌｉｎｅ［１］ｕｄ−ｈｌｉｎｅ［２］］　／（１
ブロツク内の黒の全画素数）　　　　　　　　　　　　
・・・・・・・・・（８）たたし、０≦β≦１てに２は
定数そして、ミドルバーの確信度γはｕｄｈｌｉｎｅ　［０］　−４８（０１）　十ＺＨ（１１）÷ＺＨ（２，１）を求めた後に、ｙ　＝に３・ｕｄ−ｈｌ　ｉｎｅ　［０］　／（１ブロ
ツク内の横幅の画素数）・・・・・・・・・（９）ただし、０≦γ≦１でに３は定数で求める。ざらに又、斜めバーの確信度δ３．斜めアン
ダーバーの確信度δ２．斜めアッパーバーの確信度δ３
は次のように求める。先ず、斜めバーの確信度δ１は５−ｂａｒ［０］一−３−ＺＶ（０，０）−１−ＺＶ（１，０）＋１−Ｚ
Ｖ　　（２，０）１−ＺＶ（０，１）＋１・ＺＶ（１，
１）−１−ＺＶ　　（２，１）＋１・ＺＶ（０，２）−
１−ＺＶ（１，２）−３・ＺＶ　　（２，２）３・ＺＳ
（０，０）−１４５（１，０）◆１−ＺＳ　　（２，０
）−１−ＺＳ（０，１）＋１−ＺＳ（１，１）−１・Ｚ
Ｓ　　（２，１１＋１−ＺＳ（０，２）−１−ＺＳ（１
，２）−３・ＺＳ　　（２，２）３−ＺＨ（０，０）−
１−ＺＨ（１，０）＋１−ＺＨ（２，０）−１・ＺＨ（
０，１）＋１−ＺＨ（１，１）−１−ＺＨ（２，１）＋
１−ＺＨ（０，２）−１−ＺＨ（１，２）−：ｌ・ＺＨ
（２，２）・・・・・・（ｌＯ）を求めた後に下式に従って確信度δ１を算出する。β-to 2・MAX [ud-hline[0], ud
-hline[1]ud-hline[2]] /(1
total number of black pixels in the block)
・・・・・・・・・(8) Add, 0≦β≦1, 2 is a constant, and the confidence level γ of the middle bar is ud hline [0] −48(0 1) +ZH(1 1 ) ÷ ZH (2, 1), then y = 3・ud−hl ine [0] / (number of horizontal pixels in one block) ・・・・・・・・・(9) However, If 0≦γ≦1, 3 is determined by a constant. Confidence level δ3 for the rough and diagonal bars. Confidence level of diagonal underbar δ2. Confidence of diagonal upper bar δ3
is calculated as follows. First, the confidence level δ1 of the diagonal bar is 5-bar[0] 1-3-ZV(0,0)-1-ZV(1,0)+1-Z
V (2,0)1-ZV(0,1)+1・ZV(1,
1)-1-ZV (2,1)+1・ZV(0,2)-
1-ZV (1,2)-3・ZV (2,2)3・ZS
(0,0)-145(1,0)◆1-ZS (2,0
)-1-ZS(0,1)+1-ZS(1,1)-1・Z
S (2,11+1-ZS(0,2)-1-ZS(1
,2)-3・ZS (2,2)3-ZH(0,0)-
1-ZH(1,0)+1-ZH(2,0)-1・ZH(
0,1)+1-ZH(1,1)-1-ZH(2,1)+
1-ZH(0,2)-1-ZH(1,2)-:l・ZH
After determining (2, 2)...(lO), the confidence level δ1 is calculated according to the following formula.

δ１− に４−ｓ−ｂａｒ　［０１／（１ブロツクの横幅の画素数）・・・・・・（１１）ただし、０≦δ１ ≦１てに４は定数そして、斜めアンダーパーの確信度δ２はｕｄ−ｂａｒ　［０］一＋ｏ・ｚｖ　（ｏ、ｏ）＋ｏ・ｚｖ　（１，０）＋１
−ＺＶ　（２，０）４０・ＺＶ（０，１）＋１−ＺＶ（
１，１）−１−ＺＶ（２，１）＋１−ＺＶ（０，２）−
１−ＺＶ（１，２）−３・ＺＶ（２，２）十〇・ＺＳ（
０，０）４０・２Ｓ（１，０）＋１−２５（２，０）＋
ｏ−ｚｓ　（０，１）＋１−ＺＳ　（１，１）−１−Ｚ
Ｓ　（２，１）４１−ＺＳ（０，２）−１−ＺＳ（１，
２）−３４５（２，２）＋０−Ｚ）ｌ　（０，０）十〇
・ＺＨ（１，０）＋１−Ｚｌｌ　（２，０）４０・ＺＨ
（０，１）＋１・ＺＨ（１，１）−１・ＺＨ（２，１）
＋１・ｚＨ（ｏ、２）−１−ｚＨ（１，２）−３・ＺＨ
（２，２）・・・・・・（１２）を求めた後に下式に従って確信度δ２を算出する。4-s-bar [01 / (number of pixels in width of one block) for δ1- (11) where 0≦δ1≦1, 4 is a constant, and confidence level δ2 of diagonal under par is ud-bar [0] 1+o・zv (o, o)+o・zv (1,0)+1
-ZV (2,0)40・ZV(0,1)+1-ZV(
1,1)-1-ZV(2,1)+1-ZV(0,2)-
1-ZV(1,2)-3・ZV(2,2)10・ZS(
0,0)40・2S(1,0)+1-25(2,0)+
o-zs (0,1)+1-ZS (1,1)-1-Z
S (2,1)41-ZS(0,2)-1-ZS(1,
2)-345(2,2)+0-Z)l (0,0) 10・ZH(1,0)+1-Zll (2,0)40・ZH
(0,1)+1・ZH(1,1)−1・ZH(2,1)
+1・zH(o,2)−1−zH(1,2)−3・ZH
After determining (2, 2) (12), the confidence level δ2 is calculated according to the following formula.

δ２＝に５・５−ｕｄｂａｒ［ｏ］／（１ブロツクの横幅の画素数）・・・・・・（１３）たたし、０≦６２ ≦１てに５は定数さらに、斜めアッバーパーの確信度δ３はｕｐ−ｂａｒ　［０］ −３−ＺＶ　（０，０）−１−ＺＶ　（１，０）　＋１
−ＺＶ　（２，０）１４Ｖ（０，１）＋１−ＺＶ（１，
１）４０・ＺＶ（２，１）＋１・ＺＶ　（０，２）十〇
・ＺＶ　（１，２）４０・ＺＶ　（２，２）３・ＺＳ（
０，０）−１４５（１，０）＋１・ＺＳ（２，０）１−
ＺＳ（０，１）＋１−ＺＳ（＋、１）＋０−ＺＳ（２，
１）＋１−ＺＳ（０，２）４０・ＺＳ（１，２）十〇・
ＺＳ（２，２）−３−ＩＨ（０，０）−１−ＺＨ（１，
０）＋１−Ｚｌ（（２，０）−１−２）１　（０，１）
＋１・ＺＨ（１，１）＋Ｏ・ＺＨ（２，１）＋１４８　
（０，２）＋０−ＺＨ（１，２）　４０・ＺＨ（２，２
）・・・・・・（１４）を求めた後に、下式に従って、確信度δ３を算出する。δ2= 5・5−ud bar[o] / (number of pixels in the width of one block) ・・・・・・(13) Where, 0≦62≦1, 5 is a constant, and the diagonal upper par The confidence level δ3 is up-bar [0] -3-ZV (0,0)-1-ZV (1,0) +1
-ZV (2,0)14V(0,1)+1-ZV(1,
1) 40・ZV (2, 1) + 1・ZV (0, 2) 10・ZV (1, 2) 40・ZV (2, 2) 3・ZS (
0,0)-145(1,0)+1・ZS(2,0)1-
ZS(0,1)+1-ZS(+,1)+0-ZS(2,
1) +1-ZS(0,2)40・ZS(1,2)10・
ZS(2,2)-3-IH(0,0)-1-ZH(1,
0)+1-Zl((2,0)-1-2)1 (0,1)
+1・ZH(1,1)+O・ZH(2,1)+148
(0,2)+0-ZH(1,2) 40・ZH(2,2
)...(14) After calculating, the confidence level δ3 is calculated according to the following formula.

δ３−に６−５−ｕｐ−ｂａｒ［ｏ］］／　　（＋ブロ
ック内の横幅の黒の画素数）　　　　　・・・・・・（
１５）たたし、Ｏ≦δ３≦１てに６は定数上述のようなバーを含む確信度の算出を全てのブロック
について算出したか否かを判断しくステップＳ５）、全
てのブロックについての算出か終了するまで上記動作を
繰返す。δ3-to 6-5-up-bar[o]]/ (+Number of horizontal black pixels in the block) ・・・・・・(
15) Then, O≦δ3≦1, and 6 is a constant. It is determined whether the calculation of confidence including the bar as described above has been calculated for all blocks. Step S5), calculation for all blocks Repeat the above operation until finished.

そして、全てのブロックについて確信度αβ、γ、δ３
．δ２．δ３か求められると、その中で確信度の最も高
いブロックを選択しくステップ５１０）、該当ブロック
内の端点を抽出する（ステップ５１１）。ただし、水平
方向距離がブロックの横幅のｌ／２以上の長さの２つの
端点のみを有効とし、極端に短かいものを除いて取扱う
。そして、第４図のブロックＢＬ４に関しては、第９図
で示すように、例えば点ａ　（ｘｉ、ｙｌ）及びｂ　（
ｘ２．ｙ２）　、　ａ及びｃ、ａ及びｄ、ａ及びｅの４
組の端点が抽出され、各端点間の実効パス長Ｐｋ（ｋ−
１〜ｎ　、ただし、ｎは２つの端点間の組合せ数）を求
める（ステップ５１２）。実効パス長Ｐ、は２つの端点
ａ及び５間の線か接続されている距離（ドツト数）であ
る。端点間に複数のバスか有るときには、最も短かいパ
ス長を実効パス長とする。例えは第１２図のブロックに
対して、端点ａ、ｂに関しては第１３図ＦＡ）　、　（
Ｂ）に示す２つのバスか有り、端点ａ、Ｃに関しては同
図（Ｃ）　、　（Ｄ）に示す２つのバスかあるが、短か
い方のパスミーイーローｂ（同図（Ａ））及びａ−イー
バーＣ（同図（Ｃ））を採用する。そして、各端点間の
直線路ｌｌ１ＩＤ、をＤｂ−Ｘ２−ＸＩ　　”　　（ｙ
２−ｙｌ）’　　　　　・＝−−（１６）で求め（ステ
ップ５１３）、各バスについて直線性Ｓｋを５ｋ−Ｄｋ／Ｐ。Then, confidence levels αβ, γ, δ3 for all blocks
．． δ2. When δ3 is determined, the block with the highest reliability is selected (step 510), and the end points within the block are extracted (step 511). However, only two end points whose horizontal distance is equal to or greater than 1/2 of the width of the block are valid, and extremely short end points are excluded. Regarding block BL4 in FIG. 4, as shown in FIG. 9, for example, points a (xi, yl) and b (
x2. y2) , a and c, a and d, a and e 4
The endpoints of the set are extracted, and the effective path length Pk(k−
1 to n, where n is the number of combinations between two end points) (step 512). The effective path length P is the distance (number of dots) connected by the line between the two end points a and 5. When there are multiple buses between endpoints, the shortest path length is taken as the effective path length. For example, for the block in Figure 12, for the end points a and b, Figure 13FA), (
There are two buses shown in B), and for end points a and C, there are two buses shown in (C) and (D) in the same figure, but the shorter one is Pass Me Elow B ((A) in the same figure). and a-Eber C ((C) in the same figure) are adopted. Then, the straight path ll1ID between each end point is Db-X2-XI'' (y
2-yl)' =-- (16) (step 513), and linearity Sk for each bus is 5k-Dk/P.

・・・・・・・・・（１７）で求める（ステップ５１４）。・・・・・・・・・(17) Find it with (Step 514).

このような直線性Ｓ、を全てのバスについて求めるまで上記動作を繰返しくステ
ップ５１５）、全てのバスについて直線性Ｓ、が求めら
れるとバスの中から最も直線性の大きいバスをセントバ
ーと仮説する（ステップ５２０）。Repeat the above operation until such linearity S is determined for all buses (step 515). Once linearity S is determined for all buses, the bus with the highest linearity among the buses is assumed to be the cent bar. (Step 520).

そして、仮説されたセントバー内に数字と共有するブラ
ンチが有るか否かを判断しくステップ５２１１、共有す
るブランチか有る場合には補間ブランチを除いて、仮説
されたセントバーを取り除く（ステップ５２３）。これ
は、バーと数字とか重なって接触している場合に対処す
るためである。又、数字と共有するブランチかない場合
には、仮説されたバーをブロックから取り除く（ステッ
プ５２２）６第１０図（Ｂ）は、同図（Ａ）のブロック
に対して仮説されたセントバー内の補間ブランチ以外を
取り除く例を示しており、同図（Ｃ）は仮説されたセン
トバーを取り除く例を示している。Then, it is determined whether or not there is a branch shared with the number in the hypothesized cent bar (step 5211). If there is a shared branch, the hypothesized cent bar is removed, excluding the interpolated branch (step 523). This is to deal with cases where the bar and numbers overlap and touch each other. If there is no branch shared with the number, the hypothesized bar is removed from the block (step 522).6 Figure 10 (B) shows the interpolation within the cent bar hypothesized for the block in Figure 10 (A). An example is shown in which items other than branches are removed, and (C) in the same figure shows an example in which a hypothesized cent bar is removed.

そして、特開平１−１２１９８８号公報に記載の方法に
より数字部のセグメント化を行ない（ステップ５２４）
、数字の認識、検証を行ない（ステップ５２５）、認識
、検証がＯＫか否かを判断する（ステッブ５２６）。そ
して、認識検証が０にてない場合には仮設されたパスを
仮説候補から外しくステップ５２７）、上記ステップ５
２０にリターンする。そして、仮説候補から外す回数が
２回目以上の場合には、エラー処理となっている（ステ
ップ５２８）。Then, the numerical part is segmented by the method described in JP-A-1-121988 (step 524).
, the number is recognized and verified (step 525), and it is determined whether the recognition and verification are OK (step 526). Then, if the recognition verification is not 0, remove the hypothetical path from the hypothesis candidates (step 527), step 5 above.
Return to 20. If the number of times the hypothesis candidate is removed is the second time or more, an error process is performed (step 528).

上述の実施例ではドルについて説明しているか、円やボ
ンド等の他の通貨に関しても同様に通用できる。また、
第１４図（Ａ）及び第１５図（Ａ）　　に示すような基
準線の抽出にも通用できる。Although the above-described embodiments have been explained in terms of dollars, the same applies to other currencies such as yen and bonds. Also,
This method can also be used to extract reference lines as shown in FIGS. 14(A) and 15(A).

発明の効果以上のようにこの発明の文字抽出方法によれば、バーを
含む文字情報から端点を検出し、各端点間の最短距離及
び実効距離の比に基づいてバーを検出して、バーをブロ
ックから取り除いて文字を認識するようにしているため
、バーに接触した文字に関しても確実な文字認識が可能
となる。この発明によれば、第１１図（Ａ）〜（Ｆ）で
示すような文字に関しても、認識が可能である。Effects of the Invention As described above, according to the character extraction method of the present invention, end points are detected from character information including bars, and bars are detected based on the ratio of the shortest distance and effective distance between each end point. Since characters are recognized after being removed from the block, reliable character recognition is possible even for characters that touch the bar. According to this invention, it is also possible to recognize characters as shown in FIGS. 11(A) to 11(F).

[Brief explanation of the drawing]

第１図はこの発明の動作例を示すフローチャート、第２
図は手書きされた文字の一例を示す図、第３図及び第４
図はブロック化処理を説明するための図、第５図は分割
されたエリアを示す図、第６図（Ａ）〜（Ｇ）は垂直マ
スクの例を示す図、第７図（Ａ）及び（Ｂ）は斜めマス
クの例を示す図、第８図ＦＡ）〜（Ｇ）は水平マスクの
例を示す図、第９図は端点間の実効バス、最短距離、直
線性を説明するための図、第１０図は仮説に基づく処理
例を説明するための図、第１１図（Ａ）〜（Ｆ）は手書
文字の一例を示す口笛１２図及び第１３図（Ａ）〜（Ｄ
）は端点のパス長を説明するための図、第１４図（Ａ）
　、　（Ｂ）及び第１５図（Ａ）　、　（Ｂ）は従来の
文字の抽出を説明するための図である。ＢＬＩ　〜ＢＬ７−・・ブロック、ａ　、　ｂ　、　ｃ
　、　ｄ　、ｅ　・・一端点。FIG. 1 is a flowchart showing an example of the operation of this invention, and FIG.
Figures 3 and 4 show examples of handwritten characters.
The figures are diagrams for explaining blocking processing, Figure 5 is a diagram showing divided areas, Figures 6 (A) to (G) are diagrams showing examples of vertical masks, and Figures 7 (A) and (B) is a diagram showing an example of a diagonal mask, Figures 8 (FA) to (G) are diagrams showing an example of a horizontal mask, and Figure 9 is a diagram showing an example of an effective bus between end points, the shortest distance, and linearity. Figure 10 is a diagram for explaining a processing example based on a hypothesis, Figures 11 (A) to (F) are whistle diagrams 12 and 13 (A) to (D) showing examples of handwritten characters.
) is a diagram for explaining the path length of end points, Figure 14 (A)
, (B) and FIGS. 15(A) and (B) are diagrams for explaining conventional character extraction. BLI ~BL7-...Block, a, b, c
, d, e...One end point.

Claims

[Claims] 1. A character extraction method in which a bar is detected from character information including a bar, and characters are extracted by separating the bar, which detects end points from the character information and calculates the linear distance between each end point. detect the bar, calculate the number of dots connecting each of the end points, detect the bar based on the ratio of the shortest distance between the end points and the effective distance, and separate the detected bar to extract the character. A character extraction method characterized by the following. 2. When recognizing characters in character information including bars, in a character extraction method that detects bars from the character information and extracts only the characters, a block is formed by treating one continuous character group from the character information as one block. The extracted blocks are divided into multiple areas Z_(_i_, _j_)(
i = 0 to m, j = 0 to n), and each area Z_(_i_,
Find the number of existing areas for each area Z_(_i_, _
The confidence that the block contains a bar is calculated for each block based on the number of masks in j__), the endpoints in the block that have a high certainty of including the bar are found, and the shortest distance between the detected endpoints is calculated. The distance and the effective distance, which is the number of dots of the line segment between the end points, are determined, and the bar is detected based on the ratio of the determined shortest distance between each end point and the effective distance, and the character is extracted. A character extraction method characterized by