JPS60132281A

JPS60132281A - Character separating device

Info

Publication number: JPS60132281A
Application number: JP58240335A
Authority: JP
Inventors: Yoshitake Tsuji; 辻　善丈; Hiroshi Asai; 淺井　紘
Original assignee: NEC Corp; Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1983-12-20
Filing date: 1983-12-20
Publication date: 1985-07-15
Also published as: JPH0368431B2

Abstract

PURPOSE:To determine easily and with a high accuracy a character separating candidate example position by calculating a sequence of the character separating candidate example position for minimizing a dispersion of a distance between each separating proposed example position, between each separating candidate example section, and a dispersion regarding a shift quantity of an average distance and a characer pitch, by using a dynamic planning method. CONSTITUTION:An allowable section which can set a character separating position is set by using a character pitch P and threshold levels T1, T2. Subsequently, by using the character pitch P and a threshold level T3, a separating candidate example section (k) (provided that k>=0) is set successively, and with respect to a separating candidate example position (x) (k, ik) (provided that ik>=1) in each separating candidate example section (k), a distance (d) (k, k+1; ik, kk+1) between each separating candidate example position is calculated. Subsequently, a dispersion sigma<2>d of the distance (d) (k, k+1; ik, ik+1) calculated in each separating candidate example section (k), and an evaluating measure U consisting of a double of a double error (mud-P)<2> of a shift of an average value mud and the character pitch P are calculated. Next, a sequence of a separating candidate example position (x) (k,ik) (k>=0) for minimizing its evaluating measure is derived, by which a character separation determining position is determined.

Description

【発明の詳細な説明】本発明は、紙面上に記載された文字列イメージを個々の
文字に分離する文字分離装置に関するものである。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a character separation device that separates a character string image written on paper into individual characters.

各種印刷文字群を光学的に読み取る装置（以下、０Ｃ几
と呼ぶ）において、一連の文字を認識する場合、各文字
を１字毎に分離して文字認識部に送出してやる必要があ
る。ところで、郵便物や大量の文書をＯＣ几により読み
取る場合、印刷物の品質や印字スタイルが多種存在して
おり、それ等を読取対象として取り扱う必要が生じる。When recognizing a series of characters in a device that optically reads various groups of printed characters (hereinafter referred to as 0C), it is necessary to separate each character one by one and send it to a character recognition unit. By the way, when reading mail or a large amount of documents using an OC machine, there are many different qualities and printing styles of printed matter, and it is necessary to handle them as objects to be read.

このような場合、紙面上の文字列イメージに、文字間に
接触が生じたり％　１文字が２文字以上に分離する状態
が生じ、これらを効率良く取り扱うことができる文字分
離手法が要求される。従来、このような読取り対象の制
約条件が緩和された入力データも含まれる場合、個々の
ケースに有効と思われる機能を追加した文字分離手法が
適用されている。しかし、このように個々のケースに対
応した機能を適宜追加することは文字分離の精度を低下
させたリ、あるいは、個々の対象物毎に異なった機能を
持つ文字分離装置を開発する必要性が生じる。In such a case, characters may come into contact with each other in the character string image on the paper, or a % 1 character may be separated into two or more characters, and a character separation method that can efficiently handle these situations is required. Conventionally, when input data that has relaxed constraints on the reading target is included, character separation methods have been applied that add functions that are considered effective in each case. However, adding functions corresponding to individual cases as appropriate may reduce the accuracy of character separation, or may require the development of character separation devices with different functions for each individual object. arise.

そこで、本発明の目的は、上記従来の問題点を解決する
ために文字ピッチ及び空白情報を用いて、順次、分離候
補区間を設定し、各分離候補区間の間で各分離候補位置
間の距離の分散や平均距離と文字ピッチとのずれ量に関
する分散を最小とするような文字分離候補位置の系列を
動的計画法を用いて算出することによって、文字間に接
触するイメージが生じた場合や１文字が２文字以上に分
離する状態が生じても特殊な機能を追加することなく容
易にしかも精度良く文字分離候補位置を決定することが
できる文字分離装置を提供することにある。SUMMARY OF THE INVENTION Therefore, an object of the present invention is to sequentially set separation candidate sections using character pitch and blank information, and to determine the distance between each separation candidate position between each separation candidate section in order to solve the above conventional problems. By using dynamic programming to calculate a series of character separation candidate positions that minimizes the variance of the variance and the variance of the deviation between the average distance and the character pitch, it is possible to solve cases where an image of contact between characters occurs. To provide a character separation device capable of easily and accurately determining a character separation candidate position without adding any special function even if one character is separated into two or more characters.

以下、本発明について図面を用いて祥細に説明する。Hereinafter, the present invention will be explained in detail using the drawings.

第１図ｆａ）　、　ｆｂ）　、　（Ｃ）は、部分文字列
イメージの一例を用いて、本発明における文字分離の分
離候補区間設定方法の一例を説明するための図である。FIGS. 1 fa), fb), and (C) are diagrams for explaining an example of a separation candidate section setting method for character separation in the present invention using an example of a partial character string image.

なお、印刷物の品質や印字スタイルが多種存在する読取
対象における文字分離を行う場合、最初に文字ピッチを
検出する必要がある。文字ピッチ検出装置としでは、例
えば、同一出願人による特願昭５８〜１６０７６３号明
細書「文字ピッチ検出装置」（以下引用例１とする）で
示されるような文字ピッチ検出装置があり、このような
装置を用いて一連の文字行イメージから文字ピッチを検
出することができる。− 第１図（ａｊは、文字間の接触イメージや１文字が２文
字にスプリットする場合を含む文字列イメージを斜線部
で示し、図中Ｐは文字ピッチを示している。第１図（ａ
）の部分文字−列イメージを垂直投影すると、第１図（
ｂ）で示す投影分布が得られ、黒地領域（図中、斜線部
）、白地領域に分離することができる。文字ピッチＰは
、第１図ｆａｔで示した文字列イメージから例えば前記
の引用例１で示されているような「文字ピッチ検出装置
」を用いても良いし、予め既知であれば、その既定値を
用いても良い。ここで、第１図ｆａ）で示した接触文字
（ａｍｕ）及びスプリットした文字（ｈｌを正しく１文
字車位に分離するためには、その文字の分離開始位置を
正しく予測することが必要である。例えば、図において
、接触文字イメージ′ａ“と′ｍ“とでは、ａの方がわ
ずかに小さいため、その切り出し開始位置は、接触文字
イメージ（ａｍｕ）　の始端より少し左の方になる。こ
のような接触文字の位置ずれの補正は、従来、例えば文
字イメージを参照することによって行われていた。その
ため、文字イメージの接触の度合等により、イメージを
参照することによって処理時間を要するにもかかわらず
、正しく分離位置を決定することができない場合が生じ
る。そこで、本発明をこおける文字分離位置の決定方法
はまず、第１図（ａ）で示したような文字行イメージに
おいて、第１図（ｂ）で示したような投影分布等から抽
出された白地領域（以下、空白と呼ぶ）及び黒地領域（
以下、文字塊と呼ぶ）の位置及び大きさを抽出する。例
えば、第１図（ｂ）に示すような文字塊幅Ｖｉ（ｉ＝＝
１．・・・４）及び空白サイズＷｉ　（ｉ　＝　ｌ、・
・・４）１更には、それらの位置情報が公知の技術によ
って、得られる。次に、前もって得られた文字ピッチＰ
及び閾値Ｔ、、Ｔ、を用いて、文字分離位置を設定する
ことが可能な区間（以下、許容区間と呼ぶ）を、例えば
、次に示す条件（１）　、　（２）を用いて設定するこ
とができる。Note that when performing character separation in a reading target where there are various print qualities and printing styles, it is necessary to first detect the character pitch. As a character pitch detecting device, for example, there is a character pitch detecting device as shown in Japanese Patent Application No. 160763 (1982) filed by the same applicant (hereinafter referred to as Cited Example 1). Character pitch can be detected from a series of character line images using a conventional device. - Figure 1 (aj shows the character string image including the contact image between characters and the case where one character is split into two characters with diagonal lines, and P in the figure shows the character pitch.
) is vertically projected, the partial character-sequence image in Figure 1 (
The projection distribution shown in b) is obtained, and can be separated into a black background area (shaded area in the figure) and a white background area. The character pitch P may be determined from the character string image shown in FIG. A value may also be used. In order to correctly separate the touching character (amu) and the split character (hl) shown in FIG. 1 fa) into one character position, it is necessary to correctly predict the separation start position of the character. For example, in the figure, in the touching character images 'a' and 'm', a is slightly smaller, so its extraction start position is slightly to the left of the starting edge of the touching character image (amu). Conventionally, corrections for misalignment of touching characters have been carried out, for example, by referring to the character images.As a result, depending on the degree of contact between the character images, etc., it takes processing time to refer to the images. , it may not be possible to determine the separation position correctly.Therefore, the method for determining the character separation position in accordance with the present invention begins with the character line image shown in Fig. 1(a). A white background area (hereinafter referred to as a blank space) and a black background area (hereinafter referred to as a blank area) extracted from the projection distribution etc. shown in b)
The position and size of the character block (hereinafter referred to as a character block) are extracted. For example, the character block width Vi (i==
1. ...4) and blank size Wi (i = l, ・
...4)1 Furthermore, their positional information can be obtained using known techniques. Next, the character pitch P obtained in advance
and thresholds T, , T, to set an interval (hereinafter referred to as an allowable interval) in which a character separation position can be set using, for example, the following conditions (1) and (2). be able to.

条件（１）空白となる区間は許容区間とする。Condition (1) A blank section is a permissible section.

条件（２）任意の文字塊幅Ｖｊにおいて、Ｖｉ＞Ｐ＋Ｔ
。Condition (2) At any character block width Vj, Vi>P+T
.

を満たす黒地領域のうち１文字塊幅Ｖｉの両端からＴ２
までの黒地領域を除いた区間は、許容区間とする。T2 from both ends of one character block width Vi in the black area that satisfies
The section excluding the black area up to is the permissible section.

上述した条件（１）及び条件（２）を満足する許容区間
は例えば、第１図（ａｌの文字列イメージに対して、第
１図（Ｃ）の図中Ａｔ　、　Ａｔ　、　Ａｓ　、　Ａ４
．　Ａｓで示す区間として得られる。尚、上述した閾値
Ｔ、及びＴ、は、文字ピッチＰの関数として、与えても
良い。また、閾値Ｔ１は前記引用例１に示される文字ピ
ッチＰを推定した時の推定誤差等に基づいて設定しても
良い。次に、本発明における文字分離位置の決定は、前
述した許容区間内で、文字ピッチＰ及び閾値Ｔｊを用い
て、順次第１図（Ｃ）で示すような分離候補区間ｋ（−
但し、ｋ≧０）を設定し、各分離候補区間に内の分離候
補位置Ｘ（ｋ、１ｔ）（但し、ｊｋ≧１であり、分離候
補区間に内の相対番号を表わす）に対して、それぞれの
分離候補位置間の距離ｄ（ｋ、に＋１；ｉｋ、ｊｋ＋＋
）を算出し、各分離候補区間ｋにおいて算出される距離
ｄ（ｋ、に＋１；ｊｋ。For example, the permissible intervals that satisfy the above-mentioned conditions (1) and (2) are At , At , As , A4 in Fig. 1 (C) for the character string image in Fig. 1 (al).
．． This is obtained as an interval indicated by As. Note that the threshold values T and T mentioned above may be given as functions of the character pitch P. Further, the threshold T1 may be set based on the estimation error when estimating the character pitch P shown in Cited Example 1 above. Next, character separation positions in the present invention are determined using the character pitch P and threshold Tj within the above-mentioned allowable interval, and sequentially determines the separation candidate interval k (-
However, k≧0) is set, and for a separation candidate position X (k, 1t) within each separation candidate section (however, jk≧1, representing a relative number within the separation candidate section), The distance between each separation candidate position d(k, +1; ik, jk++
), and the distance d(k, +1; jk) calculated in each separation candidate section k.

ｊｈ＋ｔ　）の分散σ５及び平均値μｄと文字ピッチＰ
とのずれの２乗誤差（μｄ−Ｐ）２からなる評価尺匿Ｕ
を計算し、その評価尺度を最小とするような分離候補位
置ｘ（ｋ、１ｋ）（ｋ≧０）の系列をめることによって
行われる。そこで、上述した分離候補区間にの設定方法
の一例について、第１図（Ｃ）を用いて説明する。jh+t) variance σ5 and average value μd and character pitch P
The evaluation scale consisting of the squared error (μd-P)2 of the deviation from
This is done by calculating the separation candidate position x(k, 1k) (k≧0) that minimizes the evaluation scale. Therefore, an example of a method for setting the above-mentioned separation candidate sections will be explained using FIG. 1(C).

第１図（Ｃ）における黒点は、分離候補区間ｋ（但し、
ｋ：０，１，２，３．４）内の各分離候補位［ｘ（ｋ。The black dots in FIG. 1(C) indicate separation candidate section k (however,
k: 0, 1, 2, 3.4) for each separation candidate position [x(k.

ｊｈ）を示しており、上述した許容区間内で、下記に示
す式（３）の関係を満たす位置として、順次設定される
。jh), and are sequentially set as positions that satisfy the relationship of equation (3) shown below within the above-mentioned allowable interval.

１ｄ（ｋ、に＋１；ｉｋ、ｊｋ＋１）　Ｐ　Ｉ≦Ｔ３　
・・・（１）式（１）において、距離ｄ　（ｋ　、　ｋ
　＋　１　；　”ｋ＋　”ｋ＋１）は、分離候補区間に
＋１の分離候補位置ｘ（ｋ＋１　、　ｉｈ＋ｔ　）と分
離候補区間にの分離候補位置Ｘ（ｋ−”ｉ＋）との距離
ｘ（ｋ＋１　、　ｉｈ＋ｘ）　ｘ（ｋ’：　ｉｈ）　を
示している。例えば、第１図（Ｃ）における分離候補位
置Ｘ（１，１）と分離候補位置Ｘ（２，２）との距離ｄ
（１，２；　１，２　）において、ＩＰ−ｄ（１，２；
　１，２）１≦Ｐ　−４−Ｔ３　を満足している。また
、分離候補区間０の分離候補位置ｘ（ＯＪ）から式（１
）で示した関係式を満足する位置は、第１図（Ｃ）の白
点で示すような６個の位置があるが、上述した許容区間
内に属する白点は、第１図（Ｃ）の黒点で示すように、
２個となり、分離候補区間ｌの分離候補位置はＸ（１，
１）、ｘ（１，２）の２つの位置としてまる。1d (k, +1; ik, jk+1) P I≦T3
...(1) In equation (1), distance d (k, k
+ 1; "k+"k+1) is the distance x(k+1, ih+x) between the separation candidate position x (k+1, ih+t) of +1 in the separation candidate section and the separation candidate position X (k-"i+) in the separation candidate section x(k': ih).For example, the distance d between the separation candidate position X(1,1) and the separation candidate position X(2,2) in FIG. 1(C)
(1,2; 1,2 ), IP-d(1,2;
1, 2) 1≦P −4−T3 is satisfied. In addition, from the separation candidate position x (OJ) of separation candidate section 0, equation (1
) There are six positions that satisfy the relational expression shown in Figure 1 (C), as shown by the white dots in Figure 1 (C). As shown by the black dots,
The separation candidate position of the separation candidate section l is X(1,
1) and x(1,2).

尚、上述した閾値Ｔ３は前述したような閾値Ｔ１のよう
に文字ピッチＰの関数として与えることができる。次に
、評価尺度Ｕを用いて、最適な分離候補位置をめるにあ
たり、終端状態となる分離候補区間は、例えば、文字行
の終端となる空白内のみ設定できるとしても良いし、前
述した空白サイズＷｉが文字ピッチＰに対して、Ｗｉ　
＞　Ｔ４・Ｐ（但し、Ｔ４は閾値である）を満たす許容
区間内に設定できるとしても良い。後者の方法を採用す
ると、例えば、第１図（Ｃ）における領域Ｅが終端状態
となる許容区間として検出される。更に、評価尺度Ｕを
計算するにあたり、次に設定すべき分離候補区間（ｋ＋
１）を分離候補区間により前述した式（１）を用いて設
定する場合、式（１）を満たす分離候補区間（ｋ＋１）
がすべて許容区間でなけれ（ち分離候補区間ｋを終端状
態となる分離候補区間にとすることができる。一方、評
価尺度Ｕを用いて最適な分離候補位置をめるための始端
状態となる分離候補区間は、すでに検出された終端状態
となる分離候補区間の位置に基づいて設定することがで
きる。Note that the threshold value T3 described above can be given as a function of the character pitch P, like the threshold value T1 described above. Next, when determining the optimal separation candidate position using the evaluation scale U, the separation candidate section that becomes the terminal state may be set only within the blank space that is the end of the character line, or the above-mentioned blank space When size Wi is character pitch P, Wi
> T4·P (however, T4 is a threshold value) may be set within an allowable interval. If the latter method is adopted, for example, region E in FIG. 1(C) is detected as a permissible section in which the end state is reached. Furthermore, in calculating the evaluation scale U, the separation candidate interval (k+
1) is set using the above-mentioned formula (1) using the separation candidate section, the separation candidate section (k+1) that satisfies the formula (1)
are not all allowable intervals (in other words, the separation candidate section k can be set as the separation candidate section that becomes the terminal state. On the other hand, the separation candidate section k that becomes the starting state is used to find the optimal separation candidate position using the evaluation scale U). The candidate section can be set based on the position of the separation candidate section that has already been detected as the terminal state.

尚、本発明の文字分離における分離候補区間設定方法は
、上述した条件以外でも１文字ピッチＰ。Note that the separation candidate section setting method for character separation according to the present invention can be applied to one character pitch P even under conditions other than the above-mentioned conditions.

文字塊幅■ｉ、ｉｎサイズＷｉに基づいて設定できるこ
とは言うまでもない。Needless to say, it can be set based on the character block width ■i and the in size Wi.

第２図は本発明における最適な文字分離位置を抽出する
ための原理を説明するための図である。FIG. 2 is a diagram for explaining the principle for extracting optimal character separation positions in the present invention.

図において、黒点で示した位置は、第１図ｆｃ）で示し
た分離候補区間ｋ（ｌｃ＝Ｑ、・・・４）の各分離候補
位置ｘ（ｋ、ｉｋ）の値を示している。また、文字ピッ
チＰは２０である場合を示している。更に、本発明の原
理を簡単に説明するために、分離対象となる文字列イメ
ージは、第１図（Ｃ）Ｊこおける分離候補区間０から分
離候補区間４までとする。まず、記号の定義を行う。記
号μｄ（ｒ＊ｎｔｉｒ＊’ｎ）（但し、ｌ≦ｒ；（ｎ）
は、分離候補区間ｒの分離候補位置ｘ（ｒ、ｉ、）から
分離候補区間ｎの分離候補位置ｘ（ｎ、１ｆｉ）に到る
までの各分離候補区間で任意に選択された１１−ｒ＋１
個の分離候補位置ｘ（ｒ、　１ｒ）ｅＸ（ｒ＋１．’ｒ
＋ｘ）、・−ｅ　ｘ（ｎ、　ｉｎ）から得られるｎ　−
ｒ個の距離ｄ（’　＊　’　＋　１　；　１、＠　ｉｒ
＋１　）ｄ（ｒ＋１．ｒ＋２：ｉｒ＋ｔ＊ｊｒ＋ｚ）・
−、ｄ（ｎ−１，ｎ；ｉｎ　１゜１ｎ）の平均値を示す
。記号σｄ（ｒｓ”：ｉｒ＋ｉ＋）（但し、１≦ｒ≦ｎ
）は、分離候補区間ｒの分離候補位置ｘ（ｒ、ｉ、）か
ら分離候補区間ｎの分離候補位置ｘ（ｎ、１ｆｌ）に到
るまでの各分離候補区間で任意に選択されたｎ　−ｒ−
１−１１同の分離候補位置Ｘ（ｒｅｉｒ）ｅｘ（ｒ＋１
．ｉｒ＋ｘ）、・−、ｘ（ｎ、ｉ、、）から得られるｎ
　−ｒ個の距ｗｊｄ　（ｒ　ｔ　ｒ　＋　１　ｒ　ｉｒ
　−ｒ＋１　）　＊ｄ　（ｒ＋１　、　ｒ＋２　：　ｊ
ｒ＋１　、１ｒ−）２　）の前述した平均値μｄ（’ｏ
ｎ＋’ｒ＋ｉｎ）における分散を示す。そこで、始端状
態となる分離候補区間ｒ（第２図においてはｒ＝０であ
る）から終端状態となる分離１関補区間ｎ（第２図にお
いてはｎ＝４）に到るまでの文字分離位置は、式（２）
で示す評価尺度Ｕを最小とするような分離候補位置ｘ（
ｒ＋　ｉｒ）＋ｘ（ｒ＋１＋　ｉｒ＋１）＊・・・＋ｘ
（ｎ＋ｊ１１）をめることによって得られる。In the figure, the positions indicated by black dots indicate the values of each separation candidate position x(k, ik) of the separation candidate section k (lc=Q, . . . 4) shown in FIG. 1 fc). Further, the case where the character pitch P is 20 is shown. Furthermore, in order to simply explain the principle of the present invention, the character string images to be separated are from separation candidate section 0 to separation candidate section 4 in J (C) of FIG. 1. First, let's define the symbols. Symbol μd(r*ntir*'n) (where l≦r; (n)
is 11-r+1 arbitrarily selected in each separation candidate section from separation candidate position x (r, i,) of separation candidate section r to separation candidate position x (n, 1fi) of separation candidate section n.
separation candidate positions x(r, 1r)eX(r+1.'r
+x), -e x(n, in) n −
r distances d(' * ' + 1; 1, @ir
+1)d(r+1.r+2:ir+t*jr+z)・
-, d(n-1, n; in 1°1n). Symbol σd(rs”:ir+i+) (1≦r≦n
) is arbitrarily selected n − in each separation candidate section from separation candidate position x (r, i,) of separation candidate section r to separation candidate position x (n, 1fl) of separation candidate section n. r-
1-11 Same separation candidate position X(reir)ex(r+1
．． ir+x), ・−, n obtained from x(n, i, ,)
−r distances wjd (r t r + 1 r ir
-r+1) *d (r+1, r+2: j
The average value μd('o
n+'r+in). Therefore, character separation is performed from the separation candidate interval r (r = 0 in Figure 2), which is the start state, to the separation candidate interval n (n = 4 in Figure 2), which is the terminal state. The position is given by formula (2)
Separation candidate position x (
r+ ir)+x(r+1+ ir+1)*...+x
It is obtained by subtracting (n+j11).

Ｌｌ（ｒ、ｎ）＝βｏｆｆ：（ｒ、ｎ；　ｉ、、１ｎ）
−１−（１−β）　”（μｄ（ｒ、ｎ；峠４．）ｐ）ｚ
　・・・・・・（２）式（２）における重み係数βは、０≦β≦１を満たす。Ll (r, n) = βoff: (r, n; i, 1n)
-1-(1-β) ”(μd(r,n; Pass 4.)p)z
(2) The weighting coefficient β in equation (2) satisfies 0≦β≦1.

ここで、式（２）を最小にするような分離候補位置のよ
り具体的な実現方法は、以下に示す動的計画法を用いて
、メモリ容量を費やすことなく行うことができる。そこ
で、分離候補区間に＋１における任意の分離候補位置ｘ
　（ｋｒ１　、　ｉｋ＋ｔ　）において、１つ前の分離
候補区間ｋ（但し、ｋ＝Ｑは、始端状態を含む分離候補
区間とする。）の任意の分離候補位置Ｘ（ｋ、１ｋ）（
但し、ｉｋ＝　１．２　、、、　ｈｋとし、ｈ、≧１と
する）から式（２）を満足する分離候補位置ｘ　（ｋ　
＋　１　＋　’ｋ）　に到る最適な分離候補位置ｘ（ｋ
、ｉｌ）を後述する漸化式を用いてめることができる。Here, a more specific method for realizing separation candidate positions that minimizes Equation (2) can be performed without consuming memory capacity using dynamic programming described below. Therefore, any separation candidate position x at +1 in the separation candidate section
In (kr1, ik+t), any separation candidate position X(k, 1k)(
However, from ik = 1.2,..., hk, and h≧1), the separation candidate position x (k
+ 1 + 'k) The optimal separation candidate position x(k
, il) can be calculated using the recurrence formula described later.

まず、距離ｄ（ｋ、に＋ｃｉｋ、ｊｋ＋ｔ）（但し、ｉ
、　＝　１．２・・・ｈｋ）をめ、以下に示す式（３−
１）　、式（３−２）、式（３−３）を計算する。First, distance d(k, to +cik, jk+t) (however, i
, = 1.2...hk), the following formula (3-
1) Calculate Equation (3-2) and Equation (3-3).

μｄ（０，に＋１；ｉｏ、ｉｋ＋１）＝　”　（ｋ−Ｊ
（０，に：ｊｏ、ｊｋ）ｋｒｌ −４−ｄ（ｋ、に＋ｃｊｋ、ｊｋ＋ｉ）　）・・・（３
−１）Ｄ（ｋｒ１　）＝Ｄ＊（ｋｌ＋　ｄ２（ｋ、　ｋ
ｒ１　：　ｉｋ、　！に＋ｘ）・・・（３−２）Ｕ（ｏ
　、　ｋｒｔ　）＝β・（兜四土す−μ６　（０、ｋ　
＋　１　ｒ　’ｏ　＋　”ｋ））２に＋１＋（１−β）（μｄ（０，に＋１．ｉｏ、ｌｈ）　ｐ）
ｔ・・・（３−３）分離候補区間にのｈｋ個の分離候補位置ｘ（ｋ、ｌ）。μd (0, +1; io, ik+1) = ” (k-J
(0, to: jo, jk) krl -4-d (k, to +cjk, jk+i) )...(3
-1) D(kr1)=D*(kl+d2(k, k
r1: ik, ! +x)...(3-2)U(o
, krt ) = β・(Kabuto Shitosu − μ6 (0, k
+ 1 r 'o + ``k)) 2 + 1 + (1 - β) (μd (0, + 1.io, lh) p)
t...(3-3) hk separation candidate positions x(k, l) in the separation candidate section.

・・・ｘ（ｋ、ｈｋ）に対して、式（３−３）の評価尺
度Ｕ（０，に−１−１）を最小とする分離候補位置ｘ（
ｋ、ｉｋ）が分離候補区間ｋ　−１−１の任意の分離候
補位置ｘ　（ｋｒ１　、　ｉｋ＋ｘ　）への最適な分離
候補位置となる。...For x(k, hk), find the separation candidate position x(
k, ik) becomes the optimal separation candidate position for an arbitrary separation candidate position x (kr1, ik+x) in the separation candidate section k-1-1.

ここで、始端状態となる分離候補区間Ｏにおける各分離
候補位置Ｘ（０，１Ｏ）（第２図において１ｏ＝ｌとな
る月こおいて、式（３−１）に示す最適な平均値μｄ（
０，０，１ｏ、１ｏ）＝Ｑ　、式（３−２）に示す最適
な距離ｄ　（１、Ｏ；’−１−ｉＯ）　の２乗累槓和Ｄ
＊（０）−０とする。分離候補区間にの各分離候補位置
Ｘ（１（、ｊｋ）には上述した最適な平均値／４（０＋
　ｋｒ　’に＋　ｉｏ　）及び最適な距離ｄ（ｋ−１，
に；ｊｋ−ｓ＋ｊｋ）の２乗の累積和Ｄ”（ｋ−１，ｋ
）を記憶しておけば、次の分離候補区間に＋１の各分離
候補位置ｘ　（ｋ−１−１、ｉｋ＋１　）　における分
離候補区間にの最適な分離候補位置Ｘ（ｋ、ｊｈ）がめ
られる。尚、式（３−３）における１項は、式（２）で
示した分散σＭ（０，に＋１；ｉｏ、ｊｈ＋ｔ）の別の
表現方法になっている。Here, for each separation candidate position X (0, 1O) in the separation candidate section O which is the starting end state (1o=l in Fig. 2), the optimum average value μd shown in equation (3-1) is calculated. (
0,0,1o,1o)=Q, the squared cumulative sum D of the optimal distance d (1,O;'-1-iO) shown in equation (3-2)
*(0)-0. The above-mentioned optimal average value/4(0+
kr' +io) and the optimal distance d(k-1,
; jk−s+jk) cumulative sum D”(k−1, k
), the optimal separation candidate position X(k, jh) for the separation candidate section at each separation candidate position x (k-1-1, ik+1) of +1 can be found in the next separation candidate section. Note that the first term in equation (3-3) is another way of expressing the variance σM (0, +1; io, jh+t) shown in equation (2).

次に、＠２図を用いて、式（３−１）、式（３−３）の
計算過程を説明する。図において、カッコで示した値は
、それぞれ各分離候補区間ｋ（ｋ＝Ｑ、　１．２，３．
４　）の各分離候補位置ｘ（ｋ、ｉｋ）において、式（
３−１）及び式（３−３）の漸化式で示された平均値μ
ｈ　Ｏ、ｋ　＊　ｉｏ　、　１ｋ）　及び評価尺度Ｕ（
０，ｋ）を示しており、分離候補位置ｘ（ｋ−１ｅ”ｋ
−ｉ）　からの最適な値として算出したものである。尚
、本説明では、式（３−３）における重み係数βは、０
．５とした場合について述べる。また、図における矢印
は、それぞれ、最適な分離候補位置の系列を示している
。例えば、分離候補位置Ｘ（２，１）は位置３９であわ
、分離候補位置Ｘ（１，１）との距離ｄ（１，１；１，
１）は１９となる。Next, the calculation process of equations (3-1) and (3-3) will be explained using diagram @2. In the figure, the values shown in parentheses are for each separation candidate section k (k=Q, 1.2, 3.
4) at each separation candidate position x(k, ik), the formula (
3-1) and the average value μ shown by the recurrence formula of formula (3-3)
h O, k * io , 1k) and evaluation scale U (
0, k), and the separation candidate position x(k-1e”k
-i) It is calculated as the optimal value from . In addition, in this explanation, the weighting coefficient β in equation (3-3) is 0.
．． The case where it is set to 5 will be described. Further, each arrow in the figure indicates a series of optimal separation candidate positions. For example, the separation candidate position X (2, 1) is located at position 39, and the distance from the separation candidate position
1) becomes 19.

そこで、分離候補位置Ｘ（１，１）を通る分離候補位置
Ｘ（２，１）における平均値μｄ（ｏ、２）は、式（３
−１）及び図より−・（１Ｘ２０＋１９　）となり、値
１９５となる。次に、分離候補位置Ｘ（１，１）には、
式（３−２）　テ示すｈルＤ”（１）＝２０’　カ記憶
すれている（図中、省略）ため、式（３−２）を用いて
、Ｄ　（２）＝　２０”　＋　１９２となる。そこで、
分離候補位置Ｘ（１，１）を通る分離候補位置ｘ（２，
１）にオケル評価尺ｉＵ　（０，２）ハＵ（ｏ、　２−
）−ｏ、５（附５ｖ−１９，５”　）　＋　０．５・（
１９，５−２０）　２となり、値０１３８となる。同様
に、分離候補位置Ｘ（１，２）を通る分離候補位置Ｘ（
２，１）における評価尺度Ｕ（０゜２）も計算され（但
し、計算は省略する）、値１．２６となる。そこで、分
離候補位置Ｘ（２，１）に対して２つの評価尺度Ｕ（０
，２）のうち、最小値をとると、１つ前の分離候補区間
１における最適な分離候補位置はｘ（１，１）となり、
また、平均値μｔｃ　０．２　）＝　１９．５　、評価
尺ｉＵ　（０，２）−０，３８が選択される。以下、同
様な操作を式（３−１）。Therefore, the average value μd(o, 2) at the separation candidate position X(2, 1) passing through the separation candidate position
-1) and the figure, it becomes -.(1X20+19), and the value is 195. Next, at the separation candidate position X (1, 1),
Equation (3-2) D"(1)=20' is stored (not shown in the figure), so using Equation (3-2), D(2)=20"+192 becomes. Therefore,
Separation candidate position x(2,
1) Oker rating scale iU (0,2) haU(o, 2-
)-o, 5 (appended 5v-19,5") + 0.5・(
19,5-20) 2, resulting in a value of 0138. Similarly, separation candidate position X(
The evaluation scale U (0°2) in 2.1) is also calculated (however, the calculation is omitted) and has a value of 1.26. Therefore, two evaluation scales U(0
, 2), the optimal separation candidate position in the previous separation candidate section 1 is x (1, 1),
Also, the average value μtc 0.2 )=19.5 and the rating scale iU (0,2)−0,38 are selected. Below, similar operations are performed using equation (3-1).

式（３−２）、式（３−３）で示した漸化式を用いて、
行うことによって、第２図で示したように。Using the recurrence formulas shown in equations (3-2) and (3-3),
By doing as shown in FIG.

各分離候補位置Ｘ（Ｊ　’ｈ）（ｋ＝Ｏｒ　１．２，３
．４　）における評価尺度Ｕ（０，ｋ）（但し、ｋ　＝
　０．１．２．３゜４）が計算される。Each separation candidate position X (J'h) (k=Or 1.2,3
．． 4) evaluation scale U(0,k) (where k =
0.1.2.3°4) is calculated.

次に、前述したように、終端状態となる分離候補区間内
の分離候補位置Ｘ（４，２）　＊　Ｘ（４３）　ｅｘ（
４，４）のうち、評価尺度［Ｊ（０，４）が最小となる
分離候補位置ｘ（４，２）を文字分離の終了位置として
選択する。そこで、最適な分離候補位置の系列を文字分
離の終了位置ｘ（４，２）より逆にたどることによって
、ｘ　（４，２）＝８１　、　ｘ　（３，３）＝６０．
Ｘ（２，２）＝２０．Ｘ（１，１）＝２０．Ｘ（０゜１
）二〇としてめることができる。Next, as described above, the separation candidate position X (4, 2) * X (43) ex (
4, 4), the separation candidate position x(4, 2) with the minimum evaluation scale [J(0, 4) is selected as the end position of character separation. Therefore, by tracing the series of optimal separation candidate positions backwards from the character separation end position x (4, 2), x (4, 2) = 81, x (3, 3) = 60, etc.
X(2,2)=20. X(1,1)=20. X(0゜1
) can be taken as 20.

第３図は、本発明の具体的一実施例を示す論理ブロック
図である。走査部１は、紙面上に記載された文字列イメ
ージを光学的に走査して、電気信号に変換し、２値量子
化後、文字列イメージメモリ２へ書き込む。文字塊抽出
部３は、文字列イメージメモ１Ｊ２ｊこ格納された文字
列イメージから文字塊を順次抽出し、各文字塊の位置及
び幅及び高さを文字塊情報レジスタ２１へ格納する。尚
、このような文字塊抽出部３は、公知の技術を用いてめ
ることができる。文字ピッチ検出部４は１文字塊情報レ
ジスタ２１に格納された各文字塊の位置及び文字塊幅、
更には文字の高さを用いて、文字ピッチＰを推定し、文
字ピッチ情報レジスタ２２に格納する。尚、このような
文字ピッチ検出部４は、同一出願人による前記引用例１
の明細書で示されている技術を用いてめることができる
し、また予め文字ピッチＰが既知であれば、与えられた
文字ピッチＰを用いても良い。パラメータ情報レジスタ
３０は、本発明で用いる種々の閾値や重み係数であるパ
ラメータＴ１１　’ｒ、　ｌ　ｒｌ、　ｅ　’ｒ、　Ｉ
　Ｔ５＋βを格納する。許容区間抽出部５は、第１図を
用いで述べた条件（１）及び条件（２）を満足する許容
区間を抽出する。最初に、条件（１）で示した空白とな
る許容区間は、文字塊情報レジスタ２１に格納された複
数個の文字塊の位置及び文字塊幅Ｖｉを用いて、空白と
なる位置及び空白サイズが比較回路等によって抽出され
る。次に、条件（２）で示した黒地領域内の許容区間は
、最初に、各文字塊幅Ｖｉが、文字ピッチ情報レジスタ
２２に格納された文字ピッチＰとパラメータ情報レジス
タ３０に格納されたパラメータＴ１との和Ｐ＋’ｌ”、
より大きいか否かを比較し、大きければ、各文字塊幅Ｖ
ｉの両端からパラメータ情報レジスタＴ２で示された値
までを除いて、文字塊幅Ｖｉを含む区間を許容区間とし
て抽出する。以上のようにして、抽出された条件（１）
を満たす空白となる許容区間及び条件（２）を満たす黒
地領域内の許容区間が抽出され、許容区間情報レジスタ
２３に、抽出された各許容区間の位置及び幅が格納され
る。終端候補区間抽出部６は１文字行イメージに対応し
て、順次許容区間レジスタ２３に格納された許容区間の
うち、空白となる許容区間Ｗｉｆこついて、パラメータ
情報レジスタ３０に格納されたパラメータＴ４及び文字
ピッチＰとの積Ｔ４・Ｐを算出し、積Ｔ４・Ｐと空白と
なる許容区間質とを比較することによって、Ｍ　Ｔ４・
Ｐよりも大きくなる許容区間Ｗｉを検出する。次に、許
容区間Ｗｉの始端から、パラメータＴ、と文字ピッチＰ
との積Ｔ、・Ｐ（但しＴ、≦Ｔりまでの許容区１Ｌｕを
詩、出し、更に許容区間Ｗｉの直前に存在する文字塊幅
Ｖｉの始端から文字ピッチＰ及びパラメータＴＩとの和
Ｐ−１−Ｔ、　内の許容区間を算出して、上述した２つ
の許容区間の論理和を、終端候補区間として、順次、終
端候補区間レジスタＵに格納する。FIG. 3 is a logical block diagram showing a specific embodiment of the present invention. A scanning unit 1 optically scans a character string image written on a paper surface, converts it into an electrical signal, and writes it into a character string image memory 2 after binary quantization. The character block extraction unit 3 sequentially extracts character blocks from the character string image stored in the character string image memo 1J2j, and stores the position, width, and height of each character block in the character block information register 21. Incidentally, such a character block extraction section 3 can be extracted using a known technique. The character pitch detection unit 4 detects the position and character block width of each character block stored in the 1-character block information register 21,
Furthermore, the character pitch P is estimated using the character height and stored in the character pitch information register 22. Incidentally, such a character pitch detection unit 4 is similar to the above-mentioned Cited Example 1 by the same applicant.
This can be done by using the technique shown in the specification of 2007, or if the character pitch P is known in advance, a given character pitch P can be used. The parameter information register 30 stores parameters T11'r, l rl, e'r, I which are various threshold values and weighting coefficients used in the present invention.
Store T5+β. The permissible interval extraction unit 5 extracts a permissible interval that satisfies the conditions (1) and (2) described using FIG. First, the allowable interval for blanks shown in condition (1) is determined by using the positions of a plurality of character blocks and the width of character blocks Vi stored in the character block information register 21. Extracted by a comparison circuit or the like. Next, in the permissible interval within the black background area shown in condition (2), first, each character block width Vi is determined by the character pitch P stored in the character pitch information register 22 and the parameter stored in the parameter information register 30. Sum P+'l'' with T1,
Compare whether it is larger or not, and if it is larger, each character block width V
Excluding both ends of i up to the value indicated by the parameter information register T2, the section including the character block width Vi is extracted as a permissible section. Condition (1) extracted as above
Blank allowable sections that satisfy condition (2) and allowable sections within the black background area that satisfy condition (2) are extracted, and the position and width of each extracted allowable section are stored in the allowable section information register 23. The terminal candidate section extracting unit 6 detects a blank allowable section Wif among the allowable sections sequentially stored in the allowable section register 23 corresponding to a single character line image, and extracts the parameters T4 and 2 stored in the parameter information register 30. By calculating the product T4・P with the character pitch P and comparing the product T4・P with the acceptable interval quality for blanks, M
A permissible interval Wi that is larger than P is detected. Next, from the start of the allowable interval Wi, the parameter T and the character pitch P
The product T, ・P (where T, ≦T, the permissible interval 1Lu is output, and the sum P of the character pitch P and the parameter TI from the starting end of the character block width Vi that exists immediately before the permissible interval Wi -1-T, is calculated, and the logical sum of the above-mentioned two permissible intervals is sequentially stored in the terminal candidate section register U as the terminal candidate section.

第４図ｔａ）及び（ｂ）に終端１菌補区間抽出部６１こ
よりて、抽出される終端候補区間の一例を示す。第４図
（ａ）の場合、終端区間は、図中Ｔ３・Ｐで示した区間
としてめられる。第４図（ｂ）の場合、終端区間は、図
中最後の矢印で示した区間であり、Ｔ、・ＰとＰ＋ＴＩ
との論理和のうち、空白となる許容区間となっている。FIGS. 4(a) and 4(b) show an example of a terminal candidate section extracted by the terminal 1 bacteria complementary section extraction unit 61. In the case of FIG. 4(a), the terminal section is defined as the section indicated by T3.P in the figure. In the case of Fig. 4(b), the terminal section is the section indicated by the last arrow in the figure, where T, ・P and P+TI
This is the allowable interval that is blank among the logical sums.

分離候補区間抽出部７は、第１図（Ｃ）を用いて説明し
たような分離候補区間にの各分離候補位置ｘ（ｋ、ｉｌ
）を許容区間情報レジスタ２３及びパラメータ情報レジ
スタ３０１こ格納された許容区間及びパラメータを用い
て、順次抽出する。尚、文字分離開始位置を含む始端分
離峡補区間Ｏの各分離候補位置ｘ（０，１ｏ）（但し、
ｉｏ＝　１．２　・”　ｈｏ　）は制御部１０によって
、最初に、文字列イメージの始端から文字ピッチＰに基
づいて設定される一定範囲の空白となる許容区間内の各
分離候補位置よりめられ、最適分離位置情報レジスタ２
６に格納されているものとする。そこで１分離候補区間
抽出部７は、すでに、抽出され最適分離候補位置レジス
タ２６に格納された分離候補区間ｋ（但し、ｋ＝　Ｏ，
１，２・・・）の分離候補位置ｘ（ｋ、ｊｋ）（但し、
ｉｋ＝　１．２・・・ｈｉ）から式（１）を満たす許容
区間内に属する分離候補位置ｘ　（ｋｌｌ　ｈ　ｉｈ＋
ｖ　）　を算出する。The separation candidate section extraction unit 7 extracts each separation candidate position x(k,il) in the separation candidate section as explained using FIG.
) are sequentially extracted using the permissible intervals and parameters stored in the permissible interval information register 23 and parameter information register 301. In addition, each separation candidate position x (0, 1o) of the starting end separation gorge supplementary section O including the character separation start position (however,
io=1.2・"ho) is first determined by the control unit 10 from each separation candidate position within a certain blank tolerance interval set based on the character pitch P from the starting end of the character string image. , optimal separation position information register 2
6. Therefore, the 1-separation candidate section extraction unit 7 extracts the separation candidate section k (where k=O,
1, 2...) separation candidate position x (k, jk) (however,
separation candidate position x (kll h ih+
v) is calculated.

即ち、分離候補区間にの第１番目の分離候補位置ｘ（ｋ
、１）から文字ピッチＰとパラメータ情報レジスタ３０
に格納されたパラメータＴ、を用いてｘ（ｋ、１）＋Ｐ
　Ｔ３となる位置を算出し、更に分離候補区間にの最後
の分離候補位置ｘ（ｋ、ｈｋ）から文字ピッチＰとパラ
メータＴ、を用いてｂ　ｘ（ｋ＊ｔｌｋ）＋Ｐ＋Ｔ、と
なる位置を算出する。上記２つの位置Ｘ（ｋｌｌ）＋Ｐ
−Ｔ、　、　Ｘ（ｋ、ｈｋ）＋Ｐ−１−Ｔ、によって得
られる区間の各分離候補位置のうち、論理積をとること
によって前述した許容区間に属する分離候補位置を分離
候補区間に＋１の各分離候補位置ｘ（ｋｌｌ、ｊｈ＋ｚ
）（但し、ｉ＊＋ｘ　＝　Ｌ　２−　”ｋｌｌ　）とし
て抽出し、分離候補位置情報レジスタ２５に格納される
。評価尺度演算部８に、分離候補位置情報レジスタ２５
の内容が入力された時、最適分離位置情報レジスタ２６
には、すでに演算された分離候補区間Ｏから分離候補区
間ｋまでの各分離候補位置ｘ（ｏ、ｔｏ）（但し、ｉ、
＝　１・・−ｈｏ　）　、　ｘ（１，１ｓ）（但し、１
Ｉ＝１・・・ｈｌ）、・・・ｘ（ｋ、１ｋ）（但し、１
ｋ＝１・・・ｈｋ）が格納されている。更に、分離候補
区間ｍ　（ｍ＝　０−・ｋ　）の各分離候補位置ｘ（ｍ
、ｉｍ）（但し、ｉ□＝１・・・塩）に対応して、評価
尺度演算部８によって、式（３−１）より計算された平
均値μ言（０，ｍ；　’（ｌｅ　ｉ、、）、式（３−２
）より計算された距離の２乗の累積和Ｄ＊（ｋ）、式（
３−３）より計算された評価尺度ＴＪ（０，ｍ）及び直
前の分離候補区間ｍ−１の最適な分離候補位置ｘ（ｍ−
１゜’ｎｙ−１）が格納されている。尚、制御部ＩＯに
よって、分離候補区間０の各分離候補位置ｘ（０，ｉｏ
）が格納された時、各分離候補位置ｘ（ｏ、ｉｏ）に対
応して格納される平均値μＴ　（ｏ、ｏ；ｔｏ、ｉｏ）
及び距離の２乗の累積和Ｄ＊（０）は０が格納されてい
るものとする。そこで、評価尺度演算部８は、分離候補
位置情報レジスタ２５より順次転送される分離候補位置
ｘ　（ｋ　＋　１　＃　’＋ｃ　）　において、最初に
最適分離位置情報レジスタ２６に格納された分離候補区
間にの各分離候補位置ｘ（ｋ、１ｋ）（但し、ｊｈ　＝
　１．２−　ｈｋ）における距離ｄ（ｋ、に＋１：ｉ、
、ｉｋ＋ｔ　）を算出し。That is, the first separation candidate position x(k
, 1) to character pitch P and parameter information register 30
x(k, 1)+P using the parameter T stored in
Calculate the position of T3, and then use the character pitch P and parameter T to calculate the position of b x (k * tlk) + P + T from the last separation candidate position x (k, hk) in the separation candidate section. do. Above two positions X(kll)+P
-T, , X(k, hk)+P-1-T, among the separation candidate positions in the interval obtained by Each separation candidate position x (kll, jh+z
) (where i*+x = L 2- "kll) and is stored in the separation candidate position information register 25.
When the contents of are input, the optimum separation position information register 26
, each separation candidate position x (o, to) (where i,
= 1...-ho ), x(1,1s) (however, 1
I=1...hl),...x(k, 1k) (however, 1
k=1...hk) are stored. Furthermore, each separation candidate position x (m
, im) (where i□=1...salt), the evaluation scale calculation unit 8 calculates the average value μ (0, m; '(le i ,, ), formula (3-2
), the cumulative sum D*(k) of the squares of the distances calculated from the formula (
3-3) and the optimum separation candidate position x(m-
1°'ny-1) is stored. Note that the control unit IO controls each separation candidate position x(0, io
) is stored, the average value μT (o, o; to, io) is stored corresponding to each separation candidate position x (o, io).
It is assumed that 0 is stored in the cumulative sum D*(0) of the squares of the distances. Therefore, the evaluation scale calculating unit 8 selects the separation candidate section initially stored in the optimum separation position information register 26 at the separation candidate position x (k + 1 #'+c) sequentially transferred from the separation candidate position information register 25. each separation candidate position x(k, 1k) (where, jh =
1.2-hk) distance d(k, +1:i,
, ik+t).

更にその平均値μｄｃＯｇｋｓｉ０．１ｋ）ｓ距離の２
来県積和Ｄ＊（ｋ）及びパラメータ情報レジスタ３０に
記憶されたパラメータβを用いて、順次、式（３−１）
で示した漸化式μ、１（０，に＋１；ｉｏ、ｊｋ＋ｘ）
＝”　（ｋ＋　ｋｌｌ　＋　ｉｋ　＊　’に＋１　）、
式（３−３）で示した漸化式ＬＪ（０，に＋１）＝Ｉ・
４−μ’ａ（０゜ｋｌｌに＋１　；　ｉｏ　ｅ　ｊｈ＋ｔ　）　）”＋　（１−
β）・（μｄ（０、ｋｌｌ）；’ｏ、　ｊｋ＋ｚ　）　
Ｐ）２　を計算することによって、直前の分離候補区間
にの分離候補位置Ｘ（ｋ、１ｋ）（但し、ｉｋ＝　１−
ｈｋ）　＋ｃ対する評価尺１１１（０，に＋１）を算出
する。Furthermore, the average value μdcOgksi0.1k) s distance 2
Using the next prefecture product sum D*(k) and the parameter β stored in the parameter information register 30, the formula (3-1) is sequentially calculated.
Recurrence formula μ, 1 (0, +1; io, jk+x) shown in
=” (+1 to k+kll+ik*’),
Recurrence formula LJ (0, +1) shown in formula (3-3) = I・
4-μ'a(+1 to 0゜kll; io e jh+t) )"+ (1-
β)・(μd(0, kll);'o, jk+z)
By calculating P)2, the separation candidate position X(k, 1k) in the immediately preceding separation candidate section (where ik = 1-
hk) Calculate the rating scale 111 (0, +1) for +c.

次に、ｈ、個の分離候補位置ｘ（ｋ、ｊｈ）　のうち、
評価尺度Ｕ（０，に＋１）が最小となる分離候補位置ｘ
（ｋ、ｉｋ）を分離候補位置ｘ（ｋｌｌ　、　ｊｋ＋ｔ
　）へ到達する１つ前の最適な分離候補位置ｘ（ｋ、ｉ
ｋ）　としてめ、更に、評価尺度［Ｊ（０，に＋１）の
最小値及び評価尺度ｔＪ（０，に＋１）が最小値となる
平均値μ吉（０＋　ｋ＋’　＊　’ｏ　ｍ　ｉｋ　）、
及び距離の２乗累積和Ｄ＊（ｋｌ１）　をそれぞれ分離
候補位置ｘ　（ｋ　＋　Ｉ　＊　ｉｋ）と共に、最適分
離位置情報レジスタ２６に格納する。Next, among the h separation candidate positions x(k, jh),
Separation candidate position x where the evaluation scale U (0, +1) is the minimum
(k, ik) as separation candidate position x(kll, jk+t
) is the optimal separation candidate position x(k, i
k), and furthermore, the minimum value of the evaluation scale [J (0, +1) and the average value μ kichi (0 + k+' * 'o m ik ) at which the evaluation scale tJ (0, +1) is the minimum value,
and the cumulative sum of squared distances D*(kl1) are stored in the optimum separation position information register 26 together with the separation candidate position x (k + I * ik).

評価尺度演算部８において、以上述べた演算処理を分離
候補位置情報レジスタ２５より順次転送されるすべての
分離候補位置ｘ　（ｋｌ１　、　ｉｋ＋＋　）　に対し
て行われると、制御部１０は、分離候補区間抽出部６に
次の分離候補区間に＋２の各分離候補位置ｘ　（ｋｌ２
　、　ｉｈ＋ｚ　）を抽出するように要求し、前述した
同様な操作が繰り返される。ここで、制御部１０は、評
価尺度演算部８によって、最適分離位置情報レジスタ２
６ｉこ転送された分離候補区間に＋１の分離候補位置ｘ
　（ｋｌ１　、　ｊｋ＋ｔ　）　が終端区間情報レジス
タ２４に格納された終端候補区間に到達したか否かを調
べ、到達しない場合には、上述した要求のみを分離候補
区間抽出部６に出力する。一方、分離候補位置ｘ　（ｋ
ｌ１　、４に＋ｔ　）　が終端候補区間に到達した場合
、制御部１０は、上述した要求を分離候補区間抽出部６
に出力し、次の分離候補区間に＋２の各分離候補位置ｘ
　（ｋｌ２　、　ｉｋ＋２）　（但し、ｉｋ＋ｚ＝　１
−　ｈｋ＋ｚ　）が前述したように評価尺度演算部８で
評価された後、制御部１０によって、最適分離位置情報
レジスタ２６に記憶された分離候補位置Ｘｃｋ’、ｉｋ
）のうち、上述した終端候補区間内にある複数個の分離
候補位置ｘ（ｎ、ｊｌ）の評価尺度Ｕ（０，１）が最小
となる分離候補位置ｘ（ｎ、ｉｎ）を評価した区間内の
終点位置として検出され、終点位置となる分離候補位置
ｘ（ｎ、ｉｎ）に到達する最適な分離候補位置の系列が
、最適分離位置情報レジスタ２６を用いて、終点位置と
なる分離候補位置ｘ（ｎ、ｉｎ）から順次、分離候補位
置Ｘ　（ｎ　ｌ　、１ｎ−１）　＋・・・ｘ（０，ｉｏ
）と逆にたどることによって抽出され、文字分離位置レ
ジスタ２７に把［意される。When the evaluation scale calculation unit 8 performs the above-mentioned calculation processing on all the separation candidate positions x (kl1, ik++) sequentially transferred from the separation candidate position information register 25, the control unit 10 The extraction unit 6 extracts each separation candidate position x (kl2
, ih+z), and the similar operations described above are repeated. Here, the control unit 10 uses the evaluation scale calculation unit 8 to control the optimal separation position information register 2.
+1 separation candidate position x in the separation candidate section transferred 6i times
It is checked whether (kl1, jk+t) has reached the terminal candidate section stored in the terminal section information register 24, and if it has not arrived, only the above-mentioned request is output to the separation candidate section extraction section 6. On the other hand, separation candidate position x (k
l1, +t) reaches the terminal candidate section, the control section 10 transmits the above-mentioned request to the separation candidate section extraction section 6.
and output each separation candidate position x of +2 to the next separation candidate section.
(kl2, ik+2) (however, ik+z= 1
-hk+z) is evaluated by the evaluation scale calculation section 8 as described above, and then the control section 10 selects the separation candidate position Xck', ik stored in the optimum separation position information register 26.
), the section in which the separation candidate position x (n, in) with the minimum evaluation scale U (0, 1) of the plurality of separation candidate positions x (n, jl) in the above-mentioned terminal candidate section is evaluated. A series of optimal separation candidate positions that are detected as the end point positions in Separation candidate position X (n l , 1n-1) +... x (0, io
) and is stored in the character separation position register 27.

次Ｆこ制御部１０は、前述した終点位置ｘ（ｎ、輸）か
ら最初に検出される文字塊の始端までの空白となる許容
区間内で、文字ピッチＰｉこ晶づいて設定される一定範
囲を次に分離すべき部分文字列イメージの始端となる分
離候補区間Ｏとして、最適分離位置情報レジスタ２６に
格納し、前述したような操作を行うように、指令する。The next control unit 10 controls the character pitch Pi within a certain range that is set within the blank tolerance interval from the end point position x (n, x) to the start of the first detected character block. is stored in the optimal separation position information register 26 as the separation candidate section O, which is the starting point of the partial character string image to be separated next, and commanded to perform the operations described above.

このようにして、文字列イメージメモリ２に格納された
文字列イメージの文字分離位置が文字分離位置レジスタ
２７に格納され、上述した文字塊情報レジスタ２１に記
憶された各文字塊の高さ及び文字分離位置レジスタ２７
に格納された文字分離位置を用いることによって、１文
字単位に分離することができる。In this way, the character separation position of the character string image stored in the character string image memory 2 is stored in the character separation position register 27, and the height and character of each character block stored in the above-mentioned character block information register 21 are stored. Separation position register 27
By using the character separation positions stored in , it is possible to separate each character.

第５図は、第３図における評価尺度演算部８の具体的な
一実施例を示す論理ブロック図である。FIG. 5 is a logical block diagram showing a specific embodiment of the evaluation scale calculating section 8 in FIG. 3.

前述したように分離候補位置情報レジスタ２５Ｉこ分離
候補区間に＋１の各分離候補位置Ｘ（ｋｌ１゜ｊｋ＋ｔ
　）　（但し、ｔｋ＋ｔ　＝　１・・・ｈｋ＋１）が格
納されると、第３図で示した制御部ｌＯによって、分離
候補位置ｘ　（ｋｌ１．　ｉｋ＋ｔ　）　が距離算出部
８１及び分離候補位置群レジスタ２６１の所定の位置へ
転送され、分離候補区間に＋１はステージレジスタ８０
及び分離候補位置群レジスタ２６１の所定の位置に格納
される。As mentioned above, each separation candidate position X (kl1゜jk+t
) (However, when tk+t = 1...hk+1) is stored, the control unit IO shown in FIG. +1 is transferred to the predetermined position of the separation candidate section in the stage register 80.
and stored in a predetermined position of the separation candidate position group register 261.

距離算出部８１に分離候補位置ｘ（ｋｌ１　＊　ｉｋ＋
ｖ　）　が格納されると、制御部１０によって、分離候
補位置群レジスタ２６１に格納された分離候補区間にの
各分離候補位置Ｘ（ｋ、１ｋ）（但し、ｉｋ＝　１−・
・ｈｋ）が順次、距離算出部８１恢転送される。ここで
第３図で示した最適分離位置情報レジスタ２６は分離族
２６４から構成される。距離算出部８は分離候補位置ｘ
（ｋｌ　１　ｅ　’に＋１）と順次転送される分離候補
位置との距離ｄ（ｋ、　ｋｌ１　；　ｉｋ＋　ｉｋ＋ｘ
）＝Ｘ（ｋｌ１　＋　ｉｋ＋ｔ　）−ｘ（ｋ、ｉｋ）を
算出する。The separation candidate position x (kl1 * ik+
v ) is stored, the control unit 10 selects each separation candidate position X(k, 1k) (where ik=1−·
・hk) are sequentially transferred to the distance calculation unit 81. The optimal separation position information register 26 shown in FIG. 3 is composed of a separation group 264. The distance calculation unit 8 calculates the separation candidate position x
(+1 to kl 1 e') and the separation candidate position that is sequentially transferred d(k, kl1; ik+ ik+x
)=X(kl1+ik+t)-x(k, ik) is calculated.

統計量算出部８２は、前述した式（３−１）及び式（３
−２）で示ニアた漸化式に基づいて、平均値μｄ（０，
に＋１；ｔｏ、ｉｈ＋ｔ）及び距離の２来県積和Ｄ（ｋ
ｌを算出する。即ち、平均値μｄ（０１ｋｌ１１１０　
。The statistics calculation unit 82 calculates the above-mentioned equation (3-1) and equation (3).
-2), the average value μd(0,
+1; to, ih + t) and distance 2 prefecture product sum D(k
Calculate l. That is, the average value μd(01kl1110
.

ｊｈ＋１）は読み出された最適統計群レジスタ２６３に
格納された分離候補位置ｘ（ｋ、ｉｋ）における平均値
μ才（０，に＋ｉｏ＋ｉｋ）、距離算出部８１の出力ｄ
（ｋ、　ｋｌ１　；　ｉｈ　＊　ｌｋ＋ｘ　）、ステー
ジレジスタ８ｏノ内容である分離区間に＋１及びｋを用
いて、計算式％式％））により算出される。一方、距離の２乗累積和Ｄ（ｋ＋１
　）は読み出された最適統計群レジスタ２６３に格納さ
れた分離候補位置ｘ（ｋ、ｊｈ）における距離の２来県
積和Ｄ＊（ｋ）　と距離算出部８１の出力ｄ（ｋ、に＋
１；ｊｋ、　ｆｈ＋ｔ）を用いて、計算式％式％）により算出される。統計量算出部８２により算出された
平均値μｄ（０，に＋１；ｊｏ、ｉｈ＋１）、及び距離
の２来県積和Ｄ（ｋ＋１　）はそれぞれ統計量格納レジ
スタ８３に格納される。評価値算出部８４は、前述した
式（３−２）に基づいて、評価尺度Ｕ（０，に＋１）の
値を算出する。即ち、評価値［１（０、ｋ＋１　）は、
第３図で示した文字ピッチＰ及びパラメータ情報レジス
タ３０１こ格納されたパラメータβ及び統計量格納レジ
スタ８３の内容及びステージレジスタ８０の内容を用い
て、計算式％式％）））により算出される。次に、比較部８５において、評価値
算出部あの出力である評価値と最小評価値レジスタ８６
の内容を比較し、評価値算出部８４の出力が、最小評価
値レジスタ８６の内容よりも小さければ、その出力信号
線８５１の出力信号８５１Ｓを’ＯＮ“にする。尚、最
小評価値レジスタ８６の内容は、最初非常に大きな値が
セットされているものとする。jh+1) is the average value μ (0, ni+io+ik) at the separation candidate position x(k, ik) stored in the read optimal statistical group register 263, and the output d of the distance calculation unit 81
(k, kl1; ih*lk+x), using +1 and k for the separation interval that is the content of the stage register 8o, is calculated by the calculation formula %). On the other hand, the cumulative sum of squared distances D(k+1
) is the sum of the two-prefecture product D*(k) of the distance at the separation candidate position x(k, jh) stored in the read optimal statistical group register 263 and the output d(k,
1; jk, fh+t), it is calculated by the calculation formula % formula %). The average value μd (0, +1; jo, ih+1) calculated by the statistics calculation unit 82 and the sum of products of two prefectures D (k+1) of the distance are stored in the statistics storage register 83, respectively. The evaluation value calculation unit 84 calculates the value of the evaluation scale U (0, +1) based on the above-mentioned formula (3-2). That is, the evaluation value [1(0, k+1) is
Using the character pitch P shown in FIG. 3, the parameter β stored in the parameter information register 301, the contents of the statistics storage register 83, and the contents of the stage register 80, it is calculated by the calculation formula %)) . Next, in the comparison section 85, the evaluation value which is the output of the evaluation value calculation section and the minimum evaluation value register 86
If the output of the evaluation value calculation unit 84 is smaller than the contents of the minimum evaluation value register 86, the output signal 851S of the output signal line 851 is turned ON. Assume that the contents of are initially set to a very large value.

出力信号８５１Ｓが’ＯＮ“になると、ゲート回路５３
が開いて、評価値算出部８４の出力が最小評価値レジス
タ８６に転送される。When the output signal 851S becomes 'ON', the gate circuit 53
is opened, and the output of the evaluation value calculation unit 84 is transferred to the minimum evaluation value register 86.

また、出力信号８５１Ｓが′″ＯＮ“になると、統計量
格納レジスタ羽に格納された平均値μｄ　（０＊　ｋ＋
１　：ｊｏ　ｍ　ｊｋ＋ｔ　）及び距離の２来県積和Ｄ
（ｋ＋１）が、ゲート回路５２が開くことによって、最
小統計量レジスタ８８に格納される。Furthermore, when the output signal 851S becomes ``ON'', the average value μd (0*k+
1: jo m jk + t) and the sum of the products of 2 prefectures D
(k+1) is stored in the minimum statistic register 88 by opening the gate circuit 52.

更に、出力信号８５１Ｓが’ＯＮ“になると、ゲート回
路５１が開くことによって、距離算出部８１に転送され
た分離候補区間にの分離候補位置ｘ（ｋ、ｉｋ）におけ
る位置情報ｋ及び１１．が連接情報レジスタ８７に格納
される。Further, when the output signal 851S becomes 'ON', the gate circuit 51 opens, and the position information k and 11. at the separation candidate position x (k, ik) in the separation candidate section transferred to the distance calculation unit 81 is It is stored in the connection information register 87.

以上の操作を最適分離位置情報２６１に格納された分離
候補区間にの分離候補位置Ｘ（ｋ、ｊｋ）（但シ、ｉｋ
ニド・・ｈｋに対して行われる。The above operations are performed to determine the separation candidate position X (k, jk) (however, ik
This is done for nido...hk.

次に、第３図で示した制御部１０は分離候補レジスタ５
より距離算出部に転送された分離候補位置ｘ　（ｋ＋１
　、　ｉｋ＋ｔ　）　における最適な平均値μ＊（０゜
ｋ＋’　ｅ　”ｏ　ｅ　”ｋ＋ｓ　）及び最適な距離の
２来県積和Ｄ＊（ｋ＋１）　として、最小統計量レジス
タ羽の内容を最小統計量群レジスタ２６３に転送し、分
離候補位置ｘ　（ｋ＋１　、　ｉｋ＋ｘ　）　の最適な
評価値として、最小評価値レジスタ８６の内容を最小評
価値群レジスタ２６４へ転送し、更に、分離候補区間ｋ
における分離候補位置ｘ　（ｋ　＋　１　ｅ　’ｋ）へ
の最適な分離パス情報として、連接情報レジスタ８７の
内容を連接情報群レジスタ２６３へ転送する。Next, the control unit 10 shown in FIG.
The separation candidate position x (k+1
, ik+t ) as the optimal mean value μ*(0°k+' e ``o e ''k+s ) and the sum of the products of two prefectures D*(k+1) of the optimal distance, the contents of the minimum statistics register wing are converted into the minimum statistics. The contents of the minimum evaluation value register 86 are transferred to the minimum evaluation value group register 264 as the optimal evaluation value of the separation candidate position x (k+1, ik+x), and further, the contents of the separation candidate section
The contents of the connection information register 87 are transferred to the connection information group register 263 as the optimal separation path information to the separation candidate position x (k + 1 e 'k) at .

次に、最小評価値レジスタ８６の内容に、初期値（非常
に大きな値）をセットする。Next, the contents of the minimum evaluation value register 86 are set to an initial value (a very large value).

以上の操作を繰り返すことによって分離候補区間に＋１
のすべての分離候補位置ｘ　（ｋ＋１　＊　ｆｈ＋ｘ　
）（但し、ｊｈ＋１＝　１．２・・・ｈｋ＋ｓ　）　に
対して、最適な評価値及び最適な分離パスが得られる。By repeating the above operation, add 1 to the separation candidate section.
All separation candidate positions x (k+1 * fh+x
) (where jh+1=1.2...hk+s), an optimal evaluation value and an optimal separation path can be obtained.

尚、本発明の具体的な別の実現方法として、通常のマイ
クロコンピーータを用いて、実現できることは言うまで
もない。It goes without saying that another specific method of implementing the present invention is to use an ordinary microcomputer.

以上、述べたように、本発明を適用することによって、
文字間の接触が生じてもまた、１文字が２文字以上にス
プリットする場合が生じても、容易にしかも女足に、−
文字単位の分離を行うことが可能となる。As mentioned above, by applying the present invention,
Even if there is contact between letters, or if one letter is split into two or more letters, it is easy to write -
It becomes possible to perform character-by-character separation.

[Brief explanation of the drawing]

第１図は部分文字列イメージの一例を用いて、本発明に
おける文字分離の分離候補区間設定方法の一例を説明す
るための図、第２図は本発明における最適な文字分離位
置を抽出するための原理を説明するための図、第３図は
本発明の具体的一実施例を示す論理ブロック図、＠４図
は第３図における終端候補区間抽出部６によって、抽出
される終端候補区間の一例を示す。第５図は第３図ζこ
おける評価尺度演算部８の具体的な一実施例を示す論理
ブロック図である。図において、１は走査部、２は文字列イメージメモリ、
３は文字塊抽出部、２１は文字塊情報レジスタ、４は文
字ピッチ検出部、２２は文字ピッチ情報レジスタ、３０
はパラメータ情報レジスタ、５は許容区間抽出部、２３
は許容区間情報レジスタ、６は終端候補区間抽出部、２
４は終端候補区間レジスタ、７は分離候補区間抽出部、
２５は分離候補位置清報レジスタ、８は評価尺度演算部
、２６は最適分離位置情報レジスタ、２７は文字分離位
置レジスタ、１０は制御部、８Ｏはステージレジスタ、
８１は距離算出部、２６１は分離候補位置群レジスタ、
２６２は連接情報群レジスタ、２６３は最適統計１を群
レジスタ、２６４は最適評価値群レジスタ、８２は統計
量算出部、８３は統計量格納レジスタ、８４は評価値算
出部、８５は比軸部、８６は最小評価値レジスタである
。年　１　口（ａ）ｄ（ｆ、２：Ｌ２）　４FIG. 1 is a diagram for explaining an example of a method for setting separation candidate sections for character separation in the present invention using an example of a partial character string image, and FIG. FIG. 3 is a logical block diagram showing a specific embodiment of the present invention, and FIG. An example is shown. FIG. 5 is a logical block diagram showing a specific embodiment of the evaluation scale calculating section 8 in FIG. 3. In the figure, 1 is a scanning unit, 2 is a character string image memory,
3 is a character block extraction unit, 21 is a character block information register, 4 is a character pitch detection unit, 22 is a character pitch information register, 30
is a parameter information register, 5 is a permissible interval extraction unit, 23
is a permissible section information register, 6 is a terminal candidate section extractor, 2
4 is a terminal candidate section register, 7 is a separation candidate section extractor,
25 is a separation candidate position information register, 8 is an evaluation scale calculation unit, 26 is an optimal separation position information register, 27 is a character separation position register, 10 is a control unit, 8O is a stage register,
81 is a distance calculation unit, 261 is a separation candidate position group register,
262 is a linked information group register, 263 is an optimal statistics 1 group register, 264 is an optimal evaluation value group register, 82 is a statistics calculation section, 83 is a statistics storage register, 84 is an evaluation value calculation section, and 85 is a ratio axis section. , 86 is a minimum evaluation value register. Year 1 mouth (a) d (f, 2:L2) 4

Claims

[Claims]

Using the projection distribution obtained from a series of character line images,
In the character separation device that separates the character line image into individual characters, means for sequentially extracting a plurality of character blocks that can be separated by spaces from the projection distribution, and estimating character pitch using the plurality of character blocks. and the projection distribution and the character pitch in the character line image,
means for setting a separation candidate section; calculating a distance between each separation candidate position in the separation candidate section and the separation candidate position in the adjacent separation candidate section; A character separation device comprising: means for calculating a series of optimal separation candidate positions that minimizes an evaluation scale configured based on an average value of distances and a difference between the character pitches over the character line image. .