JPS60142784A

JPS60142784A - Character separating system

Info

Publication number: JPS60142784A
Application number: JP58246709A
Authority: JP
Inventors: Toshio Matsuura; 松浦　俊夫; Katsuhiko Nishikawa; 克彦西川; Akira Inoue; 彰井上; Tomomitsu Murano; 朋光村野; Kiyoshi Iwata; 清岩田
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1983-12-29
Filing date: 1983-12-29
Publication date: 1985-07-27

Abstract

PURPOSE:To separate characters for one character's share from a manuscript character pattern by providing a means detecting specific line segment end point where a specific line segment contacts a character pattern and a character separating section separating the character pattern based on the line segment end point. CONSTITUTION:The range of a threshold value alpha is decided from a center line C of a lateral area of the character area, the range is scanned horizontally and the center position of the part clipped with black points is obtained as the specific line segment. The specific line segment is obtained from the upper and lower range respectively in this way and when plural specific line segments exist, the segment closer to the center line C is adopted. When the specific line segment from the upper range and that from the lower range are in contact with the character pattern, the character pattern is separated from the contact point closer to the center line C as shown in (D). Thus, the contacted manuscript character pattern is separated accurately with a comparatively simple means.

Description

【発明の詳細な説明】（発明の技術分野〕本発明は手書きされたアルファ・ニューメリック文字に
おいて、文字と文字が接合されたものを分離するように
した文字分離方式に関する。DETAILED DESCRIPTION OF THE INVENTION (Technical Field of the Invention) The present invention relates to a character separation method for separating joined characters in handwritten alphanumeric characters.

[Conventional technology and problems]

従来の手書き文字認識技術は、−文字毎に決められた文
字枠内に書かれたものを認識するために、一定の文字枠
に書かれたものだけしか認識できなかった。しかも文字
枠内に書かれるため文字と文字との接合はなかった。Conventional handwritten character recognition technology recognizes what is written within a predetermined character frame for each character, and can only recognize what is written within a predetermined character frame. Moreover, since it was written within the character frame, there was no joining between the characters.

ところで論理回路図等では回路素子の名称や人力信号表
示等のために手書き文字でこれらの記号を表示している
。しかるに論理回路図をデータ処理装置に入力する場合
、この手書き文字部分が特別な文字枠内で書かれたもの
ではなく、文字間の連続部分が存在するためにこの手書
き文字を認識することができなかった。そのため従来で
は回路設計者が図面を作成する場合にこの手書き文字の
記号部分をパンチカードにデータとして入力し、カード
リーグでデータ処理装置に入力しなければならなかった
。By the way, in logic circuit diagrams and the like, these symbols are displayed with handwritten characters to indicate names of circuit elements, human input signals, and the like. However, when inputting a logic circuit diagram into a data processing device, this handwritten character cannot be recognized because it is not written within a special character frame, but because there are continuous parts between characters. There wasn't. For this reason, in the past, when a circuit designer created a drawing, he or she had to input the symbol portion of the handwritten characters as data on a punch card, and then input it into a data processing device using a card league.

そのため回路設計者には図面の作成の外に手書き文字部
分をデータ入力しなければならないという負担がか＼る
ので、この手書き文字部分を自動読取りできるような装
置の開発が要求されている。As a result, circuit designers are burdened with having to input data of handwritten characters in addition to creating drawings, so there is a demand for the development of a device that can automatically read these handwritten characters.

[Purpose of the invention]

本発明の目的は、このような手書き文字部分の認識に際
して必要な、文字枠という規定がなく書かれた文字列群
の中で文字と文字が接合したデータに対してこれを分離
して個々の文字を抽出することを可能とする文字分離方
式を提供することである。The purpose of the present invention is to separate data in which characters are joined in a group of character strings written without a character frame, which is necessary for recognizing such handwritten character parts, and to separate them into individual characters. The object of the present invention is to provide a character separation method that allows character extraction.

[Structure of the invention]

この目的を達成するため、本発明の文字分離方式では、
接合された手書き文字を保持する手書き文字保持手段と
、この接合された手書き文字領域の文字量中心推定位置
より一定範囲内を走査して特異線分を抽出する特異線分
抽出手段と、この特異線分が文字バタンと接触する特異
線分終点を検出する特異線分終点検出手段と、この特異
線分終点により文字パタンを分離する文字分離処理部を
備え、接合された手書き文字バタンから１文字分の文字
パタンを分離するようにしたことを特徴とする。To achieve this objective, the character separation method of the present invention:
handwritten character holding means for holding joined handwritten characters; singular line segment extraction means for extracting a singular line segment by scanning within a certain range from the estimated character center position of the joined handwritten character area; It is equipped with a singular line segment end point detection means for detecting a singular line segment end point where a line segment contacts a character stamp, and a character separation processing unit that separates a character pattern using this singular line segment end point, and is capable of detecting one character from a joined handwritten character stamp. The feature is that the character pattern for minutes is separated.

[Embodiments of the invention]

本発明を一実施例にもとづき詳述するに先立ち、その動
作原理を説明する。Before describing the present invention in detail based on one embodiment, the principle of operation thereof will be explained.

（１）まず図面を走査して手書き文字グループを検出す
る。手書き文字グループは、パターンや線部分に比較し
て黒領域がある程度集合しているので容易に識別し、抽
出することがわかる。(1) First, a drawing is scanned to detect handwritten character groups. It can be seen that handwritten character groups can be easily identified and extracted because black areas are concentrated to a certain extent compared to patterns and line parts.

（２）このようにして手書き文字グループを抽出後、第
１図（イ）に示す如（、これを水平方向に走査して、同
（ロ）に示す如く、その文字グループの高さく垂直方向
の長さ）Ｈと長さＬをめる。(2) After extracting a handwritten character group in this way, as shown in Figure 1 (a), scan this in the horizontal direction, and as shown in Figure 1 (b), the height of the character group is (length) H and length L.

それから第１図（ロ）に示す如く、この領域を垂直方向
にスキャンして分離可能な文字を分離する。この結果、
第１図においてＡ、ＥＳＣ，Ｄは個別に分離可能である
が、「４Ｎ」は接合しているので第２図（イ）に示す如
く、これが一つの単位として抽出される。Then, as shown in FIG. 1(b), this area is scanned vertically to separate separable characters. As a result,
In FIG. 1, A, ESC, and D can be separated individually, but since "4N" is joined, it is extracted as one unit as shown in FIG. 2 (A).

（３）ところで通常文字の大きさは、第１図（ハ）に示
す如く、その幅と高さの比は約２対３て′あるので、こ
の場合第２図（イ）の縦ＹＷと横ＸＷとの比率より２文
字が接合された場合に相当することが識別される。(3) By the way, as shown in Figure 1 (C), the size of normal characters has a width to height ratio of approximately 2:3, so in this case, the vertical YW in Figure 2 (A) From the ratio with horizontal XW, it is identified that this corresponds to a case where two characters are joined.

（４）次に第２図（ロ）に示す如く、この文字領域の横
領域の中心線Ｃより閾値αの区域を定め、これを水平走
査して黒点と黒点とのはさまれた部分の中心位置を特異
線分としてめる。このときその一方または両方の側に黒
点領域が存在しないとき閾値αの枠部分と黒点間の中心
位置、または枠部分間の中心位置を前記特異線分とする
。(4) Next, as shown in Figure 2 (b), an area with a threshold value α is determined from the center line C of the horizontal area of this character area, and this area is horizontally scanned to find the area between the black dots. Take the center position as a singular line segment. At this time, if there is no black spot area on one or both sides, the center position between the frame portion of the threshold value α and the black spot, or the center position between the frame portions is defined as the singular line segment.

このようにして上の部分と、下の部分よりそれぞれ特異
線分をめるが、特異線分が複数本存在するときその中心
線Ｃに近いものを残す。In this way, singular line segments are obtained from the upper part and the lower part, respectively, but when there are a plurality of singular line segments, those close to the center line C are left.

（５）この上の方からの特異線分と、下の方からの特異
線分が文字バタンと接したとき、第２図（ニ）に示す如
く、その中心線Ｃに近い接点により、この文字パタンを
分離する。(5) When the singular line segment from above and the singular line segment from below touch the character slam, as shown in Figure 2 (d), this Separate character patterns.

本発明の一実施例を第３図〜第５図にもとづき、第２図
を参照しつつ説明する。An embodiment of the present invention will be described based on FIGS. 3 to 5 and with reference to FIG. 2.

図中、１０は画像メモリ、１１は矩形領域バ・ソファ・
メモリ、１２は閾値保持レジスタ、１３はアドレス制御
部、１４は特異線分抽出部、１５は特異線分終点検出部
、１６は一文字分離処理部である。In the figure, 10 is an image memory, 11 is a rectangular area
12 is a memory, a threshold value holding register, 13 is an address control section, 14 is a singular line segment extraction section, 15 is a singular line segment end point detection section, and 16 is a single character separation processing section.

画像メモリ１０はビデオ入力信号が格納されるものであ
って原画像全体が保持されるものである。The image memory 10 stores video input signals and holds the entire original image.

矩形領域バッファ・メモリ１１は、前記第２図（イ）に
示す如く、接続されたものとして判定されたデータが入
力されるバッファ・メモリである。The rectangular area buffer memory 11 is a buffer memory into which data determined to be connected is input, as shown in FIG. 2(a).

閾値保持レジスタ１２は、第２図（ロ）に示す閾値αが
記入されるレジスタである。この閾値αはシュミレーシ
ョンにより決定されるが、例えばＸＷの幅の値の１０〜
２０％位に定められる。The threshold value holding register 12 is a register in which the threshold value α shown in FIG. 2 (b) is written. This threshold value α is determined by simulation, but for example, from 10 to the width value of XW.
It is set at around 20%.

アドレス制御部１３は画像メモリ１０から第２図（イ）
に示す領域を切り出して矩形領域バ・ソファ・メモリ１
１に格納したり、あるいは矩形領域バッファ・メモリ１
１に格納された文字バタンを読み出したり、中心線Ｃを
中心にそれぞれ左右αずつ領域を切り出す等の如きアド
レスを発生するものである。The address control unit 13 starts from the image memory 10 as shown in FIG.
Cut out the area shown in and create a rectangular area Ba Sofa Memory 1
1 or rectangular area buffer memory 1
This function generates an address for reading out a character stamp stored in 1 or for cutting out areas α on the left and right sides of the center line C.

特異線分抽出部１４は前記閾値αで設定された領域内の
特異線分を抽出するものである。ここで特異線分は文字
パタンの中間点を示すものであるが、第４図に示す如く
、領域内においてβ１の部分は文字パタンか存在しない
ためその領域の中間点を示す■が特異線分となり、β２
の部分は文字パタンＰ２のみが存在するため、領域の左
端部分と文字パタンＰ２の左側部分の中間点を示す■と
、文字パタンＰ２の右側部分と領域の右端部分の中間点
を示す■′とが特異線分となる。そして１３の部分では
文字パタンＰ１の右側と文字パタンＰ２の左側の中間点
を示す■と、領域の右端部分と文字パタンＰ１の左側の
中間点を示す■′および上記■゛とが特異線分となる。The singular line segment extraction unit 14 extracts singular line segments within the area set by the threshold value α. Here, the singular line segment indicates the midpoint of the character pattern, but as shown in Figure 4, there is no character pattern in the part β1 within the area, so the symbol ■ indicating the midpoint of that area is the singular line segment. So, β2
Since only character pattern P2 exists in the part, ■ indicates the midpoint between the left end of the area and the left side of character pattern P2, and ■' indicates the midpoint between the right side of character pattern P2 and the right end of the area. is a singular line segment. In the part 13, ■, which indicates the midpoint between the right side of the character pattern P1 and the left side of the character pattern P2, ■', which indicates the midpoint between the right end of the area and the left side of the character pattern P1, and the above ■゛ are singular line segments. becomes.

そして特異線分が複数存在するとき、領域の中心線Ｃに
近いものを採用する。When a plurality of singular line segments exist, the one closest to the center line C of the area is adopted.

特異線分終点検出部１５は、特異線分と文字パタンの接
触する終点をめるものであり、例えば第５図に示す如く
、特異線分■と文字パタンＰが接触する特異線分終点Ｅ
１及び特異線分■と文字パタンＰとの接触する特異線分
終点Ｅ２をめるものである。なお、特異線分終点が上ま
たは下のそれぞれにおいて複数存在するとき、中心線Ｃ
に近いものと文字パタンとの特異線分終点を検出するこ
とになる。The singular line segment end point detection unit 15 detects the end point where the singular line segment and the character pattern come into contact, and for example, as shown in FIG.
This is to find the end point E2 of the singular line segment where the character pattern P and the singular line segment 1 and the character pattern P are in contact. In addition, when there are multiple singular line segment end points on each of the upper and lower sides, the center line C
The end point of the singular line segment between the character pattern and the character pattern is detected.

一文字分離処理部１６は接合されている文字パタンＰを
分離処理するものであって上記特異線分終点Ｅ１および
β２のうちのいずれか中心線Ｃに近い方の特異線分終点
を通り中心線Ｃに平行な直線にもとづき文字パタンを分
離処理するものであって、第５図の例では特異線分終点
Ｅ１より垂下した直線りにもとづき文字パタンＰを分離
するものである。The character separation processing unit 16 separates the joined character patterns P, and passes through the singular line segment end point which is closer to the center line C among the singular line segment end points E1 and β2, and the center line C In the example shown in FIG. 5, character patterns P are separated based on straight lines that are parallel to the end point E1 of the singular line segment.

次に本発明の一実施例構成である第３図の動作について
説明する。Next, the operation of FIG. 3, which is an embodiment of the present invention, will be explained.

（ａ）画像メモリ１０に入力されている画像データを、
図示省略した図形処理装置で処理し、第１図（ロ）に示
す如き文字バタン群を抽出し、これを上下方向に走査し
て分離できない幅が２文字分あるＷの部分を検出する。(a) Image data input to the image memory 10,
Processing is performed by a graphic processing device (not shown) to extract a character stamp group as shown in FIG.

それからこの部分の文字パタンか接合しているのか否か
を検出するため、第２図（イ）に示すＹＷ力方向走査に
より最初に文字パタンに接触する点Ｓを検出し、この点
Ｓより周知の方法でこの文字パタンの周辺を追跡してＸ
Ｗのほぼ全幅の範囲を通過して出発点である点Ｓに戻っ
たとき、このＸＷとＹＷの領域に接合された文字パタン
か存在するものと判断できる。Then, in order to detect whether or not the character patterns in this part are joined, the point S that first contacts the character pattern is detected by scanning in the YW force direction shown in Figure 2 (a), and the point S that first contacts the character pattern is known from this point S. Trace the area around this character pattern using the method
When it passes through almost the full width of W and returns to point S, which is the starting point, it can be determined that a joined character pattern exists in the XW and YW regions.

（ｂ）このようにして第２図（イ）の領域に２文字のバ
タンか接合されているものと判断されたとき、この領域
を矩形領域バッファ・メモリー１に記入する。そして第
２図（ロ）に示す如く、この領域の中心線Ｃ−Ｃより閾
値保持レジスター２に記入されているαの範囲の領域を
定め、特異線分抽出部１４により例えば第４図に示す如
き特異線分■、■、■・−をその上方と下においてめる
。(b) In this way, when it is determined that two characters are joined in the area shown in FIG. 2(a), this area is written into the rectangular area buffer memory 1. Then, as shown in FIG. 2(b), a region within the range of α entered in the threshold value holding register 2 is determined from the center line C-C of this region, and the singular line segment extraction unit 14 determines the region as shown in FIG. 4, for example. Place singular line segments such as ■, ■, ■, - above and below it.

勿論この際、第４図の■′や■′の如き特異線。Of course, in this case, singular lines like ■' and ■' in Figure 4.

分は特異線分■、■よりも中心線Ｃ−Ｃから遠くにある
ので、除外されることになる。minute is further away from the center line C-C than the singular line segments ① and ②, so it will be excluded.

（Ｃ）次に特異線分終点検出部１５により、前記第４図
のような処理結果により残された特異線分が、文字パタ
ンと接する点である、例えば第５図におけるＥｌ、β２
を特異線分終点として検出する。(C) Next, the singular line segment end point detection unit 15 determines that the singular line segment left as a result of the processing shown in FIG.
is detected as the end point of the singular line segment.

（ｄ）そして−文字分離処理部１６によりこの特異線分
終点Ｅｌ、β２のうち中心線Ｃ−Ｃに近い特異線分終点
Ｅ１において、線りにもとづき文字パタンＰを分離する
。(d) - The character separation processing unit 16 separates the character pattern P based on the line at the singular line segment end point E1, which is closer to the center line C-C, among the singular line segment end points El and β2.

このようにして分離された文字パタンを、それぞれ周知
の方法で辞書バタンと比較してこの分離した文字パタン
を認識することが可能となる。It becomes possible to recognize the separated character patterns by comparing each of the character patterns separated in this way with a dictionary button using a well-known method.

なお上記説明ではアルファ・ニューメリック文字を例に
して説明したが勿論本発明はこれのみに限定されるもの
ではない。１つの文字が縦方向に分離していないもので
あれば分離することが可能であり、アルファ・ニューメ
リック文字は１つの文字に分離部分が存在しないので効
果的に分離することが、可能である。In the above description, alpha numeric characters were used as an example, but the present invention is of course not limited to this. If one character is not separated in the vertical direction, it can be separated, and alpha numeric characters can be effectively separated because there is no separation part in one character.

０しかも接合文字数は２文字のみに限定されるものではな
い。例えばその文字パタンの縦横の比より３文字の接合
と判定されたときは、その領域を３等分するような位置
に引いてその左右の閾値範囲を同等に処理すればよい。0 Furthermore, the number of joined characters is not limited to only two characters. For example, when it is determined that three characters are joined based on the aspect ratio of the character pattern, the area may be drawn to a position that divides the area into three equal parts, and the left and right threshold ranges may be processed equally.

〔Effect of the invention〕

本発明によれば接合した手書き文字パタンを比較的簡単
な手段により、正確に分離することができる。According to the present invention, joined handwritten character patterns can be accurately separated using relatively simple means.

[Brief explanation of the drawing]

第１図は文字パタンの説明図、第２図は接合文字パタン
の分離状態説明図、第３図は本発明の一実施例構成図、
第４図および第５図はその動作状態説明図である。図中、１０は画像メモリ、１１は矩形領域バッファ・メ
モリ、１２は闇値保持レジスタ、１３はアドレス制御部
、１４は特異線分抽出部、１５は特異線分終点検出部、
１６は一文字分離処理部である。特許出願人　富士通株式会社代理人弁理士　山　谷　晧　榮１Fig. 1 is an explanatory diagram of a character pattern, Fig. 2 is an explanatory diagram of a separated state of a joined character pattern, and Fig. 3 is a configuration diagram of an embodiment of the present invention.
FIG. 4 and FIG. 5 are explanatory diagrams of its operating state. In the figure, 10 is an image memory, 11 is a rectangular area buffer memory, 12 is a dark value holding register, 13 is an address control section, 14 is a singular line segment extraction section, 15 is a singular line segment end point detection section,
16 is a single character separation processing section. Patent applicant Fujitsu Ltd. Representative Patent Attorney Akira Yamatani 1

Claims

[Scope of Claims] 1. A handwritten character holding means that holds joined handwritten characters, and a specific line segment is extracted by scanning within a certain range from the estimated character center position of the joined handwritten character area. comprising a singular line segment extraction means, a singular line segment end point detection means for detecting a singular line segment end point where the singular line segment contacts a character slam, and a character separation processing unit that separates the character bang by the singular line segment end point, A character separation method characterized in that a character stamp for one character is separated from joined handwritten character stamps. 2. The character separation method according to claim 1, wherein the character BATA is an alpha numeric character.