JPS60160481A - Reader of character - Google Patents

Reader of character

Info

Publication number
JPS60160481A
JPS60160481A JP59017551A JP1755184A JPS60160481A JP S60160481 A JPS60160481 A JP S60160481A JP 59017551 A JP59017551 A JP 59017551A JP 1755184 A JP1755184 A JP 1755184A JP S60160481 A JPS60160481 A JP S60160481A
Authority
JP
Japan
Prior art keywords
character
recognition
recognition means
characters
read
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP59017551A
Other languages
Japanese (ja)
Inventor
Fumio Yoda
依田 文夫
Masataka Yamamoto
山本 勝敬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Computer Basic Technology Research Association Corp
Original Assignee
Computer Basic Technology Research Association Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Computer Basic Technology Research Association Corp filed Critical Computer Basic Technology Research Association Corp
Priority to JP59017551A priority Critical patent/JPS60160481A/en
Publication of JPS60160481A publication Critical patent/JPS60160481A/en
Pending legal-status Critical Current

Links

Landscapes

  • Character Discrimination (AREA)

Abstract

PURPOSE:To allow recognition at every character by an optimum recognizing means by outputting the recognized result by the 2nd recognizing means if the recognized result of a character by the 1st recognizing means is included in a read-out object character by the 2nd recognizing means. CONSTITUTION:The recognized result by the 1st recognizing means 5 and that by the 2nd recognizing means 6 with respect to an input character string 8 are sequentially transmitted to an editing means 7. If the recognized result by the 1st recognizing means 5 is not a recognition object character of the 2nd recognizing means 6, that is, KANJI (Chinese character), the recognized result is outputted as a read-out result through a control means 4. If a recognition object character is HIRAGANA (cursive form of, Japanese syllable), said recognized result by the 1st recognizing means 5 is replaced with that by the 2nd recognizing means 6 which is suitable to HIRAGANA, and it is outputted as a read-out result through the control means 4.

Description

【発明の詳細な説明】 [発明の技術分野] 本発明は複数の異なる文字レッ1−の文字が混在した文
字データを読み取る文字読取装置、例えば、漢字、ひら
がな、数字等で構成される日本語を認識して読み取る文
字読取装置に関するものである。
[Detailed Description of the Invention] [Technical Field of the Invention] The present invention relates to a character reading device for reading character data in which characters of a plurality of different character types are mixed, for example, Japanese characters consisting of kanji, hiragana, numbers, etc. This invention relates to a character reading device that recognizes and reads text.

[従来技術] 従来のこの種の装置は第1図に示すように構成されてい
た。第1図は文字読取装置の従来例の構成図である。図
中、(1)は文字を読み取られる用紙、(2)は文字を
光電変換により読み取る走査手段、(3〉は光電変換さ
れた文字パターンを認識する認識手段、(4)は認識結
果を出力する制御手段である。この種の文字読取装置は
次のように動作する。用紙(1)上に記入された文字は
、走査手段(2)によって読み取られ、1個の認識手段
(3)で認識され、この認識結果は制御手段(4)を経
由して出力される。この種の従来の装置では、数字のよ
うに一種類の文字セットだけを対象として読み取る場合
には充分に高い読取率が得られる。しかし、文章構造が
複雑で主として直線の組み合せから構成されている漢字
や、構造が簡単で主として曲線から(1a造されている
ひらがなの如く、構造の異なる複数の文字セットの文字
が混在した文字データを一つの認識手段(3)で読み取
る場合には、処理がぎわめで複雑になり、この種の装置
でそれに対応させると、装置が高価になるという欠点が
あった。
[Prior Art] A conventional device of this type was constructed as shown in FIG. FIG. 1 is a block diagram of a conventional example of a character reading device. In the figure, (1) is the paper on which the characters are read, (2) is the scanning means that reads the characters by photoelectric conversion, (3> is the recognition means that recognizes the photoelectrically converted character pattern, and (4) is the output of the recognition result. This type of character reading device operates as follows: Characters written on paper (1) are read by scanning means (2) and recognized by one recognition means (3). The recognition result is output via the control means (4).In this type of conventional device, when reading only one type of character set such as numbers, the reading rate is sufficiently high. However, it is possible to obtain characters from multiple character sets with different structures, such as kanji that have a complex sentence structure and consist mainly of combinations of straight lines, or kanji that have a simple structure and mainly consist of curved lines (such as hiragana, which is written in 1a). When mixed character data is read by a single recognition means (3), the processing becomes busy and complicated, and if this type of device is used to handle it, the device becomes expensive.

更に、この欠点を解決するために提案された他の従来の
装置は、複数の文字セットの文字を認識するために各文
字セラ1〜の認識に適するように構成された複数の各文
字セット毎の認識を用意し、文字パターンの特徴に基づ
いて文字ごどに、上記複数の認識手段から最適な認識手
段を1個選択し、選択された認識手段で認識を行うもの
である。
Furthermore, other conventional devices proposed to overcome this drawback include a plurality of character sets each configured to be suitable for recognition of characters of a plurality of character sets. For each character, one optimal recognition means is selected from the plurality of recognition means described above based on the characteristics of the character pattern, and recognition is performed using the selected recognition means.

このような文字レッ1−に応じ、認識手段を選択使用す
る上記の装置は、各文字セラ1〜の文字の認識に最適な
認識手段が選択されれば、高い読取率が得られる利点が
あるが、記入者の癖等により多様に変化する手書き文字
の場合には、入力文字に最適な認識手段を高い精度で選
択覆ることは難しく、装置構成や処理が複雑な割には総
合読取率があまり高くならないという欠点があった。
The above-mentioned device which selects and uses a recognition means according to such a character mark 1- has the advantage that a high reading rate can be obtained if the recognition means most suitable for recognizing each character mark 1- is selected. However, in the case of handwritten characters that vary widely depending on the habits of the person writing them, it is difficult to select the most suitable recognition method for the input characters with high accuracy, and the overall reading rate is low despite the complexity of the device configuration and processing. The drawback was that it did not rise very high.

[発明の概要] 本発明は上記のような従来のものの欠点を除去Jるため
になされたもので、第1の認識手段と、第1の認識手段
で読み取ることが困難な文字セラ1〜の認識に適した第
2の認識手段を設け、第2の認識手段の認識結果を用い
て第1の認識手段の認識結果を修正することを特徴とし
、簡単な構成の認識手段で高い読み取り率の文字読取装
置の提供を目的とする。
[Summary of the Invention] The present invention has been made in order to eliminate the drawbacks of the conventional ones as described above. A second recognition means suitable for recognition is provided, and the recognition result of the first recognition means is corrected using the recognition result of the second recognition means. The purpose is to provide a character reading device.

[発明の実施例] 以下、本発明を第2図の実施例を用いて詳細に説明する
[Embodiments of the Invention] The present invention will be described in detail below using an embodiment shown in FIG.

第2図は、本発明の文字読取装置の実施例の構成図であ
る。図中、同−符号及び同一記号は、従来例と同一また
は相当部分を示すものである。また、(5)は、特に、
ある文字セラ1〜の文字を正確に認識できるがその他の
文字セットの文字も認識可能な全文字の認識を対象とし
た第1の認識手段、(6)は1つの文字セットの文字の
みを正確に認識できる第2の認識手段、(7)は第1の
認識手段(5)と第2の認識手段(6)の各々の認識結
果を編集J゛る編集手段である。
FIG. 2 is a block diagram of an embodiment of the character reading device of the present invention. In the drawings, the same reference numerals and the same symbols indicate the same or equivalent parts as in the conventional example. In addition, (5) is especially
The first recognition means targets recognition of all characters, which can accurately recognize characters from a certain character set 1~, but can also recognize characters from other character sets, (6) accurately recognizes only characters from one character set The second recognition means (7) is an editing means that edits the recognition results of the first recognition means (5) and the second recognition means (6).

次に第2図の動作を説明すると、まず用紙(1)上に記
入された文字は、走査手段(2)によって光電変換等で
読み取られる。この光電変換され/j文字パターンは、
特にある文字セットを正確に認識できるが、その他の文
字セットの文字も認識できる第1の認識手段(5)と1
つの文字セットの文字のみを正確に認識できる第2の認
識手段(6)によって並列的に処理され、認識手段(5
)と(6)の各々の認識結果は編集手段(7)に送られ
る。編集手段(7)では、通常、前記第1の認識手段(
5)の認識結果を制御手段(4)に送るが、特に前記第
1の認識手段(5)の認識結果が、前記第2の認識手段
(6)の読み取り対象文字セットの文字である場合には
、第2の認識手段(6)で得られた認識結果を制御手段
(4)に送る制御手段(4)は、この編集手段(7)の
編集結果を読み取り結果として出力する。
Next, the operation shown in FIG. 2 will be explained. First, characters written on paper (1) are read by scanning means (2) by photoelectric conversion or the like. This photoelectrically converted /j letter pattern is
A first recognition means (5) that can particularly accurately recognize a certain character set, but also recognize characters from other character sets;
The recognition means (5) is processed in parallel by a second recognition means (6) which can accurately recognize only characters from two character sets.
) and (6) are sent to the editing means (7). In the editing means (7), the first recognition means (
The recognition result of step 5) is sent to the control means (4), especially when the recognition result of the first recognition means (5) is a character of the character set to be read by the second recognition means (6). The control means (4) sends the recognition result obtained by the second recognition means (6) to the control means (4), and the control means (4) outputs the editing result of the editing means (7) as a reading result.

以下に本発明の実施例の動作を漢字カナ交り文の読み取
りを例にとってさらに詳しく説明する。
The operation of the embodiment of the present invention will be explained in more detail below, taking reading of Kanji and Kana characters as an example.

第3図は用紙(1)に記入された入力文字の例であり、
文字列「雨の降る日」 (8)が記入された例を承りも
のである。一般に漢字は主に直線の組み合せで構成され
、また、ひらがなは主に曲線から構成されている。そこ
で上記第1の認識手段(5)では、文字パターンの直線
部分を調べる方法等、主として漢字の認識に適した従来
の文字認識技術を用いる。また、上記第2の認識手段(
6)では、文字パターンの輪郭を調べる方法等、曲線部
分の多いひらがな等に適した文字認識技術を用い、ひら
がなの文字レットだけを認識Jるようにする。
Figure 3 is an example of input characters written on form (1),
An example in which the character string "Rainy day" (8) is entered is acceptable. In general, kanji are mainly composed of combinations of straight lines, and hiragana are mainly composed of curved lines. Therefore, the first recognition means (5) uses conventional character recognition techniques mainly suitable for recognizing Chinese characters, such as a method of examining straight line portions of character patterns. In addition, the second recognition means (
In step 6), only hiragana characterlets are recognized using a character recognition technique suitable for hiragana, which has many curved parts, such as a method of examining the outline of a character pattern.

第4図は入力文字「雨」 (9)、[のj (10)、
[降J (11)、[るJ(12>、「日J (13)
を上記第1の認識手段(5)で認識した認識結果の例を
示したものであり、認識結果が1由」(14)、「つJ
 (15)、[降J (16)、「ろ」(17)、r日
J (18)となった例である。第5図は同じ文字列(
8)を上記第2の認識手段(6)で認識した認識結果を
示したものであり、入力文字「雨」 (9)と[降J 
(11)に対し−Cは棄却となり[のJ (10)、「
るJ(12)、[日J (13)は、それぞれ「のJ 
(19)、「るJ (20)rはJ (21)と認識し
た例である。なお、ここで記号[◆J (22)は棄却
を意味するものである。人力文字列(8)に対する第1
の認識手段(5)の認識結果と第2の認識手段(6)の
認識結果は、順次編集手段(7)へ送られる。編集手段
(7)では、第1の認識手段(5)の認識結果が、第2
の認識手段(6)の認識対象文字ひない場合、すなわら
、漢字ならば、高い精度で認識されるため、この認識結
果を制御手段(4)を経由して読み取り結果どして出力
し、また、前記第1の認識手段(5)の認識結果が、前
記第2の認識手段(6)の認識対象文字、すな4つち、
認識対象文字がひらがなならば、ひらがなの認識に適し
た第2の認識手段(6)の認識結果で前記第1の認識手
段(5)の認識結果を置き換え、制御手段(4)を経由
して読み取り結果どして出力づる。例えば、第3図に承
り入力文字「雨」(9)、「隣J (11)、「日J 
(13)に対J゛るf!1の認識手段(5)による認識
結果は、それぞれ高い精度で読み取った漢字「雨J (
14)、[降J(16)、[日J (18)であるため
、前記編集手段(7)はこのまま制御手段く4)に送る
。しかし、入ノj文字[のJ (10)、「る」(12
)に対する第1の認識手段(5)の認識結果は、第1の
認識手段では正確に読み取ることが困難なひらがなであ
るため、[つJ (15)、[ろJ(17)となってし
まうので、ひらがなの認識に適する第2の認識手段(6
)の2識結果[のJ (19)、rるJ (20)で置
き換え、制御手段(4)に送る。したがって、第6図の
如く修正されることになる。すなりも、第6図は、上記
制御手段(4)に送られる1文字列(23)を示したも
のであり、第1の認識手段(5)の認識結果が一部修正
され、正しい読み取り結果「雨の降る日」が制御手段(
4)に送られていることがわかる。
Figure 4 shows the input characters "rain" (9), [noj (10),
[fall J (11), [ru J (12>), ``日 J (13)
is recognized by the first recognition means (5) above, and the recognition results are
(15), [fall J (16), "ro" (17), and r day J (18). Figure 5 shows the same character string (
8) is recognized by the second recognition means (6) above.
For (11), −C is rejected and [J (10), “
J (12) and J (13) are respectively
(19), "ruJ (20) r is an example of recognition as J (21). Here, the symbol [◆J (22) means rejection. 1st
The recognition results of the second recognition means (5) and the second recognition means (6) are sequentially sent to the editing means (7). In the editing means (7), the recognition result of the first recognition means (5) is
If there are no characters to be recognized by the recognition means (6), in other words, if they are kanji, they are recognized with high accuracy, so this recognition result is output as a reading result via the control means (4). , and the recognition result of the first recognition means (5) is the character to be recognized by the second recognition means (6), that is, four characters,
If the character to be recognized is Hiragana, the recognition result of the first recognition means (5) is replaced with the recognition result of the second recognition means (6) suitable for recognizing Hiragana, and the recognition result is passed through the control means (4). Outputs the reading results. For example, in Figure 3, the input characters ``rain'' (9), ``Next J'' (11), ``Sun J''
Against (13) J゛ruf! The recognition results obtained by the recognition method (5) in 1 are the kanji ``Ame J (
14), [fall J (16), and [day J (18)], the editing means (7) sends it as is to the control means (4). However, the entering no j character [ no J (10), ``ru'' (12
), the recognition result of the first recognition means (5) is [tsuJ (15), [roJ (17)] because it is difficult for the first recognition means to read hiragana accurately. Therefore, the second recognition method (6
) is replaced with J (19), rJ (20) and sent to the control means (4). Therefore, it will be modified as shown in FIG. Figure 6 shows one character string (23) sent to the control means (4), and the recognition result of the first recognition means (5) is partially corrected to ensure correct reading. The result ``Rainy day'' is the control means (
You can see that it is being sent to 4).

なJ3、上記実施例では第1の認識手段の認識結果だけ
を第2の認識手段の認識結果で修正し、編集覆る場合に
ついて説明したが、本発明は、これに限らず第1の認識
手段の認識結果とイの候補文字を第2の認識手段の認識
結果とその候補文字で編集してもよい。また、第2の認
識手段が1個の場合について説明したが、これに限定す
るものではなく、例えば、第2の認識手段として異なる
文字セラ1〜毎に複数個設けてもよい。更に、文字デー
タが漢字カナ交り文の場合について示したが、その他の
英数字や記号等が混在した文字データの読み取りにも適
用できる。
In the above embodiment, only the recognition result of the first recognition means is corrected by the recognition result of the second recognition means, and the present invention is not limited to this, but the present invention is not limited to this. The recognition result of ``a'' and the candidate character ``a'' may be edited using the recognition result of the second recognition means and the candidate character. Moreover, although the case where there is one second recognition means has been described, the present invention is not limited to this, and for example, a plurality of second recognition means may be provided for each different character cella 1. Furthermore, although the case where the character data is a kanji/kana combination has been described, the present invention can also be applied to reading character data containing a mixture of other alphanumeric characters, symbols, etc.

[発明の効果] 以上のように、本発明によれば、編集手段で第1の認識
手段での認識結果の文字が、第2の認識手段での読取対
象文字に含まれる文字である場合には、第2認派手段で
の認識結果を出力し、それ以外の文字である揚台には、
第1の認識手段での認識結果を出力する構成を有し、第
1の認識手段で得られた認識結果を第2の認識手段で得
られた認識結果で修正り−るような文字読み取りを行う
ものであるから、認識手段を複雑にすることなく、前も
って文字毎に認識手段を選択J−ることによる分類誤り
が発生しない上に、文字ごとに最適な認識手段で認識で
きることから総合的に高い基数率を得ることがでさると
いう効果がある。
[Effects of the Invention] As described above, according to the present invention, when the characters recognized by the first recognition means in the editing means are included in the characters to be read by the second recognition means, outputs the recognition result using the second recognition means, and for other characters,
It has a configuration that outputs the recognition result of the first recognition means, and performs character reading in which the recognition result obtained by the first recognition means is corrected by the recognition result obtained by the second recognition means. Because the recognition method is not complicated, classification errors caused by selecting a recognition method for each character in advance do not occur, and since each character can be recognized with the most suitable recognition method, it is possible to improve overall performance. This has the effect of obtaining a high cardinal rate.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は従来の文字読取装置の構成を示すブロック図、
第2図は本発明の一実施例の構成を示すブロック図、第
3図は用紙に記入された漢字カナ交り文の例を示す説明
図、第4図は認識結果の例を示す説明図、第5図は認識
結果の他の例を示づ説明図、第6図は制御手段に送られ
る文字の列を示す説明図である。 図中、(1)は用紙、 (2)は走査手段、(3)は認
識手段、 (4)は制御手段、−(5)は第1の認識手
段、(7)は編集手段、(6)は第2の認識手段、 なお、図中、同−符号及び同一記号は、同一または相当
部分を示す。 代理人 大君 地組 外2名 第1図
FIG. 1 is a block diagram showing the configuration of a conventional character reading device.
FIG. 2 is a block diagram showing the configuration of an embodiment of the present invention, FIG. 3 is an explanatory diagram showing an example of a kanji-kana combination written on a sheet, and FIG. 4 is an explanatory diagram showing an example of a recognition result. , FIG. 5 is an explanatory diagram showing another example of the recognition result, and FIG. 6 is an explanatory diagram showing a string of characters sent to the control means. In the figure, (1) is the paper, (2) is the scanning means, (3) is the recognition means, (4) is the control means, - (5) is the first recognition means, (7) is the editing means, (6) ) is the second recognition means. In the figures, the same reference numerals and the same symbols indicate the same or corresponding parts. Agent: Ookimi, 2 people from Chigumi Figure 1

Claims (4)

【特許請求の範囲】[Claims] (1) 文字を光学的に走査し光電変換する走査手段と
、読取対象文字の全文字セットを対象として光電変換さ
れた文字パターンを認識する第10)認識手段と上記第
1認識手段と並列に動作し、読取対象文字の一部の文字
レットを対象として文字パターンを認識するように構成
した1個以上の第2の認識手段と、前記各認識手段の認
識結果を編集する編集手段とを具備する文字を認識して
読み取る文字読取装置において、上記編集手段で上記第
1の認識手段での認識結果の文字が、上記第2の認識手
段での読取対象文字に含まれる文字である場合には、上
記第2認識手段での認識結果を出力し、ぞれ以外の文字
である場合には、上記第1の認識手段での認識結果を出
力する構成を特徴とづる文字読取装置。
(1) A scanning means for optically scanning characters and photoelectrically converting them; and a 10) recognition means for recognizing photoelectrically converted character patterns for the entire set of characters to be read, in parallel with the first recognition means. one or more second recognition means configured to operate and recognize a character pattern for some characterlets of characters to be read; and an editing means for editing the recognition results of each of the recognition means. In a character reading device that recognizes and reads characters that are read by the editing means, when the characters recognized by the first recognition means are included in the characters to be read by the second recognition means; . A character reading device characterized by outputting the recognition result of the second recognition means, and outputting the recognition result of the first recognition means if the character is a different character.
(2) 前記第1の認識手段は、漢字を正確に認識でき
ることを特徴とする特許請求の範囲第1項記載の文字読
取装置。
(2) The character reading device according to claim 1, wherein the first recognition means is capable of accurately recognizing Chinese characters.
(3) 前記M2の認識手段は、ひらがなを正確に認識
できることを特徴とする特許請求の6u 1m ITl
 1rJ記載の文字読取装置。
(3) The M2 recognition means can accurately recognize Hiragana.
Character reading device described in 1rJ.
(4) 前記第2の認識手段を異なる文字セット毎に複
数個段りたことを特徴とする特許請求の範囲第1循記載
の文字読取装置。
(4) The character reading device according to claim 1, wherein the second recognition means is arranged in a plurality of stages for different character sets.
JP59017551A 1984-02-01 1984-02-01 Reader of character Pending JPS60160481A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP59017551A JPS60160481A (en) 1984-02-01 1984-02-01 Reader of character

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP59017551A JPS60160481A (en) 1984-02-01 1984-02-01 Reader of character

Publications (1)

Publication Number Publication Date
JPS60160481A true JPS60160481A (en) 1985-08-22

Family

ID=11947048

Family Applications (1)

Application Number Title Priority Date Filing Date
JP59017551A Pending JPS60160481A (en) 1984-02-01 1984-02-01 Reader of character

Country Status (1)

Country Link
JP (1) JPS60160481A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63223890A (en) * 1987-03-12 1988-09-19 Toshiba Corp Drawing reader
JPH0289194A (en) * 1988-09-26 1990-03-29 Fujitsu Ltd Hand written character recognizing system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63223890A (en) * 1987-03-12 1988-09-19 Toshiba Corp Drawing reader
JPH0289194A (en) * 1988-09-26 1990-03-29 Fujitsu Ltd Hand written character recognizing system

Similar Documents

Publication Publication Date Title
EP0343786A3 (en) Method and apparatus for reading and recording text in digital form
JPS60160481A (en) Reader of character
JP2740335B2 (en) Table reader with automatic cell attribute determination function
JPH0452510B2 (en)
Amin OCR of Arabic texts
JPS6336389A (en) Character reader
JPS57168382A (en) Optical character reader
JPS59158482A (en) Character recognizing device
JPH03225579A (en) Device for segmenting character pattern
JPS60110089A (en) Character recognizer
JPS6095689A (en) Optical character reader
JPS59148983A (en) Method for selecting "kanji" recognizing dictionary
JPS6139175A (en) Optical character reading device
JPS6227887A (en) Character type separating system
JPH01287789A (en) Mark sheet
JPS6160189A (en) Optical character reader
JPS63263588A (en) Character reader
JPS60254388A (en) Optical character reader
JPS5644968A (en) Kana (japanese syllabary)-kanji (chinese character) conversion system
JPH05282484A (en) Optical character reader
JPS6115288A (en) Optical character reader
JPS6160185A (en) Character recognizer
JPS6095688A (en) Character recognizing device
JPS61187086A (en) Optical character reader
JPH09237317A (en) General document reader