JPS581821B2

JPS581821B2 - Japanese data input device

Info

Publication number: JPS581821B2
Application number: JP53115853A
Authority: JP
Inventors: 原辰次; 高野陸男; 石川浩一郎; 白鳥嘉勇; 木村久正
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1978-09-22
Filing date: 1978-09-22
Publication date: 1983-01-13
Also published as: JPS5543621A

Description

【発明の詳細な説明】この発明は、漢字、ひらがな、カタカナなどで表記され
る日本語データを記号、英数字、仮名文字の認識を用い
て簡易に投入することを目的とした日本語データ投入装
置に関するものである。[Detailed Description of the Invention] This invention aims to easily input Japanese data written in kanji, hiragana, katakana, etc. by using symbols, alphanumeric characters, and kana character recognition. It is related to the device.

従来の記号・文字認識にはＯＣＲ（ＯｐｔｉｃａｌＣｈ
ａｒａｃｔｅｒＲｅａｄｅｒ）等による手書き、ある
いは印刷文字のオフライン認識（すでに書かれたあるい
は印刷された文字の認識）と、デイジタイザー等を用い
た手書き文字のオンライン認識（手書き中に筆順なども
考慮して認識する）とがあった。OCR (Optical Channel) is used for conventional symbol/character recognition.
offline recognition of handwritten or printed characters (recognition of characters that have already been written or printed) using a digitizer, etc., and online recognition of handwritten characters using a digitizer (recognition that takes into account stroke order while handwriting). ).

そして両認識方法とも記号、英数字、カタカナに対する
認識技術はほとんど完成されており、またひらがな文字
認識に対する技術もある程度完成されてきた。For both recognition methods, the recognition technology for symbols, alphanumeric characters, and katakana has almost been completed, and the technology for hiragana character recognition has also been completed to some extent.

しかし、日本語表記における最も重要な漢字に対する認
識は、■漢字は他の文字に比べ複雑であるため、高い読
取り分解能が要求される、■漢字は字種が多いので、高
い識別能力が要求される、などのため記号、英数字、カ
タカナ、ひらがなの昭識に比べハード的にもソフト的に
も難かしく、高い認識率が得られないばかりでなく、読
取認識装置が非常に高価になるという欠点があった。However, the recognition of the most important kanji in Japanese notation is: ■ Kanji are more complex than other characters, so high reading resolution is required. ■ Kanji has many types, so high discrimination ability is required. This makes it more difficult in terms of both hardware and software compared to the knowledge of symbols, alphanumeric characters, katakana, and hiragana, and not only is it difficult to achieve a high recognition rate, but the reading recognition equipment is also very expensive. There were drawbacks.

この発明は、これらの欠点を除去するためになされたも
ので、記号、英数字、カタカナ文字のみの認識を用いて
簡易に漢字を含む日本語データの投入を行うようにした
ものである。This invention was made to eliminate these drawbacks, and is designed to easily input Japanese data including Kanji characters by recognizing only symbols, alphanumeric characters, and katakana characters.

以下この発明について詳細に説明する。This invention will be explained in detail below.

なお、この明細書においては、仮名は平仮名と片仮名を
意味し、平仮名の場合には「かな」または「ひらがな」
と、片仮名の場合には「カナ」または「カタカナ」と表
わすことにする。In this specification, kana means hiragana and katakana, and in the case of hiragana, it is written as "kana" or "hiragana".
In the case of katakana, it will be expressed as ``kana'' or ``katakana.''

第１図はこの発明の一実施例の構成をブロック図で示し
た情報の流れ図で、１は漢字を含む日本語データ、２は
漢字の特徴を表わす識別記号を付帯したカタカナ表記の
日本語を読み取るスキャナ一部、３は前記スキャナ一部
２からのパターン情報を認識する認識部、４は前記認識
部３で認識されたカタカナ表記の日本語を漢字カナ混り
文に変換する変換部である。FIG. 1 is an information flow diagram showing the configuration of an embodiment of the present invention in a block diagram. 1 is Japanese data including kanji, and 2 is Japanese data written in katakana with identification symbols representing the characteristics of kanji. 3 is a recognition section that recognizes the pattern information from the scanner section 2; 4 is a conversion section that converts the katakana written Japanese recognized by the recognition section 3 into a sentence containing kanji and kana; .

第２図ａはこの発明に用いられる手書きあるいは印刷に
よる日本語文の一例を示すもので、第２図ｂは第２図ａ
の日本語文を後述する入力用けん盤部によって記号付カ
タカナ文で表わした例を示すものである。Figure 2a shows an example of a handwritten or printed Japanese text used in this invention, and Figure 2b shows an example of the Japanese text used in this invention.
This figure shows an example of a Japanese sentence expressed as a katakana sentence with symbols using the input keyboard section, which will be described later.

前記入力用けん盤部の一例の詳細を第３図に示す。FIG. 3 shows details of an example of the input keyboard section.

第３図において、入力用けん盤部１Ａはカナけん盤１Ｂ
および漢字情報付加けん盤１Ｃとからなり、カナけん盤
１Ｂはカタカナキー１Ｂ１を有し、また漢字情報付加け
ん盤１Ｃは、漢字の特徴を表わす付加情報を付加するた
めのパターンキー１Ｃ，、人名（人名・地名）キー１Ｃ
２、区切キー１Ｃ３、および送りキー１Ｃ４を備えてい
る。In Fig. 3, the input keyboard section 1A is the keyboard section 1B.
and a kanji information adding keyboard 1C, the kana writing board 1B has a katakana key 1B1, and the kanji information adding keyboard 1C has a pattern key 1C for adding additional information representing the characteristics of the kanji. place name) key 1C
2, a separator key 1C3, and a feed key 1C4.

再び第２図ｂにおいて、５は前述した第３図の入力用け
ん排部１Ａにより付加された漢字の特徴を表わすパター
ン（以後、特徴識別記号と呼ぶ）、６はカタカナ文字で
あり、日本語データは特徴識別記号５で分ち書きされた
ものである。Again in FIG. 2b, 5 is a pattern representing the characteristics of a kanji character (hereinafter referred to as a feature identification symbol) added by the input extractor 1A of FIG. 3, and 6 is a katakana character, which is a Japanese The data is separated by feature identification symbol 5.

第２図ｂに示すような形で手書きあるいは印刷された日
本語（以後、記号付カタカナ文と呼ぶ）はスキャナ一部
２によりパターン情報に変換され、次に認識部３により
その記号付カタカナ文に対応するカタカナ符号と特徴識
別記号５に変換され、さらに変換部４により、語単位に
漢字カナ混り文に対応する文字（漢字）符号に変換され
る。Japanese handwritten or printed in the form shown in Figure 2b (hereinafter referred to as katakana text with symbols) is converted into pattern information by the scanner part 2, and then converted into katakana text with symbols by the recognition unit 3. is converted into a katakana code and a feature identification symbol 5 corresponding to the character, and further converted word by word into a character (kanji) code corresponding to a sentence containing kanji and kana.

すなわち変換部４内の変換ファイルにはあらかじめ変換
が一義的になされるように登録されている。That is, the conversion file in the conversion unit 4 is registered in advance so that conversion can be performed uniquely.

上記の「選挙」の場合で云えば、「■センキヨ■」の信
号コードで変換ファイル内を走査し、あらかじめ登録し
ておいた「選挙」のコードと合致した場合、「選挙」に
対応する漢字コードを読出すわけである。In the case of "election" above, the conversion file is scanned with the signal code "■senkiyo■", and if it matches the pre-registered code for "election", the kanji corresponding to "election" is detected. This is how the code is read.

次にこの発明によるシステム構成の各種の応用例を示す
。Next, various application examples of the system configuration according to the present invention will be shown.

第４図は端末装置１０内に各機能、すなわちスキャナ一
部２、認識部３、変換部４、伝送制御部７を有し、これ
により通信綱１１を介して、フロントエンドプロセッサ
８および中央処理装置９を有するセンター１２に文字符
号を送り、ここで漢字カナ混りの日本語文に認識変換せ
しめるものである。In FIG. 4, a terminal device 10 has various functions, namely a scanner section 2, a recognition section 3, a conversion section 4, and a transmission control section 7, which connect a front end processor 8 and a central processing section via a communication line 11. The character code is sent to a center 12 having a device 9, where it is recognized and converted into a Japanese sentence mixed with Kanji and Kana.

第５図〜第７図はファクシミリ符号を利用して端末装置
１０からパターン情報を送り、通信綱１１を介して文字
符号をセンター１２に送り、この通信綱１１中またはセ
ンター１２で漢字カナ混りの日本語文に認識変換せしめ
るものである。In FIGS. 5 to 7, pattern information is sent from the terminal device 10 using facsimile codes, character codes are sent to the center 12 via the communication line 11, and kanji and kana are mixed in the communication line 11 or at the center 12. This allows the recognition and conversion of the text into Japanese sentences.

第８図、第９図は特徴識別記号、カナ文字符号を送るこ
とにより通信網１１を介してセンター１２に送り、この
通信網１１中またはセンター１２で漢字カナ混りの日本
語文に認識変換せしめるものである。In FIGS. 8 and 9, characteristic identification symbols and kana character codes are sent to the center 12 via the communication network 11, and are recognized and converted into Japanese sentences containing kanji and kana in the communication network 11 or at the center 12. It is something.

このようにこの発明は、ローカルに処理するのみでなく
、回線を用いた種々の構成のシステムが実現できる。In this way, the present invention enables not only local processing but also systems with various configurations using lines to be realized.

なお、この発明はオンライン認識、オフライン認識は問
わず、また特徴識別記号は上述のものに限らず、他の記
号、文字と区別できるものならどのような記号であって
もよい。Note that the present invention is applicable to both online recognition and offline recognition, and the characteristic identification symbol is not limited to those described above, but may be any symbol that can be distinguished from other symbols or characters.

また、カナのみならすかなでもよいことは云うまでもな
い。Also, it goes without saying that if it's only kana, it's fine to write only kana.

以上説明したようにこの発明によれば、下記のような幾
多の利点を有する。As explained above, the present invention has many advantages as described below.

■ 英数仮名と有限個である１０種類程度の簡単な記号
を用いた分ち書きであるため、手書き入力、タイプ入力
を問わず、データを投入する人間の負担は非常に小さい
。■ Since the data is written in parts using alphanumeric and kana characters and a limited number of about 10 types of simple symbols, the burden on humans inputting data is extremely small, regardless of whether it is handwritten or typed.

■ 英数仮名と１０種類程度の簡単な記号の認識である
ため、従来の認識技術がそのまま使用でき、高級な漢字
認識が不要であるため、スキャナ一部および認識部が安
価にでき、認識率が向上する。■ Because it recognizes alphanumeric kana and about 10 simple symbols, conventional recognition technology can be used as is, and advanced kanji recognition is not required, so the scanner and recognition unit can be made inexpensive and the recognition rate can be improved. will improve.

■ 語単位、すなわち、特徴識別記号の奇数番目と偶数
番目の特徴識別記号およびこの特徴識別記号に挾まれた
仮名文字列を、漢字仮名混り文に変換するので、認識部
において文字単位ではｌｏｏ％の認識が達成されなくて
も、語単位では正しく変換できる可能性があり、必要以
上に認識率を向上させる必要はない。■ Since the word unit, that is, the odd-numbered and even-numbered feature identification symbols and the kana character string sandwiched between these feature identification symbols, is converted into a sentence containing kanji and kana, the recognition unit Even if % recognition is not achieved, there is a possibility of correct conversion on a word-by-word basis, and there is no need to improve the recognition rate more than necessary.

かようにこの発明によれば、英数仮名文字にわずかの特
徴識別記号を用いた記号付仮名文で認識変換するため、
従来の英数カタカナの文字認識技術を用いて安価で認識
変換率の高い日本語データの投入が可能であるとともに
、機能分散が容易にできるので、効率のよいシステムの
構成が可能である等の利点を有する。According to this invention, alphanumeric and kana characters are recognized and converted into kana sentences with symbols using a few feature identification symbols.
It is possible to input Japanese data that is inexpensive and has a high recognition conversion rate using conventional alphanumeric and katakana character recognition technology, and because functions can be easily distributed, it is possible to configure an efficient system. has advantages.

[Brief explanation of the drawing]

第１図はこの発明の一実施例の構成をブロック図で示し
た情報の流れ図、第２図ａはこの発明に用いられる日本
語文の一例を示すものであり、第２図ｂは第２図ａの日
本語文を記号付カタカナ文で表わした例を示す図、第３
図は入力用けん盤部の詳細を示す図、第４図〜第９図は
この発明によるシステムの構成例をそれぞれ示すブロッ
ク図である。図中、１は日本語データ、２はスキャナ一部、３は認識
部、４は変換部、５は特徴識別記号、６はカタカナ文字
である。FIG. 1 is an information flow diagram showing the configuration of an embodiment of this invention in a block diagram, FIG. 2 a shows an example of Japanese text used in this invention, and FIG. Figure 3 showing an example of the Japanese sentence a expressed in katakana sentences with symbols.
The figure shows details of the input keyboard section, and FIGS. 4 to 9 are block diagrams showing configuration examples of the system according to the present invention. In the figure, 1 is Japanese data, 2 is a part of the scanner, 3 is a recognition unit, 4 is a conversion unit, 5 is a feature identification symbol, and 6 is a katakana character.

Claims

[Claims]

1. A part of the scanner that optically reads handwritten or printed Japanese kana data character strings, which are formed by adding characteristic identification symbols representing the graphical characteristics of the kanji before and after the pronunciation kana written in kanji, and this scanner. A recognition unit that identifies the odd-numbered and even-numbered feature identification symbols read in part and the kana character strings interposed between the feature identification symbols, and a summary of the kana character strings, feature identification symbols, and kanji codes. It consists of a conversion unit having a conversion file, searches for the conversion file using the information of the recognition unit, and if it matches, outputs the kanji code of the conversion unit to convert the handwritten or printed Japanese data. A Japanese data input device characterized by converting the text into the corresponding kanji code.