JP2634926B2

JP2634926B2 - Kana-Kanji conversion device

Info

Publication number: JP2634926B2
Application number: JP2103692A
Authority: JP
Inventors: 佳三斎藤
Original assignee: Consejo Superior de Investigaciones Cientificas CSIC
Current assignee: Consejo Superior de Investigaciones Cientificas CSIC
Priority date: 1990-04-19
Filing date: 1990-04-19
Publication date: 1997-07-30
Anticipated expiration: 2012-07-30
Also published as: JPH041848A

Description

【発明の詳細な説明】（イ）産業上の利用分野この発明は、主として、日本語ワードプロセッサのよ
うな、かな漢字変換辞書を備えた情報処理装置に用いら
れ、入力されたかな文字列を漢字かな交じり文に変換す
るかな漢字変換装置に関する。The present invention is mainly used in an information processing apparatus having a kana-kanji conversion dictionary, such as a Japanese word processor, and converts an input kana character string to a kanji kana character. The present invention relates to a kana-kanji conversion device for converting kana to kanji.

（ロ）従来の技術一般に、日本語における単語の中には、送りがなの付
け方によって許容される表記が複数存在する場合があ
る。例えば、「おぼえがき」という単語であれば、「覚
え書」、「覚え書き」、「覚書」、「覚書き」等の送り
がながある。また、その表記についても、「おぼえが
き」、「オボエガキ」、「おぼえ書き」、「覚えがき」
等の種々の表記がある。(B) Conventional technology In general, in a Japanese word, there are cases where there are a plurality of expressions that are allowed depending on how to attach a sentence. For example, in the case of the word “remember”, there are sendings such as “memorandum”, “memorandum”, “memorandum”, and “memorandum”. In addition, the notation is also used for "remembering", "oboegaki", "remembering", "remembering"
And so on.

従来のかな漢字変換装置においては、送りがなに種類
の多い単語を所望の送りがなで入力するためには、送りがなの異なる単語を、全てかな漢字変換辞書に登
録しておき、そこから所望の送りがなの単語を選択す
る。In the conventional Kana-Kanji conversion device, in order to input a word with many types of feeds in a desired way, all words with different feeds are registered in the Kana-Kanji conversion dictionary, and the desired word is selected from the dictionary. I do.

一つの単語に対して一つの送りがなの表記のみをかな
漢字変換辞書に登録しておき、かな漢字変換後、所望の
送りがなに修正する。Only one kana-kanji notation for one word is registered in the kana-kanji conversion dictionary, and after kana-kanji conversion, the desired kana-kanji conversion is corrected.

アルゴリズムで送りがなを削除する。つまり、送りが
なを全て付記した表記のみを辞書に登録しておき、それ
を変換する第１候補とし、そこから順に送りがなを削除
する。例えば、前述の「おぼえがき」のように、送りが
なの候補が「覚え書」、「覚え書き」、「覚書」、「覚
書き」と４つあるような場合には、送りがなの字数の最
も多い表記を第１候補とし、第１候補（覚え書き）→第
２候補（覚え書、又は、覚書き）→第３候補（覚書）の
ようにして、送りがなを自動的に削除してゆく。The algorithm deletes the sentence. In other words, only the notation in which all of the sending letters are added is registered in the dictionary, and the first notation is used as the first candidate for conversion, and the sending letters are sequentially deleted from there. For example, in the case where there are four candidates for the sending word, such as "memorandum", "memorandum", "memorandum", and "memorandum", as in the above-mentioned "notes", the notation with the largest number of characters in the sending word Is the first candidate, and the first candidate (memorandum) → the second candidate (memorandum or memorandum) → the third candidate (memorandum) is automatically deleted.

というような方法を用いている。Such a method is used.

しかしながら、上述ののように、かな漢字変換辞書
内に送りがなの異なる単語全てを登録することは、限ら
れた辞書容量内での登録単語数の低下を招く。However, as described above, registering all words having different sending words in the kana-kanji conversion dictionary causes a decrease in the number of registered words within a limited dictionary capacity.

また、上述ののように、一つの単語に対して一つの
送りがなの表記しか登録していない場合には、単語を読
出す毎にその送りがなを修正（送りがなの追加、削除）
しなければならず、変換確定後に修正を行うため、学習
機能が効かないという不具合がある。Also, as described above, when only one notation is registered for one word, each time the word is read out, the notation is corrected (addition or deletion of the notation).
And the correction is performed after the conversion is determined, so that the learning function is not effective.

さらに、上述ののように、アルゴリズムで送りがな
を削除する場合には、送りがなを追加する方向での表記
の処理が困難であり、例えば、「おぼえがき」の場合で
あれば、第１候補「覚書」としたような場合には、次の
候補を「覚え書き」（送りがなの追加処理）とすること
が難しいという問題がある。Furthermore, as described above, in the case where a feed word is deleted by the algorithm, it is difficult to perform notation processing in a direction in which the feed word is added. For example, in the case of “remember”, the first candidate “memorandum” In this case, there is a problem that it is difficult to set the next candidate as “memorandum” (additional processing for sending).

従って、このような問題を解決するために、本発明者
等は、送りがなのパターンをかな漢字変換辞書とは別に
テーブルとして記憶しておき、そのテーブルに基づいて
送りがなを変化させるようにして、送りがなの追加、削
除等の修正操作を不要にしたかな漢字変換装置を出願し
た（特願平１−133737号）。Therefore, in order to solve such a problem, the present inventors store the pattern of the transmission in a table separately from the kana-kanji conversion dictionary, and change the transmission based on the table to change the transmission. We applied for a Kana-Kanji conversion device that did not require correction operations such as addition and deletion (Japanese Patent Application No. 1-133737).

（ハ）発明が解決しようとする課題しかしながら、このかな漢字変換装置においては、例
えば「おぼえがき」という単語であれば、「覚え書」、
「覚え書き」、「覚書」、「覚書き」、「おぼえが
き」、「オボエガキ」等には変換可能であっても、「お
ぼえ書き」、又は「覚えがき」のようには変換すること
はできなかった。すなわち、例えば「杓子定規」という
単語であれば「しゃくし定規」や、また「絞り込む」で
あれば「しぼり込む」又は「絞りこむ」のように、複数
漢字表記の内の一部だけを平がな表記にして変換するこ
とはできなかった。(C) Problems to be Solved by the Invention However, in the kana-kanji conversion device, for example, if the word “oboegaki”
Although it can be converted to "memo,""memo,""memo,""oboegaki," etc., it cannot be converted to "memo," or "memo." Did not. That is, for example, if the word is "scoop ruler", only a part of the multiple kanji notation is flattened, such as "shakushi ruler" or "squeeze down" or "squeeze down". It could not be converted to kana notation.

この発明は、このような事情を考慮してなされたもの
で、送りがなに複数の表記が存在する単語については、
その送りがなのパターンをかな漢字変換辞書とは別にテ
ーブルとして記憶すると共に、漢字表記文字が複数あ
り、一部を平がなに変更した表記が存在する単語につい
ても、その漢字／平がな表記のパターンをかな漢字変換
辞書とは別にテーブルとして記憶しておき、それらの単
語が読出される場合には、各テーブルに基づいて送りが
なや表記を変化させるようにして、従来のような送りが
なや表記の修正操作を不要にした、かな漢字変換装置を
提供するものである。The present invention has been made in view of such circumstances, and for a word having a plurality of notations in a feed,
The kanji / hiragana notation pattern is also stored for words that have multiple kanji notation characters and a part of which has been changed to hiragana, as well as storing the sending pattern as a table separately from the kana-kanji conversion dictionary. Is stored as a table separately from the Kana-Kanji conversion dictionary, and when those words are read out, the spelling and spelling are changed based on each table, and the spelling and spelling correction operation as in the conventional method is performed. The present invention provides a kana-kanji conversion device which eliminates the need for.

（ニ）課題を解決するための手段第１図はこの発明の構成を示すブロック図である。(D) Means for Solving the Problems FIG. 1 is a block diagram showing the configuration of the present invention.

図に示すように、この発明は、漢字を含む多数の単語
をその読み情報と共に記憶し、表記にゆらぎのある単語
に対しては識別符号102を付して記憶した辞書手段101
と、前記識別符号102と、複数種類の各文字位置毎の送
り仮名の有無を表すビットパターン複数とをテーブルと
して記憶した送りがな記憶手段103と、複数種類の各文
字位置毎の漢字／平がな表記を表すビットパターンとを
テーブルとして記憶した漢字／平がな表記記憶手段104
と、入力手段105から入力された読み情報に対応する単
語を前記辞書手段101から読出す読出し手段106と、前記
読出し手段106によって読出される単語に前記識別符号1
02が付されているときには前記送りがな記憶手段103に
記憶されたテーブルに基づいてその単語の表記を前記複
数種類の送りがなの表記に順次変換すると共に、前記識
別符号に応じて前記漢字／平がな表記記憶手段104に記
憶されたテーブルに基づいてその単語の複数漢字の内の
一部又は全部を平がな表記に変換する変換手段107と、
前記変換手段107によって表記が変換された単語を表示
する表示手段108とを備えてなるかな漢字変換装置であ
る。As shown in the figure, the present invention provides a dictionary means 101 which stores a large number of words including kanji together with their reading information, and stores a word having fluctuation in notation by attaching an identification code 102 thereto.
And the identification code 102, a plurality of types of bit patterns indicating the presence or absence of a kana for each character position, and a kana storage means 103 storing a table, and a plurality of types of kanji / hiragana for each character position. Kanji / hiragana notation storage means 104 storing bit patterns representing notations as a table
Reading means 106 for reading a word corresponding to the reading information input from the input means 105 from the dictionary means 101; and a word read by the reading means 106,
When 02 is added, the notation of the word is sequentially converted to the plurality of types of notation in the form of plural words based on the table stored in the storage device 103, and the kanji / hiragana is changed according to the identification code. Conversion means 107 for converting a part or all of a plurality of kanjis of the word into plain notation based on a table stored in the notation storage means 104;
A kana-kanji conversion device comprising a display unit 108 for displaying a word whose notation has been converted by the conversion unit 107.

なお、この発明における辞書手段101としては、漢字
を含む多数の単語をその読み情報と共に記憶し、表記に
ゆらぎのある単語に対しては識別符号を付して記憶でき
るものであればよく、ROM、あるいはフロッピーディス
ク装置、磁気ディスク装置等の外部記憶媒体が用いられ
る。Note that the dictionary means 101 in the present invention may be any one that can store a large number of words including kanji together with its reading information, and can store words with fluctuations in notation with identification codes. Alternatively, an external storage medium such as a floppy disk device or a magnetic disk device is used.

また、識別符号102としては、識別可能な番号、記号
等の各種の符号が用いられる。As the identification code 102, various codes such as identifiable numbers and symbols are used.

さらに、送りがな記憶手段103及び漢字／平がな表記
記憶手段104としては、複数種類の各文字位置毎の漢字
／平がな表記を表すビットパターンとをテーブルとして
記憶できるものであればよく、辞書手段101と同様、RO
M、あるいはフロッピーディスク装置、磁気ディスク装
置等の外部記憶媒体が用いられる。Further, as the sending storage unit 103 and the kanji / hiragana notation storage unit 104, any one can be used as long as it can store a plurality of types of bit patterns representing kanji / hiragana notation for each character position as a table. RO as in means 101
M or an external storage medium such as a floppy disk device or a magnetic disk device is used.

そして、入力手段105としては、かな及び漢字の読み
情報としての平がな文字列等を入力できるものであれば
よく、例えば、キーボード、タブレット装置、OCR、磁
気テープ装置等が用いられる。As the input means 105, any means can be used as long as it can input flat character strings and the like as kana and kanji reading information, and for example, a keyboard, a tablet device, an OCR, a magnetic tape device, or the like is used.

また、読出し手段106としては、入力手段105から入力
された読み情報に対応する単語を辞書手段101から読出
すことができるものであればよく、また、変換手段107
としては、読出し手段106によって読出される単語に識
別符号102が付されているときには、送りがな記憶手段1
03に記憶されたテーブルに基づいてその単語の表記を複
数種類の送りがなの表記に順次変換すると共に、前記識
別符号に応じて漢字／平がな表記記憶手段104に記憶さ
れたテーブルに基づいてその単語の複数漢字の内の一部
又は全部を平がな表記に変換できるものであればよく、
読出し手段106、及び変換手段107としては、一般に、マ
イクロプロセッサを用いるのが便利である。Further, as reading means 106, any means can be used as long as it can read words corresponding to the reading information input from input means 105 from dictionary means 101, and conversion means 107
When the identification code 102 is attached to the word read by the reading means 106, the sending storage means 1
The notation of the word is sequentially converted into a plurality of types of kana-kana notation based on the table stored in 03, and based on the table stored in the kanji / hiragana notation storage means 104 according to the identification code. Anything that can convert some or all of the multiple kanji of the word into plain notation,
In general, it is convenient to use a microprocessor as the reading means 106 and the converting means 107.

そして、表示手段108としては、変換手段107によって
変換された単語を表示できるものであればよく、プリン
タ、ディスプレイ装置、磁気テープ装置、磁気ディスク
装置等が用いられるが、処理内容を迅速に確認できるも
のとしては、ディスプレイ装置を用いて好適である。As the display unit 108, any device capable of displaying the word converted by the conversion unit 107 may be used, and a printer, a display device, a magnetic tape device, a magnetic disk device, or the like is used. It is preferable to use a display device.

（ホ）作用第１図に示すように、この発明によれば、入力手段10
5から読み情報が入力されると、読出し手段106によっ
て、その読み情報に対応する単語が辞書手段101から読
出される。(E) Operation As shown in FIG. 1, according to the present invention, the input means 10
When reading information is input from 5, a word corresponding to the reading information is read from the dictionary means 101 by the reading means 106.

このとき、読出される単語に識別符号102が付されて
いる場合には、変換手段107によって、送りがな記憶手
段103に記憶されたテーブルに基づいて、その単語の表
記が複数種類の送りがなの表記に順次変換されると共
に、前記識別符号に応じて漢字／平がな表記記憶手段10
4に記憶されたテーブルに基づいて、その単語の複数漢
字の内の一部又は全部が平がな表記に変換され、それが
表示手段108によって表示される。At this time, if the identification code 102 is attached to the word to be read out, the conversion unit 107 converts the word notation into a plurality of types of notation based on the table stored in the no-feed storage unit 103. The kanji / hiragana notation storage means 10 is sequentially converted and according to the identification code.
Based on the table stored in 4, some or all of the kanjis of the word are converted to a plain representation and displayed by the display means 108.

従って、あらかじめ設定したテーブルに基づいて単語
の送りがなが変換された後、単語の漢字の一部又は全部
が平がなに変換されるので、従来のような送りがなや表
記の修正操作が不要となり、送りがなや表記パターンの
異なる単語の登録が一単語ですむため、辞書容量の節約
が可能となる。Therefore, after the word kanji is converted based on a preset table, part or all of the kanji of the word is converted to hiragana. Since the registration of words having different sending patterns and notation patterns can be performed by one word, the dictionary capacity can be saved.

（ヘ）実施例以下、図面に示す実施例に基づいてこの発明を詳述す
る。なお、これによってこの発明が限定されるものでは
ない。(F) Embodiment Hereinafter, the present invention will be described in detail based on an embodiment shown in the drawings. Note that the present invention is not limited to this.

第２図はこの発明の一実施例を示す構成ブロック図で
ある。FIG. 2 is a block diagram showing an embodiment of the present invention.

この図において、１はかなキーやファンクションキ
ー、また、かな漢字変換を指示するためのかな漢字変換
キー、あるいは、入力内容を確定する実行キー等を備え
たキーボードであり、かなによる単語の読み情報を制御
部２に入力する。In this figure, reference numeral 1 denotes a keyboard provided with a kana key, a function key, a kana-kanji conversion key for instructing kana-kanji conversion, or an execution key for confirming the input content, and controls word reading information by kana. Input to section 2.

制御部２はマイクロプロセッサから構成され、ROMか
らなるプログラムメモリ３に書き込まれている制御プロ
グラムに従い、後述する各種のデータ処理を行う。The control unit 2 includes a microprocessor, and performs various data processing described below according to a control program written in a program memory 3 formed of a ROM.

４はキーボード１から入力されたかなによる単語の読
み情報を記憶する入力バッファである。Reference numeral 4 denotes an input buffer for storing word reading information based on kana input from the keyboard 1.

５はROMからなるかな漢字変換辞書であり、漢字を含
む多数の単語をその読み情報と共に記憶した自立語辞書
６、品詞情報を記憶した品詞テーブル７、送りがな情報
を記憶した送りがなテーブル８、及び単語内の漢字を平
がなで表記するための情報を記憶した漢字／平がな表記
テーブル９を有している。Reference numeral 5 denotes a kana-kanji conversion dictionary composed of a ROM, which is an independent word dictionary 6 storing a large number of words including kanji together with their reading information, a part-of-speech table 7 storing part-of-speech information, a transmission part table 8 storing transmission-side information, Has a kanji / hiragana notation table 9 which stores information for writing kanji in hiragana.

自立語辞書６に記憶された単語の内、複数種類の送り
がなを有する単語（以後、送りがなにゆらぎのある単語
という）、及び複数種類の漢字／平がな表記を有する単
語（以後、表記にゆらぎのある単語という）には、後述
するような、表記パターン番号が付されている。Of the words stored in the independent word dictionary 6, words having a plurality of types of kanji (hereinafter referred to as words with fluctuations) and words having a plurality of types of kanji / hiragana expressions (hereinafter, fluctuations in notation) Are referred to as “words”) with a notation pattern number as described later.

また、送りがなテーブル８には、自立語辞書６内の表
記パターン番号が付されている単語についての送りがな
のパターンが、テーブルとして記憶されており、漢字／
平がな表記テーブル９には、表記パターン番号が付され
ている単語についての複数漢字の内の一部又は全部を平
がな表記に変換するための漢字／平がな表記パターン
が、テーブルとして記憶されている。In addition, in the sending table 8, a sending pattern for a word having a written pattern number in the independent word dictionary 6 is stored as a table.
In the hiragana notation table 9, a kanji / hiragana notation pattern for converting a part or all of a plurality of kanjis of a word having a notation pattern number into a hiragana notation is stored as a table. It is remembered.

10はCRTディスプレイやLC（液晶）ディスプレイ、あ
るいはELディスプレイ等からなる表示装置である。Reference numeral 10 denotes a display device including a CRT display, an LC (liquid crystal) display, an EL display, and the like.

制御部２は、キーボード１から入力された単語の読み
情報、つまり、かな文字列を入力バッファ４に格納し、
そのかな文字列に対してかな漢字変換が指示されたとき
には、そのかな文字列に対応する単語を自立語辞書６か
ら読出す。The control unit 2 stores the reading information of the word input from the keyboard 1, that is, the kana character string in the input buffer 4,
When the kana-kanji conversion is instructed for the kana character string, the word corresponding to the kana character string is read from the independent word dictionary 6.

そして、その読出した単語に表記パターン番号が付さ
れているときには、送りがなテーブル８の、その表記パ
ターン番号が示すポインタに位置する送りがなのパター
ンに基づいて、その単語の表記を複数種類の送りがなの
表記に順次変換すると共に、漢字／平がな表記テーブル
９の漢字／平がな表記番号が示すポインタに位置する表
記パターンに基づいて、その単語の漢字の一部又は全部
を平がな表記に変換し、それを表示装置９に表示する。When a notation pattern number is added to the read word, the notation of the word is represented by a plurality of kinds of notation based on the notation pattern located at the pointer indicated by the notation pattern number in the notation table 8. And a part or all of the kanji of the word is converted to hiragana based on the notation pattern located at the pointer indicated by the kanji / hiragana notation number in the kanji / hiragana notation table 9. And displays it on the display device 9.

第３図は自立語辞書６の記憶フォーマットを示す説明
図である。FIG. 3 is an explanatory diagram showing a storage format of the independent word dictionary 6.

図に示すように、自立語辞書６には、一つの単語につ
き、語長10、重複11、読みがな12、品詞番号13、表記14
の項目が設けられており、さらに、送りがなや表記にゆ
らぎのある単語についてのみ、表記パターン番号15の項
目が設けられている。As shown in the figure, the independent word dictionary 6 stores, for each word, a word length of 10, a duplication of 11, a pronunciation of 12, a part of speech number of 13, and a notation of 14.
The item of notation pattern number 15 is provided only for words having fluctuations in notation and notation.

語長10には、一単語についてのレコード長、重複11に
は、単語の重複部分の記憶を避けるための、同一表記に
ついての変化部分のみを記憶しているという情報、読み
がな12には、見出しとしての平がな表記が、それぞれ記
憶されている。The word length 10 is the record length for one word, the duplication 11 is the information that only the changed part of the same notation is stored to avoid the storage of the duplicated part of the word, and the reading 12 is , And a plain notation as a heading are stored.

また、品詞番号13は、名詞、動詞等の品詞の情報が記
憶されるところであるが、各単語についての品詞情報
は、品詞テーブル７が別途記憶されており、品詞番号13
には、その単語の品詞情報が、品詞テーブル７の何番目
のレコードに記憶されているかのポインタ（数値）が記
憶されている。表記14には、漢字を含む単語の表記が記
憶されている。The part-of-speech number 13 is a part in which part-of-speech information such as a noun and a verb is stored. The part-of-speech information for each word is separately stored in the part-of-speech table 7.
Stores a pointer (numerical value) indicating in which part of the record of the part of speech table 7 the part of speech information of the word is stored. The notation 14 stores the notation of a word including a kanji.

これらの、語長10、重複11、読みがな12、品詞番号1
3、表記14は、それぞれ従来公知の自立語辞書の構造に
準じている。These have a word length of 10, a duplication of 11, a reading, and a part of speech number of 1.
3. The notation 14 conforms to the structure of a conventionally known independent word dictionary.

表記パターン番号15には、上記の品詞テーブル７と同
様に、単語の送りがなのパターン及び漢字／平がな表記
のパターンが、送りがなテーブル８の何番目のレコード
に記憶されているかのポインタ（数値）が記憶されてい
る。In the notation pattern number 15, similarly to the part of speech table 7 described above, a pointer (numerical value) indicating in which record of the kana character table 8 the word kana pattern and the kanji / hiragana notation pattern are stored. Is stored.

そして、単語に表記パターン番号15が付加されている
かどうかは、品詞番号13がある一定の数値を超えている
かどうかで判定する。すなわち、送りがな表記にゆらぎ
のある単語についての品詞情報を品詞テーブル７の後半
にまとめて登録しておき、品詞番号13を確認することに
よって、その単語に表記パターン番号15が付加されてい
るかどうかを判定できるようにしている。Whether or not the word has the notation pattern number 15 is determined based on whether or not the part of speech number 13 exceeds a certain numerical value. That is, part-of-speech information on a word having fluctuations in the sentence notation is collectively registered in the second half of the part-of-speech table 7, and by checking the part-of-speech number 13, it is determined whether or not the notation pattern number 15 is added to the word. It is possible to judge.

第４図は送りがなテーブル８の記憶フォーマットを示
す説明図である。FIG. 4 is an explanatory diagram showing a storage format of the sending table 8.

図に示すように、送りがなテーブル８には、表記パタ
ーン番号順に、送りがなのパターンと漢字／平がな表記
パターンとがビットパターンで、第１候補、第２候補、
第３候補、……、というように順次記憶されている。例
えば、表記パターン番号ｎであれば、第１候補（000100
00）、第２候補（01010000）、第３候補（00000000）で
ある。As shown in the drawing, in the sending table 8, the sending pattern and the kanji / hiragana notation pattern are bit patterns in the order of the notation pattern numbers, and the first candidate, the second candidate,
The third candidate,... Are sequentially stored. For example, if the notation pattern number is n, the first candidate (000100
00), the second candidate (01010000), and the third candidate (00000000).

この送りがなテーブル８の内容は、送りがなのパター
ン部分については１バイト単位で漢字表記に対応させて
いる。例えば、「おぼえがき」という、第４図の表記パ
ターン番号ｎの送りがなパターンを有する単語である場
合には、読み文字データは、（覚＝２）（え＝１）（書＝１）（き＝１）＝2111 であるので、この４桁の表記に、第５図に示すように、
それぞれ先頭から１ビットずつ割り当てる。そして、第
６図に示すように、ビットが“１（オン）”になってい
る表記を削除する。In the contents of this sending table 8, the sending part pattern portion is made to correspond to the kanji notation in byte units. For example, in the case of a word having a spelling pattern of the notation pattern number n in FIG. 4 called "remembering", the read character data is (Kaku = 2) (E = 1) (Book = 1) (K = 1) = 2111, and this four-digit notation, as shown in FIG.
One bit is allocated from the beginning. Then, as shown in FIG. 6, the notation in which the bit is "1 (on)" is deleted.

このような処理方法だと、一度選択した単語の送りが
なについて、その送りがなのパターンを学習バッファに
格納しておけば、二度目からは、１回の変換で、その単
語について所望する送りがなでの表記を得ることができ
る。With such a processing method, if the sentence pattern of the word selected once is stored in the learning buffer, the desired sentence of the word can be expressed by the first conversion from the second time. Can be obtained.

このようにして、送りがなのパターンと漢字／平がな
表記パターンとを１バイトのビットパターン情報として
テーブル化したものが第４図である。FIG. 4 is a table in which the feed pattern and the kanji / hiragana notation pattern are tabulated as 1-byte bit pattern information.

第７図は第４図に示した送りがなテーブル８の内容を
示す説明図である。FIG. 7 is an explanatory diagram showing the contents of the sending table 8 shown in FIG.

図に示すように、第４図の表記パターン番号ｎで示し
た送りがなパターンである、上述の「おぼえがき」の例
であれば、第１候補（覚え書）→第２候補（覚書）→第
３候補（覚え書き）→第４候補（覚書き）→第５候補
（おぼえがき）→第６候補（オボエガキ）と順次変換さ
れる。As shown in the figure, in the case of the above-mentioned “remembering”, which is the feed pattern indicated by the notation pattern number n in FIG. 4, the first candidate (memorandum) → the second candidate (memorandum) → The third candidate (memorandum) → the fourth candidate (memorandum) → the fifth candidate (remember) → the sixth candidate (oboegaki) are sequentially converted.

このように、送りがなのビットパターンについては、
例えば、「おぼえがき」の場合であれば、先頭の４ビッ
ト分について、“１（オン）",“０（オフ）”の情報で
ビット化して登録しておき、「おぼえがき」と入力さ
れ、それをかな漢字変換する場合には、最初の４文字の
表記の読みデータに対応する送りがなのビットパターン
が“１（オン）”の表記文字を削除するようにする。As described above, the bit pattern of the transmission is
For example, in the case of “remember”, the first four bits are registered as bits with information of “1 (on)” and “0 (off)”, and “remember” is input. When converting the kana to kanji characters, the notation character whose feed pattern bit pattern corresponding to the reading data of the first four characters is "1 (on)" is deleted.

これらの変換の内、第５候補と第６候補については、
単語の先頭の文字は、通常、削除されることがないた
め、これを利用して、先頭が“１（オン）”、つまり、
削除のビットであれば、その単語の読みがなを全て平が
な、又はカタカナで表記するようにしている。Of these transformations, for the fifth and sixth candidates,
Since the first character of a word is not usually deleted, the first character is used as “1 (on)”, that is,
If it is a bit of deletion, all the readings of the word are written in hiragana or katakana.

そして、第７候補以降については、先頭ビットが“１
（オン）”の場合において、平がな又はカタカナを表わ
す以外に表現できるビットが、図中×印で示すように６
ビットあるためこれを利用して、先頭ビットが“１（オ
ン）”の場合で、かつ、第２ビットから第７ビットまで
の６ビットの内どれか１つが“１（オン）”であるとき
は、それを漢字／平がな表記番号とみなし、第８図に示
す漢字／平がな表記テーブル９を参照する。For the seventh and subsequent candidates, the first bit is “1”.
In the case of “(ON)”, the bits that can be expressed other than representing hiragana or katakana are 6 bits as indicated by the crosses in the figure.
This is used when the first bit is “1 (ON)” and one of the 6 bits from the second bit to the seventh bit is “1 (ON)”. Regards it as a kanji / hiragana notation number and refers to the kanji / hiragana notation table 9 shown in FIG.

すなわち、上記漢字／平がな表記番号は漢字／平がな
表記テーブル９へのポインタとなっており、このポイン
タによって漢字／平がな表記テーブル９を参照する。That is, the kanji / hiragana notation number serves as a pointer to the kanji / hiragana notation table 9, and this pointer refers to the kanji / hiragana notation table 9.

漢字／平がな表記テーブル９は、漢字／平がな表記番
号の順に８ビットパターンのフォーマットで記憶されて
いる。すなわち、例えば「しぼりこむ」という単語であ
る場合には、自立語辞書６には「絞り込む」という表記
パターンで登録され、その読み文字データは、（絞＝２）（り＝１）（込＝１）（む＝１）＝2111 である。The kanji / hiragana notation table 9 is stored in the form of an 8-bit pattern in the order of kanji / hiragana notation numbers. That is, for example, when the word is “squeeze down”, it is registered in the independent word dictionary 6 with a notation pattern of “narrow down”, and the read character data is (squeeze = 2) (ri = 1) (inclusive = 1) (M = 1) = 2111

なお、語幹のみを登録する場合は、最後の“む”の情
報はない場合もある。When only the stem is registered, there is a case where there is no last "mu" information.

従って、この「しぼりこむ」という文字列がかな漢字
変換される場合には、自立語辞書６の単語に付加された
表記パターン番号15から、第９図に示すような送りがな
テーブル８が検索されて、第１候補、第２候補と順次か
な漢字変換され、第４候補のときには、第４候補のビッ
トパターンが先頭の１ビットと最後の１ビットが除かれ
た形でバイナリー数値に換算し直されて、第10図に示す
ような漢字／平がな表記テーブル９が参照れさる。ここ
で、送りがなテーブル８の第４候補の漢字／平がな表記
番号はバイナリーコードで１番であり、第10図の漢字／
平がな表記テーブル９の１番目の表記パターンは“1000
000"であるため、漢字表記の内の先頭の漢字が平がな表
記に変更される。すなわち、読み文字データによって、
先頭の漢字の読みが読み文字の第１番目と第２番目であ
るとわかるので、「絞」が「しぼ」に変更され、第10図
に示すように「しぼり込む」という表記に変換される。Therefore, when the character string “SHIBORIKOMU” is converted to Kana-Kanji characters, the sending table 8 as shown in FIG. 9 is searched from the notation pattern numbers 15 added to the words in the independent word dictionary 6, and The first candidate and the second candidate are sequentially converted to Kana-Kanji characters, and in the case of the fourth candidate, the bit pattern of the fourth candidate is converted back to a binary number in a form in which the first bit and the last bit are removed, and A kanji / hiragana notation table 9 as shown in FIG. 10 is referred to. Here, the kanji / hiragana notation number of the fourth candidate in the sending table 8 is the first binary code, and the kanji /
The first notation pattern in the hiragana notation table 9 is “1000
000 ", the first kanji in the kanji notation is changed to a hiragana notation.
Since the first kanji reading is found to be the first and second reading characters, "squeezing" is changed to "shiro" and converted to the notation "squeezing in" as shown in Fig. 10. .

同様に、第５候補のときには、第５候補の漢字／平が
な表記番号はバイナリーコードで２番であり、漢字／平
がな表記テーブル９の２番目の表記パターンは“001000
00"であるため、「込」が「こ」に変更され、「絞りこ
む」という表記に変換される。Similarly, in the case of the fifth candidate, the kanji / hiragana notation number of the fifth candidate is the binary code No. 2, and the second notation pattern of the kanji / hiragana notation table 9 is “001000”.
Since it is "00", "in" is changed to "ko" and converted to the notation "to narrow down".

これにより、例えば、「しゃくしじょうぎ」という単
語である場合には、読み文字データは、（杓＝３）（子＝１）（定＝３）（規＝１）＝3131 であるので、第10図に示した漢字／平がな表記テーブル
９のｎ番目の表記パターンに該当するように漢字／平が
な表記番号を設定しておけば、「しゃくし定規」という
表記に変換することができる。Thus, for example, in the case of the word “Shakushi Jyogi”, the read character data is (ladder = 3) (child = 1) (constant = 3) (rule = 1) = 3131. If the kanji / hiragana notation number is set so as to correspond to the n-th notation pattern in the kanji / hiragana notation table 9 shown in FIG. 10, it can be converted to the notation "shakushi ruler". Can be.

なお、送りがな及び漢字／平がな表記パターンの種類
が255種類以内なら、表記パターン番号は１バイトの情
報となって、ビットパターンの情報は固定長となり、送
りがな及び漢字／平がな表記パターンの種類が256種類
以上なら、表記パターン番号は２バイトのアドレス情報
となり、ビットパターンの情報は１バイトの整数倍によ
る可変長データとなる。If there are no more than 255 types of kanji and kanji / hiragana notation patterns, the notation pattern number is 1-byte information, the bit pattern information is fixed length, and If the type is 256 or more, the notation pattern number is 2-byte address information, and the bit pattern information is variable-length data by an integral multiple of 1 byte.

また、第７図に示したビットパターンの１バイト情報
は一例であって、例えば、「覚え書き」の場合であれ
ば、先頭の４ビットについては“１（オン）”と“０
（オフ）”とが逆でもよく、その場合の例を示したもの
が第11図である。The one-byte information of the bit pattern shown in FIG. 7 is an example. For example, in the case of “memorandum”, the first four bits are “1 (on)” and “0”.
(Off) "may be reversed, and FIG. 11 shows an example in that case.

次に、制御部２の処理動作の内容を第12図及び第13図
に示すフローチャートに従い説明する。Next, the contents of the processing operation of the control unit 2 will be described with reference to the flowcharts shown in FIGS.

第12図はかな漢字変換が行われる場合の処理内容を示
すフローチャートである。FIG. 12 is a flowchart showing the contents of processing when kana-kanji conversion is performed.

かな漢字変換が行われる場合には、まず、キーボード
１から文字列が入力され（ステップ201）、その文字列
が入力バッファ４に記憶された後、かな漢字変換が指示
されると（ステップ202）、制御部２により、同音異義
語があるかどうかが判定され（ステップ203）、同音異
義語がある場合には、同音異義語の候補が抽出されて表
示装置９に表示され（ステップ204）、それらの選択が
行われる（ステップ205）。When the kana-kanji conversion is performed, first, a character string is input from the keyboard 1 (step 201), and the character string is stored in the input buffer 4, and when the kana-kanji conversion is instructed (step 202), control is performed. The unit 2 determines whether there is a homonym (step 203). If there is a homonym, candidates for the homonym are extracted and displayed on the display device 9 (step 204). A selection is made (step 205).

次に、自立語辞書６の品詞番号13が検索されて、送り
がな及び表記にゆらぎがあるかどうかが判定され（ステ
ップ206）、送りがな及び表記にゆらぎがある場合に
は、送りがな及び表記のゆらぎが抽出されて表示装置９
に表示され（ステップ207）、それらの選択が行われる
（ステップ208）。Next, the part-of-speech number 13 of the independent word dictionary 6 is searched to determine whether there are fluctuations in the feed and notation (step 206). If there are fluctuations in the feed and the notation, the fluctuations in the feed and the notation are detected. Display device 9 extracted
Are displayed (step 207), and these are selected (step 208).

そして、文字列が再度入力される場合にはステップ20
1に戻り、文字列が入力されない場合にはかな漢字変換
処理の終了となる（ステップ209）。If the character string is input again, step 20
Returning to 1, if the character string is not input, the kana-kanji conversion processing ends (step 209).

第13図は第12図に示した送りがな及び表記のゆらぎが
抽出される場合の詳細フローチャートである。FIG. 13 is a detailed flowchart in the case where the feed length and the fluctuation of the notation shown in FIG. 12 are extracted.

送りがな及び表記のゆらぎが抽出される場合には、ま
ず、自立語辞書６内に表記パターン番号が付加されてい
るのかどうかが判定され（ステップ301）、表記パター
ン番号が付加されている場合には、その表記パターン番
号に基づいて送りがなテーブル８が参照され（ステップ
302）、１バイト情報である送りがなと表記のビットパ
ターンが漢字／平がな表記番号であるのかどうかが判定
され（ステップ303）、漢字／平がな表記番号でない場
合には、送りがなテーブル８より、該当する送りがなパ
ターンの情報が検索される（ステップ304）。In the case where the infeed and the fluctuation of the notation are extracted, first, it is determined whether or not the notation pattern number is added in the independent word dictionary 6 (step 301). , The sending table 8 is referred to based on the written pattern number (step
302) It is determined whether or not the bit pattern of the one-byte information “Shigana” is a kanji / hiragana notation number (step 303). Then, the information of the corresponding sending pattern is searched (step 304).

そして、漢字表記に対応させた送りがなパターンの１
バイト情報より、送りがなが作成されて出力される（ス
テップ305）。Then, one of the sending patterns corresponding to the kanji notation
A feed is created and output from the byte information (step 305).

また、漢字／平がな表記番号である場合には、漢字／
平がな表記テーブル９より、該当する表記パターンの情
報が検索され（ステップ306）、表記パターンの情報と
読み文字データより、漢字／平がな表記パターンが作成
されて出力される（ステップ307）。Also, if the kanji / hiragana is a spelling number, the kanji /
Information of the corresponding notation pattern is retrieved from the hiragana notation table 9 (step 306), and a kanji / hiragana notation pattern is created and output from the notation pattern information and the read character data (step 307). .

このようにして、自立語辞書６に表記パターン番号を
付加した単語に関してのみ、あらかじめ送りがなテーブ
ル８と漢字／平がな表記テーブル９に、送りがなのパタ
ーン及び漢字／平がな表記パターンをそれぞれ登録して
おき、それらのテーブルに基づいて単語の送りがなや表
記を変換することにより、簡単な変換操作で、自立語辞
書６の登録単語数を増大させることなく、単語を所望の
形態で表記することが可能となる。In this way, only the words to which the notation pattern numbers are added to the independent word dictionary 6 are registered in advance in the sending table 8 and the kanji / hiragana notation table 9 with the sending pattern and the kanji / hiragana notation pattern, respectively. In addition, by converting the word sentence notation based on those tables, the word can be written in a desired form by a simple conversion operation without increasing the number of registered words in the independent word dictionary 6. It becomes possible.

（ト）発明の効果この発明によれば、送りがなや表記に複数種類のパタ
ーンを有する単語については、そのパターンを辞書手段
とは別にテーブルとして記憶しておき、その単語が読出
されるときには、そのテーブルに基づいて、その単語の
表記を複数種類のパターンに変換するようにしたので、
従来のような送りがなの追加や削除、又は漢字を平がな
に変換する等の表記の修正操作が不要となると共に、複
数種類の表記パターンを有する単語の登録が一単語です
むため、辞書容量の節約が可能となる。(G) Effects of the Invention According to the present invention, for a word having a plurality of types of patterns in sentence notation, the pattern is stored as a table separately from the dictionary means, and when the word is read, the pattern is stored. Based on the table, we convert the word notation into multiple types of patterns,
This eliminates the need to add or delete traditional spelling words or convert kanji to hiragana, and saves the dictionary capacity because only one word can be registered with multiple types of spelling patterns. Can be saved.

特に、表記に複数種類のパターンを有する単語につい
ては、単語の漢字／平がな表記パターンを辞書手段とは
別にテーブルとして記憶しているので、単語の複数漢字
の内の一部又は全部を平がなに変換するという表記の変
更を容易に行うことができる。例えば、「しぼりこむ」
の場合であれば、第１候補（絞り込む）→第２候補（し
ぼり込む）→第３候補（絞りこむ）→第４候補（しぼり
こむ）のように、複数漢字の内の一部又は全部を容易に
平がなに変換することができる。In particular, for a word having a plurality of types of patterns in the notation, the kanji / hiragana notation pattern of the word is stored as a table separately from the dictionary means. It is possible to easily change the notation of converting to kana. For example, "squeeze in"
In the case of, some or all of a plurality of kanji characters are used, such as a first candidate (refine) → a second candidate (refine) → a third candidate (refine) → a fourth candidate (refine). It can be easily converted to flat.

また、送りがなについては、その表記の候補順位を、
例えば、「おぼえがき」の場合であれば、第１候補（覚
え書）→第２候補（覚書）→第３候補（覚え書き）→第
４候補（覚書き）のように、任意に設定することができ
るので、従来のアルゴリズムで送りがなを削除してゆく
ような、送りがなの削除処理のみではなく、単語ごとの
送りがなの追加、及び削除処理が可能となる。In addition, for sending, the candidate ranking of the notation,
For example, in the case of "remembering", arbitrarily set, for example, a first candidate (memorandum) → a second candidate (memorandum) → a third candidate (memorandum) → a fourth candidate (memorandum). Therefore, it is possible to perform not only the process of deleting a sentence but also the process of adding and deleting a sentence for each word, in which the conventional algorithm deletes the sentence.

[Brief description of the drawings]

第１図はこの発明の構成を示すブロック図、第２図はこ
の発明の一実施例を示す構成ブロック図、第３図は自立
語辞書の記憶フォーマットを示す説明図、第４図は送り
がなテーブルの記憶フォーマットを示す説明図、第５図
はビットの割り当て例を示す説明図、第６図は送りがな
のビットの削除例を示す説明図、第７図は送りがなテー
ブルの内容を示す説明図、第８図は漢字／平がな表記テ
ーブルの記憶フォーマットを示す説明図、第９図は送り
がなテーブルの他の内容例を示す説明図、第10図は漢字
／平がな表記テーブルの具体例を示す説明図、第11図は
ビットパターンが逆の場合を示す第７図相当図、第12図
及び第13図は実施例の動作を示すフローチャートであ
る。１……キーボード、２……制御部、３……プログラムメモリ、４……入力バッファ、５……かな漢字変換辞書、６……自立語辞書、７……品詞テーブル、８……送りがなテーブル、９……漢字／平がな表記テーブル、 10……表示装置。FIG. 1 is a block diagram showing a configuration of the present invention, FIG. 2 is a block diagram showing an embodiment of the present invention, FIG. 3 is an explanatory diagram showing a storage format of an independent word dictionary, and FIG. FIG. 5 is an explanatory diagram showing an example of bit allocation, FIG. 6 is an explanatory diagram showing an example of deleting bits of a sending tag, FIG. 7 is an explanatory diagram showing contents of a sending tag table, FIG. FIG. 8 is an explanatory diagram showing a storage format of a kanji / hiragana notation table, FIG. 9 is an explanatory diagram showing another example of contents of a sending kanji table, and FIG. 10 is a specific example of a kanji / hiragana notation table. FIG. 11 is a diagram corresponding to FIG. 7 showing the case where the bit pattern is reversed, and FIGS. 12 and 13 are flowcharts showing the operation of the embodiment. 1 ... keyboard, 2 ... control unit, 3 ... program memory, 4 ... input buffer, 5 ... kana-kanji conversion dictionary, 6 ... independent word dictionary, 7 ... part-of-speech table, 8 ... sending table, 9 …… Kanji / hiragana notation table, 10 …… Display device.

Claims

(57) [Claims]

1. A dictionary means for storing a large number of words including kanji together with their reading information, and adding and discriminating a word having a fluctuation in notation with said identification code; A kana character storage means storing a plurality of bit patterns representing the presence or absence of a kana character at each character position as a table, and a kanji character storing a plurality of types of kanji characters at each character position / bit patterns representing hiragana notation as a table. Plain display storage means, reading means for reading a word corresponding to the read information input from the input means from the dictionary means, and when the identification code is attached to the word read by the reading means, The notation of the word is sequentially converted to the plurality of types of the notation of the word based on the table stored in the word storage device, and the kanji is changed according to the identification code. Conversion means for converting a part or all of a plurality of kanjis of the word into a hiragana notation based on a table stored in the hiragana notation storage means; A kana-kanji conversion device comprising display means for displaying.