JPH0736878A - Homonym selecting device - Google Patents

Homonym selecting device

Info

Publication number
JPH0736878A
JPH0736878A JP5182770A JP18277093A JPH0736878A JP H0736878 A JPH0736878 A JP H0736878A JP 5182770 A JP5182770 A JP 5182770A JP 18277093 A JP18277093 A JP 18277093A JP H0736878 A JPH0736878 A JP H0736878A
Authority
JP
Japan
Prior art keywords
kanji
conversion
kana
string
hiragana
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP5182770A
Other languages
Japanese (ja)
Inventor
Toru Ueda
徹 上田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sharp Corp
Original Assignee
Sharp Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sharp Corp filed Critical Sharp Corp
Priority to JP5182770A priority Critical patent/JPH0736878A/en
Publication of JPH0736878A publication Critical patent/JPH0736878A/en
Pending legal-status Critical Current

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

PURPOSE:To provide a homonym selecting device capable of improving the conversion efficiency of 'KANA' (Japanese syllabary)/'KANJI' (Chinese character) conversion. CONSTITUTION:This selecting device is characterized by including a 'KANJI' conversion means (1 to 4) for converting an inputted 'HIRAGANA' (cursive form of Japanese syllabary) string into a 'KANA'/'KANJI' string including 'KANJI' conversion candidates homonymous with the 'HIRAGANA' string to be converted into 'HIRAGANA' and 'KANJI', a storage means 5 for storing 'KANJI' whose 'KANJI' conversion is determined out of the 'KANA'/'KANJI' string converted by the means (1 to 4) and a selecting means (6, 7) for selecting the conversion order of the 'KANJI' conversion candidates corresponding to the 'HIRAGANA' string to be converted so that the meaning of the 'KANJI' stored in the means 5 is matched with that of a context.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【産業上の利用分野】本発明は同音異義語選択装置に係
り、詳細にはかな漢字変換の変換効率を向上し得る同音
異義語選択装置に係る。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a homophone / synonym selection device, and more particularly to a homophone / synonym selection device capable of improving conversion efficiency of kana-kanji conversion.

【0002】[0002]

【従来の技術】従来、同音異義語の選択は、あらかじめ
辞書に記述された頻度の大きいものを使用する、人手で
作成した共起組み合わせ辞書を用いてAという自立語が
でていれば、Bという記述の頻度の大きい候補を選択す
るといった1対1の辞書を使用して漢字変換を行ってい
た。
2. Description of the Related Art Conventionally, when selecting a homonym, a frequently used word described in a dictionary in advance is used. If an independent word A is produced using a co-occurrence combination dictionary created manually, B Kanji conversion was performed using a one-to-one dictionary such as selecting a candidate having a high description frequency.

【0003】[0003]

【発明が解決しようとする課題】従来の頻度の大きい候
補を選択するという手法では文脈の前後関係を考慮して
いないため、同じ「こうじ」という読みに対して「道路
工事」と「地価の公示」といった文脈で変換結果を変え
ねばならない場合に対処できない。また、1対1の対応
関係を記憶しておく場合には、「道路」−「工事」、
「地価」−「公示」という組み合わせを記憶しておけば
よいが、この手法では一つの自立語に対しての関係しか
求まらず、複数の自立語を考慮にいれて変換候補を選択
することができない。
Since the conventional method of selecting a candidate with a high frequency does not consider the context of the context, “road construction” and “public announcement of land price” are applied to the same reading “Koji”. It is not possible to deal with cases where the conversion result must be changed in the context such as ". When the one-to-one correspondence is stored, “road”-“construction”,
It suffices to memorize the combination of "land price"-"public notice", but this method only finds the relation to one independent word, and selects conversion candidates considering multiple independent words. I can't.

【0004】本発明の目的は、かな漢字変換の変換効率
を向上し得る同音異義語選択装置を提供することにあ
る。
An object of the present invention is to provide a homonym synonym selection device capable of improving the conversion efficiency of kana-kanji conversion.

【0005】[0005]

【課題を解決するための手段】入力されるひらかな列を
ひらかな及び漢字に変換すべきひらかなと同音異義の漢
字変換候補を含むかな漢字列に変換するかな漢字変換手
段と、かな漢字変換手段により変換されたかな漢字列の
うち漢字変換が確定した漢字を記憶する記憶手段と、記
憶手段に記憶された漢字と文脈に関して意味が適合する
ように変換すべきひらかなに対応する漢字変換候補の変
換順を選択する選択手段とを含むことを特徴とする。
[Means for Solving the Problems] Kana-Kanji conversion means for converting an input hiragana string into a kana-kanji string containing hiragana and kanji conversion candidates that have the same phonetic differences as the kana-kanji conversion means and kana-kanji conversion means A kana-kanji conversion candidate is stored in the kana-kanji string, and the conversion order of the kanji conversion candidates corresponding to the hiragana to be converted so that the kanji stored in the memory and the meaning of the context match Selection means for selecting.

【0006】[0006]

【作用】かな漢字変換手段が入力されるひらかな列をひ
らかな及び漢字に変換すべきひらかなと同音異義の漢字
変換候補を含むかな漢字列に変換し、記憶手段がかな漢
字変換手段により変換されたかな漢字列のうち漢字変換
が確定した漢字を記憶し、選択手段が記憶手段に記憶さ
れた漢字と文脈に関して意味が適合するように変換すべ
きひらかなに対応する漢字変換候補の変換順を選択する
ので、すでに入力された語(文)と意味的に類似する漢
字変換候補が優先的に選択され漢字変換効率を向上し得
る。
[Function] A kana-kanji character converted by the kana-kanji conversion means into a kana-kanji string containing hiragana and a kanji conversion candidate having the same phonetic meaning as the kana-kanji conversion means to be converted into hiragana and kanji. Since the kanji whose conversion to kanji has been confirmed is stored in the sequence, the selecting means selects the conversion order of the kanji conversion candidates corresponding to the hiragana to be converted so that the meaning matches the kanji stored in the storage means and the context. , Kanji conversion candidates that are semantically similar to already entered words (sentences) can be preferentially selected to improve the Kanji conversion efficiency.

【0007】[0007]

【実施例】図1は本発明の同音異義語選択装置の実施例
のブロック図である。図1において、1はあらかじめ入
力されたひらかな列を記憶する入力バッファ、2は入力
バッファ1に記憶されたひらかな列をかな漢字交じり、
すなわちひらかな及び漢字に変換すべきひらかなと同音
異義の漢字変換候補を含むかな漢字列に変換し、変換さ
れたかな漢字列を出力するかな漢字変換部、3は漢字変
換部2の変換のために使用するかな漢字変換辞書、4は
かな漢字変換部2で変換された漢字変換候補を記憶する
変換候補バッファ、5は変換結果が確定されたかな漢字
列を記憶する確定文章バッファ、6は確定文章バッファ
5に記憶されたかな漢字列の自立語からそれまでに入力
された文章の意味ベクトルを求め、その結果を用いて変
換候補バッファ4に記憶されている変換候補の順位を変
更する候補並べ変え部、7は後述する意味ベクトルが格
納されている共起テ−ブルである。
1 is a block diagram of an embodiment of a homonym synonym selection device of the present invention. In FIG. 1, reference numeral 1 denotes an input buffer for storing a previously input HIRANA string, 2 denotes Kana-Kanji for the HIRANA string stored in the input buffer 1,
That is, the kana-kanji conversion unit 3 which converts a kana-kanji string including a kana-kanji conversion candidate having the same phonetic meaning as that of hiragana and kanji and outputs the converted kana-kanji string is used for the conversion of the kanji conversion part 2. Suru-Kana-Kanji conversion dictionary, 4 is a conversion candidate buffer for storing Kanji conversion candidates converted by the Kana-Kanji conversion unit 2, 5 is a definite sentence buffer for storing a Kana-Kanji string whose conversion result is definite, and 6 is a definite sentence buffer 5. A candidate rearranging unit that obtains the meaning vector of the sentence input up to that time from the independent words of the Kana-Kana character string and changes the order of the conversion candidates stored in the conversion candidate buffer 4 using the result, 7 will be described later. It is a co-occurrence table in which a meaning vector is stored.

【0008】なお、入力バッファ1、かな漢字変換部
2、かな漢字変換辞書3及び変換候補バッファ4がかな
漢字変換手段を、確定文章バッファ5が記憶手段を、候
補並び替え部6と共起テ−ブル7とが選択手段を構成す
る。
The input buffer 1, the kana-kanji conversion unit 2, the kana-kanji conversion dictionary 3 and the conversion candidate buffer 4 are kana-kanji conversion means, the fixed sentence buffer 5 is a storage means, and the candidate rearrangement unit 6 and the co-occurrence table 7 are provided. And constitute selection means.

【0009】ここでは、それまでに入力された語または
文のもつ意味的な情報をベクトル(意味ベクトル)とし
て表現し、その意味的な情報により変換結果を決定する
(もしくは変換結果の順番を入れ替える)ことによりす
でに入力された語(文)と意味的に類似する、すなわち
文脈に関して意味が適合する変換結果が優先される。
Here, the semantic information of the words or sentences input so far is expressed as a vector (semantic vector), and the conversion result is determined (or the order of the conversion results is changed according to the semantic information). ), The conversion result that is semantically similar to the word (sentence) already input, that is, the meaning of matching the context is prioritized.

【0010】図2は共起テ−ブルであって漢字変換候補
の意味が適合する順、すなわち各単語間のつながり易さ
を示す図、図3及び図4は各単語間のつながりから漢字
変換を行う場合の例を説明する図である。
FIG. 2 is a co-occurrence table showing the order in which the meanings of the Kanji conversion candidates match, that is, the ease of connection between each word. FIGS. It is a figure explaining the example in the case of performing.

【0011】図2において、2行目の「先生」と4列目
の「講義」は7であり、2行目の「先生」と5列目の
「抗議」の5よりもつながりが深いことを示している。
例として「彼はその決定についてこうぎをおこなった」
という文章が入力された場合、「こうぎ」という文字列
の変換結果は、「抗議」と「講義」との2種類が存在す
る。この場合、すでに、「彼」と「決定」という漢字変
換が確定した自立語が入力されているので、これらの語
から漢字変換候補の順番を決定する。まず、図2の
「彼」の意味ベクトル(共起テ−ブルの「彼」の行方向
の要素)は「8、7、2、4、3、2」であり、同様に
「決定」の意味ベクトル(共起テ−ブルの「決定」の行
方向の要素)は「2、5、7、3、7、1」である。こ
れらの2つのベクトルを平均することで現在の意味ベク
トルを求めると、(5、6、4、3、5、1)となる
(対応する数値の平均値をそれぞれ求めると(8+2)
÷2=5、(7+5)÷2=6、以下同様、ただし小数
点は切り捨てる。図3参照)。このベクトルの要素の中
で、「講義」と「抗議」に相当する所をみると、「講
義」は3、「抗議」は5となっている(図3参照)。つ
まりそれまでの入力から意味ベクトルを求めると、現在
の入力「こうぎ」に対しては「抗議」のほうが「講義」
よりつながりが深いことを意味しており、変換結果とし
て「抗議」を第1位に選択し出力する。
In FIG. 2, the "teacher" in the second row and the "lecture" in the fourth column are 7, and the connection is deeper than that of "teacher" in the second row and "protest" in the fifth row. Is shown.
As an example, "He did the decision."
When a sentence "is input", there are two types of conversion results of the character string "Kougi", "protest" and "lecture". In this case, since independent words such as “he” and “decision” whose kanji conversion has been confirmed have already been input, the order of kanji conversion candidates is determined from these words. First, the semantic vector of "he" in Fig. 2 (the element in the row direction of "he" in the co-occurrence table) is "8, 7, 2, 4, 3, 2," and similarly for "decision". The meaning vector (row-wise element of "decision" in co-occurrence table) is "2, 5, 7, 3, 7, 1". When the current meaning vector is calculated by averaging these two vectors, it becomes (5, 6, 4, 3, 5, 1) ((8 + 2) when the average value of the corresponding numerical values is calculated.
÷ 2 = 5, (7 + 5) ÷ 2 = 6, and so on, except for the decimal point. (See FIG. 3). Among the elements of this vector, looking at the places corresponding to “lecture” and “protest”, “lecture” is 3 and “protest” is 5 (see FIG. 3). In other words, if the meaning vector is calculated from the input up to that point, “protest” is more “lecture” than the current input “Kogi”.
It means that the connection is deeper, and "protest" is selected and output as the first place as the conversion result.

【0012】また、他の例として「彼はその先生のこう
ぎをうけた」場合も「こうぎ」という文字列の変換結果
は、「抗議」と「講義」との2種類が存在する。この場
合、前述の例と同様に図2の「彼」の意味ベクトル(共
起テ−ブルの「彼」の行方向の要素)は「8、7、2、
4、3、2」であり、同様に「先生」の意味ベクトル
(共起テ−ブルの「先生」の行方向の要素)は「7、
9、4、7、5、1」である。これらの2つのベクトル
を平均することで現在の意味ベクトルを求めると、
(7、8、3、5、4、1)となる(対応する数値の平
均値をそれぞれ求めると(8+7)÷2=7、(7+
9)÷2=8。以下同様、図4参照)。このベクトルの
要素の中で、「講義」と「抗議」に相当する所をみる
と、「講義」は5、「抗議」は4となっている(図4参
照)。つまりそれまでの入力から意味ベクトルを求める
と、現在の入力「こうぎ」に対しては「講義」のほうが
「抗議」よりつながりが深いことを意味しており、変換
結果として「講義」を出力する。
As another example, in the case of "He received the teacher's kogi", there are two types of conversion results of the character string "kogi", "protest" and "lecture". In this case, as in the above example, the meaning vector of "he" (the element in the row direction of "he" of the co-occurrence table) in FIG.
Similarly, the meaning vector of "teacher" (the element in the row direction of "teacher" in the co-occurrence table) is "7,
9, 4, 7, 5, 1 ". When the current meaning vector is calculated by averaging these two vectors,
(7,8,3,5,4,1) ((8 + 7) / 2 = 7, (7+
9) ÷ 2 = 8. Similarly, refer to FIG. 4). Among the elements of this vector, looking at the places corresponding to "lecture" and "protest", "lecture" is 5 and "protest" is 4 (see Fig. 4). In other words, if the meaning vector is calculated from the input up to that point, it means that "lecture" is more connected than "protest" to the current input "Kogi", and outputs "lecture" as the conversion result. .

【0013】共起テ−ブルとしては、大量のデ−タから
学習することで作成が可能である。大量のデ−タから自
動的に作成した場合には作成者によるバラツキなどが入
らず高い品質の共起テ−ブルを作成することができ、そ
の結果かな漢字変換効率を向上することが期待できる。
The co-occurrence table can be created by learning from a large amount of data. When automatically created from a large amount of data, it is possible to create a high-quality co-occurrence table without variations caused by the creator, and as a result, it is expected that the kana-kanji conversion efficiency is improved.

【0014】なお、現在の意味ベクトル(5、6、4、
3、5、1)から漢字変換候補を入れ替える場合に最初
の例の記述ではその要素の大小を比較した(抗議:5>
「講義:3)が、現在の意味ベクトルとその候補単語の
もつ意味ベクトルの距離を用いてもよい。「抗議」の場
合にはその単語の意味ベクトル(図2の共起テ−ブルの
抗議の行方向の要素)は(2、4、6、2、8、1)で
あり、現在の意味ベクトルとのユ−クリッド距離Dは、 D(抗議)=(5−2)2 +(6−4)2 +(4−6)
2 +(3−2)2 +(5−8)2 +(1−1)2 =27 同様に「講義」の場合には、 D(講義)=(5−3)2 +(6−5)2 +(4−3)
2 +(3−9)2 +(5−5)2 +(1−3)2 =46 となる。この場合、距離の近いものから漢字変換候補が
選択されるのが妥当であるので、最も距離の近い「抗
議」が第1位に選択される。
The current meaning vector (5, 6, 4,
When replacing Kanji conversion candidates from 3, 5, 1), in the description of the first example, the size of the element was compared (Protest: 5>
“Lecture: 3) may use the distance between the current meaning vector and the meaning vector of the candidate word. In the case of“ protest ”, the meaning vector of the word (protest of the co-occurrence table in FIG. 2). Element in the row direction) is (2,4,6,2,8,1), and the Euclidean distance D to the current semantic vector is D (protest) = (5-2) 2+ (6 -4) 2 + (4-6)
2 + (3-2) 2 + (5-8) 2 + (1-1) 2 = 27 Similarly, in the case of "lecture", D (lecture) = (5-3) 2 + (6-5 ) 2 + (4-3)
2+ (3-9) 2+ (5-5) 2+ (1-3) 2 = 46. In this case, since it is appropriate to select the Kanji conversion candidate from the one having the shortest distance, the “protest” having the shortest distance is selected as the first place.

【0015】[0015]

【発明の効果】かな漢字変換手段が入力されるひらかな
列をひらかな及び漢字に変換すべきひらかなと同音異義
の漢字変換候補を含むかな漢字列に変換し、記憶手段が
かな漢字変換手段により変換されたかな漢字列のうち漢
字変換が確定した漢字を記憶し、選択手段が記憶手段に
記憶された漢字と文脈に関して意味が適合するように変
換すべきひらかなに対応する漢字変換候補の変換順を選
択するので、すでに入力された語(文)と意味的に類似
する漢字変換候補が優先的に選択され漢字変換効率を向
上し得る。
The kana-kanji conversion means converts the input hiragana string into a kana-kanji string containing hiragana and kanji conversion candidates having the same phonetic meaning as kana and kanji, and the storage means is converted by the kana-kanji conversion means. The kanji whose kanji conversion has been confirmed is stored in the Takana kanji string, and the selection means selects the conversion order of the kanji conversion candidates corresponding to the hiragana to be converted so that the meaning matches the kanji stored in the storage means and the context. Therefore, the kanji conversion candidates that are semantically similar to the already input word (sentence) are preferentially selected, and the kanji conversion efficiency can be improved.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明の同音異義語選択装置の実施例のブロッ
ク図である。
FIG. 1 is a block diagram of an embodiment of a homonym synonym selection device of the present invention.

【図2】共起テ−ブルを示す図である。FIG. 2 is a diagram showing a co-occurrence table.

【図3】各単語間のつながりから漢字変換を行う場合の
例を説明する図である。
FIG. 3 is a diagram illustrating an example in which kanji conversion is performed based on the connection between words.

【図4】各単語間のつながりから漢字変換を行う場合の
他の例を説明する図である。
FIG. 4 is a diagram illustrating another example of a case where Kanji conversion is performed from the connection between words.

【符号の説明】[Explanation of symbols]

1 入力バッファ 2 かな漢字変換部 3 かな漢字変換辞書 4 変換候補バッファ 5 確定文章バッファ 6 候補並べ替え部 7 共起テ−ブル 1 Input buffer 2 Kana-Kanji conversion unit 3 Kana-Kanji conversion dictionary 4 Conversion candidate buffer 5 Fixed sentence buffer 6 Candidate rearrangement unit 7 Co-occurrence table

Claims (2)

【特許請求の範囲】[Claims] 【請求項1】 入力されるひらかな列をひらかな及び漢
字に変換すべきひらかなと同音異義の漢字変換候補を含
むかな漢字列に変換するかな漢字変換手段と、前記かな
漢字変換手段により変換されたかな漢字列のうち漢字変
換が確定した漢字を記憶する記憶手段と、前記記憶手段
に記憶された漢字と文脈に関して意味が適合するように
変換すべきひらかなに対応する漢字変換候補の変換順を
選択する選択手段とを含むことを特徴とする同音異義語
選択装置。
1. A kana-kanji conversion means for converting an input hiragana string into a kana-kanji string that includes kana-kanji conversion candidates having the same phonetic meaning as hiragana and kanji, and kana-kanji converted by the kana-kanji conversion means. A storage means for storing a kanji character for which a kanji conversion has been confirmed among the columns and a conversion order of kanji conversion candidates corresponding to the hiragana to be converted so that the kanji characters stored in the storage means have the same meaning in terms of context. A homonym synonym selection device comprising: selection means.
【請求項2】 前記選択手段は漢字変換候補の意味が適
合する順を規定する共起テ−ブルを備えていることを特
徴とする請求項1に記載の同音異義語選択装置。
2. The homonym synonym selection device according to claim 1, wherein the selection means includes a co-occurrence table that defines an order in which the meanings of the Kanji conversion candidates match.
JP5182770A 1993-07-23 1993-07-23 Homonym selecting device Pending JPH0736878A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP5182770A JPH0736878A (en) 1993-07-23 1993-07-23 Homonym selecting device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP5182770A JPH0736878A (en) 1993-07-23 1993-07-23 Homonym selecting device

Publications (1)

Publication Number Publication Date
JPH0736878A true JPH0736878A (en) 1995-02-07

Family

ID=16124119

Family Applications (1)

Application Number Title Priority Date Filing Date
JP5182770A Pending JPH0736878A (en) 1993-07-23 1993-07-23 Homonym selecting device

Country Status (1)

Country Link
JP (1) JPH0736878A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09146961A (en) * 1995-11-24 1997-06-06 Nec Corp Standard document generation system
JP2003514304A (en) 1999-11-05 2003-04-15 マイクロソフト コーポレイション A linguistic input architecture that converts from one text format to another and is resistant to spelling, typing, and conversion errors
JP2019159826A (en) * 2018-03-13 2019-09-19 富士通株式会社 Display control program, display control device, and display control method

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09146961A (en) * 1995-11-24 1997-06-06 Nec Corp Standard document generation system
JP2003514304A (en) 1999-11-05 2003-04-15 マイクロソフト コーポレイション A linguistic input architecture that converts from one text format to another and is resistant to spelling, typing, and conversion errors
JP2019159826A (en) * 2018-03-13 2019-09-19 富士通株式会社 Display control program, display control device, and display control method

Similar Documents

Publication Publication Date Title
JPH0736878A (en) Homonym selecting device
KR960038586A (en) Complex language transfer data processing system and character generation data processing method
JPH0578058B2 (en)
JPS60247770A (en) Character processor
JPS63316162A (en) Document preparing device
JPS63106070A (en) Chinese sentence input system
JPS62209667A (en) Sentence producing device
JPH1063651A (en) Chinese language input device
JP3340124B2 (en) Kana-Kanji conversion device
JPS61235978A (en) Character string correction system
JPS63206860A (en) Information processor
JPS59116835A (en) Japanese input device with input abbreviating function
JPH06119379A (en) System and method for machine translation provided with reading kana @(3754/24)japanese syllabary) attaching function
JPH0728489A (en) Recognition candidate selecting device
JPS6120176A (en) Roman character/chinese character converter
JPS60112175A (en) Abbreviation conversion system of kana (japanese syllabary)/kanji (chinese character) convertor
JPS6278673A (en) Kana/kanji (chinese character) converter
JPH01237877A (en) Kanji conversion system
JPH0816572A (en) Automatic recognition system of alphanumerics/kana character
JPH05158921A (en) Kana/kanji converter
JPH05101037A (en) Kana/kanji converting system
JPH08249325A (en) Device and method for japanese syllabary chinese character conversion
JPH03208163A (en) Character processor
JPS6258371A (en) Sentence producing device
JPH03202951A (en) Japanese sentence processor