JPH0855117A

JPH0855117A - Character processor

Info

Publication number: JPH0855117A
Application number: JP6188260A
Authority: JP
Inventors: Akira Hamada; 明濱田
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1994-08-10
Filing date: 1994-08-10
Publication date: 1996-02-27

Abstract

PURPOSE:To unify retrieve dictionary words having the same notation but different reading while preventing the capacity or processing of a reading substituting dictionary or the like from being increased. CONSTITUTION:Whether character substitution for a character string inputted from an input part 3 is necessary or not is judged based on a character substituting table 7, and when the character substitution is necessary, the inputted character string is saved in a buffer 4, substitution is executed based on the table 7 and retrieval is executed by an independent word dictionary 6 based on a condition stored in a retrieving time restricting condition buffer 9, so that dictionary words having differences in their reading can be unitedly retrieved without preparing a master reading separation type reading dictionary.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、日本語ワードプロセッ
サ等仮名漢字変換機能をもつ文字処理装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character processing device such as a Japanese word processor having a kana-kanji conversion function.

【０００２】[0002]

【従来の技術】従来、仮名漢字変換では、入力文字列で
辞書の仮名見出しを検索し、マッチした単語を利用して
いた。図７の仮名漢字変換装置のブロック図にしめすよ
うに、自立語辞書１０１の仮名見出しから入力文字列１
００とマッチするものを検索手段１０２によって検索
し、その結果１０３を用いて表記候補生成していた。し
かし、図１０の自立語辞書の構成に示すように、「喧嘩
（けんか）」と「喧嘩（げんか）」や、「ウィーク」と
「ウイーク」のように、単語の中には読みや表記にゆら
ぎがあるために、同じ単語でも別の見出し項目として記
憶しておく必要があった。しかし、これでは、辞書の容
量が増加するだけでなく、品詞や意味上の分類の付与な
どで扱いをそろえる必要があるなど、辞書メンテナンス
の面でも非常に手間がかかっていた。2. Description of the Related Art Conventionally, in kana-kanji conversion, a kana heading in a dictionary is searched for with an input character string and the matched word is used. As shown in the block diagram of the kana-kanji conversion device in FIG. 7, the input character string 1 is input from the kana heading of the independent word dictionary 101.
The search means 102 searches for a match with 00, and the result 103 is used to generate a notation candidate. However, as shown in the structure of the independent word dictionary in FIG. 10, words such as “quarrel” and “quarrel”, or “week” and “week” may be written or written in a word. Due to the fluctuation, it was necessary to store the same word as another heading item. However, this not only increases the capacity of the dictionary, but also requires the handling to be done by adding parts of speech and semantic classifications, which is very troublesome in terms of dictionary maintenance.

【０００３】これに対し、特開平４−１６５５５７（平
成４年６月１１日公開）の仮名漢字変換辞書検索機構に
おいては、同じ表示で異なる読みを持つ単語群は、その
読みの一つをマスタ読みとして表記と対応付け、他の読
みはマスタ読みと対応させた辞書によってマスタ読みへ
変換している。この構成を図１１に図示した。図１１の
ように、入力文字列１００を読みの対応をとる辞書１０
６で、自立語辞書１０１の見出しであるマスタ読みに変
換してから検索することで、別読みでもマッチするよう
にしている。On the other hand, in the kana-kanji conversion dictionary search mechanism of Japanese Patent Laid-Open No. 4-165557 (published on June 11, 1992), a word group having the same display but different reading has one of the readings as a master. Readings are associated with the notation, and other readings are converted into master readings by a dictionary corresponding to the master readings. This configuration is shown in FIG. As shown in FIG. 11, the dictionary 10 that corresponds to reading the input character string 100
In step 6, the master reading, which is the heading of the independent word dictionary 101, is converted and then searched, so that the separate reading is matched.

【０００４】しかし、上記の従来技術における実施例で
は、マスタ読みへの変換用に１０１の辞書見出しと対応
した辞書１０６が必要となり、またこの辞書を検索する
ために本来の辞書検索と同様の処理が必要になる。ま
た、特開平４−１６５５５７は「同じ表記の別読み」の
ゆらぎに対して処理を行うため、カタカナ語の「ウイー
ク」と「ウィーク」のように読みと表記が対応するゆら
ぎについては考慮されていない。However, in the above-described embodiment of the prior art, the dictionary 106 corresponding to the dictionary heading 101 is required for conversion into the master reading, and the same processing as the original dictionary search is performed to search this dictionary. Will be required. Further, since Japanese Patent Laid-Open No. 4-165557 processes the fluctuation of "separate reading of the same notation", fluctuations corresponding to the reading and the notation such as "weak" and "weak" in Katakana are taken into consideration. Absent.

【０００５】さらに、「兄弟喧嘩」のために「げんか」
を「けんか」に読みかえて、辞書を検索すると、「県
下」まで候補になってしまい、無駄な候補が増加するこ
とが考えられる。また、「げんか」の読みをすべて「け
んか」に置き換えてしまうと「原価」などの候補がでな
くなってしまうという問題も発生する。Furthermore, "Genka" for "brothers'fight"
If you read the word as "quarter" and search the dictionary, "prefecture" will become candidates, and the number of useless candidates may increase. Also, if all the readings of "Genka" are replaced with "quarrels", there will be a problem that candidates such as "cost" will disappear.

【０００６】また、複数の文節を一度に変換するような
場合には、読み替えたままにしておくと、以降の検索に
影響することも考えられる。例えば、「かぶしきがいし
ゃ」と入力して「会社」を検索するために「がいしゃ」
を「かいしゃ」に読み替えたままにしておくと、「しき
が」→「指揮が」という変換候補のための助詞「が」や
「しきがい」→「指揮外」という変換候補のための接尾
語「外」などが検索できなくなるといった問題がある。Further, in the case of converting a plurality of clauses at a time, it may be possible to affect the subsequent retrieval if the reading is left as it is. For example, enter "Kabushiki Gaisha" to search for "company".
Is replaced with "Kaisha", the suffix "ga" or "shigigai" for the conversion candidate "shigiga" → "conductor" becomes the suffix for the conversion candidate "outside the conductor". There is a problem that "outside" etc. cannot be searched.

【０００７】[0007]

【発明が解決しようとする課題】カタカナ語の表記の揺
れのうち、「ベ」と「ヴェ」、「ボ」と「ヴォ」など
は、「ｂｅｓｔ」と「ｖｅｓｔ」、「ｂｏｌｔ」と「ｖ
ｏｌｔ」のような本来別の単語を書き分けていることも
ある。そのため、仮名漢字変換辞書では「べすと」とい
う見出しに対しては「ｂｅｓｔ」と「ｖｅｓｔ」双方の
単語の品詞と意味の情報を持たせ、「▲う゛▼ぇすと」
という見出しに対しては「ｖｅｓｔ」の品詞と意味の情
報を持たせることになる。この場合、２つの辞書項目を
一本化することはできず、「べ」と「▲う゛▼ぇ」の読
み替えを行う利点はない。Among the fluctuations in the katakana notation, "be" and "ve", "bo" and "vo", etc. are "best" and "best", "bolt" and "v".
In some cases, another word such as "olt" is originally written. Therefore, in the Kana-Kanji conversion dictionary, the heading "BESTO" should have information about the part of speech and meaning of both words "best" and "best", and "
For the headline, the part of speech of "vest" and the meaning information will be given. In this case, the two dictionary items cannot be unified, and there is no advantage in replacing "be" and "b".

【０００８】本発明は、「うぇい」と「うえい」のよう
に、同じ意味の単語にもかかわらず、読みのゆらぎのた
めに複数の見出し語を持つために、辞書の容量の増加の
もとになっていたり、読み替え辞書と読み替え手段を用
いて、一つの代表であるマスタ読みに変換して検索する
ことなく、一定の規則に基づいて入力文字列を置換し変
換用自立語辞書を検索することによって、読みのゆらぎ
による複数の見出し語を一本化し、辞書見出しに合わせ
るものである。The present invention increases the capacity of the dictionary because it has a plurality of headwords for reading fluctuations, even though the words have the same meaning, such as "Wai" and "Uei". Independent dictionary for conversion by substituting the input character string based on a certain rule without converting to a master reading which is one representative using the reading dictionary and the reading means and searching. By searching for, a plurality of headwords due to reading fluctuations are unified and matched with the dictionary headline.

【０００９】[0009]

【課題を解決するための手段】読みのゆらぎについての
情報を付与した変換用自立語辞書、置き換えの条件と方
法を記述した文字置換用テーブルを具備し、文字入力手
段より入力された文字列において、読みのゆらぎがある
かどうかを、前記文字置換用テーブルを参照して判断
し、入力文字列に一定の規則に基づいて置き換えを行っ
た後、変換用自立語辞書の辞書見出しと照合させること
によって、読みなどにゆらぎのある辞書の単語を効果的
に検索することが可能となる。[Means for Solving the Problem] A conversion independent word dictionary provided with information about reading fluctuations, a character replacement table describing replacement conditions and methods, and a character string input from character input means are provided. , Refer to the character replacement table to determine whether there is fluctuation in reading, replace the input character string based on a certain rule, and then match it with the dictionary heading of the conversion independent word dictionary. This makes it possible to effectively search for words in a dictionary that have fluctuations in reading.

【００１０】[0010]

【作用】キーボード等より入力されたひらがな文字列に
対して、置き換えが必要かどうか、文字置換用テーブル
によって判定を行う。判定の結果、置き換えが必要な場
合、前記文字置換用テーブルによって、置き換えを行な
う。置き換えを行う際に、入力された文字列は一時的に
バッファに退避させておく。検索用の自立語辞書と照合
を行ったのちに、入力文字列はバッファに退避させた文
字列に戻すことによって、その後の変換に影響を及ぼす
ことがない。また、前記文字置換用テーブルによって変
換候補の条件を限定するため、カタカナ語に関して漢字
の候補を選択するようなことを防ぐ。仮名文字単位での
置き換えではあるが、図８に示すように検索開始位置も
変更することで「くぁ」→「か」のような置き換えにも
対応することが可能となる。[Operation] Whether or not replacement of the hiragana character string input from the keyboard or the like is necessary is determined by the character replacement table. If the result of determination is that replacement is necessary, replacement is performed using the character replacement table. When performing replacement, the input character string is temporarily saved in the buffer. After collating with the independent word dictionary for searching, the input character string is returned to the character string saved in the buffer, so that the subsequent conversion is not affected. Further, since the conversion candidate conditions are limited by the character replacement table, it is possible to prevent selection of Kanji candidates for Katakana. Although the replacement is performed in units of kana characters, the replacement such as “kua” → “ka” can also be supported by changing the search start position as shown in FIG.

【００１１】[0011]

【実施例】以下、本発明の一実施例について、詳細に説
明する。なお、本発明はこの実施例に限定されるもので
はないことはもちろんである。EXAMPLES An example of the present invention will be described in detail below. Needless to say, the present invention is not limited to this embodiment.

【００１２】図１は、本発明の一実施例の構成を示すブ
ロック図である。この図において、１は制御部であり、
仮名漢字変換用の文法テーブル、プログラムなどを含ん
でいる。２はＣＲＴまたはＬＣＤなどからなる表示部で
あり、辞書検索結果から生成された変換候補の確認等に
使用される。３はキーボード等からなる入力部であり、
文字の入力や仮名漢字変換の指示などを行う。本実施例
ではひらがな文字列を入力する。４は入力部３から入力
された被変換文字列が格納されるバッファである。５は
バッファ４のどの文字から辞書検索を行うかを示すポイ
ンタである。６は自立語辞書である。７は文字置換用の
テーブルであり、文字の置換が必要かどうかの条件、置
き換えの方法などを格納している。８は文字の置換を行
った場合にバッファ４とポインタ５の以前の内容を対比
するバッファ、９は辞書検索時の限定条件の種類を蓄え
るバッファである。図３、図４はともに文字置換用の
テーブルの構成例を示している。図３では、置き換えの
条件と置き換えの方法別に置き換え対象の文字コードを
集めている。１行ごとに条件と方法、検索条件が格納さ
れている。１行目は入力文字列の先頭の文字が濁音の仮
名である場合に置き換え処理の対象となることを表して
いる。図４は、図３のポインタ位置の文字の部分を違う
方法で格納したものである。文字コードと対応する要素
のうち、置き換え対象の文字と対応する要素が“１”に
なっている。この例も濁音の仮名を置き換え対象の文字
にした例を示している。FIG. 1 is a block diagram showing the configuration of an embodiment of the present invention. In this figure, 1 is a control unit,
It contains a grammar table and programs for Kana-Kanji conversion. Reference numeral 2 denotes a display unit including a CRT or LCD, which is used for confirmation of conversion candidates generated from dictionary search results. 3 is an input unit including a keyboard,
Input characters and give instructions for Kana-Kanji conversion. In this embodiment, a hiragana character string is input. Reference numeral 4 is a buffer for storing the converted character string input from the input unit 3. Reference numeral 5 is a pointer indicating from which character in the buffer 4 the dictionary search is to be performed. 6 is an independent word dictionary. Reference numeral 7 is a table for character replacement, which stores conditions such as whether or not character replacement is required, a replacement method, and the like. Reference numeral 8 is a buffer for comparing the previous contents of the buffer 4 and pointer 5 when character replacement is performed, and 9 is a buffer for storing the types of limiting conditions at the time of dictionary search. 3 and 4 both show a configuration example of a table for character replacement. In FIG. 3, character codes to be replaced are collected according to replacement conditions and replacement methods. Conditions, methods, and search conditions are stored for each line. The first line shows that when the first character of the input character string is a kana of kana, it is subject to the replacement process. FIG. 4 shows that the character portion at the pointer position in FIG. 3 is stored by a different method. Among the elements corresponding to the character code, the element corresponding to the character to be replaced is "1". This example also shows an example in which the kana of the dakuon is used as the character to be replaced.

【００１３】置き換え対象の文字列は、処理方法の欄に
ある処理が施される。ここでは文字コードを１つ小さく
する。これによって、濁音が清音に置き換えられること
になる。また、検索条件として、連濁が指定されている
ため、図５の辞書にある検索条件のうち、６３の連濁の
部分にマークされているもののみが検索の対象となる。
連濁とは二つの語が複合するとき、後ろの語のはじめの
清音が濁音になることである。例えば、「水（みず）」
と「不足（ふそく）」という二つの単語が複合した場
合、「水不足（みずぶそく）」となり、後ろの単語のは
じめの清音であった「ふ」が濁音の「ぶ」になるのであ
る。The character string to be replaced is subjected to the processing in the processing method column. Here, the character code is reduced by one. As a result, the voiced sound is replaced by the pure sound. Further, since rendaku is designated as the search condition, only the search condition marked in the part 63 of rendaku among the search conditions in the dictionary of FIG. 5 becomes the search target.
Rendaku means that when two words are compounded, the beginning of the latter word becomes a dakuon. For example, "water"
When the two words "and" are lacking, the result is "water shortage", and the first clear sound in the latter word, "Fu," becomes a dull "bu". .

【００１４】図３の文字置換用テーブルの２行目は入力
文字列の１文字目が「う」であり、２文字目が「ぃ」か
「ぇ」か「ぉ」である場合に置き換え処理の対象となる
例である。処理方法としては文字コードを１つ大きくす
る。これによって、小文字である「ぃ」「ぇ」「ぉ」が
大文字である「い」「え」「お」に置き換えられること
になる。In the second line of the character replacement table of FIG. 3, when the first character of the input character string is "u" and the second character is "i", "ee" or "ぉ", the replacement processing is performed. This is an example of the target of. As a processing method, the character code is increased by one. As a result, the lowercase letters "i", "e" and "o" are replaced with the uppercase letters "i", "e" and "o".

【００１５】４行目は入力文字列の１文字目が「く」で
あり、２文字目が「ぁ」の場合である場合に置き換え処
理の対象となる例である。処理方法としては、「ぁ」を
「か」に置き換えた後に、検索位置、つまりポインタを
ひとつ進める。この処理を図８に示す。この処理によっ
て「くぁるてっと」という文字列は「かるてっと」とし
て辞書と照合される。The fourth line is an example of the replacement process when the first character of the input character string is "ku" and the second character is "a". As a processing method, after replacing "a" with "ka", the search position, that is, the pointer is advanced by one. This process is shown in FIG. By this processing, the character string "kuarutetto" is collated with the dictionary as "karutetto".

【００１６】図５は、自立語辞書の内容を示す模式図で
ある。各単語は、読み見出し６１と表記６２と各種分類
情報６３〜６５で構成されている。「喧嘩」「不足」の
ような連濁可能な単語は６３にマークされており、「ウ
イーク」、「ウエイト」などカタカナ表記の語は６５に
マークされている。この各種分類情報は、置き換えが行
われたときの条件の限定に利用される。たとえば、図３
の１行目の場合、つまり入力文字列の最初の文字が濁点
であった場合は、文字コードから１をひいた清音に置き
換えた文字列を図５の６３にマークされた連濁可能単語
に限定して検索をおこなうことになる。濁音に関して
は、置き換えた単語と置き換える前の単語の両方と照合
を行う。FIG. 5 is a schematic diagram showing the contents of the independent word dictionary. Each word is composed of a reading headline 61, a notation 62, and various classification information 63 to 65. Reproducible words such as "fight" and "insufficient" are marked at 63, and words in katakana notation such as "weak" and "weight" are marked at 65. This various classification information is used to limit the conditions when the replacement is performed. For example, in FIG.
In the case of the first line of, that is, when the first character of the input character string is a dakuten, the character string in which the character code is replaced with a clear sound is limited to the rendaku-possible word marked 63 in FIG. Will be searched. For dakuon, it matches both the replaced word and the word before replacement.

【００１７】例えば、入力文字列が「げんか」の場合、
「げんか」と「けんか」の両方で照合を行う。ただし、
「けんか」は図５の６３にマークされた連濁可能な語の
み照合の対象となる。For example, when the input character string is "Genka",
Match both "Genka" and "Kenka". However,
For the “quarrel”, only the words that can be rendaku marked with 63 in FIG. 5 are to be matched.

【００１８】カタカナ語は表記６２はなく、マッチした
の入力文字列をカタカナにして変換結果の表記を生成す
る。これによって、前述の例の「くぁるてっと」は「か
るてっと」と置き換えられたあと、図５の６５にマーク
されたカタカナ単語に限定して検索し、見出し語の「か
るてっと」と照合する。照合した場合、入力文字列であ
る「くぁるてっと」をカタカナに変換した「クァルテッ
ト」が出力される。The Katakana word has no notation 62, and the matched input character string is converted to Katakana to generate the notation of the conversion result. As a result, the word "kartet" in the above example is replaced with "kartet", and the search is limited to the katakana word marked 65 in FIG. "Tetto". When collated, the input string "quartet" is converted into katakana, and "quartet" is output.

【００１９】以下に、図３のテーブルによる文字の置換
を図６のフローによって説明する。（ステップＳ０）辞書検索の対象となる文字列の先頭文
字、つまりポインタ５が指すバッファ４の文字を調べ
る。「う」ならステップＳ１へ、「く」ならステップＳ
２へ、「う」、「く」以外で置き換えが必要な場合は図
で省略されているそれぞれの処理へ、置き換えの必要が
ない場合は、ステップＳ５へそれぞれ進む。The replacement of characters by the table of FIG. 3 will be described below with reference to the flow of FIG. (Step S0) The first character of the character string to be searched for in the dictionary, that is, the character in the buffer 4 pointed by the pointer 5 is checked. If "u", go to step S1, if "ku", go to step S1.
If the replacement other than “u” or “ku” is necessary, the process proceeds to each process omitted in the figure. If the replacement is not necessary, the process proceeds to step S5.

【００２０】（ステップＳ１）ポインタ５が指す「う」
の次の文字を調べる。図３の２行目の「ぃ」「ぇ」
「ぉ」ならステップＳ３へ進む。それ以外ならステップ
Ｓ５へ進む。(Step S1) "U" pointed by the pointer 5
Find the next character of. "I""e" of the second line of FIG.
If “o”, proceed to step S3. Otherwise, go to step S5.

【００２１】（ステップＳ２）ポインタ５が示す「く」
の次の文字を調べる。図３の３行目の「ぉ」ならステッ
プＳ３へ、図３の４行目の「ぁ」ならステップＳ４へ進
む。それ以外ならステップＳ５へ進む。(Step S2) "ku" indicated by the pointer 5
Find the next character of. If "o" on the third line in FIG. 3 proceeds to step S3, and if "a" on the fourth line in FIG. 3, proceeds to step S4. Otherwise, go to step S5.

【００２２】（ステップＳ３）ポインタ５の次の文字の
「ぃ」「ぇ」「ぉ」を１つ大きい文字コード「い」
「え」「お」に置き換える。置き換える前の状況はバッ
ファ８に蓄えておく。バッファ９には候補をカタカナ語
に限定することを記憶しておく。ステップＳ５に進む。(Step S3) The character next to the pointer 5 "i""e""o" is one larger character code "i"
Replace with "e" and "o". The situation before replacement is stored in the buffer 8. It is stored in the buffer 9 that candidates are limited to Katakana. Go to step S5.

【００２３】（ステップＳ４）ポインタ５の次の文字の
「ぁ」を「か」に置き換える。ポインタ５がこの「か」
を指すように１大きくする。置き換える前の状況はバッ
ファ８に蓄えておく。バッファ９には候補をカタカナ語
に限定することを記録しておく。ステップＳ５へ進む。(Step S4) The character "a" next to the pointer 5 is replaced with "ka". Pointer 5 is this "ka"
Increase by 1 to point to. The situation before replacement is stored in the buffer 8. It is recorded in the buffer 9 that candidates are limited to Katakana. Go to step S5.

【００２４】（ステップＳ５）ポインタ５が指す文字を
先頭にして、バッファ４の文字列で辞書６を検索する。
ステップＳ６へ進む。(Step S5) The dictionary 6 is searched with the character string in the buffer 4, starting with the character pointed by the pointer 5.
Go to step S6.

【００２５】入力が「うぇいと」の場合、ステップＳ０
→Ｓ１→Ｓ３により「うえいと」と置き換えられてお
り、「ウエイト」とマッチして候補として蓄えられる。
しかし、バッファ９の内容を参照して「飢え」「上井」
などは読みがマッチしても候補にはしない。If the input is "weight", step S0
→ S1 → S3 is replaced with "Ueito", and matches "weight" and is stored as a candidate.
However, referring to the contents of buffer 9, "hunger""Uei"
Is not a candidate even if the readings match.

【００２６】（ステップＳ６）バッファ８を参照して、
バッファ４、ポインタ５が変更されていれば、置き換え
る前の状況を復元する。ステップＳ７に進む。(Step S6) Referring to the buffer 8,
If the buffer 4 and the pointer 5 are changed, the situation before the replacement is restored. Proceed to step S7.

【００２７】ここで「うえいと」と読み替えられた文字
列は「うぇいと」に戻される。カタカナ語の「ウエイ
ト」の表記は、バッファ４の文字列から生成されるの
で、最終的に表示部２に表示される候補の表記は「ウェ
イト」となる。The character string read as "Ueito" here is returned to "Weight". Since the Katakana word "weight" is generated from the character string in the buffer 4, the candidate notation finally displayed on the display unit 2 is "weight".

【００２８】（ステップＳ７）ポインタ５が指す文字を
調べる。図３の１行目の濁音ならステップＳ８へ進む。
それ以外なら検索処理を終える。(Step S7) The character pointed by the pointer 5 is examined. If it is the dull sound in the first line of FIG. 3, the process proceeds to step S8.
Otherwise, the search process ends.

【００２９】（ステップＳ８）ポインタ５の指す濁音を
１つ小さい文字コードの清音に置き換える。置き換える
前の状況はバッファ８に蓄えておく。バッファ９には候
補を連濁可能な語に限定することを記録しておく。ステ
ップＳ９へ進む。(Step S8) The dull sound pointed by the pointer 5 is replaced with a clear sound having a character code one smaller. The situation before replacement is stored in the buffer 8. It is recorded in the buffer 9 that candidates are limited to words that can be rendaku. Go to step S9.

【００３０】（ステップＳ９）ポインタ５が指す文字を
先頭にして、バッファ４の文字列で辞書６を検索する。
ステップＳ１０へ進む。(Step S9) The dictionary 6 is searched with the character string in the buffer 4, starting with the character pointed by the pointer 5.
Go to step S10.

【００３１】（ステップＳ１０）バッファ８を参照し
て、バッファ４が変更されていれば置き換える前の状況
を復元し、検索処理を終了する。(Step S10) With reference to the buffer 8, if the buffer 4 has been changed, the condition before replacement is restored, and the search process ends.

【００３２】入力部３より入力された入力文は、被変換
文字列格納バッファ４に格納され、ポインタ５を前記被
変換文字列格納バッファ４の先頭の文字を指すように設
定し、前記の図３のステップに基づいて検索処理を行な
う。検索処理によって検索された単語は表示部２によっ
て表示される。The input sentence input from the input unit 3 is stored in the converted character string storage buffer 4, the pointer 5 is set so as to point to the leading character of the converted character string storage buffer 4, and The search process is performed based on the step of 3. The word retrieved by the retrieval process is displayed on the display unit 2.

【００３３】前記の検索処理ではバッファ８を参照して
置き換え前の状況を復元するかわりに、ポインタ５の指
す清音を１つ大きい文字コードの濁音に置き換えてもよ
い。また、入力文字列に対して文字の置き換えを行い、
辞書検索後に元の文字に復元するかわりに、図７で示す
ように、入力文字列から検索用の文字列を取り出して文
字の置き換えを行ってもよい。In the above-mentioned retrieval processing, instead of referring to the buffer 8 to restore the situation before replacement, the clear sound pointed by the pointer 5 may be replaced with the dull sound of the character code one larger. Also, replace the characters in the input string,
Instead of restoring the original character after the dictionary search, as shown in FIG. 7, a character string for search may be extracted from the input character string to replace the character.

【００３４】[0034]

【発明の効果】本発明によって、自立語辞書に登録され
ている語彙とは独立に規則的に文字を置換する手段を設
けることで、読みにゆらぎがある辞書単語を一本化する
ことが可能となる。規則的に文字を置換することによっ
て、読み替え辞書による方式での容量と処理にかかる手
間の増加を省けるとともに、自立語辞書の登録語に対応
する読み替え辞書を用意する必要がなくなる。また、表
記が同じで別読みのある語だけに限定していないので、
カタカナ語のゆらぎなど読みと表記の両方のゆらぎに対
しても対応が可能である。According to the present invention, by providing means for regularly replacing characters independently of the vocabulary registered in the independent word dictionary, it is possible to unify dictionary words that have fluctuations in reading. Becomes By regularly replacing the characters, it is possible to save an increase in the capacity and processing required by the reading dictionary method, and it is not necessary to prepare a reading dictionary corresponding to a registered word in the independent word dictionary. Also, because it is not limited to words that have the same notation and are read separately,
It is possible to deal with fluctuations in both reading and writing, such as fluctuations in Katakana.

【００３５】さらに、読み替えて辞書検索した場合は、
辞書に記述された一定の特徴を持つ単語だけに限定する
ことで、無駄な候補の増加を防いでいる。例えば、「う
ぇい」を「うえい」に読み替えた場合はカタカナ語のみ
に限定することによって、「上井」などは候補とはなら
ない。さらに、入力文字列は、置き換えて辞書検索を行
った後に、もとの文字に戻しておくので、以後の辞書単
語や付属語の検索への影響をなくすことができる。ま
た、仮名文字単位の置き換えではあるが、検索開始位置
も変更することができるので、「くぁ」→「か」のよう
な置き換えにも対応することが可能である。Further, when the text is replaced and the dictionary is searched,
By limiting to words with a certain characteristic described in the dictionary, it is possible to prevent an unnecessary increase of candidates. For example, when "uei" is replaced with "uei", "uei" and the like are not candidates by limiting to katakana only. Furthermore, since the input character string is replaced and the dictionary search is performed, the input character string is returned to the original character, so that it is possible to eliminate the influence on the subsequent search of the dictionary word and the attached word. Further, although the replacement is performed in units of kana characters, the search start position can also be changed, so that the replacement such as “kua” → “ka” can also be supported.

[Brief description of drawings]

【図１】本発明の一実施例の仮名漢字変換装置を示す図
である。FIG. 1 is a diagram showing a kana-kanji conversion device according to an embodiment of the present invention.

【図２】本発明の一実施例におけるブロック図である。FIG. 2 is a block diagram of an embodiment of the present invention.

【図３】本発明の一実施例における文字置換用テーブル
の内容を示す説明図である。FIG. 3 is an explanatory diagram showing the contents of a character replacement table according to an embodiment of the present invention.

【図４】本発明の一実施例における文字置換用テーブル
の内容を示す説明図である。FIG. 4 is an explanatory diagram showing the contents of a character replacement table according to an embodiment of the present invention.

【図５】本発明の一実施例における別読みを省いた自立
語辞書の内容を示す説明図である。FIG. 5 is an explanatory diagram showing the contents of an independent word dictionary without separate reading according to an embodiment of the present invention.

【図６】本発明の一実施例における文字の置換の処理の
流れを示すフローチャートである。FIG. 6 is a flowchart showing a flow of processing of character replacement in an embodiment of the present invention.

【図７】本発明の一実施例における置き換えを行う以前
の文字列を回復する別の実施例をしめすブロック図であ
る。FIG. 7 is a block diagram showing another embodiment of recovering a character string before replacement in one embodiment of the present invention.

【図８】本発明の一実施例における検索開始位置の変更
の例をしめす説明図である。FIG. 8 is an explanatory diagram showing an example of changing the search start position according to the embodiment of the present invention.

【図９】従来技術におけるブロック図である。FIG. 9 is a block diagram of a conventional technique.

【図１０】従来技術において同じ語の別読みを別項目と
した自立語辞書の内容を示す図である。FIG. 10 is a diagram showing the contents of an independent word dictionary in which another reading of the same word is another item in the related art.

【図１１】従来技術において別読みを置き換える辞書を
使用するブロック図である。FIG. 11 is a block diagram using a dictionary for replacing separate reading in the prior art.

【図１２】従来技術において別読みを置き換える辞書の
説明図である。FIG. 12 is an explanatory diagram of a dictionary that replaces another reading in the related art.

[Explanation of symbols]

１．制御部２．表示部３．入力部４．被変換文字列格納バッファ５．ポインタ６．自立語辞書７．文字置換用テーブル８．置換時退避用テーブル９．検索時限定条件バッファ 1. Control unit 2. Display unit 3. Input section 4. Converted character string storage buffer 5. Pointer 6. Independent word dictionary 7. Character replacement table 8. Replacement table for replacement 9. Search-only condition buffer

Claims

[Claims]

1. A reading independent fluctuation in a character string input by a character input means, comprising a conversion independent word dictionary to which information about reading fluctuation is added, and a character replacement table describing replacement conditions and methods. Is determined by referring to the character replacement table, the input character string is replaced according to a certain rule, and then the input character string is matched with the dictionary heading of the conversion independent word dictionary. Character processing unit.

2. The character processing device according to claim 1, wherein when the first character of the input character string is a dakuten, the character string in which the character is replaced with a clean sound is collated with the word in the conversion dictionary that can be consecutively confused.

3. The input character string according to a certain rule, when the notation of the pronunciation of katakana varies, the input character string is replaced based on a certain rule and collated with the katakana word in the conversion dictionary. Character processing unit.