JPH10124498A

JPH10124498A - Japanese syllabary/chinese character converting device

Info

Publication number: JPH10124498A
Application number: JP8281970A
Authority: JP
Inventors: Masako Yoshimura; 雅子吉村
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1996-10-24
Filing date: 1996-10-24
Publication date: 1998-05-15

Abstract

PROBLEM TO BE SOLVED: To obtain a device in which conversion precision is improved by increasing the number of co-occurrence data used at the time of conversion by referring to co-occurrence data by plural methods concerning a word for which the co-occurrence data can not be obtained at the time of initial reference. SOLUTION: A Japanese syllabary/Chinese character conversion controlling part 8 designates a word for which co-occurrence data should be refereed to for a co-occurrence data referring part 4. The co-occurrence data referring part 4 refers to data related with the designated word from a co-occurrence data dictionary 7, changes a method for collating data when the objective data can not be obtained, and refers to the co-occurrence data again. Thus, it is possible to refer to the co-occurrence data which can not be obtailed at the time of initial reference. For example, 'SANNIN FUKUSHIMAHO SEN (Japanese syllabary)' is converted into 'SANIN FUKUSHIMA HOSEN (Chinese character)' in a conventional Japanese syllabary/Chinese character converter using a first candidate, and it is converted into 'SANIN FUKUSHIMA HOSEN (Shinese character)' in this device by using co-occurrence data, 'SANIN' and 'HOSEN'.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、入力されたかなデ
ータを、漢字データに変換するかな漢字変換装置に関す
るものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a kana-kanji conversion device for converting input kana data into kanji data.

【０００２】[0002]

【従来の技術】従来、かな漢字変換装置においては、入
力されたかなデータを漢字データに変更する際、複数あ
る変換候補からあるデータを選択して表示し、その候補
が正しくない場合、ユーザが複数の変換候補から正しい
漢字データを選択することで、変換結果を修正する構成
をとっている。2. Description of the Related Art Conventionally, in a kana-kanji conversion device, when changing input kana data to kanji data, a certain data is selected and displayed from a plurality of conversion candidates. The conversion result is corrected by selecting the correct kanji data from the conversion candidates.

【０００３】第一候補の正解率（変換精度）を向上させ
ることが、かな漢字変換装置における重要な課題の一つ
であり、そのために、複数の語の共起関係を表す共起デ
ータを用いた変換も行われている。[0003] Improving the correct answer rate (conversion accuracy) of the first candidate is one of the important issues in the kana-kanji conversion apparatus. For that purpose, co-occurrence data representing the co-occurrence relation of a plurality of words is used. Conversion has also been performed.

【０００４】[0004]

【発明が解決しようとする課題】共起データを用いた変
換では、該当する共起データがいくつ存在するかによっ
て変換精度が左右され、用意する共起データの量が多け
れば、それだけ変換精度が向上する。In the conversion using the co-occurrence data, the conversion accuracy depends on the number of the corresponding co-occurrence data. If the amount of the co-occurrence data to be prepared is large, the conversion accuracy is increased accordingly. improves.

【０００５】しかしながら、メモリなど記憶手段の容量
は有限であり、日本語において無数に存在する、複数の
名詞を連続した表現や、様々な言語現象の共起関係を、
全て共起データとして登録することは不可能である。However, the capacity of a storage means such as a memory is limited, and a continuous expression of a plurality of nouns and a co-occurrence relationship of various linguistic phenomena, which are innumerably present in Japanese, are required.
It is impossible to register all as co-occurrence data.

【０００６】そこで本発明は、共起データの参照要領を
工夫することにより、変換時に使用する共起データの数
を増加し、変換精度を高めたかな漢字変換装置を提供す
ることを目的とする。Accordingly, it is an object of the present invention to provide a kana-kanji conversion apparatus in which the number of co-occurrence data used at the time of conversion is increased by devising a reference procedure for co-occurrence data and conversion accuracy is improved.

【０００７】[0007]

【課題を解決するための手段】本発明のかな漢字変換装
置は、日本語単語の読み、漢字表記、品詞情報及び意味
分類情報を格納した変換辞書と、指定された語について
変換辞書を検索する辞書検索部と、語と語の共起関係を
示した共起データを格納した共起データ辞書と、指定さ
れた語に関して共起データの照合を行い、共起データが
得られなかった場合には、かな漢字変換制御部の制御に
基づき、再度、データの照合方法を変更して照合を行う
共起データ参照部と、辞書検索部、共起データ参照部お
よびデータの流れを制御するかな漢字変換制御部とを有
する。A kana-kanji conversion apparatus according to the present invention comprises a conversion dictionary storing reading of Japanese words, kanji notation, part of speech information and semantic classification information, and a dictionary for searching a conversion dictionary for a specified word. When the search unit and the co-occurrence data dictionary that stores the co-occurrence data indicating the co-occurrence relationship between words and the specified words are collated, if the co-occurrence data is not obtained, A co-occurrence data reference unit that changes the data collation method again for collation based on the control of the kana-kanji conversion control unit, a dictionary search unit, a co-occurrence data reference unit, and a kana-kanji conversion control unit that controls the data flow And

【０００８】これにより、初期参照で共起データを得ら
れなかった語に関して、複数の方法によって共起データ
を参照することで、より多くの共起データを参照するこ
とが可能となり、変換精度を高めることができる。[0008] With this, for a word for which co-occurrence data could not be obtained by the initial reference, more co-occurrence data can be referred to by referring to the co-occurrence data by a plurality of methods, and conversion accuracy can be improved. Can be enhanced.

【０００９】[0009]

【発明の実施の形態】本発明の請求項１に記載の発明
は、かな文字列の入力や変換候補の選択指示を行う入力
手段と、入力されたかな文字列、変換候補、変換された
かな漢字文字列を記憶する変換データ記憶部と、日本語
単語の読み、漢字表記、品詞情報及び意味分類情報を格
納した変換辞書と、指定された語について変換辞書を検
索する辞書検索部と、語と語の共起関係を示した共起デ
ータを格納した共起データ辞書と、指定された語に関し
て、共起データの照合を行う共起データ参照部と、入力
文字列、変換候補、変換結果を表示する表示手段と、辞
書検索部、共起データ参照部および変換データ記憶部に
格納されたデータの流れを制御するかな漢字変換制御部
とを有し、共起データ参照部において、指定された語の
共起データが得られなかった場合に、データの照合方法
を変更し、再度共起データの参照を行うこととした。DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention according to claim 1 of the present invention provides an input means for inputting a kana character string and selecting a conversion candidate, an input kana character string, a conversion candidate, and a converted kana-kanji character. A conversion data storage unit that stores a character string, a conversion dictionary that stores reading of Japanese words, kanji notation, part of speech information, and semantic classification information; a dictionary search unit that searches a conversion dictionary for a specified word; A co-occurrence data dictionary that stores co-occurrence data indicating the co-occurrence relationship of words, a co-occurrence data reference unit that performs collation of co-occurrence data for a specified word, an input character string, conversion candidates, and conversion results Display means for displaying, and a kana-kanji conversion control section for controlling the flow of data stored in the dictionary search section, the co-occurrence data reference section, and the conversion data storage section, and in the co-occurrence data reference section, Co-occurrence data If you did, by changing the method of collating data, it was decided to carry out a reference again co-occurrence data.

【００１０】したがって、かな漢字変換制御部が共起デ
ータ参照部に対し共起データを参照する語を指定し、共
起データ参照部では、共起データ辞書を対象に、指定さ
れた語に関するデータを参照し、対象となるデータが得
られなかった場合は、データの照合方法を変更し、再
度、共起データの参照を行うことにより、初期参照では
得られなかった共起データを参照可能にできる。Therefore, the kana-kanji conversion control unit specifies a word that refers to the co-occurrence data to the co-occurrence data reference unit, and the co-occurrence data reference unit uses the co-occurrence data dictionary to store data relating to the specified word. If the reference data is not obtained, the collation method of the data is changed, and the co-occurrence data that cannot be obtained by the initial reference can be referred to by referring to the co-occurrence data again. .

【００１１】本発明の請求項２に記載の発明は、共起デ
ータ参照部において、複数の名詞からなる名詞連続の語
についての共起データが得られなかった場合に、名詞連
続を構成する名詞のいくつかの組合せを対象として再度
共起データを参照することとした。[0011] According to a second aspect of the present invention, when the co-occurrence data reference unit does not obtain co-occurrence data for a series of noun words consisting of a plurality of nouns, the nouns forming the noun series We decided to refer to the co-occurrence data again for some combinations of.

【００１２】したがって、共起データ参照部において、
複数の名詞からなる名詞連続の語についての共起データ
が得られなかった場合に、名詞連続を構成する名詞のい
くつかの組合せを対象として共起データを参照すること
により、初期参照では参照できなかった名詞連続に関す
る共起データを参照可能にすることができる。Therefore, in the co-occurrence data reference section,
If co-occurrence data for a series of nouns consisting of multiple nouns is not available, referencing the co-occurrence data for some combinations of the nouns that make up the noun continuation allows for initial reference. The co-occurrence data relating to the missing noun sequence can be referred to.

【００１３】本発明の請求項３に記載の発明は、共起デ
ータ参照部において、指定された語の共起データが得ら
れなかった場合に、意味分類情報を参照し、再度、意味
分類が同じ語の共起データを参照することとしている。
したがって、共起データ参照部において共起データが得
られなかった場合に、意味分類情報を参照して、再度共
起データを参照することにより、意味分類が同じ、関連
する語に関する共起データを参照することができる。According to a third aspect of the present invention, when co-occurrence data of a specified word is not obtained in the co-occurrence data reference section, the co-occurrence data is referred to and the semantic classification is performed again. It refers to the co-occurrence data of the same word.
Therefore, when co-occurrence data is not obtained in the co-occurrence data reference unit, by referring to the semantic classification information and referring to the co-occurrence data again, the co-occurrence data for the related words having the same semantic classification can be obtained. Can be referenced.

【００１４】本発明の請求項４に記載の発明は、共起デ
ータ参照部において、「ＡをＢする」型の入力に対する
共起データが得られなかった場合に、「Ａ」及び「Ｂ」
の連続による語「ＡＢ」を対象に、再度共起データを参
照することとしている。したがって、共起データ参照部
において「ＡをＢする」型の入力に対する共起データが
得られなかった場合に、「Ａ」及び「Ｂ」の連続による
語「ＡＢ」を対象に再度共起データを参照することによ
り、初期参照では参照できなかった、動作を表す語に関
する共起データを参照可能にできる。According to a fourth aspect of the present invention, when co-occurrence data for an input of "A to B" type is not obtained in the co-occurrence data reference section, "A" and "B"
The co-occurrence data is referred to again for the word "AB" which is a sequence of "." Therefore, if the co-occurrence data reference unit does not obtain the co-occurrence data for the input of “A to B” type, the co-occurrence data is re-targeted to the word “AB” that is a series of “A” and “B”. , It is possible to refer to the co-occurrence data related to the word representing the operation, which could not be referred to by the initial reference.

【００１５】以下、本発明の実施の形態について図面を
参照しながら説明を行う。図１は、本発明の一実施の形
態におけるかな漢字変換装置の機能ブロック図である。An embodiment of the present invention will be described below with reference to the drawings. FIG. 1 is a functional block diagram of a kana-kanji conversion device according to an embodiment of the present invention.

【００１６】図１において、１は、かな文字列の入力
や、変換候補の選択指示を行う入力手段、２は、入力さ
れたかな文字列、変換候補、変換されたかな漢字文字列
を記憶する変換データ記憶部、３は、指定された語につ
いて変換辞書６を検索する辞書検索部、４は、指定され
た語に関して、共起データの照合を行う共起データ参照
部、５は、入力文字列、変換候補、変換結果等を表示す
る表示手段、６は、日本語単語の読み、漢字表記、品詞
情報及び意味分類情報等を格納した変換辞書、７は、語
と語の共起関係を示した共起データを格納した共起デー
タ辞書、８は、辞書検索部３、共起データ参照部４の各
機能の動作および変換データ記憶部２に格納されたデー
タの流れを制御するかな漢字変換制御部である。In FIG. 1, reference numeral 1 denotes an input means for inputting a kana character string and an instruction for selecting a conversion candidate; and 2, a conversion means for storing an input kana character string, a conversion candidate, and a converted kana kanji character string. A data storage unit 3 is a dictionary search unit that searches the conversion dictionary 6 for a specified word. 4 is a co-occurrence data reference unit that performs collation of co-occurrence data on a specified word. , Conversion candidates, and conversion means for displaying conversion results, etc., 6 is a conversion dictionary storing reading of Japanese words, kanji notation, part of speech information, semantic classification information, etc., and 7 shows a co-occurrence relationship between words. A co-occurrence data dictionary storing the co-occurrence data, a kana-kanji conversion control for controlling the operation of each function of the dictionary search section 3 and the co-occurrence data reference section 4 and the flow of data stored in the conversion data storage section 2 Department.

【００１７】図２に示す回路ブロック図において、９は
キーボード、１０はＣＰＵ（中央処理装置）、１１はＣ
ＲＴ（陰極線管ディスプレイ）、１２はＲＯＭ（リード
オンリーメモリ）、１３はＲＡＭ（ランダムアクセスメ
モリ）である。In the circuit block diagram shown in FIG. 2, 9 is a keyboard, 10 is a CPU (Central Processing Unit), and 11 is C
RT (Cathode Ray Tube Display), 12 is a ROM (Read Only Memory), and 13 is a RAM (Random Access Memory).

【００１８】図１に示した入力手段１はキーボード９に
より、変換データ記憶部２はＲＡＭ１３により、変換辞
書６、共起データ辞書７はＲＯＭ１２により、辞書検索
部３、共起データ参照部４、かな漢字変換制御部８は、
ＣＰＵ１０が、ＲＯＭ１２およびＲＡＭ１３とデータの
やりとりを行いながらＲＯＭ１２に記憶された動作プロ
グラムを実行することにより、表示手段５はＣＲＴ１１
により、それぞれ実現されている。The input means 1 shown in FIG. 1 is operated by the keyboard 9, the converted data storage unit 2 is operated by the RAM 13, the conversion dictionary 6, the co-occurrence data dictionary 7 is operated by the ROM 12, the dictionary search unit 3, the co-occurrence data reference unit 4, The kana-kanji conversion control unit 8
When the CPU 10 executes the operation program stored in the ROM 12 while exchanging data with the ROM 12 and the RAM 13, the display unit 5 displays the CRT 11.
, Respectively.

【００１９】上記のように構成された本実施の形態のか
な漢字変換装置におけるかな漢字変換の処理ついて、以
下、その動作の説明を行う。The operation of kana-kanji conversion in the kana-kanji conversion apparatus of the present embodiment configured as described above will be described below.

【００２０】入力手段１によって入力されたかな文字列
は、ユーザによる変換の指示に従い、かな漢字変換制御
部８によって変換データ記憶部２に格納される。The kana character string input by the input unit 1 is stored in the conversion data storage unit 2 by the kana-kanji conversion control unit 8 in accordance with a user's conversion instruction.

【００２１】かな漢字変換制御部８は、辞書検索部３に
よって、変換データ記憶部２に格納されたかな文字列中
の部分文字列を対象に、変換辞書６を対象に辞書検索を
行い、検索結果を変換データ記憶部２に格納する。The kana-kanji conversion control unit 8 performs a dictionary search on the conversion dictionary 6 by the dictionary search unit 3 on a partial character string in the kana character string stored in the conversion data storage unit 2 and obtains a search result. Is stored in the conversion data storage unit 2.

【００２２】次に、かな漢字変換制御部８は、辞書検索
部３による検索結果に基づき、入力かな文字列の語の分
割を決定する。分割された個々の語において、辞書検索
部３による検索結果で複数の同音語が変換候補となる場
合、かな漢字変換制御部８は、共起データ参照部４を制
御して共起データ辞書７を参照し、該当データを基に、
第一変換候補の決定を行う。Next, the kana-kanji conversion control unit 8 determines the division of the words of the input kana character string based on the search result by the dictionary search unit 3. When a plurality of homophone words are conversion candidates in the search result of the dictionary search unit 3 in each of the divided words, the kana-kanji conversion control unit 8 controls the co-occurrence data reference unit 4 to store the co-occurrence data dictionary 7 Refer to and, based on the relevant data,
The first conversion candidate is determined.

【００２３】かな漢字変換制御部８は、決定したかな漢
字文字列を、表示手段５によって表示し、ユーザの指示
に従って、変換候補の表示、変更、確定を行う。The kana-kanji conversion control section 8 displays the determined kana-kanji character string on the display means 5, and displays, changes and confirms conversion candidates in accordance with a user's instruction.

【００２４】かな漢字変換制御部８の制御に基づき、共
起データ参照部４によって行う共起データの参照につい
て、以下、その動作を図３のフローチャートに基づいて
説明する。The operation of referring to co-occurrence data performed by the co-occurrence data reference section 4 under the control of the kana-kanji conversion control section 8 will be described below with reference to the flowchart of FIG.

【００２５】まず、ステップＳ１で、かな漢字変換制御
部８は、共起データ参照部４へ、変換データ記憶部２に
格納された変換対象文中の、データ参照を行う語のかな
文字列を渡す。ステップＳ２で、共起データ参照部４
は、共起データ辞書７に格納された共起データから、指
定された読みで始まるデータを選出する。First, in step S 1, the kana-kanji conversion control unit 8 passes a kana character string of a word to be referred to in the conversion target sentence stored in the conversion data storage unit 2 to the co-occurrence data reference unit 4. In step S2, the co-occurrence data reference unit 4
Selects data beginning with the specified reading from the co-occurrence data stored in the co-occurrence data dictionary 7.

【００２６】ステップＳ３で、共起データ参照部４は、
ステップＳ１で指定された語が確定語かどうかの情報を
確認し、確定語の場合、ステップＳ４に進む。ステップ
Ｓ４では、ステップＳ２で選出した共起データにおい
て、読みで指定した先頭の語の漢字表記が、既にその語
の変換候補として確定している確定語と異なるものを削
除し、ステップＳ５に進む。ステップＳ３で、指定語が
確定語でない場合、そのままステップＳ５に進む。ステ
ップＳ５で、かな漢字変換制御部８により、先頭の語が
一致する共起データを、候補共起データとして変換デー
タ記憶部２に格納する。In step S3, the co-occurrence data reference unit 4
In step S1, information as to whether or not the specified word is a definite word is confirmed. If the word is a definite word, the process proceeds to step S4. In step S4, in the co-occurrence data selected in step S2, a word whose kanji notation of the first word specified by reading is different from a fixed word already determined as a conversion candidate of the word is deleted, and the process proceeds to step S5. . If it is determined in step S3 that the designated word is not a definite word, the process directly proceeds to step S5. In step S5, the kana-kanji conversion control unit 8 stores the co-occurrence data whose head words match each other as candidate co-occurrence data in the conversion data storage unit 2.

【００２７】ステップＳ６では、変換データ記憶部２に
格納した各共起データに対し、共起データの読みにおけ
る指定語の後続文字列と、変換データ記憶部２に格納さ
れた変換対象文字列における指定語の後続文字列とを比
較する。ステップＳ７では、ステップＳ６の比較におけ
る、２つのかな文字列が一致するデータの有無を確認す
る。一致するデータがない場合、ステップＳ９で、再
度、データの照合方法を変更し、データの参照を行う。
一致するデータがある場合、ステップＳ８で、この共起
データを該当データとして変換データ記憶部２に格納
し、処理を終了する。In step S 6, for each of the co-occurrence data stored in the conversion data storage unit 2, the character string following the specified word in reading the co-occurrence data and the conversion target character string stored in the conversion data storage unit 2 Compares the character string following the specified word. In step S7, it is confirmed whether there is data in which the two kana character strings match in the comparison in step S6. If there is no matching data, in step S9, the data collation method is changed again, and the data is referenced.
If there is matching data, the co-occurrence data is stored as the corresponding data in the converted data storage unit 2 in step S8, and the process ends.

【００２８】ステップＳ９で行うデータ参照の３つの実
施の形態について、以下、その動作を、各々、図４、図
５、図６のフローチャートに基づいて説明する。The operation of the three embodiments of data reference performed in step S9 will be described below with reference to the flowcharts of FIGS. 4, 5, and 6, respectively.

【００２９】（実施の形態１）まず、ステップ９Ａ−１
で、指定された語が名詞になり得る語か否かを判断す
る。指定された語が名詞になり得ない場合、このデータ
参照方法は適用されないため、処理を終了する。指定さ
れた語が名詞になり得る場合、ステップ９Ａ−２で、指
定された語を先頭として構成し得る名詞連続の個数（以
下この個数をＮとする）を得る。(Embodiment 1) First, Step 9A-1
It is determined whether or not the designated word is a word that can be a noun. If the specified word cannot be a noun, the processing ends because this data reference method is not applied. If the specified word can be a noun, in step 9A-2, the number of consecutive nouns that can be formed starting from the specified word (hereinafter, this number is referred to as N) is obtained.

【００３０】例えば、入力かな文字列が「さんいん／ふ
くしま／ほせん」で、「さんいん」が指定語の場合、ス
テップ９Ａ−１で、指定された語「さんいん」が名詞と
なり得るため、ステップ９Ａ−２に進む。また、「さん
いん」「ふくしま」「ほせん」の各々も、名詞となり得
るため、ステップ９Ａ−２で名詞連続の個数Ｎ＝３を得
る。For example, if the input kana character string is "sanin / fukushima / hosen" and "sanin" is a designated word, the specified word "sanin" can be a noun in step 9A-1. , And proceed to Step 9A-2. In addition, since each of “sanin”, “Fukushima”, and “hosen” can be nouns, the number N of consecutive nouns is obtained in step 9A-2.

【００３１】ステップ９Ａ−３で、Ｎ＜３の場合、現在
と構成要素の異なる名詞連続は生成不可能なので処理を
終了する。Ｎ≧３の場合、ステップ９Ａ−４に進む。At step 9A-3, if N <3, the processing is terminated because noun continuations having different constituent elements from the current one cannot be generated. If N ≧ 3, the process proceeds to step 9A-4.

【００３２】ステップ９Ａ−４からステップ９Ａ−１２
で、参照対象となる名詞連続を生成する。ステップ９Ａ
−４では先頭からの名詞の個数を規定する変数ｓ＝１と
する。ステップ９Ａ−５でｓ≦Ｎ−２の場合、ステップ
９Ａ−６に進み、先頭ｓ個に接続する名詞が、先頭から
いくつ目の名詞かを表す変数ｔをｔ＝ｓ＋１に初期化す
る。Steps 9A-4 to 9A-12
Generates a noun sequence to be referred to. Step 9A
At -4, a variable s = 1 that defines the number of nouns from the beginning is set. If s ≦ N−2 in step 9A-5, the process proceeds to step 9A-6, in which a variable t representing the number of nouns connected to the first s is set to t = s + 1.

【００３３】ステップ９Ａ−７でｔ＞Ｎの場合、対応す
る名詞が存在しないため、ステップ９Ａ−１３に進む。
ステップ９Ａ−１３では、先頭からの名詞の個数ｓを１
つ増やし、ステップ９Ａ−５に戻る。ステップ９Ａ−７
でｔ≦Ｎの場合は、ステップ９Ａ−８に進み、先頭ｓ個
の名詞と、先頭からｔ番目の名詞の接続により名詞連続
を生成する。If t> N in step 9A-7, there is no corresponding noun, so the process proceeds to step 9A-13.
In step 9A-13, the number s of nouns from the beginning is set to 1
And returns to step 9A-5. Step 9A-7
If t ≦ N, the process proceeds to step 9A-8, where a series of nouns is generated by connecting the first s nouns and the t-th noun from the top.

【００３４】例えば、上記「さんいん／ふくしま／ほせ
ん」の例においては、ｓ＝１、ｔ＝ｓ＋１＝２の場合、
「さんいんふくしま」を生成する。For example, in the above example of "sanin / fukushima / hosen", when s = 1 and t = s + 1 = 2,
Generate "Sanfu Fukushima".

【００３５】ステップ９Ａ−９では、ステップＳ５にお
いて変換データ記憶部２に格納した候補共起データを対
象に、ステップ９Ａ−８で生成した語の照合を行う。ス
テップ９Ａ−１０では、ステップ９Ａ−９での共起デー
タの照合における該当データの有無を調査し、該当デー
タが無い場合、ステップ９Ａ−１２に進む。一致するデ
ータがある場合、ステップ９Ａ−１１で、この共起デー
タを該当データとして変換データ記憶部２に格納する。In step 9A-9, the words generated in step 9A-8 are collated with the candidate co-occurrence data stored in the conversion data storage unit 2 in step S5. In step 9A-10, the presence or absence of the corresponding data in the collation of the co-occurrence data in step 9A-9 is checked. If there is no such data, the process proceeds to step 9A-12. If there is matching data, the co-occurrence data is stored in the converted data storage unit 2 as corresponding data in step 9A-11.

【００３６】ステップ９Ａ−１２では、後接する名詞が
先頭から何個目の名詞かを示すｔを１つ増やし、ステッ
プ９Ａ−７に戻る。In step 9A-12, t indicating the number of the next noun from the beginning is incremented by one, and the process returns to step 9A-7.

【００３７】例えば、上記「さんいん／ふくしま／ほせ
ん」の例においては、ステップ９Ａ−７で、ｔ（＝３）
≦Ｎ（＝３）なので、ステップ９Ａ−８に進み、ｓ＝
１、ｔ＝３から、「さんいんほせん」を生成する。For example, in the above example of "sanin / fukushima / hosen", in step 9A-7, t (= 3)
Since ≦ N (= 3), the process proceeds to step 9A-8, where s =
1. From “t = 3”, “saninhosen” is generated.

【００３８】ステップ９Ａ−５でｓ＞Ｎ−２の場合、参
照対象となる名詞連続の生成が全て終ったことになり、
処理を終了する。If s> N-2 in step 9A-5, generation of the noun sequence to be referred to has been completed.
The process ends.

【００３９】例えば、「さんいん／ふくしま／ほせん」
の例においては、Ｎ＝３となり、ｓ＝１、ｔ＝２で「さ
んいんふくしま」を、ｓ＝１、ｔ＝３で「さんいんほせ
ん」を、生成する。ｓ＝２の場合、「さんいんふくしま
ほせん」と最初の語と同じ語を生成することになる。よ
って、ステップ９Ａ−５でｓ（＝２）＞Ｎ−２（＝１）
の場合、処理を終了する。For example, "sanin / fukushima / hosen"
In the example of (1), N = 3, and “samfukushima” is generated at s = 1 and t = 2, and “small book” is generated at s = 1 and t = 3. In the case of s = 2, the same word as the first word "saninfukushimahosen" will be generated. Therefore, s (= 2)> N−2 (= 1) in step 9A-5.
If, the process ends.

【００４０】この共起データ照合ステップを用い、ステ
ップ９Ａ−９において、「さんいんほせん」の照合に成
功した場合、共起データ「さんいんほせん（参院補
選）」を得る。これにより、「さんいんふくしまほせ
ん」において、第一候補を用いる従来のかな漢字変換装
置では、「山陰福島保線」等に変換していたのに対し、
「さんいん」と「ほせん」に共起データを使用し、本実
施の形態のかな漢字変換装置では「参院福島補選」と変
換することが可能となる。Using this co-occurrence data collation step, in step 9A-9, if collation of "saninhosen" is successful, co-occurrence data "saninhosen (secondary election of the Upper House)" is obtained. As a result, in the conventional Kana-Kanji conversion device using the first candidate in "Sanfu Fukushima Hosen", it was converted to "San-in Fukushima Yasen" etc.
By using co-occurrence data for “sanin” and “hosen”, the kana-kanji conversion device of the present embodiment can convert the data to “Sanin Fukushima Supplementary Selection”.

【００４１】（実施の形態２）次に、図５により、共起
データを参照する第２の手法を説明する。(Embodiment 2) Next, a second method for referring to co-occurrence data will be described with reference to FIG.

【００４２】まずステップ９Ｂ−１で、ステップＳ５に
おいて変換データ記憶部２に格納した候補共起データを
取り出す。ステップ９Ｂ−２において、ステップ９Ｂ−
１で取り出した候補共起データを一つ得る。ステップ９
Ｂ−３で、ステップ９Ｂ−２で得た候補共起データにお
ける、指定語の次の語の意味分類情報を得る。ステップ
９Ｂ−４で、変換データ記憶部２に格納した、指定語の
次の語の変換候補の意味分類情報を得る。First, in step 9B-1, the candidate co-occurrence data stored in the converted data storage unit 2 in step S5 is extracted. In Step 9B-2, Step 9B-
One candidate co-occurrence data extracted in step 1 is obtained. Step 9
In B-3, the semantic classification information of the word next to the designated word in the candidate co-occurrence data obtained in step 9B-2 is obtained. In step 9B-4, the semantic classification information of the conversion candidate of the word next to the designated word stored in the conversion data storage unit 2 is obtained.

【００４３】ステップ９Ｂ−５で、ステップ９Ｂ−３で
得た共起データの意味分類情報と、ステップ９Ｂ−４で
得た変換候補の意味分類情報とを比較し、意味分類情報
が等しい場合、ステップ９Ｂ−６に進む。ステップ９Ｂ
−６では、ステップ９Ｂ−５で比較対象とした共起デー
タにおいて、意味分類情報を参照した語を、意味分類情
報が一致する変換候補に変更したデータを、該当共起デ
ータとして変換データ記憶部２に格納する。In step 9B-5, the semantic classification information of the co-occurrence data obtained in step 9B-3 is compared with the semantic classification information of the conversion candidate obtained in step 9B-4. Proceed to step 9B-6. Step 9B
In -6, in the co-occurrence data to be compared in step 9B-5, the data in which the word referring to the semantic classification information is changed to a conversion candidate whose semantic classification information matches is converted as the corresponding co-occurrence data in the conversion data storage unit. 2 is stored.

【００４４】ステップ９Ｂ−５で、意味分類情報が等し
くない場合、ステップ９Ｂ−２に戻り次の候補共起デー
タを得る。ステップ９Ｂ−２で、次の候補共起データが
得られなかった場合、全てのデータに対する照合が終了
しており、処理を終了する。If the semantic classification information is not equal in step 9B-5, the process returns to step 9B-2 to obtain the next candidate co-occurrence data. If the next candidate co-occurrence data is not obtained in step 9B-2, the collation for all the data has been completed, and the process ends.

【００４５】例えば、入力かな文字列が「きょうか／き
かん」で、「きょうか」が指定語の場合、ステップ９Ｂ
−１で、変換データ記憶部２に格納した候補共起デー
タ、「きょうか／れんしゅう（強化／練習）：５／１
３」、「きょうか／かくだい（強化／拡大）５／２
２」、「きょうか／げっかん（強化／月間）５／１
８」、…を取り出す。For example, if the input character string is "Kyoka / Kikan" and "Kyoka" is a designated word, step 9B
-1, the candidate co-occurrence data stored in the conversion data storage unit 2, "Kyoka / renshu (enhancement / practice): 5/1
3 ”,“ Today / Kaidai (enhancement / enlargement) 5/2
2 ”,“ Kyoka / Gekkan (enhanced / monthly) 5/1
8 ", ... are taken out.

【００４６】ステップ９Ｂ−２で、最初の候補共起デー
タ「きょうか／れんしゅう（強化／練習）：５／１３」
を得る。意味分類情報が、共起データの漢字表記の後ろ
に付与されている場合、ステップ９Ｂ−３で、指定語
「きょうか」の次の語「れんしゅう」の意味分類情報１
３を得る。In step 9B-2, the first candidate co-occurrence data "Kyouka / renshu (enhancement / practice): 5/13"
Get. If the semantic classification information is added after the kanji notation of the co-occurrence data, in step 9B-3, the semantic classification information 1 of the word “renshu” following the designated word “kyoka”
Get 3.

【００４７】ステップ９Ｂ−４で、変換データ記憶部２
に格納した、指定語の次の語「きかん」の変換候補、機
関（１０）、期間（１８）、季刊（３４）…の意味分類
情報を得る。ステップ９Ｂ−５で、共起データの意味分
類情報（１３）と、変換候補の意味分類情報（１０、１
８、３４）とを比較する。In step 9B-4, the conversion data storage unit 2
, The conversion candidate of the next word “Kikan” after the designated word, the institution (10), the period (18), the quarterly (34)... In step 9B-5, the semantic classification information (13) of the co-occurrence data and the semantic classification information (10, 1,
8, 34).

【００４８】全ての候補共起データに対して、同様の処
理を行うと、候補共起データ「きょうか／げっかん（強
化／月間）５／１８」の参照において、ステップ９Ｂ−
５で、共起データの意味分類情報（１８）と、変換候補
「期間」の意味分類情報（１８）が一致する。When the same processing is performed for all the candidate co-occurrence data, the reference to the candidate co-occurrence data “Kyoka / Gekikan (enhanced / monthly) 5/18” refers to Step 9B-
In 5, the semantic classification information (18) of the co-occurrence data matches the semantic classification information (18) of the conversion candidate “period”.

【００４９】「きょうか」と「げっかん」の共起データ
が存在する場合、「月間」と「期間」が同一の意味分類
情報を持つことから、該当共起データ「きょうか／きか
ん（強化／期間）５／１８」を得る。これにより、入力
かな文字列「きょうかきかん」において、第一候補を用
いる従来のかな漢字変換では、「強化機関」等に変換し
ていたのに対し、「きょうか」と「きかん」に共起デー
タを使用し、本実施の形態のかな漢字変換装置では「強
化期間」と変換することが可能となる。When co-occurrence data of “Kyoka” and “Gekkan” exist, “Month” and “Period” have the same semantic classification information. Period) 5/18 "is obtained. As a result, in the input kana character string "Kyokakikan", the conventional kana-kanji conversion using the first candidate was converted to "Strengthening institution", etc., but co-occurred in "Kyouka" and "Kikan" Using the data, the kana-kanji conversion apparatus according to the present embodiment can convert the data into the “enhancement period”.

【００５０】（実施の形態３）次に、図６により、共起
データを参照する第３の手法を説明する。(Embodiment 3) Next, a third method for referring to co-occurrence data will be described with reference to FIG.

【００５１】まず、ステップ９Ｃ−１で、変換データ記
憶部２に格納された変換対象文のうち、指定語からの文
字列が「Ａ（名詞）をＢ（名詞）する」か否かを判断
し、対応しない場合、このデータ参照方法は適用されな
いため、処理を終了する。このデータ参照方法が適用さ
れると判断された場合、ステップ９Ｃ−２に進む。ステ
ップ９Ｃ−２では、変換データ格納部に格納された「Ａ
をＢする」型の入力文字列から、文字列「ＡＢ」を生成
する。First, in step 9C-1, it is determined whether the character string from the designated word in the conversion target sentence stored in the conversion data storage unit 2 is "A (noun) to B (noun)". If not, the data reference method is not applied, and the process ends. When it is determined that this data reference method is applied, the process proceeds to step 9C-2. In step 9C-2, “A” stored in the conversion data storage
A character string “AB” is generated from an input character string of the “do B” type.

【００５２】ステップ９Ｃ−３で、ステップＳ５におい
て変換データ記憶部２に格納した候補共起データを対象
に、ステップ９Ｃ−２で生成した語の照合を行う。ステ
ップ９Ｃ−４では、ステップ９Ｃ−３での共起データの
照合における該当データの有無を調査し、該当データが
あれば、ステップ９Ｃ−５で、この共起データを該当デ
ータとして変換データ記憶部２に格納し、処理を終了す
る。In step 9C-3, the word generated in step 9C-2 is collated with the candidate co-occurrence data stored in the conversion data storage unit 2 in step S5. In step 9C-4, the presence or absence of the corresponding data in the collation of the co-occurrence data in step 9C-3 is checked, and if there is such data, in step 9C-5, the co-occurrence data is used as the corresponding data in the conversion data storage unit. 2 and the process ends.

【００５３】例えば、入力かな文字列「せいかくをちょ
うさする」は、「Ａ（せいかく）をＢ（ちょうさ）す
る」型なので、ステップ９Ｃ−１で、ステップ９Ｃ−２
に進む。ステップ９Ｃ−２では、「せいかくをちょうさ
する」から「せいかくちょうさ」を生成する。For example, the input kana character string "Choose Seikaku" is of the type "A (Sekaiku) to B (choice)", so that step 9C-1 and step 9C-2
Proceed to. In step 9C-2, “Squareness” is generated from “Squareness”.

【００５４】ステップ９Ｃ−３で、ステップＳ５におい
て変換データ記憶部２に格納した候補共起データを対象
に、ステップ９Ｃ−２で生成した語「せいかくちょう
さ」の照合を行う。ステップ９Ｃ−３で共起データ「性
格調査」が照合された場合、ステップ９Ｃ−５で、この
共起データを該当データとして変換データ記憶部２に格
納し、処理を終了する。In step 9C-3, the word "seikakuchosa" generated in step 9C-2 is collated with the candidate co-occurrence data stored in the converted data storage unit 2 in step S5. If the co-occurrence data “character search” is collated in step 9C-3, the co-occurrence data is stored as the corresponding data in the converted data storage unit 2 in step 9C-5, and the process ends.

【００５５】これにより、入力かな文字列「せいかくを
ちょうさする」において、第一候補を用いる従来のかな
漢字変換装置では「正確を調査する」等に変換していた
のに対し、「せいかく」と「ちょうさ」に共起データを
使用し、本実施の形態のかな漢字変換装置では「性格を
調査する」と変換することが可能となる。As a result, in the input kana character string "Sho-ku-se-ku-sakuru", the conventional kana-kanji conversion device using the first candidate converts the character string to "Check for accuracy" or the like. By using the co-occurrence data for “chosa”, the kana-kanji conversion device of the present embodiment can be converted to “investigate personality”.

【００５６】なお、本実施の形態においては、複数のデ
ータ参照方法を、各々別々の実施の形態に分けて説明を
行ったが、実施の形態としては、複数のデータ参照方法
を同時に用いても差支えない。In the present embodiment, a plurality of data reference methods have been described separately in different embodiments. However, in the embodiment, a plurality of data reference methods may be used simultaneously. No problem.

【００５７】[0057]

【発明の効果】本発明によれば、最初の共起データ参照
では参照されなかったデータを、共起データとして得る
ことが可能となるため、変換候補の決定に使用すること
ができ、変換精度を向上することができる。According to the present invention, data that has not been referred to in the first co-occurrence data reference can be obtained as co-occurrence data. Can be improved.

[Brief description of the drawings]

【図１】本発明の一実施の形態におけるかな漢字変換装
置の機能ブロック図FIG. 1 is a functional block diagram of a kana-kanji conversion device according to an embodiment of the present invention.

【図２】本発明の一実施の形態におけるかな漢字変換装
置の回路ブロック図FIG. 2 is a circuit block diagram of a kana-kanji conversion device according to an embodiment of the present invention.

【図３】本発明の一実施の形態の共起データ参照の動作
を表すフローチャートFIG. 3 is a flowchart illustrating an operation of referring to co-occurrence data according to an embodiment of the present invention;

【図４】本発明の実施の形態１におけるデータ参照の動
作を表すフローチャートFIG. 4 is a flowchart showing a data reference operation according to the first embodiment of the present invention;

【図５】本発明の実施の形態２におけるデータ参照の動
作を表すフローチャートFIG. 5 is a flowchart showing a data reference operation according to the second embodiment of the present invention;

【図６】本発明の実施の形態３におけるデータ参照の動
作を表すフローチャートFIG. 6 is a flowchart illustrating a data reference operation according to the third embodiment of the present invention.

[Explanation of symbols]

１入力手段２変換データ記憶部３辞書検索部４共起データ参照部５表示手段６変換辞書７共起データ辞書８かな漢字変換制御部 DESCRIPTION OF SYMBOLS 1 Input means 2 Conversion data storage part 3 Dictionary search part 4 Co-occurrence data reference part 5 Display means 6 Conversion dictionary 7 Co-occurrence data dictionary 8 Kana-kanji conversion control part

Claims

[Claims]

An input means for inputting a kana character string and selecting a conversion candidate, a conversion data storage unit for storing an input kana character string, a conversion candidate, and a converted kana-kanji character string, and a Japanese word , A conversion dictionary that stores kanji notation, part-of-speech information, and semantic classification information, a dictionary search unit that searches the conversion dictionary for a specified word, and co-occurrence data indicating the co-occurrence relationship between words. A co-occurrence data dictionary, a co-occurrence data reference unit for collating co-occurrence data with respect to a specified word, display means for displaying an input character string, a conversion candidate, and a conversion result; And a kana-kanji conversion control unit for controlling the flow of data stored in the conversion data storage unit. In the co-occurrence data reference unit, co-occurrence data of a specified word could not be obtained. If the date Kana-kanji conversion apparatus of collation method to change the, and performs reference again co-occurrence data.

2. If the co-occurrence data reference unit does not obtain co-occurrence data for a series of nouns consisting of a plurality of nouns, the co-occurrence data reference unit re-selects some combinations of nouns forming the noun series. The kana-kanji conversion device according to claim 1, wherein the co-occurrence data is referred to.

3. If the co-occurrence data reference unit does not obtain co-occurrence data of a specified word, the co-occurrence data is referred to, and the co-occurrence data of the word having the same semantic classification is referred to again. The kana-kanji conversion device according to claim 1 or 2, wherein:

4. In the co-occurrence data reference section, "A
2. The co-occurrence data is referred to again for a word "AB" formed by a continuation of "A" and "B" when co-occurrence data for an "type" input is not obtained. 2. The kana-kanji conversion device according to claim 2 or 3.