JP2856775B2

JP2856775B2 - Document creation device

Info

Publication number: JP2856775B2
Application number: JP1199286A
Authority: JP
Inventors: 秀紀長崎; 幸弘唐崎; 恒雄宮本; 俊雄宮間; 雅汎岩木
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1989-08-02
Filing date: 1989-08-02
Publication date: 1999-02-10
Anticipated expiration: 2014-02-10
Also published as: JPH0363753A

Description

【発明の詳細な説明】［発明の目的］（産業上の利用分野）この発明は、文字列の読み情報から所望の文字列に変
換し、文書の作成を行う文書作成装置に係わり、特に、
変換の結果出力される同音語の内から文中で適切なもの
を優先的に出力する機能に関する。DETAILED DESCRIPTION OF THE INVENTION [Object of the Invention] (Industrial application field) The present invention relates to a document creation device that converts a character string reading information into a desired character string and creates a document.
It relates to the function of preferentially outputting an appropriate one of the homophones output from the conversion as a result of the conversion.

（従来の技術）文書作成装置では、読み情報（キーボードからの仮名
文字列、ローマ字列入力、音声入力等）を所望の文字列
（漢字、片仮名、平仮名等が混在）に変換し、所望の文
字列を得る機能が備わっている。この変換に際し、ある
読み情報には複数の同じ読みを持つ単語（同音語）が存
在し、また、単語／文節の切断箇所にも複数の可能性が
存在する。(Prior Art) A document creation device converts reading information (a kana character string, a Roman character string input, a voice input, etc. from a keyboard) into a desired character string (a mixture of kanji, katakana, hiragana, etc.) and outputs a desired character. There is a function to get a line. At the time of this conversion, there are a plurality of words (same words) having the same reading in certain reading information, and there are also a plurality of possibilities at the cut portions of words / phrases.

例1:［きのう−機能／昨日／帰納］［からす−枯らす／枯す／烏／カラス］例2:［ひ（火）とは−ひと（人）は］例3:［きょう（今日）はいしゃ（医者）に−きょう（今
日）はいしゃ（歯医者）に］従って、変換結果として、所望のものが初めから出力
されるとは限らないので、変更可能な候補として仮に変
換結果を出力する手法が採られる。そして、文書作成装
置には、同音語の存在する単語（例１）や単語の切れ目
が複数存在する文節（例２）に対しては、他の候補から
所望のものを選択可能にする機能が、べた入力された読
み文字列に複数の文節切断箇所が存在する場合（例３）
に対しては、文節切断箇所を変更し再変換を起動する機
能（文節切り直し機能）が各々設けられている。Example 1: [Kinu-Function / Yesterday / Induction] [Crows-Wither / Wither / Crow / Crow] Example 2: [What is Hi (fire)-One person] Example 3: [Today (Today) the doctor (doctor) - today (today) to the dentist (dentist)] Accordingly, as a conversion result, the desired one can not necessarily be output from the beginning, a method of outputting the temporarily conversion result as a change possible candidates Is adopted. The document creation device has a function of selecting a desired word from other candidates for a word (example 1) having a homophone or a phrase (example 2) having a plurality of word breaks. When there are a plurality of segment breaks in the solid input reading character string (Example 3)
, A function of changing a segment break location and initiating re-conversion (phrase segmentation function) is provided.

しかし、無制限に変換結果の候補を表示した場合、オ
ペレータに上記変更動作のために過度な負担を強いるこ
とになり好ましくない。そこで、できるだけ適切な変換
候補が最初に出力されるような処置が採られている。However, if the conversion result candidates are displayed indefinitely, an excessive burden is imposed on the operator for the change operation, which is not preferable. Therefore, a measure is taken so that the most suitable conversion candidate is output first.

例えば、前後の単語／文節間の文法的な接続可否をチェック
して不適当な変換候補を排除する、同音語の選択頻度をモニタし、使用頻度の高いもの
を優先的に出力する（この場合、作成中の文書の属する
分野（医療／法律／特許等）を加味する場合もある）、一つ前に選択された同音語が次にも選択される可能
性が高いので、選択された語を記憶し、次回の同じ読み
の変換時に記憶した語を他の同音語より優先して出力す
る、ペアで出現する可能性の高い単語の対を記憶し、そ
の対を他の同音語より優先して出力する（例えば、汽車
−乗（る）というペアを登録しておき、「きしゃにすぐ
にのれ」に対し、「記者に直ぐに乗れ」のような変換結
果が出ないようにする）、変換結果の意味的な分析を行い、意味的に不適格な
変換候補を排除する（例えば、動物が主語の場合は「鳴
く」が正しく、人間が主語の場合は「泣く」が正しいと
いうような意味解析）、がある。For example, it checks the grammatical connection between the preceding and following words / phrases to eliminate inappropriate conversion candidates, monitors the frequency of selection of homophones, and outputs the most frequently used words preferentially (in this case, , The field to which the document being created belongs (medical / legal / patent, etc.) may be taken into account.) The same phoneme selected before is likely to be selected next, so the selected word And output the word stored at the next conversion of the same reading in preference to other homophones.Remember the word pairs that are likely to appear in pairs and prioritize that pair over other homophones. (For example, register a pair of train-riding (Ru) so that a conversion result such as "Ride immediately to a reporter" is not output for "Get on quickly" ), Perform semantic analysis of conversion results and eliminate semantically ineligible conversion candidates That (for example, if the animal is subject "rather than" is correct, if human-subject meaning analysis such as "cry" is correct) there is,.

上記の手法では複数の正しい切断点が存在する可能
性、１つの切断候補の中で複数の同音語が存在する可能
性があり、他の手法と併用しないと適切な変換結果は得
られない。、の手法は、頻度の高い語、前に選択さ
れた語に変換結果が引っ張られすぎて、他の同音語が隠
れてしまいかえって結果を悪くする場合が存在する。例
えば、「きしゃがきしゃにのる」の変換では、最初の
「きしゃ」に対して「記者」を選択した場合や「記者」
の選択頻度が高い場合には、「記者が記者に乗る」と不
適格な変換結果が得られてしまう。従って、更に、
の手法の利用（併用）が望まれる。しかし、の手法
は、機械翻訳では多く使用されるが、比較的処理能力の
低いプロセッサを使用した場合には、処理の負担が増大
し、変換に要する速度が遅くなるという問題を生じる。
この結果、現在の文書作成装置では、上記の手法を用
いて所望の変換候補を得ることが多い。In the above method, there is a possibility that a plurality of correct cutting points exist, and there is a possibility that a plurality of homophone words exist in one cutting candidate, and an appropriate conversion result cannot be obtained unless used in combination with another method. In the method of, there is a case in which the conversion result is pulled too much by a frequently-used word or a previously selected word, and other homonyms are hidden, and the result is worsened. For example, in the conversion of "squat on a businessman", "reporter" is selected for the first
If the selected frequency is high, "reporter ride reporters" ineligible conversion results will be obtained. Therefore,
The use (combination) of the methods is desired. However, this method is often used in machine translation, but when a processor having relatively low processing capability is used, a problem arises in that the processing load increases and the speed required for conversion decreases.
As a result, a current document creation device often obtains a desired conversion candidate using the above-described method.

ペアで出現する可能性の高い語は「共起関係の存在す
る語」と呼ばれる。この単語の対（前に来る単語と後ろ
にくる単語の情報の対）は、単語辞書メモリ（変換時に
参照されるもので、単語の読み、表記、頻度情報・文法
情報等の付加情報等が記憶されたメモリ）や、単語辞書
とは別に設けられたテーブル中に記憶される。変換の際
に、ある語に対する共起関係が参照され、対をなす単語
が後ろに続く語（離れていても良い）の変換候補の中に
存在したらその単語が優先して出力される。例えば、
「人−泣（く）」（（）は活用によって変化しても良い
ことを示す）の共起関係が登録されている場合、「ひと
がはげしくなく」の読みに対して、「人が激しく泣く」
の如く下線の部分に共起関係が働き、「人が激しく鳴
く」のような不適切な変換結果の出力が防止される。こ
の場合、単語をグループ化（人間／動物等）して各単語
にどのグループに入るかの情報を付加しておき、共起関
係を単語のグループ間の関係として、これが一致する場
合に優先させるという手法もある。Words that are likely to appear in pairs are called "co-occurring words". The word pair (a pair of information of a preceding word and a word of a following word) is a word dictionary memory (which is referred to at the time of conversion, and includes additional information such as word reading, notation, frequency information, grammar information, and the like). (A stored memory) or a table provided separately from the word dictionary. At the time of conversion, a co-occurrence relation for a certain word is referred to, and if a paired word exists in a conversion candidate of a following word (which may be separated), that word is preferentially output. For example,
When a co-occurrence relationship of "people-crying" (() indicates that it can be changed by utilization) is registered, " people are not violent" Cry "
Work is co-occurrence relation to the portion of the underline as, "people ringing violently
Ku output of incorrect conversion result as "is prevented. In this case, the words are grouped (human / animal, etc.), and information on which group is included in each word is added, and the co-occurrence relation is regarded as a relation between the word groups, and when the words coincide with each other, priority is given. There is also a technique.

ただ、これだけでは誤って優先させる場合もある。例
えば、「ひとでなく→人で無く」の変換に「人−泣
（く）」の共起関係が働くと、「人で泣く」という誤っ
た変換結果が出力されてしまう。そこで、共起関係の登
録時に、共起する２単語間の接続関係、例えば『助詞』
の情報も記憶しておき（上記例では「人−泣（く）−
が」）、その助詞までも一致した場合に始めて共起関係
を活用するという方法もとられる。However, there are cases where this alone gives priority to mistakes. For example, when the co-occurrence relationship of “people-cry” works on the conversion of “not a person → not a person”, an erroneous conversion result of “crying by a person” is output. Therefore, when registering a co-occurrence relation, a connection relation between two co-occurring words, for example, "particle"
(In the above example, "people-cry (ku)-
However, there is a method that utilizes co-occurrence relations only when the particles also match.

（発明が解決しようとする課題）しかし、変換候補の中で、共起テーブルに登録されて
いる２単語とその間の接続関係（例えば助詞）が一致し
た場合に共起関係を働かせて優先出力した場合、１つの
用例にしか共起関係を活用することができない。例え
ば、「人−泣（く）−が」登録されていた場合、「その
ひとまではげしくないている」の用例には、助詞「ま
で」の存在によって共起関係は働かない。従って、共起
関係を確実に働かせるためには、例えば、「人−が泣
（く）」、「人−は−泣（く）」、「人−と−泣
（く）」、「人−まで−泣（く）」、「人−も−泣
（く）」……を全て登録しておかなければならず、登録
しなければならない共起の組み合わせが非常に膨大にな
る。(Problems to be Solved by the Invention) However, when two words registered in the co-occurrence table and a connection relation (for example, a particle) between the two words match in the conversion candidates, the co-occurrence relation is activated and priority output is performed. In this case, the co-occurrence relationship can be used for only one example. For example, "people - crying (Ku) - is" If you have been registered, in the example of "are not vigorously until the person" does not work is the co-occurrence relation by the presence of the particle "to". Therefore, in order to ensure that the co-occurrence relationship works, for example, “people cry”, “people cry”, “people and cry”, “people” Until-crying, "people-crying," ... all must be registered, and the number of co-occurrence combinations that must be registered is extremely large.

また、同じ読みの単語間で２つ以上共起関係が登録さ
れる場合、例えば「記者−を−やめ（る）」「汽車−を
−やめる」があった場合にどちらを優先するかという問
題も生じる。In addition, when two or more co-occurrence relationships are registered between words having the same reading, for example, when there is a "reporter-stop (ru)" or "train-stop-", there is a problem in which one has priority. Also occurs.

本発明は、上記問題を解決するために、単語間（また
は単語の概念−人間／動物／機械等の間でも良い）の共
起関係を参照して読み情報を所望の文字列に変換する
際、共起関係をより有効的に適用して、同音語の中から
望む候補を優先的に出力可能な文書作成装置の提供を目
的とする。The present invention solves the above problem by converting reading information into a desired character string by referring to a co-occurrence relationship between words (or between a word concept and a human / animal / machine). It is another object of the present invention to provide a document creation apparatus capable of applying co-occurrence relations more effectively and outputting a desired candidate from homophones with priority.

［発明の構成］（課題を解決するための手段）上記目的を達成するために、本発明の文書作成装置
は、読み情報入力する入力手段と、読みに対応して単語
文字列が検索可能に登録された辞書記憶手段と、前記入
力手段より入力された読み情報に基づいて前記辞書記憶
手段を検索し所望の単語列を得る変換処理手段との構成
に、ある単語と他の単語との対を示す情報、これら両単
語が文字列中に対で現れるときの両単語の既定の接続パ
ターン、並びに、この既定の接続パターンと置き換え可
能な接続パターンを記憶する記憶手段を具備させ、変換
処理手段が、前記読み情報に対する前記辞書記憶手段の
検索の結果、複数の同音異義語が存在する場合、検索出
力において、前記既定の接続パターン、又は、この既定
の接続パターンと置き換え可能な接続パターンをもつ前
記記憶手段に登録された単語対が存在するか否かを検査
し、存在する場合には、当該単語対を両単語の他の同音
異義語よりも優先して出力することを特徴としている。[Structure of the Invention] (Means for Solving the Problems) In order to achieve the above object, a document creation device according to the present invention has an input means for inputting reading information and a searchable word character string corresponding to reading. The configuration of the registered dictionary storage means and the conversion processing means for searching the dictionary storage means based on the reading information input from the input means to obtain a desired word string, includes a pair of a certain word and another word. Conversion processing means, comprising: storage means for storing information indicating the following, a predetermined connection pattern of both words when these words appear as a pair in a character string, and a connection pattern replaceable with the predetermined connection pattern. However, if a plurality of homonyms exist as a result of the search of the dictionary information by the dictionary storage means, the search output may be replaced with the predetermined connection pattern or the predetermined connection pattern. To check whether a word pair registered in the storage means having a proper connection pattern exists, and if so, output the word pair in preference to other homonyms of both words. It is characterized by.

また、ある単語と他の単語との対を示す情報、これら
両単語が文字列中に対で現れるときの両単語の既定の接
続パターン、この既定の接続パターンと置き換え可能な
接続パターン、並びに、前記単語対が出現可能な複数の
パターンに対する優先度情報を記憶手段に記憶せしめ、
前記変換処理手段が、前記読み情報に対する前記辞書記
憶手段の検索の結果、複数の同音異義語が存在する場
合、検索出力において、前記既定の接続パターン、又
は、この既定の接続パターンと置き換え可能な接続パタ
ーンをもつ前記記憶手段に登録された単語対が存在する
か否かを検査するとともに、検査の結果複数の単語対が
存在する場合には、前記優先度情報に従って優先度の高
い単語対を選択し、この単語対をこれら単語の他の同音
異義語よりも優先して出力することを特徴とした文書作
成装置を提供する。Further, information indicating a pair of a certain word and another word, a default connection pattern of both words when both words appear as a pair in a character string, a connection pattern replaceable with the default connection pattern, and Storing in the storage means priority information for a plurality of patterns in which the word pair can appear;
The conversion processing unit, when a plurality of homonyms exist as a result of the search performed by the dictionary storage unit with respect to the reading information, can replace the predetermined connection pattern or the predetermined connection pattern in a search output. A check is performed to determine whether a word pair registered in the storage unit having a connection pattern exists.If a plurality of word pairs exist as a result of the check, a word pair having a higher priority is determined according to the priority information. There is provided a document creating apparatus characterized in that a selected word pair is output with priority over other homonyms of these words.

（作用）上記構成によって、本発明の文書作成装置は、読み情
報の変換結果として複数の同音異義語が出力されたと
き、その中から確からしい単語を選んで他の同音異義語
より優先して出力することができる。(Operation) With the above configuration, when a plurality of homonyms are output as a conversion result of the reading information, the document creation device of the present invention selects a probable word from among them and gives precedence to other homonyms. Can be output.

（実施例）まず、本実施例の基本的概念を説明する。共起を用い
る際に、２単語間（単語を概念で分類し、各概念の間の
共起関係でも構わない）たものでも構わない）にある接
続関係に注目する。一般に共起関係にある単語間の接続
関係は、「を、の、が、に、で、は、と、も、から、
へ、な、まで、より、なく等」の助詞と、「連接」と呼
ばれる体言の単語どうしの助詞を伴わない共起関係、及
び、「修飾」と呼ばれる前の用言の単語からの助詞を伴
わない共起がある。以上の接続関係をそのまま共起情報
として活用すると、登録しなければならない共起関係が
増加するので、同じような働きをする助詞をグループ化
し、このグループ単位で共起情報として登録し、変換時
に作用できるようにする。例えば、「鳥」＋「は」＋
「鳴（く）」と共起テーブルに登録されている場合に
は、「とりはなく」と入力されれば、共起関係を参照し
て「鳥は鳴く」が変換候補として優先される。しかし、
この共起関係は、「鳥が鳴く」「鳥の鳴き（声）」「鳥
も鳴く」「鳥まで鳴く」の用例のように助詞「は」以外
の助詞でも同じ共起関係が適用できる。従って、「鳥」
−「鳴（く）」の共起における接続情報には、（は、
が、の、も、まで）のようなグループ（分類）情報を登
録する。また、助詞「を」に関しては、「名詞＋を＋
自動詞」と「名詞＋を＋他動詞」という共起の用例が
考えられる。では、「を」を「の、が、は、も、ま
で」の各助詞に変更しても等価な共起関係の適用が可能
であり、では、「を」を「の、が、は、も、で、ま
で、から、より、に」の各助詞に変更しても等価な共起
関係の適用が可能である。本実施例における助詞の分類
例を表１に示す。分類は助詞の格（主格、目的格、連体
格等）に因る分類を参考にしているが、必ずしも助詞の
各格との分類分けが文法的に一意に決まる訳ではないの
で、実験的に最適な変換結果が得られるよう調整されて
いる。Example First, a basic concept of the present example will be described. When co-occurrence is used, attention is paid to a connection relationship between two words (a word may be classified by concept and a co-occurrence relationship between concepts may be used). In general, the connection relation between words in a co-occurrence relation is “,,,,,,,,,,,,,,,,,,,,,,,”,
,,,,,,, Etc., a co-occurrence relationship without a particle between the words of the nomenclature called "concatenation", and a particle from the word of the previous verb called "modification" There is no co-occurrence. If the above connection relations are used as they are as co-occurrence information, the co-occurrence relations that must be registered will increase, so particles that perform the same function will be grouped and registered as co-occurrence information in this group unit. To be able to work. For example, "bird" + "ha" +
If “sounds” is registered in the co-occurrence table and “totori is absent” is input, “bird sounds” is prioritized as a conversion candidate with reference to the co-occurrence relationship. But,
The co-occurrence relationship, "bird cries," "bird squeal (voice)", "bird
It is also rather "apply particle" is the same co-occurrence relationship in particle other than "as shown in the example of" rather than to the bird. " Therefore, "bird"
-The connection information in the co-occurrence of “sound”
However, group (classification) information such as, but not limited to, is registered. As for the particle "wo", "noun +
An example of co-occurrence of "intransitive verb" and "noun + wo + transitive verb" can be considered. Then, it is possible to apply an equivalent co-occurrence relationship even if "" is changed to "no, ga, ga, wo, till" particles. Even if it is changed to each particle of,,,,,, and, an equivalent co-occurrence relationship can be applied. Table 1 shows an example of particle classification in this embodiment. The classification is based on the classification based on the case of the particle (nominative case, object case, union case, etc.). However, the classification of each particle is not necessarily grammatically unique. Adjustments are made to obtain optimal conversion results.

共起関係にある単語を登録する際、その間の接続情報
として、表１の分類を示す情報が付加される。そして、
本実施例の文書作成装置では、共起関係を適用して読み
情報の変換を行う場合、前後の単語とその接続関係（助
詞）までぴったり一致しているものは従来通り共起関係
に従って該当単語を優先出力するが、許容されている接
続関係（助詞）に一致したパターンのものにもある程度
の優先度を与えて該当単語を出力させている。本実施例
における優先度の例が表２に示される。 When a word having a co-occurrence relationship is registered, information indicating the classification in Table 1 is added as connection information therebetween. And
In the document creating apparatus of the present embodiment, when reading information is converted by applying a co-occurrence relation, words that exactly match the preceding and succeeding words and their connection relations (particles) match the corresponding word according to the conventional co-occurrence relation. Is output with priority, but a certain degree of priority is given to a pattern that matches a permitted connection relation (particle) to output the word. Table 2 shows an example of the priority in this embodiment.

本実施例では、同じ読み情報（変換例）に対して複数
の共起関係が作用する場合が起こる。このために、共起
関係の適用において表２の優先度が適用されるわけであ
る。共起関係のチェック時点で優先度が計算され、優先
点数の高い方の共起関係を選んで該当する単語を同音語
から選んで候補として出力する。表２の例では、助詞
（接続形態）のパターンが完全に一致しているもの、共
起が作用する文節距離が短いほど優先度が高く設定され
ている。上記例では隣接／非隣接しか見ていないが、共
起関係が適用される文節の間に挟まった文節の個数に応
じて点数を細分化しても構わない。 In the present embodiment, a case where a plurality of co-occurrence relationships act on the same reading information (conversion example) occurs. For this reason, the priorities in Table 2 are applied in applying the co-occurrence relation. At the time of checking the co-occurrence relation, the priority is calculated, the co-occurrence relation having the higher priority score is selected, and the corresponding word is selected from the homophone and output as a candidate. In the example of Table 2, the priority is set higher as the pattern of the particle (connection form) completely matches and the phrase distance at which co-occurrence acts is shorter. Although only the adjacent / non-adjacent is seen in the above example, the score may be subdivided according to the number of clauses sandwiched between the clauses to which the co-occurrence relation is applied.

第１図は、本実施例を適用した文書作成装置の機能ブ
ロック図である。FIG. 1 is a functional block diagram of a document creating apparatus to which the present embodiment is applied.

処理部（実際には、CPU、インタフェースハードウェ
ア、これらを制御するためのプログラムで実現）とし
て、キーボード110からのデータ（キーコードデー
タ、コマンドデータ）入力処理を行う入力制御部11、
入力制御部によって受け取られた読み情報（本実施例で
はかな文字列）を漢字・片仮名混じり文に変換するかな
漢字変換部13、変換結果として作成されていく文書の
校正・編集（同音語の変更／選択・文字列の削除／移動
／コピー等）を行う編集作業部15、作成された文書、
又は、変換結果（候補）の表示を制御する表示制御部17
が設けられる。A processing unit (actually realized by a CPU, interface hardware, and a program for controlling these units) includes an input control unit 11, which performs data (key code data, command data) input processing from the keyboard 110;
A kana-kanji conversion unit 13 that converts reading information (a kana character string in this embodiment) received by the input control unit into a sentence mixed with kanji and katakana, and proofreads and edits a document created as a conversion result (change of homophone / Editing work unit 15, which performs selection / deletion / movement / copy of character strings, created documents,
Alternatively, the display control unit 17 that controls the display of the conversion result (candidate)
Is provided.

入力制御部11とかな漢字変換部13の管理下には読み入
力バッファ112が設けられる。読み入力バッファ112には
入力制御部11が受けた読み情報かな文字列が格納され、
格納された読み情報文字列はかな漢字変換部13によって
解析され漢字／カタカナ混じり文に変換される。A reading input buffer 112 is provided under the control of the input control unit 11 and the kana-kanji conversion unit 13. The reading input buffer 112 stores the reading information kana character string received by the input control unit 11,
The stored reading information character string is analyzed by the Kana-Kanji conversion unit 13 and converted into a sentence mixed with Kanji / Katakana.

画面表示部17の管理下には、表示装置（ディスプレ
イ）170に表示されるパターンデータを記憶する表示メ
モリ172が存在する。表示メモリ172は、CPUに接続さ
れ、文字のコードに対応したドットパターンが書き込ま
れていく。Under the control of the screen display unit 17, there is a display memory 172 that stores pattern data displayed on a display device (display) 170. The display memory 172 is connected to the CPU, and dot patterns corresponding to character codes are written.

かな漢字変換部13は、読み情報の変換に際して辞書3
0、共起テーブル31、助詞分類パターンテーブル32、優
先規則テーブル33を利用する。辞書30は、単語の読み
をインデックスとしてその読みに対応する自立語（漢字
混じり語・カタカナ語）を記憶した自立語辞書、べた
入力された読みを変換単位としての文節に切断する際に
利用される付属語辞書、抽出された単語−付属語間／
切断された文節間の文法的な接続の正しさをチェックす
るための文法辞書から成る。自立語辞書に記憶されてい
る各単語には、品詞・活用等の文法情報（文法チェック
の際に利用される）や使用頻度情報などの付加情報がつ
けられている。辞書の構成、読み情報から変換候補（同
音異字語）を得るまでの基本的なかな漢字変換の手法
は、例えば、日経エレクトロニクス1983.8.29号、180頁
乃至215頁と「日本語処理」の項、及び、その参考文献
に書かれた手法が適用できるので詳細な説明を省略す
る。共起テーブル31には、共起関係にある２単語の情報
（各単語の辞書番号）と２単語の接続関係を示す情報
（最も適切な接続関係−助詞、連接、修飾等）が登録さ
れる。尚、共起テーブルの内容は、前記辞書30中のある
単語に対して共起関係にある単語と接続関係の情報とを
リスト的に記憶せしめることにより、辞書30に記憶させ
ても良い。また、共起テーブル31に登録される情報は、
共起関係にあるカテゴリの情報でもよい。例えば、
『猫、犬、猿等』をまとめて『動物』というカテゴリに
分類し、『動物−鳴（く）−が』と登録される。この場
合、辞書30中に記憶された各単語には、例えば、猫−動
物、犬−動物、猿−動物、私−人間、彼女−人間という
ようにどのカテゴリに属する単語かの情報が付加され
る。助詞分類パターンテーブル32は、共起テーブルに登
録されている接続関係と置き換え可能な接続関係、即ち
表１に示された共起のパターンと許容できる接続のリス
トとが登録されたテーブルである。優先規則テーブル33
は、共起関係ありとされた２単語間にどれだけの優先度
を与えるか、即ち、表２に示された点数付与の情報を記
憶したテーブルである。辞書30、共起テーブル31、助詞
分類パターンテーブル32、優先規則テーブル33は、CPU
に接続されたROMに記憶される。尚、後から追加登録の
可能性がある場合には、読書き可能な外部記憶装置（磁
気ディスク）に記憶させても良い。The kana-kanji converter 13 converts the dictionary 3
0, a co-occurrence table 31, a particle classification pattern table 32, and a priority rule table 33 are used. The dictionary 30 is an independent word dictionary that stores independent words (words mixed with kanji and katakana words) corresponding to the readings of the words as indices, and is used when cutting solid input readings into phrases as conversion units. Attached word dictionary, extracted words-between attached words /
It consists of a grammar dictionary for checking the correctness of the grammatical connection between the disconnected phrases. Each word stored in the independent word dictionary is provided with additional information such as grammatical information such as part of speech and inflection (used at the time of grammar check) and usage frequency information. The basic kana-kanji conversion method for obtaining a conversion candidate (homophonetic) from a dictionary configuration and reading information is described in, for example, Nikkei Electronics 1983.8.29, pages 180 to 215 and the section "Japanese processing". Further, since the method described in the reference can be applied, detailed description is omitted. In the co-occurrence table 31, information of two words having a co-occurrence relation (dictionary number of each word) and information indicating a connection relation of the two words (most appropriate connection relation-particle, concatenation, modification, etc.) are registered. . Note that the contents of the co-occurrence table may be stored in the dictionary 30 by storing a list of words that have a co-occurrence relationship with a certain word in the dictionary 30 and information of a connection relationship. Information registered in the co-occurrence table 31 is as follows.
Information of a category having a co-occurrence relationship may be used. For example,
“Cats, dogs, monkeys, etc.” are collectively classified into the category “animals” and registered as “animals-sounding-ga”. In this case, to each word stored in the dictionary 30 is added information on which category the word belongs to, for example, cat-animal, dog-animal, monkey-animal, i-human, her-human. You. The particle classification pattern table 32 is a table in which the connection relations that can be replaced with the connection relations registered in the co-occurrence table, that is, the co-occurrence patterns shown in Table 1 and a list of allowable connections are registered. Priority rule table 33
Is a table in which information is given as to how much priority is given between two words that are considered to have a co-occurrence relationship, that is, information on the point assignment shown in Table 2. The dictionary 30, the co-occurrence table 31, the particle classification pattern table 32, and the priority rule table 33
Is stored in the ROM connected to. If there is a possibility that additional registration will be performed later, the information may be stored in a readable / writable external storage device (magnetic disk).

かな漢字変換部13は、変換結果の記憶部として、共起
パターン展開テーブル130と同音語バッファ132とを使用
する。共起パターン展開テーブルは、かな漢字変換部13
が、変換候補の中で前記共起テーブル31、助詞分類パタ
ーンテーブル32、優先規則テーブル33の内容を利用し
て、変換候補（複数の同音異義語）の中から共起関係を
適用できるものを選択したとき、この共起関係にある単
語ペアとその優先度の点数とが共起パターン展開テーブ
ル130に記憶される。同音語バッファ132は、読み情報バ
ッファ112中の読み情報に対する変換結果としてかな漢
字変換部13が出力する同音異義語、及び、これら同音異
義語の読みを記憶する。変換結果には、一意に確定した
単語（同音異義語が存在しないもの）と同音異義語が混
在した単語が存在する。一意に確定した単語について
は、かな漢字変換部13が、辞書30から読み出したその単
語の文字コードを文書バッファ180に書き込む。同音異
義語が存在する単語については、未確定を示すコードと
この単語の同音異義語が格納された同音語バッファ132
のアドレスとを文書バッファ180に書き込む。The kana-kanji conversion unit 13 uses a co-occurrence pattern development table 130 and a homophone buffer 132 as a storage unit of the conversion result. The co-occurrence pattern development table is stored in the Kana-Kanji conversion unit 13
Among the conversion candidates, those which can apply the co-occurrence relationship among the conversion candidates (a plurality of homonyms) using the contents of the co-occurrence table 31, the particle classification pattern table 32, and the priority rule table 33 are described. When selected, the co-occurrence word pair and the priority score are stored in the co-occurrence pattern development table 130. The homophone buffer 132 stores homonyms output by the kana-kanji conversion unit 13 as a conversion result for the reading information in the reading information buffer 112, and readings of these homonyms. The conversion result includes a word in which a uniquely determined word (having no homonym) and a homonym are mixed. For the uniquely determined word, the kana-kanji conversion unit 13 writes the character code of the word read from the dictionary 30 to the document buffer 180. For a word having a homonym, a homonym buffer 132 storing a code indicating indefinite and a homonym of this word is stored.
Is written to the document buffer 180.

編集作業部15は、キーボード110からのコマンド入力
（移動、コピー、文字修飾等の機能キーの押下によるコ
ード入力）に従って、文書バッファ180の内容の校正・
編集を行うプログラムである。編集作業部15は、校正・
編集の対象となる文書の部分を表示するために文書バッ
ファ180から文字コード列を読み出し、画面表示部17に
渡す。編集作業部15は、未確定を示すコードが文書バッ
ファ180から読み出されたときには、このコードに付随
した前記アドレス情報を元に同音語バッファ132をアク
セスし、同音異義語群の中の１つを選び、その文字列の
コードを画面表示部17に渡す。尚、同音語バッファ132
の各エントリには、表示すべき語を示すビットのが設け
られており、ある単語の同音異義語群の中の候補として
表示すべき語はこのビットがオンされている。編集作業
部15はこのビットをチェックして、候補として表示すべ
き語を選ぶ。また、編集作業部15が候補の語の文字コー
ドを画面表示部17に渡すときにはその語が未確定の語で
ある情報も付加し、画面表示部17に、その語の表示に対
して確定された語とは異なる表示（例えば上線付、反
転、ブリンク等）を指示する。更に、編集作業部15は、
キーボード110からの同音異義語の次の候補を表示する
ようキーボード110からの指示があったとき（［次候
補］キーの押下）、カーソルが示している語に対して
（カーソル表示の行桁位置は編集作業部が管理してい
る）次の候補を表示する処理を行う。これは、前記編集
作業部15が、同音語バッファ132中において、現在前記
オンしているビットをオフし、次の順位にある同音異義
語のビットをオンし、前述の表示処理を繰り返すことに
よって実行する。The editing work unit 15 proofreads and modifies the contents of the document buffer 180 in accordance with command input from the keyboard 110 (code input by pressing function keys such as move, copy, and character modification).
This is a program for editing. The editing work unit 15
In order to display the portion of the document to be edited, a character code string is read from the document buffer 180 and passed to the screen display unit 17. When the code indicating unconfirmed is read from the document buffer 180, the editing work unit 15 accesses the homophone buffer 132 based on the address information attached to this code, and And passes the code of the character string to the screen display unit 17. In addition, the homophone buffer 132
Each entry has a bit indicating a word to be displayed. This bit is turned on for a word to be displayed as a candidate in a homonymous word group of a certain word. The editing work unit 15 checks this bit and selects a word to be displayed as a candidate. Further, when the editing work unit 15 passes the character code of the candidate word to the screen display unit 17, the information that the word is an undetermined word is also added, and the screen display unit 17 confirms the display of the word. Indicate a different display (for example, overline, reverse, blink, etc.) Furthermore, the editing work unit 15
When there is an instruction from the keyboard 110 to display the next candidate of the homonym from the keyboard 110 (pressing the [next candidate] key), the word indicated by the cursor (line position of the cursor display) Performs the processing of displaying the next candidate. This is because the editing work unit 15 turns off the bit that is currently on in the homophone buffer 132, turns on the bit of the homonym in the next order, and repeats the display processing described above. Execute.

画面表示部17は、編集作業部15から送られた文字コー
ドと文字コードに付随した修飾指示（下線付、上線付、
反転、拡大等）に基づき文字のパターンを生成し、表示
メモリ172に格納する。尚、文字の基本パターンは図示
しないROMに格納され、コードに従ってROMから読み出す
ことができるようになっている。表示メモリ172の内容
はディスプレイ170に表示される。The screen display unit 17 displays the character codes sent from the editing work unit 15 and the modification instructions attached to the character codes (underlined, overlined,
A character pattern is generated based on (inversion, enlargement, etc.) and stored in the display memory 172. The basic pattern of characters is stored in a ROM (not shown), and can be read from the ROM according to codes. The contents of the display memory 172 are displayed on the display 170.

次に、第２図のフローチャートを参照して、かな漢字
変換部13による共起関係の活用時の動作を説明する。Next, the operation of the kana-kanji conversion unit 13 when utilizing the co-occurrence relationship will be described with reference to the flowchart of FIG.

例えば、「きしゃのきしゃがきしゃにきしゃした」と
いうベタ文のかな文字列が読み情報として入力され、変
換が指示（キーボード110における［変換］キーの押
下）されたとする（ステップ10）。前述したように読み
文字列がバッファ112に記憶される。For example, it is assumed that a solid character string of a solid sentence "Kisha shy shyness" is input as reading information and conversion is instructed (pressing the [Conversion] key on the keyboard 110) (step 10). . The read character string is stored in the buffer 112 as described above.

かな漢字変換部15は、変換指示を受けて、読み文字列
を辞書30の内容に従って読み文字列を文節に分ち書き
し、漢字混じり文に変換する（ステップ20）。この変換
は従来のものと同じなので詳しい説明は省略する。ここ
で、「きしゃの／きしゃが／きしゃに／きしゃし
た」（／は文節の切断点を示す。説明上単語を区別す
るために番号を付した。）と分ち書きされ、〜の単
語について、辞書30内の自立語辞書から同音異義語が読
み出され同音語バッファ132にスタックされる。〜
の各自立語「きしゃ」の読みに対して、名詞「帰社、貴
社、記者、汽車」の同音異義語が一応の変換結果として
スタックされる。の「きしゃした」に対しては、「帰
社する」というサ変動詞の連用形「帰社し」と過去を示
す助動詞「た」という組合わせの「帰社（し）（た）」
と、「した」を名詞「下」と分析した場合の名詞の「き
しゃ」との複合形「貴社下、帰社下、記者下、汽車下」
の同音異義語が同音語バッファ132にスタックされる。Upon receiving the conversion instruction, the kana-kanji conversion unit 15 separates the read character string into segments according to the contents of the dictionary 30, and converts the read character string into a sentence mixed with kanji (step 20). Since this conversion is the same as that of the conventional one, detailed description is omitted. Here, " Kishano / Kishiga / Kishani / Kishashi
And "(/ are numbered to distinguish. Explanation on word indicating a breakpoint of clauses.) And the word-separated, for words ~ homonym read from independent word dictionary of the dictionary 30 And is stacked in the homophone buffer 132. ~
For each reading of the independent word "Kisha", the homonym of the noun "Go back, your company, reporter, train" is stacked as a temporary conversion result. For "Kishashita", "coming home" is a combination of the sa variable verb "coming home" and the auxiliary verb "ta" indicating the past.
And the noun "kisha" when analyzing "ha" as the noun "lower".
Are stacked in the homonym buffer 132.

次に、かな漢字変換部13は、共起テーブル31を検索
し、共起テーブル31に登録されている共起関係単語のペ
アが前記変換結果の中に存在するか否かをチェックする
（ステップ30）。共起テーブル31には、イ）貴社（名詞）「の」記者（名詞）ロ）貴社（名詞）「の」汽車（名詞）ハ）貴社（名詞）「に」帰社（する）（自動詞）ニ）記者（名詞）「が」帰社（する）（自動詞）ホ）汽車（名詞）「で」帰社（する）（自動詞）の共起関係が登録されていたとする。同音異義語の中に
存在する共起関係にある単語ペアは、 1.貴社と記者、2.貴社と記者、 3.貴社と記者、4.貴社と汽車、 5.貴社と汽車、6.貴社と汽車、 7.貴社と帰社（し）、 8.記者と帰社（し）、 9.汽車と帰社（し）、 10.貴社と記者、11.貴社と記者、 12.貴社と汽車、13.貴社と汽車、 14.貴社と帰社（し）、 15.記者と帰社（し）、 16.汽車と帰社（し）、 17.貴社と記者、18.貴社と汽車、 19.貴社と帰社（し）、 20.記者と帰社（し）、 21.汽車と帰社（し）、の21とおりが存在するので、これらを共起パターン展開
テーブル130中に格納していく。この手順は以下のよう
にかな漢字変換部180により実行されていく。まず、第
１の文節中の単語の同音異義語中に共起ペアの前方の単
語として登録されているものがあるかどうかをチェック
する。登録されているものがあった場合、それ以降の文
節中（本実施例では、候補が膨大にならないように３つ
後ろの文節まで）の単語の同音異義語に対して、前記前
方の単語とペアをなす単語が存在するかどうかチェック
する。Next, the kana-kanji conversion unit 13 searches the co-occurrence table 31 and checks whether a pair of co-occurrence-related words registered in the co-occurrence table 31 exists in the conversion result (step 30). ). In the co-occurrence table 31, a) your company (noun) “no” reporter (noun) b) your company (noun) “no” train (noun) c) your company (noun) “ni” return to home (in) (intransitive verb) d ) Suppose that the co-occurrence relationship of the reporter (noun) “ga” returned to the office (do) (intransitive verb) e) the train (noun) “de” returned to the office (do) (intransitive verb) was registered. Word pairs that have co-occurrence relations in homonyms include: 1. Your company and reporter, 2. Your company and reporter, 3. Your company and reporter, 4. Your company and train, 5. Your company and train, 6. Your company And train, 7. Your company and return, 8. Reporter and return, 9. Train and return, 10. Your company and reporter, 11. Your company and reporter, 12. Your company and train, 13. Your company and train, 14. Your company and return, 15. Reporter and return, 16. Train and return, 17. Your company and reporter, 18. Your company and train, 19. Your company and return ), 20. Reporter and returning to home, 21. Train and returning to home, and these are stored in the co-occurrence pattern development table 130. This procedure is executed by the kana-kanji conversion unit 180 as follows. First, it is checked whether any homonyms of the words in the first phrase are registered as words preceding the co-occurrence pair. If there is a registered word, the homonyms of the words in the following phrases (in this embodiment, up to three phrases later so as not to increase the number of candidates) are compared with the preceding words. Checks for paired words.

これらペア（21とおり）の各々を共起パターン展開テ
ーブル130に格納する際に、ペアをなす単語の間の「つ
なぎの関係」が一致しているかどうかがチェックされる
（ステップ40）。「つなぎの関係」が一致しているもの
については“1"のフラグ情報が付加される（ステップ4
1）。「つなぎの関係」が一致しないペアについては助
詞分類パターンテーブル32をアクセスして（ステップ4
3）、「許容できる接続関係」になっているかどうかが
チェックされる（ステップ45）。「許容できる接続関
係」である場合は“2"のフラグを付し（ステップ47）、
そうでなければ“3"のフラグを付し（ステップ49）、当
該ペアをに付加される。そして、かな漢字変換部13は、
都度、前記フラグの数値に従って優先規則テーブル33を
アクセス（ステップ50）し、規定された優先規則に基づ
いて優先点数をつけていく（ステップ60）。この結果を
共起パターン展開テーブル130に格納する（ステップ8
0）。以上の動作をし、存在すればこれを共起パターン
展開テーブル130に格納する。そして、第２の文節中の
単語、第３の文節中の単語と順番に上記動作を繰り返し
て（ステップ90、ステップ100）、全ての組み合わせ
（上記例では21とおり）を共起パターン展開テーブル13
0に格納する。文末まで解析が終了したとすると、上記
例では、結果として、 1.貴社と記者１−70/完一、体言、隣接、 2.貴社と記者１−50/完一、体言、不隣、 3.貴社と記者１−50/完一、体言、不隣、 4.貴社と汽車１−70/完一、体言、不隣、 5.貴社と汽車１−50/完一、体言、不隣、 6.貴社と汽車１−50/完一、体言、不隣、 7.貴社と帰社（し）３−10/不一、用言、不隣、 8.記者と帰社（し）３−10/不一、用言、不隣、 9.汽車と帰社（し）３−10/不一、用言、不隣、 10.貴社と記者３−20/不一、体言、隣接、 11.貴社と記者３−10/不一、体言、不隣、 12.貴社と汽車３−10/不一、体言、不隣、 13.貴社と汽車３−10/不一、体言、不隣、 14.貴社と帰社（し）２−40/許容、用言、不隣、 15.記者と帰社（し）１−50/完一、用言、不隣、 16.汽車と帰社（し）３−10/不一、用言、不隣、 17.貴社と記者３−20/不一、体言、隣接、 18.貴社と汽車３−20/不一、体言、隣接、 19.貴社と帰社（し）１−70/許容、用言、隣接、 20.記者と帰社（し）３−20/不一、用言、隣接、 21.汽車と帰社（し）３−20/不一、用言、隣接、という21とおりの共起関係の同音異義語ペアが優先度点
数とともに展開テーブル130に登録されたことになる。When each of these pairs (21 types) is stored in the co-occurrence pattern development table 130, it is checked whether or not the “connecting relationship” between the words forming the pair matches (step 40). Flag information of "1" is added to those having the same "connection relationship" (step 4).
1). For the pair whose “connecting relation” does not match, access the particle classification pattern table 32 (step 4
3) It is checked whether the connection is "acceptable" (step 45). If the connection is "acceptable connection", a flag "2" is added (step 47).
If not, a flag of "3" is added (step 49), and the pair is added to the pair. Then, the kana-kanji conversion unit 13
Each time, the priority rule table 33 is accessed according to the numerical value of the flag (step 50), and the number of priority points is assigned based on the specified priority rule (step 60). This result is stored in the co-occurrence pattern development table 130 (step 8
0). The above operation is performed, and if it exists, it is stored in the co-occurrence pattern development table 130. The above operation is repeated in the order of the word in the second phrase and the word in the third phrase (steps 90 and 100), and all combinations (21 in the above example) are stored in the co-occurrence pattern development table 13.
Store in 0. Assuming that the analysis is completed up to the end of the sentence, in the above example, as a result, 1. your company and reporter 1-70 / kanichi, nominative, adjacent, 2. your company and reporter 1-50 / kanichi, nominative, non-neighbour, 3 .Your reporter 1-50 / Kanichi, Nominal, non-neighbour, 4.Your company and train 1-70 / Kanichi, Nomi, non-neighbour, 5.Your company and train 1-50 / Kanichi, Nomi, non-neighbour, 6.Your company and train 1-50 / Kanichi, Nominal, non-neighbour, 7.Your company and return to home (Shi) 3-10 / None, word, non-neighbour, 8.Reporter and home (Shi) 3-10 / Mischief, word, non-neighbour, 9. Train and return home (S) 3-10 / non-one, word, non-neighbour, 10. Your company and reporter 3-20 / non-one, body, adjacent, 11. Reporter 3-10 / None, Nominal, Not Neighbor, 12.Your Company and Train 3-10 / None, Nomi, No Neighbor, 13.Your Company and Train 3-10 / None, Nominal, No Neighbor, 14.Your Company 2-40 / permitted, word, non-neighbour, 15. Reporter and returned 1-50 / kanichi, word, non-neighbour, 16. Train and home 3-10 / Mischief, decree, Unnecessary, 17. Your company and reporter 3-20 / None, Nominal, adjacent, 18. Your company and train 3-20 / None, Nominal, adjacent, 19. Your company and returning to the company 21. Co-occurrence with 21 as follows: 20. Reporter and return home (Sh) 3-20 / None, Judgment, Adjacent, 21. Train and return home (S) 3-20 / N / A, Negative, Adjacent The homonym pair of the relationship is registered in the development table 130 together with the priority score.

各単語の同音異義語毎にそれが該当しているペア（前
でも後ろでもかまわない）の点数を累積していくと、・単語について、「貴社」→350、「記者」→10、
「汽車」→10の得点、・単語について、「記者」→120、「汽車」→10 ・単語について、「貴社」→110、「記者」→90、
「汽車」→80 ・単独について、「帰社（し）」→240、「記者」→8
0、「汽車」→80 の結果が得られ、各々の単語の同音異義語の中で最も優
先度の高ものが第１候補として選択される。即ち、同音
語バッファ132に同音異義語群をスタックする際、選択
された候補のビットがオンされ、１番目に表示がなされ
るようにされる（ステップ110。尚、この場合、最初に
スタックされた同音異義語群の順番を優先度順で並べ変
えても良い。結果、「貴社の記者が貴社に帰社した」と
変換できる。この結果は、文法的にも、意味的にも確か
らしいものとなっており、オペレータが不要な［次候
補］キーの操作をして所望の変換結果を得なくとも良い
ことを示している。For each homonym of each word, accumulate the score of the pair to which it applies (it does not matter whether it is before or after). ・ About the word, "your company" → 350 , "reporter" → 10,
Scores of “train” → 10 ・ About words, “reporter” → 120 , “train” → 10 ・ About words, “your company” → 110 , “reporter” → 90,
“Train” → 80 ・ Single, “Return to home” → 240 , “Reporter” → 8
0, “train” → 80 is obtained, and the highest priority among homonyms of each word is selected as the first candidate. That is, when a homonym group is stacked in the homonym buffer 132, the bit of the selected candidate is turned on so that the first candidate is displayed (step 110. In this case, the first candidate is stacked first). You can rearrange the order of the homonyms in the order of priority, which translates to "Your reporter has returned to your company." This result is grammatically and semantically sound. This indicates that it is not necessary for the operator to operate an unnecessary [next candidate] key to obtain a desired conversion result.

さらに、別の例で本実施例を検証する。 Further, the present embodiment will be verified with another example.

「せんせいがはっぴょうした」という読み文字列が入
力され、「せんせい（が）／はっぴょう（し）
（た）」と変換されたものとする。また、共起テーブル
31には、先生「は」発表（する）−自動詞の共起関係のみが登録されているとする。A reading character string of "sensei is happy" is input, and "sensei (ga) / happy (shi)"
(Ta) ". Also, co-occurrence table
It is assumed that only the co-occurrence relationship between the teacher “ha” announcement (do) and the intransitive verb is registered in 31.

先生＋が＋発表した「接続関係」は共起テーブルの助詞「は」と完全に一
致していないが、許容できる接続関係にグループに
『が』が入っているので、隣接していて助詞も許容され
ていると判断され、優先点数は、用言であるので、60点
が付けられる。今回は、他に共起テーブルを用いて優先
点数をつける組み合わせは生じないので、他の同音異義
語には全く優先点数が付けられていない。The "connecting relation" that teacher + announced + does not completely match the particle "ha" in the co-occurrence table, but the admissible connecting relation includes "ga" in the group, so the adjoining particles It is determined that it is permitted, and the priority score is a word, so 60 points are given. This time, there is no other combination that assigns a priority score using the co-occurrence table, so that no other homonym has a priority score at all.

従って、先制＋が＋発表した専制＋が＋発表した宣誓＋が＋発表したよりも優先して、『先生＋が＋発表した』と出力され
る。これも文法的、意味的に確からしい変換結果であ
る。Therefore, the predecessor + has announced + The tyranny + has announced + The oath + has given priority over the + announcement, and “Teacher + has announced +” is output. This is also a grammatically and semantically sound conversion result.

前記実施例において、辞書30中において、読みに対応
して登録されている単語各々に、この単語と対をなす単
語とその接続情報とを加えて記憶させておけば、前記共
起テーブル31を省略することもできる。In the embodiment, in the dictionary 30, if each word registered corresponding to the reading is added and stored with a word paired with the word and its connection information, the co-occurrence table 31 is stored. It can be omitted.

尚、優先度の点数が同じ候補が出た場合には、使用頻
度等別の尺度を加味して順序を決めても良い。また、あ
る点数以下の候補は排除するようにすることも可能であ
る。この場合は、オペレータが点数のしきい値を設定で
きるようにしても良い。When a candidate having the same priority score appears, the order may be determined in consideration of another measure such as the frequency of use. Further, it is also possible to exclude a candidate having a certain score or less. In this case, the operator may be able to set a score threshold.

更に、単語が属するカテゴリ間で共起関係を活用する
場合には、ステップ20の段階で同音異義語として出力さ
れた各単語の分類コードを辞書30から得れば良い。Further, when utilizing the co-occurrence relationship between the categories to which the words belong, the classification code of each word output as the homonymous word in the step 20 may be obtained from the dictionary 30.

また、優先度点数の付与を更に細分化し、文節の離れ
た具合、用語に対する体言の深層格（主格、所有格、目
的格等）への適合性、「接続関係」の類似（許容）の度
合いを加味して点数付けを変えても良い。In addition, the priority score is further subdivided, and the degree of separation of phrases, the suitability of the nomenclature to the deep case (nominative case, possessive case, purpose case, etc.) for the term, and the degree of similarity (allowance) of the "connection relationship" May be added to change the scoring.

［発明の効果］本発明によれば、登録された共起関係の単語間につい
て、当該単語間の接続関係の類似性を加味してその共起
関係を適用するので、実際に登録されている共起関係の
数よりも多くのパターンについて最適な変換結果を得る
ことができる。[Effects of the Invention] According to the present invention, since co-occurrence relations are applied to words between registered co-occurrence relations in consideration of the similarity of connection relations between the words, the words are actually registered. Optimal conversion results can be obtained for more patterns than the number of co-occurrence relationships.

また、共起関係の適用に優先順位を付けているので、
複数の共起関係が同時に変換結果に影響を及ぼしたと
き、その中で最適なものを変換結果として出力すること
が可能となる。We also prioritize the application of co-occurrence relationships,
When a plurality of co-occurrence relations simultaneously affect the conversion result, it is possible to output the optimum one as the conversion result.

また、隣接した単語間だけでなく、離れた単語間にも
共起関係が適用できるので、対象単語間に修飾語等が存
在しても適格に共起関係を適用することができる。Further, since the co-occurrence relation can be applied not only between adjacent words but also between distant words, the co-occurrence relation can be appropriately applied even if a modifier or the like exists between the target words.

[Brief description of the drawings]

第１図は本実施例を適用した文書作成装置の機能ブロッ
ク図であり、第２図は実施例におけるかな漢字変換部動
作を示すフローチャートである。 13……かな漢字変換部 31……共起テーブル 32……助詞分類パターンテーブル 33……優先規則テーブル 130……共起パターン展開テーブルFIG. 1 is a functional block diagram of a document creating apparatus to which the present embodiment is applied, and FIG. 2 is a flowchart showing an operation of a kana-kanji conversion unit in the embodiment. 13 Kana-Kanji conversion part 31 Co-occurrence table 32 Particle classification pattern table 33 Priority rule table 130 Co-occurrence pattern development table

フロントページの続き (72)発明者宮間俊雄東京都青梅市末広町２丁目９番地株式会社東芝青梅工場内 (72)発明者岩木雅汎東京都青梅市末広町２丁目９番地株式会社東芝青梅工場内 (56)参考文献特開平１−229367（ＪＰ，Ａ) 特開昭59−2125（ＪＰ，Ａ) 特開昭60−124774（ＪＰ，Ａ) 特開昭61−75467（ＪＰ，Ａ) 特開昭63−98068（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁶，ＤＢ名) G06F 17/21Continued on the front page (72) Inventor Toshio Miyama 2-9-9 Suehirocho, Ome City, Tokyo Inside the Toshiba Ome Plant Co., Ltd. (72) Inventor Masanori Iwaki 2-9-9 Suehirocho Ome City, Tokyo Toshiba Ome Plant Co., Ltd. (56) References JP-A 1-229367 (JP, A) JP-A 59-2125 (JP, A) JP-A 60-124774 (JP, A) JP-A 61-75467 (JP, A) JP-A-63-98068 (JP, A) (58) Fields investigated (Int. Cl. ⁶ , DB name) G06F 17/21

Claims

(57) [Claims]

An input unit for inputting reading information; a dictionary storage unit in which a word character string is registered in a searchable manner in correspondence with the reading; and a dictionary storage unit based on the reading information input from the input unit. In a document creating apparatus having a conversion processing means for obtaining a desired word string by searching, information indicating a pair of a certain word and another word, a default of both words when both words appear as a pair in a character string And a storage unit for storing a connection pattern that can be replaced with the predetermined connection pattern. The conversion processing unit searches the dictionary storage unit for the reading information and finds that a plurality of homonyms are If there is, in the search output, there is a word pair registered in the storage means having the predetermined connection pattern or a connection pattern that can be replaced with the predetermined connection pattern. Whether to inspect, when present, document creation apparatus and outputting the word pair in preference to other homonym both words.

2. An input means for inputting reading information, a dictionary storage means in which a word character string is registered so as to be searchable corresponding to the reading, and the dictionary storage means based on the reading information input from the input means. In a document creating apparatus having a conversion processing means for obtaining a desired word string by searching, information indicating a pair of a certain word and another word, a default of both words when both words appear as a pair in a character string A connection pattern that can be replaced with the default connection pattern, and storage means for storing priority information for a plurality of patterns in which the word pair can appear, wherein the conversion processing means When a plurality of homonyms exist as a result of the search performed by the dictionary storage unit, the predetermined connection pattern or the predetermined connection pattern can be replaced in the search output. Whether there is a word pair registered in the storage unit having a proper connection pattern, and if a plurality of word pairs exist as a result of the inspection, a word pair having a higher priority is determined according to the priority information. And outputting this word pair with priority over other homonyms of these words.