JPH08180058A

JPH08180058A - Compound word retrieval device

Info

Publication number: JPH08180058A
Application number: JP6322938A
Authority: JP
Inventors: Katsumi Tokuda; 克己徳田; Shoichi Aoyama; 昇一青山; Ryuichi Shiomi; 隆一塩見
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1994-12-26
Filing date: 1994-12-26
Publication date: 1996-07-12

Abstract

PURPOSE: To provide a word retrieval device for outputting the plural results of compound word retrieval in an order from the most certain one by a prescribed procedure when more than two words for constituting a compound word in a dictionary match with the words provided in an input sentence. CONSTITUTION: A word retrieval part 105 divides the words from an inputted sentence and further, obtains a compound word number corresponding to the compound word provided with the words. When the same compound number is provided in more than two words, a compound word retrieval part 107 outputs it to a compound word tentative storage part 108 as a compound word candidate. A compound word aligning part 109 rearranges the compound words stored in the compound word tentative storage part 108 in the order of certainty by the prescribed procedure and an output part 110 outputs the compound words and compound word translations in the order.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、文字列中の複数の単語
で構成された連語を検索する連語検索装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a compound word search apparatus for searching compound words composed of a plurality of words in a character string.

【０００２】[0002]

【従来の技術】英語文を日本語文に翻訳するにあたり、
連語の適切な翻訳が重要な課題となっている。ここで、
連語とは、二語以上の単語で構成された熟語またはイデ
ィオムをいう。連語検索装置としては、以下のようなも
のがある。第１の従来技術は、特開昭６３−８９９７５
号にて公表されたものである。これは、検索された複数
の連語が同じ単語を共有している場合に一定の規則で優
先順位をつけるものである。第２の従来技術は、特開平
１−２６６６７１号にて公表されたものである。これ
は、連語を構成する単語の先頭の文字や発音に着目し
て、連語を分類することにより、検索時に候補の絞り込
みを行って高速化を図るものである。[Prior Art] When translating an English sentence into a Japanese sentence,
Proper translation of collocation has become an important issue. here,
A collocation is an idiom or idiom composed of two or more words. The following is an example of a compound word search device. The first conventional technique is disclosed in Japanese Patent Laid-Open No. 63-89975.
It was published in the issue. This is to prioritize a plurality of retrieved collocations according to a certain rule when they share the same word. The second conventional technique is disclosed in Japanese Patent Laid-Open No. 1-266671. This aims at speeding up by narrowing down the candidates at the time of search by classifying the collocations by paying attention to the leading letters and pronunciations of the words forming the collocations.

【０００３】[0003]

【発明が解決しようとする課題】ところで、上記２つの
連語検索装置においては、ある特定の連語を検索できる
ためには、辞書に登録されているその連語を構成する単
語が全て入力文に含まれている必要がある。従って例え
ば、辞書に“in one's idiomatic fashion”「独特の流
儀で」という連語が登録されている場合に、入力文中に
"in his idiomatic fashion ”とあるときには“one's
”と“his ”とが一致しないことから、この連語を検
索できない。更に、一般に出版されている辞書での対訳
例文等は、例文としての性格上、連語のような文構成要
素の一部分だけを単位にしているのではなく、文を単位
としている。このため、これらの対訳例文を参照して連
語検索を行う場合には、上記のような装置ではほとんど
役に立たず、著しく利便性を欠くことになる。By the way, in the above two collocation retrieval devices, in order to be able to retrieve a certain collocation, all the words constituting the collocation registered in the dictionary are included in the input sentence. Need to be. So, for example, if the lexical word “in one's idiomatic fashion” is registered in the dictionary,
When it says "in his idiomatic fashion,""one's
This compound cannot be searched because "" and "his" do not match. Furthermore, due to the nature of the example sentence, bilingual example sentences in commonly published dictionaries contain only a part of sentence components such as compound words. The sentence is not a unit, but a sentence.Therefore, when performing a collocation search by referring to these bilingual example sentences, the above-mentioned device is almost useless and remarkably inconvenient. Become.

【０００４】また、対訳例文を参照して部分的な単語の
一致による連語検索を行うには、最適な候補を高速に選
択する手法を備えることが必要不可欠であるが、現在そ
のようなものがない。本発明は上記課題に鑑み、辞書に
登録された連語を構成する２個以上の単語が入力文に含
まれる単語と一致する場合に効率的にその連語をその訳
語として利用することのできる連語検索装置を提供する
ことを目的とする。Further, in order to perform a collocation search by matching partial words with reference to a bilingual example sentence, it is indispensable to provide a method for selecting an optimum candidate at high speed, but such a method is currently available. Absent. In view of the above-mentioned problems, the present invention can efficiently use a compound word as a translated word when two or more words forming a compound word registered in a dictionary match a word included in an input sentence. The purpose is to provide a device.

【０００５】[0005]

【課題を解決するための手段】上記課題を解決するた
め、請求項１に係わる発明においては、入力文中の複数
の単語により構成される連語を検索する連語検索装置で
あって、文を入力する入力手段と、前記入力手段で入力
された文を一時記憶する文一時記憶手段と、基本形であ
る見出し単語と、見出し単語及びその活用形の単語を含
む連語の識別子とを対応付けて予め記憶している単語記
憶手段と、各連語を識別する識別子と、該識別子で特定
される連語と、その訳とを対応付けて予め記憶している
連語記憶手段と、前記文一時記憶手段に記憶されている
文を単語に分割し、分割した単語が基本形でないときに
基本形の単語に変換する単語分割手段と、前記単語分割
手段で得られた単語が前記単語記憶手段に記憶されてい
る見出し単語に一致する場合に対応する連語の識別子を
検出する単語検出手段と、前記単語検出手段が連語の識
別子を検出したときにはその識別子と、分割されたまま
の単語と基本形である単語又は分割されたままの単語と
を、検出しないときには分割されたままの単語と基本形
である単語又は分割されたままの単語を一組として分割
された順番に記憶する単語一時記憶手段と、前記単語一
時記憶手段に記憶されている識別子が複数の組で一致す
るときに該識別子で識別される連語と連語訳とを前記連
語記憶手段から検索する連語検索手段と、前記連語検索
手段で検索された全ての連語と連語訳とを連語ごとに一
時記憶する連語一時記憶手段と、前記連語一時記憶手段
に記憶されている連語の最適な順位を所定の手順で決定
する順位決定手段と、前記順位決定手段で決定された順
位の高い順に連語と連語訳とを出力する出力手段とを備
えたことを特徴としている。In order to solve the above problems, in the invention according to claim 1, there is provided a compound word search device for searching a compound word composed of a plurality of words in an input sentence, for inputting a sentence. The input means, the sentence temporary storage means for temporarily storing the sentence input by the input means, the basic word of the headword, and the identifier of the compound word including the headword and its inflected word are stored in advance in association with each other. Stored in the sentence temporary storage means, and a word storage means in which the word storage means, an identifier for identifying each collocation, a collocation specified by the identifier, and a translation thereof are stored in advance in association with each other. A sentence that is divided into words, and when the divided words are not in the basic form, the word is divided into basic words, and the word obtained by the word dividing means matches the headword stored in the word storage means. If the word detection means detects the corresponding collocation identifier, when the word detection means detects the collocation identifier, the identifier, the word that remains divided and the word that is the basic form or the word that remains divided And, when not detected, a word that is still divided and a word that is in the basic form or a word that is still divided and is stored in the divided order as a set, and the word temporary storage means stores the word in the divided order. A collocation search unit that searches the collocation storage unit for collocations and collocations identified by the identities when a plurality of identities match each other, and all collocations and collocations searched by the collocation search unit. Is stored for each collocation, a collocation temporary storage means, a ranking determination means for determining an optimal ranking of the collocations stored in the collocation temporary storage means in a predetermined procedure, and the ranking determination means. It is characterized by in and output means for outputting the collocation and collocation translation to higher determined ranking order.

【０００６】請求項２に係わる発明においては、前記順
位決定手段は、前記単語一時記憶手段に記憶されている
単語の順番と前記連語一時記憶手段に記憶されている単
語の順番との入れ替わり回数を計数し、回数の少ない連
語の順位を高くする単語入替数計数部を備えることを特
徴としている。請求項３に係わる発明においては、前記
順位決定手段は、前記連語一時記憶手段に記憶されてい
る連語を構成する最初と最後の単語の間に前記単語一時
記憶手段に記憶されている連語を構成しない単語が何語
挿入されているかを数え、その数が小さい連語の順位を
高くする単語挿入数計数部を備えることを特徴としてい
る。In the invention according to claim 2, the rank determining means determines the number of exchanges between the order of words stored in the word temporary storage means and the order of words stored in the compound word temporary storage means. It is characterized by including a word replacement number counting unit that counts and increases the rank of collocations having a small number of times. In the invention according to claim 3, the rank determining means forms a compound word stored in the word temporary storage means between a first word and a last word forming a compound word stored in the compound word temporary storage means. It is characterized by including a word insertion number counting unit that counts how many words that are not inserted and increases the rank of a collocation having a small number.

【０００７】請求項４に係わる発明においては、前記順
位決定手段は、前記単語一時記憶手段に記憶されている
連語を構成する単語と前記連語一時記憶手段に記憶され
ている連語を構成する単語との一致率を計算し、その一
致率の高い連語の順位を高くする単語一致率計算部を備
えることを特徴としている。請求項５二係わる発明にお
いては、前記順位決定手段は、前記連語一時記憶手段に
記憶されている一の連語を構成する単語に一致する前記
単語一時記憶手段に記憶されている単語の数を数え、そ
の数が大きい連語の順位を高くする単語一致数計数部を
備えることを特徴としている。In the invention according to claim 4, the rank determining means includes a word forming a compound word stored in the word temporary storage means and a word forming a compound word stored in the word temporary storage means. It is characterized by including a word matching rate calculation unit that calculates the matching rate of and increases the rank of collocations having a high matching rate. In a fifth aspect of the present invention, the rank determining means counts the number of words stored in the word temporary storage means that match a word forming one compound word stored in the multiple word temporary storage means. It is characterized by including a word matching number counting unit for increasing the rank of a collocation having a large number.

【０００８】請求項６に係わる発明においては、前記順
位決定手段は、前記単語一時記憶手段に記憶されている
単語の順番と前記連語一時記憶手段に記憶されている単
語の順番との入れ替わり回数を計数し、回数の少ない連
語の順位を高くする単語入替数計数部と、前記単語入替
数計数部で順位が同一順位の連語があるときには、前記
連語一時記憶手段に記憶されている連語を構成する最初
と最後の単語の間に前記単語一時記憶手段に記憶されて
いる連語を構成しない単語が何語挿入されているかを数
え、その数が小さい連語の順位を高くする単語挿入数計
数部と、前記単語挿入数計数部で順位が同一順位の連語
があるときには、前記単語一時記憶手段に記憶されてい
る連語を構成する単語と前記連語一時記憶手段に記憶さ
れている連語を構成する単語との一致率を計算し、その
一致率の高い連語の順位を高くする単語一致率計算部
と、前記単語一致率計算部で順位が同一順位の連語があ
るときには、前記連語一時記憶手段に記憶されている一
の連語を構成する単語に一致する前記単語一時記憶手段
に記憶されている単語の数を数え、その数が大きい連語
の順位を高くする単語一致数計数部を備えることを特徴
としている。In the invention according to claim 6, the rank determining means determines the number of exchanges of the order of the words stored in the word temporary storage means and the order of the words stored in the compound word temporary storage means. When there are collocations having the same rank in the word replacement number counting unit that counts and increases the rank of collocations having a small number of times, the collocations stored in the collocation temporary storage means are configured. A word insertion number counting unit that counts how many words that do not form a collocation stored in the word temporary storage means between the first and last words are inserted, and increases the rank of collocations with a small number thereof. When the word insertion number counting unit has collocations having the same rank, the vocabulary forming the collocation stored in the word temporary storage means and the collocation stored in the collocation temporary storage means are constructed. If there is a word matching rate calculation unit that calculates the matching rate with a word and increases the rank of a word having a high matching rate, and the word matching rate calculation unit has a word with the same rank, the word temporary storage unit A word match number counting unit for counting the number of words stored in the word temporary storage unit that match a word constituting one collocation stored in the word and increasing the rank of the collocation having a large number. It has a feature.

【０００９】[0009]

【作用】上記構成により、請求項１の発明において、単
語分割手段は、文一時記憶手段に記憶されている入力手
段から入力された文を単語に分割し、分割した単語が基
本形でないとき（活用形の単語であるとき）に基本形の
単語に変換する。単語検出手段は、得られた単語が検索
を高速とするために基本形である単語だけを見出し単語
とした単語記憶手段に記憶されている見出し単語に一致
した場合に対応する連語の識別子を検出し、連語の識別
子を検出したときには、その識別子と分割されたまの単
語（入力文中の単語）又はそれに加えて基本形の単語を
一組として、連語の識別子を検出しないときには、分割
されたままの単語又はそれに加えて基本形の単語を一組
として分割された順番（入力文の文頭から順）に単語一
時記憶手段に一時記憶させる。連語検索手段は、単語一
時記憶手段に記憶されている識別子が複数の組で一致す
るときに入力文に連語が含まれているとして、その一致
する識別子で識別される連語と連語訳とを連語記憶手段
から検索し、順次連語一時記憶手段に連語ごとに一時記
憶させる。順位決定手段は、入力文中の連語として最適
（翻訳に利用できる適切）な順位を所定の手順で決定す
る。出力手段は、順位の高い順に連語と連語訳とを出力
（例えば表示画面への表示）する。With the above construction, in the invention of claim 1, the word dividing means divides the sentence input from the input means stored in the sentence temporary storage means into words, and when the divided words are not in the basic form (inflection) When it is a word in the form), it is converted to the word in the basic form. The word detecting means detects an identifier of a corresponding collocation word when the obtained word matches a heading word stored in the word storing means in which only the basic word is used as a heading word in order to speed up the search. , When a collocation identifier is detected, the identifier and the word that has been divided (words in the input sentence) or a basic word in addition to that are used as a set, and when the collocation identifier is not detected, the word that remains divided or In addition to this, the basic form words are temporarily stored in the word temporary storage means in a divided order (from the beginning of the input sentence). The collocation search means determines that the input sentence includes collocations when the identifiers stored in the word temporary storage means match in a plurality of sets, and the collocations and collocations identified by the matching identifiers are collocated. The data is searched from the storage means and sequentially stored in the multiple word temporary storage means for each multiple word. The rank determining means determines an optimum rank (suitable for translation) as a collocation in an input sentence by a predetermined procedure. The output means outputs (for example, displays on the display screen) the collocations and collocations in descending order of rank.

【００１０】請求項２の発明において、請求項１の発明
の作用に加えて、順位決定手段の単語入替数計数部は、
入力文中の連語として最適な順位を決定するため、連語
一時記憶手段に記憶されている連語の語順と単語一時記
憶手段に記憶されている語順（連語記憶手段の連語の語
順と入力文の連語の語順）の入れ替わり回数を計算し、
回数の少ない連語の語順の順位を高くする。In the invention of claim 2, in addition to the operation of the invention of claim 1, the word replacement number counting unit of the order determining means is:
In order to determine the optimum rank as a complex word in the input sentence, the word order of the complex word stored in the complex word temporary storage means and the word order stored in the word temporary storage means (the word sequence of the complex word of the complex word storage means and the complex word of the input sentence Calculate the number of replacements (word order),
Increase the word order of collocations that have a low frequency.

【００１１】請求項３の発明において、請求項１の発明
の作用に加えて、順位決定手段の単語挿入数計数部は、
入力文中の連語として最適な順位を決定するため、連語
一時記憶手段に記憶されている連語を構成する最初と最
後の単語の間に単語一時記憶手段に記憶されている連語
を構成しない単語が何語挿入されているかを数え、その
数が小さい連語の順位を高くする。According to the invention of claim 3, in addition to the operation of the invention of claim 1, the word insertion number counting section of the rank determining means is:
In order to determine the optimum rank as a collocation in the input sentence, what is the word that does not form a collocation stored in the word temporary storage means between the first and last words that form the collocation stored in the collocation temporary storage means. Count whether words have been inserted, and increase the rank of collocations with smaller numbers.

【００１２】請求項４の発明において、請求項１の発明
の作用に加えて、順位決定手段の単語一致率計算部は、
入力文中の連語として最適な順位を決定するため、単語
一時記憶手段に記憶されている連語を構成する単語との
一致率を計算し、その一致率の高い連語の順位を高くす
る。請求項５の発明において、請求項１の発明の作用に
加えて、順位決定手段の単語一致数計数部は、入力文中
の連語として最適な順位を決定するため、連語一時記憶
手段に記憶されている一の連語を構成する単語に一致す
る単語一時記憶手段に記憶されている単語の数を数え、
その数が大きい連語の順位を高くする。In the invention of claim 4, in addition to the operation of the invention of claim 1, the word concordance rate calculation unit of the rank determining means includes:
In order to determine the optimal rank as a collocation in the input sentence, the matching rate with the words forming the collocation stored in the word temporary storage means is calculated, and the collocation with a high matching rate is ranked higher. In the invention of claim 5, in addition to the operation of the invention of claim 1, the word coincidence number counting unit of the rank determining means stores the word in the temporary word storage means in order to determine the optimum rank as a complex word in the input sentence. Counting the number of words stored in the word temporary storage means that match the words forming one compound word,
Increase the rank of collocations with a large number.

【００１３】請求項６の発明において、請求項１の発明
の作用に加えて、順位決定手段における適切な連語の順
位の決定にあたり、同一順位があるときには、順次単語
入替数計数部、単語挿入数計数部、単語一致率計算部及
び単語一致数計数部を機能させ最適な連語の順位を決定
する。In the sixth aspect of the invention, in addition to the operation of the first aspect of the invention, when there is the same rank in determining an appropriate collocation order of the deciding means, the word replacement number counting section and the word insertion number The counting unit, the word matching rate calculating unit, and the word matching number counting unit are made to function to determine the optimum collocation order.

【００１４】[0014]

【実施例】以下、本発明に係る連語検索装置を実施例に
基づいて説明する。図１は、本発明に係る連語検索装置
の一実施例の構成図である。この連語検索装置は、入力
部１０１と、文一時記憶部１０２と、単語辞書１０３
と、連語辞書１０４と、単語検索部１０５と、単語一時
記憶部１０６と、連語検索部１０７と、連語一時記憶部
１０８と、連語整列部１０９と、出力部１１０とを備え
ている。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, a compound word search device according to the present invention will be described based on embodiments. FIG. 1 is a block diagram of an embodiment of a compound word search device according to the present invention. This compound word search device includes an input unit 101, a sentence temporary storage unit 102, and a word dictionary 103.
The multilingual dictionary 104, the word search unit 105, the word temporary storage unit 106, the multiword search unit 107, the multiword temporary storage unit 108, the multiword alignment unit 109, and the output unit 110.

【００１５】入力部１０１は、操作者から文の入力を受
け付け、受け付けた文を文一時記憶部１０２に記憶させ
るとともに、単語検索部１０５に起動指示を与える。文
一時記憶部１０２は、入力部１０１から入力された文を
一時記憶している。単語辞書１０３は、図２に示すよう
に、単語見出しとその単語を含む連語の識別子である連
語番号とを登録している。なお、単語見出しは基本形
（原形）の単語だけが登録されており、連語番号は、基
本形の単語を含む連語のものだけではなく、その活用形
の単語を含む連語のものも登録されている。ここで連語
とは、二語以上の単語で構成された熟語またはイディオ
ムをいう。The input unit 101 receives a sentence input from the operator, stores the received sentence in the sentence temporary storage unit 102, and gives a start instruction to the word search unit 105. The sentence temporary storage unit 102 temporarily stores the sentence input from the input unit 101. As shown in FIG. 2, the word dictionary 103 has registered therein a word heading and a compound word number which is an identifier of a compound word including the word. Note that only basic-form (original-form) words are registered as word headings, and collocation numbers are not only collocations that include basic-form words, but collocations that include conjugated words. Here, the collocation means an idiom or an idiom composed of two or more words.

【００１６】連語辞書１０４は、図３に示すように、各
連語の識別子である連語番号と連語とその訳とを連語番
号の順に登録している。単語検索部１０５は、単語分割
部と単語検出部とからなる。単語分割部は、入力部１０
１から起動指示を受けると、文一時記憶部１０２に記憶
されている文を読み出し、所定の手順で単語に分割す
る。具体的に欧米系の言語であるときには、文字と文字
間に設けられたスペースによって分割し、日本語、中国
語等であるときには、形態素解析や最長一致法等によっ
て分割される。As shown in FIG. 3, the collocation dictionary 104 registers collocation numbers which are identifiers of collocations, collocations and their translations in the order of collocation numbers. The word search unit 105 includes a word division unit and a word detection unit. The word division unit is the input unit 10
When the activation instruction is received from 1, the sentence stored in the sentence temporary storage unit 102 is read and divided into words according to a predetermined procedure. Specifically, in the case of Western languages, it is divided by the space provided between characters, and in the case of Japanese, Chinese, etc., it is divided by morphological analysis, the longest matching method, and the like.

【００１７】また、単語分割部は、形態素解析辞書（図
示せず）を内蔵し、分割された単語をこの辞書に基づい
て基本形に変換する。基本形に変形したときには分割さ
れたままの単語と基本形に変換した単語とを、分割され
たままの単語が基本形であるときにはその単語を単語検
出部に通知する。更に、文一時記憶部１０２に記憶され
ている文の読み出しが全て終わったときには、単語検出
部に読み出し終了を通知する。Further, the word division unit incorporates a morphological analysis dictionary (not shown), and converts the divided words into basic forms based on this dictionary. When it is transformed into the basic form, the word that is still divided and the word that is converted into the basic form are notified to the word detection unit when the word that is still divided is the basic form. Furthermore, when the reading of all the sentences stored in the sentence temporary storage unit 102 is completed, the word detection unit is notified of the completion of reading.

【００１８】単語検出部は、単語分割部から単語を受け
取ると、基本形である単語をキーとして単語辞書１０３
を検索し、その単語が単語見出しに登録されているとき
には、対応する連語番号を取り出し、単語一時記憶部１
０６に単語（分割されたままの単語と基本形の単語又は
分割されたままの単語）と連語番号とを一組にして記憶
させる。その単語が単語見出しに登録されていないとき
には、その単語（分割されたままの単語と基本形の単語
又は分割されたままの単語）だけを一組にして記憶させ
る。単語分割部から読み出し終了の通知を受け取り、単
語辞書１０３の検索を終了したときには、連語検索部１
０７に起動指示を与える。When the word detection unit receives a word from the word division unit, the word dictionary 103 uses the basic word as a key.
When the word is registered in the word heading, the corresponding collocation number is extracted and the word temporary storage unit 1
In 06, a word (a word that is still divided and a basic form word or a word that is still divided) and a collocation number are stored as a set and stored. When the word is not registered in the word heading, only that word (a word that is still divided and a basic form word or a word that is still divided) is stored as a set. When the notification of the completion of reading is received from the word division unit and the search of the word dictionary 103 is completed, the collocation search unit 1
A start instruction is given to 07.

【００１９】単語一時記憶部１０６は、入力された文の
先頭から順番に単語及び連語番号又は単語を組にして記
憶している。連語検索部１０７は、単語検索部１０５か
ら起動指示を受けると、単語一時記憶部１０６に記憶さ
れている各組の連語番号を読み出し、複数の組に共通し
て含まれている連語番号を検出し、検出したその連語番
号に一致する連語と連語訳とを連語辞書１０４から取り
出し、順位を全て「１」として連語番号の小さい順に連
語一時記憶部１０８に記憶させる。連語辞書１０４から
の連語等の取り出しが終了すると、連語整列部１０９に
起動指示を与える。The word temporary storage unit 106 stores a word and a collocation number or a word in order from the beginning of the input sentence. Upon receiving the activation instruction from the word search unit 105, the compound word search unit 107 reads the compound word numbers of each set stored in the word temporary storage unit 106 and detects the compound word numbers commonly included in the plurality of sets. Then, the collocations and collocations that match the detected collocation number are extracted from the collocation dictionary 104, and all are stored in the collocation temporary storage unit 108 in the ascending order of the collocation numbers, with the rank being "1". When the retrieval of the complex word or the like from the complex word dictionary 104 is completed, a start instruction is given to the complex word alignment unit 109.

【００２０】連語一時記憶部１０８は、図４に示すよう
に、順位４０１と連語番号４０２と連語４０３と連語訳
４０４と単語入替数４０５と挿入単語数４０６と辞書一
致率４０７と単語一致数４０８用の各記憶欄を持つ連語
テーブル４１０からなる。順位欄４０１は、連語検索部
１０７によって初期値として「１」が設定された後、連
語整列部１０９によって書き替えられる。連語番号４０
２と連語４０３と連語訳４０４の各欄の内容は、連語検
索部１０７によって書き込まれる。具体的には、例え
ば、図５に示す文が入力部１０１から入力された場合、
連語テーブル４１０には、連語検索部１０７によって順
位４０１と連語番号４０２と連語４０３と連語訳４０４
との各欄に検索結果を書き込まれ、単語入替数４０５と
挿入単語数４０６と辞書一致率４０７と単語一致数４０
８の各欄が空白のままとなっている。ここで順位とは、
入力文に含まれる単語で構成される連語として最適な
（最もふさわしい）順に付した順番をいう。As shown in FIG. 4, the compound word temporary storage unit 108 stores a rank 401, a compound number 402, a compound word 403, a compound translation 404, a word replacement number 405, an inserted word number 406, a dictionary matching rate 407, and a word matching number 408. It consists of a collocation table 410 having storage fields for each. The order column 401 is rewritten by the complex word alignment unit 109 after “1” is set as an initial value by the complex word search unit 107. Collocation number 40
The contents of the fields of 2, the complex word 403, and the complex word translation 404 are written by the complex word search unit 107. Specifically, for example, when the sentence shown in FIG. 5 is input from the input unit 101,
The collocation table 410 includes a rank 401, a collocation number 402, a collocation 403, and a collocation translation 404 in the collocation search unit 107.
The search result is written in each column of, and the word replacement number 405, the insertion word number 406, the dictionary matching rate 407, and the word matching number 40.
Each column of 8 is left blank. Here, the ranking is
It means the order of the most appropriate (most suitable) collocations made up of the words contained in the input sentence.

【００２１】連語整列部１０９は、入替数計数部と、挿
入単語数計数部と、辞書一致率計算部と、単語一致数計
数部と、カウンタ部とを備える。入替数計数部は、連語
検索部１０７から起動指示を受けると、カウンタ部のカ
ウンタＭの値を初期値「１」に設定する。連語一時記憶
部１０８に記憶されているＭ番目の連語があるか否かを
判断し、あるときには、その連語を取り出し、連語を構
成する単語に一致する単語一時記憶部１０６に記憶され
ている単語に印を付加する。この際、連語一時記憶部１
０８に記憶されている連語を構成する単語の順番が判別
できるような印を付加する。単語に付加した印の出現順
序（単語一時記憶部１０６には、文の先頭から順番に単
語の組が記憶されている。）が入れ替わっている単語の
数（単語入替数）を数える。数えた単語入替数を連語一
時記憶部１０８に記憶されている連語に対応して記憶さ
せ、単語一時記憶部１０６に記憶されている単語に付加
した印を消去し、併せてカウンタ部のカウンタＭの値を
「１」インクリメントする。Ｍ番目の連語がないと判断
したときには、連語一時記憶部１０８に記憶されている
連語を単語入替数の小さい連語が先になるように順位を
並べ替える。ただし、並べ替えは、連語一時記憶部１０
８の同順位の連語の間に適用する。連語の並べ替えを終
了すると、挿入単語数計数部に起動指示を与える。The complex word alignment unit 109 includes a replacement number counting unit, an inserted word number counting unit, a dictionary matching rate calculating unit, a word matching number counting unit, and a counter unit. Upon receiving the activation instruction from the complex word search unit 107, the replacement number counting unit sets the value of the counter M of the counter unit to the initial value “1”. It is determined whether or not there is the M-th compound word stored in the compound word temporary storage unit 108, and when there is, the compound word is extracted, and the word stored in the word temporary memory unit 106 that matches the word forming the compound word. Mark. At this time, the compound word temporary storage unit 1
A mark is added so that the order of the words forming the collocation stored in 08 can be determined. The number of words (word replacement number) in which the appearance order of the marks added to the words (word sets are stored in order from the beginning of the sentence in the word temporary storage unit 106) is counted. The counted word replacement number is stored in association with the collocation stored in the collocation temporary storage unit 108, the mark added to the word stored in the word temporary storage unit 106 is erased, and the counter M of the counter unit is also stored. The value of is incremented by "1". When it is determined that there is no M-th compound word, the compound words stored in the compound word temporary storage unit 108 are rearranged in order such that the compound word with the smaller word replacement number comes first. However, the rearrangement is performed by the compound word temporary storage unit 10.
Applies between 8 collocated collocations. When the rearrangement of collocations is completed, an activation instruction is given to the insertion word number counting unit.

【００２２】挿入単語数計数部は、単語入替数計数部か
ら起動指示を受けると、カウンタ部のカウンタＭの値を
初期値「１」に設定する。連語一時記憶部１０８に記憶
されているＭ番目の連語があるか否かを判断し、あると
きにはその連語を取り出し、連語を構成する単語に一致
する単語一時記憶部１０６に記憶されている単語に印を
付加する。Upon receiving the activation instruction from the word replacement number counting section, the inserted word number counting section sets the value of the counter M of the counter section to the initial value "1". It is determined whether or not there is the M-th compound word stored in the compound word temporary storage unit 108, and when there is, the compound word is taken out, and the word stored in the word temporary memory unit 106 that matches the word forming the compound word is determined. Add a mark.

【００２３】単語一時記憶部１０６に記憶されている単
語（分割されたままの単語に限る。）の先頭から順に、
最初に印の付加された単語と最後に印の付加された単語
との間に印の付加されていない単語の数（挿入単語数）
を数える。具体的には単語に印の付加されていない組の
数を数える。数えた挿入単語数を連語一時記憶部１０８
に記憶されている連語に対応して記憶させ、単語一時記
憶部１０６に記憶されている単語に付加した印を消去
し、併せてカウンタ部のカウンタＭの値を「１」インク
リメントする。Words stored in the word temporary storage unit 106 (limited to words that have been divided) are sequentially arranged from the beginning.
Number of unmarked words between the first marked word and the last marked word (number of inserted words)
Count. Specifically, the number of sets in which no mark is added to the word is counted. The number of inserted words counted is stored in the compound word temporary storage unit 108.
The word added to the word is stored in the word temporary storage unit 106, the mark added to the word is erased, and the value of the counter M of the counter unit is incremented by “1”.

【００２４】Ｍ番目の連語がないと判断したときには、
連語一時記憶部１０８に記憶されている連語を単語挿入
数が少ない連語が先になるように順位を並べ替える。た
だし、並べ替えは、連語一時記憶部１０８の同順位の連
語の間に適用する。連語の並べ替えを終了すると、辞書
一致率計算部に起動指示を与える。辞書一致率計算部
は、挿入単語数計数部から起動指示を受けると、カウン
タ部のカウンタＭの値を初期値「１」に設定する。連語
一時記憶部１０８に記憶されているＭ番目の連語がある
か否かを判断し、あるときにはその連語を取り出し、連
語を構成する単語に一致する単語一時記憶部１０６に記
憶されている単語に印を付加する。When it is judged that there is no Mth compound word,
The collocations stored in the collocation temporary storage unit 108 are rearranged in order such that the collocation having a smaller number of word insertions comes first. However, the rearrangement is applied between the collocations having the same rank in the collocation temporary storage unit 108. When the rearrangement of collocations is completed, a start instruction is given to the dictionary matching rate calculation unit. When receiving the activation instruction from the inserted word number counting unit, the dictionary matching rate calculation unit sets the value of the counter M of the counter unit to the initial value “1”. It is determined whether or not there is the M-th compound word stored in the compound word temporary storage unit 108, and when there is, the compound word is taken out, and the word stored in the word temporary memory unit 106 that matches the word forming the compound word is determined. Add a mark.

【００２５】単語一時記憶部１０６の印を付加した単語
の数を連語一時記憶部１０８の連語を構成する単語の数
で除算した値（辞書一致率）を計算する。計算した辞書
一致率を連語一時記憶部１０８に記憶されている連語に
対応して記憶させ、単語一時記憶部１０６に記憶されて
いる単語に付加した印を消去し、併せてカウンタ部のカ
ウンタＭの値を「１」インクリメントする。Ｍ番目の連
語がないと判断したときには、連語一時記憶部１０８に
記憶されている連語を辞書一致率の大きい連語が先にな
るように順位を並べ替える。ただし、並べ替えは、連語
一時記憶部１０８の同順位の連語の間に適用する。連語
の並べ替えを終了すると、単語一致数計数部に起動指示
を与える。A value (dictionary concordance rate) is calculated by dividing the number of words to which the mark is added in the word temporary storage unit 106 by the number of words constituting the compound word in the compound word temporary storage unit 108. The calculated dictionary matching rate is stored in association with the multiple words stored in the multiple word temporary storage unit 108, and the mark added to the word stored in the temporary word storage unit 106 is deleted. The value of is incremented by "1". When it is determined that there is no Mth collocation, the collocations stored in the collocation temporary storage unit 108 are rearranged so that the collocation having a high dictionary matching rate comes first. However, the rearrangement is applied between the collocations having the same rank in the collocation temporary storage unit 108. When the rearrangement of collocations is completed, a start instruction is given to the word matching number counting unit.

【００２６】単語一致数計数部は、辞書一致率計算部か
ら起動指示を受けると、カウンタ部のカウンタＭの値を
初期値「１」に設定する。連語一時記憶部１０８に記憶
されているＭ番目の連語があるか否かを判断し、あると
きには、その連語を取り出し、連語を構成する単語に一
致する単語一時記憶部１０６に記憶されている単語に印
を付加する。印を付加した単語の数（単語一致数）を数
える。具体的には、単語に印を付加した組の数を数え
る。数えた単語一致数を連語一時記憶部１０８に記憶さ
れている連語に対応して記憶させ、単語一時記憶部１０
６に記憶されている単語に付加した印を消去し、併せて
カウンタ部のカウンタＭの値を「１」インクリメントす
る。Ｍ番目の連語がないと判断したときには、連語一時
記憶部１０８に記憶されている連語を単語一致数の大き
い連語が先になるように順位を並べ替える。ただし、並
べ替えは、連語一時記憶部１０８の同順位の連語の間に
適用する。When receiving the activation instruction from the dictionary matching rate calculation unit, the word matching number counting unit sets the value of the counter M of the counter unit to the initial value "1". It is determined whether or not there is the M-th compound word stored in the compound word temporary storage unit 108, and when there is, the compound word is extracted, and the word stored in the word temporary memory unit 106 that matches the word forming the compound word. Mark. Count the number of words with a mark (the number of matching words). Specifically, the number of sets in which a mark is added to a word is counted. The counted number of word matches is stored corresponding to the collocation stored in the collocation temporary storage unit 108, and the word temporary storage unit 10 is stored.
The mark added to the word stored in 6 is erased, and at the same time, the value of the counter M of the counter unit is incremented by "1". When it is determined that there is no Mth compound word, the compound words stored in the compound word temporary storage unit 108 are rearranged in order such that the compound word having the largest number of word matches is first. However, the rearrangement is applied between the collocations having the same rank in the collocation temporary storage unit 108.

【００２７】連語の並べ替えを終了すると、出力部１１
０に起動指示を与える。例えば、連語整列部１０９は、
図４に示した連語テーブル４１０が連語一時記憶部１０
８に記憶されていた場合には、最終的に図６に示すよう
に単語入替数４０５と挿入単語数４０６と辞書一致率４
０７と単語一致数４０８の各欄に数値を書き込み、順位
４０１の高いものから連語を並び替えた連語テーブル６
０１を作成する。When the rearrangement of the complex words is completed, the output unit 11
A start instruction is given to 0. For example, the collocation unit 109
The collocation table 410 shown in FIG.
8 is stored, the word replacement number 405, the insertion word number 406, and the dictionary matching rate 4 are finally obtained as shown in FIG.
07 and a word match number 408 are filled in with numerical values, and the collocation table 6 in which collocations are rearranged from the highest ranking 401
Create 01.

【００２８】連語整列部１０９の各部による並び替えを
連語番号によって記述すると、単語入替数計数部によっ
て１−２−３−４−５−７−６となり、挿入単語数計数
部によって２−４−５−７−１−３−６となり、辞書一
致率計算部によって２−４−７−５−１−３−６とな
り、単語一致数計数部によって図６に示すように４−２
−７−５−１−３−６となる。When the rearrangement by each part of the continuous word alignment unit 109 is described by a continuous word number, the word replacement number counting unit gives 1-2-3-4-5-7-6, and the insertion word number counting unit 2-4-4. 5-7-1-3-6, the dictionary matching rate calculation unit becomes 2-4-7-5-1-3-6, and the word matching number counting unit 4-2 as shown in FIG.
-7-5-1-3-6.

【００２９】出力部１１０は、連語整列部１０９から起
動指示を受けると、連語一時記憶部１０８に記憶されて
いる連語をその順位が高い連語から順次出力する。次
に、本実施例の動作を図７〜図１１に示すフローチャー
トを用いて説明する。入力部１０１は、文の入力を受け
付けると、文一時記憶部１０２に一時記憶させる（Ｓ７
０２）。Upon receiving the activation instruction from the complex word alignment unit 109, the output unit 110 sequentially outputs the complex words stored in the complex word temporary storage unit 108 from the complex word having the highest rank. Next, the operation of this embodiment will be described with reference to the flowcharts shown in FIGS. When the input unit 101 receives an input of a sentence, the input unit 101 temporarily stores the sentence in the sentence temporary storage unit 102 (S7).
02).

【００３０】単語検索部１０５は、入力部１０１から起
動指示を受け取ると、文一時記憶部１０２に記憶されて
いる文から単語を取り出し、基本形である単語に一致す
る単語が単語辞書１０３の見出し単語に有るか否かを判
断する（Ｓ７０６）。見出し単語に有るときには、その
単語（分割されたままの単語と基本形の単語又は分割さ
れたままの単語）と単語辞書１０３から取り出した見出
し単語に対応する全ての連語の連語番号とを一組として
単語一時記憶部１０６に記憶させる（Ｓ７０８）。見出
し単語にないときには、その単語だけを一組として単語
一時記憶部１０６に記憶させる（Ｓ７１０）。文一時記
憶部１０２に残っている単語の有無を判断し（Ｓ７１
２）、有るときにはＳ７０４に戻り、無いときには、Ｓ
７１４に移る。When the word search unit 105 receives the activation instruction from the input unit 101, it retrieves a word from the sentence stored in the sentence temporary storage unit 102, and a word matching the basic word is found in the word dictionary 103. (S706). When it is found in a heading word, the word (a word that is still divided and a basic word or a word that is still divided) and a collocation number of all collocations corresponding to the heading word extracted from the word dictionary 103 are set as a set. The word is temporarily stored in the word temporary storage unit 106 (S708). If the word is not a headword, only that word is stored in the word temporary storage unit 106 as a set (S710). It is determined whether there are any words remaining in the sentence temporary storage unit 102 (S71
2) If yes, return to S704, and if no, S
Move to 714.

【００３１】Ｓ７１４において、連語検索部１０７は、
単語一時記憶部１０６に記憶されている複数組に共通し
て含まれる連語番号を検索し、検索した連語番号で識別
される連語と連語訳とを連語辞書１０４から取り出し、
連語一時記憶部１０８の連語テーブルに記憶させる（Ｓ
７１４）。次に、連語整列部１０９は、カウンタＭの値
を「１」に設定する（Ｓ８０２）。連語一時記憶部１０
８に記憶されたＭ番目の連語の有無を判断し（Ｓ８０
４）、有るときにはＭ番目の連語を連語一時記憶部１０
８から読み出す（Ｓ８０６）。読み出した連語の単語の
並びの順番を付した印を一致する単語一時記憶部１０６
に記憶されている単語に付加する（Ｓ８０８）。単語一
時記憶部１０６に付加された印と順番が入れ替わってい
る単語入替数を数える（Ｓ８１０）。数えた単語入替数
を連語一時記憶部１０８の連語テーブルに記憶させる
（Ｓ８１２）。単語一時記憶部１０６の印を消去し（Ｓ
８１４）、カウンタＭの値を「１」インクリメントして
（Ｓ８１６）、Ｓ８０２に戻る。In S714, the complex word search unit 107
A collocation number commonly included in a plurality of sets stored in the word temporary storage unit 106 is searched, and a collocation and a collocation identified by the searched collocation number are retrieved from the collocation dictionary 104,
It is stored in the collocation table of the collocation temporary storage unit 108 (S
714). Next, the word alignment unit 109 sets the value of the counter M to "1" (S802). Compound word temporary storage unit 10
The presence or absence of the Mth compound word stored in 8 is determined (S80
4), if there is, the Mth compound word is used as the compound word temporary storage unit 10
It is read from 8 (S806). The word temporary storage unit 106 that matches the marks with the sequence order of the read collocation words.
It is added to the word stored in (S808). The number of word replacements whose order is replaced with the mark added to the word temporary storage unit 106 is counted (S810). The counted word replacement number is stored in the compound word table of the compound word temporary storage unit 108 (S812). The mark in the word temporary storage unit 106 is erased (S
814), the value of the counter M is incremented by "1" (S816), and the process returns to S802.

【００３２】連語一時記憶部１０８にＭ番目の連語がな
いときには、連語テーブルに記憶されている単語入替数
の小さな順に順位を付け連語を並べ替える（Ｓ８１
８）。同一順位の連語が有るか否かを判断し（Ｓ８２
０）、有るときにはＳ９０２に、ないときにはＳ１１２
０にそれぞれ移る。Ｓ９０２〜Ｓ９０６は上記Ｓ８０２
〜Ｓ８０６と同一であるのでその説明を省略する。When there is no M-th compound word in the compound word temporary storage unit 108, the compound words are ranked in order from the smallest word replacement number stored in the compound word table and the compound words are rearranged (S81).
8). It is determined whether there are collocations having the same rank (S82).
0), if there is, to S902, if not, to S112
Move to 0 respectively. S902 to S906 are the above S802.
-S806 is the same as that of S806, and the description thereof is omitted.

【００３３】読み出した連語に含まれる単語に一致する
単語一時記憶部１０６に記憶されている単語に印を付加
する（Ｓ９０８）。印を付加した単語と単語との間に挿
入されている挿入単語数を数える（Ｓ９１０）。数えた
挿入単語数を連語一時記憶部１０８の連語テーブルに記
憶させる（Ｓ９１２）。Ｓ９１４、Ｓ９１６は上記Ｓ８
１４、Ｓ８１６と同一であるので説明を省略する。A mark is added to the word stored in the word temporary storage unit 106 that matches the word included in the read complex word (S908). The number of inserted words inserted between the marked words is counted (S910). The counted number of inserted words is stored in the compound word table of the compound word temporary storage unit 108 (S912). S914 and S916 are the above S8.
14 and S816, the description thereof will be omitted.

【００３４】連語一時記憶部１０８にＭ番目の連語がな
いときには、連語テーブルに記憶されている挿入単語数
の小さな順に順位を付け、連語を並べ替える（Ｓ９１
８）。同一順位の連語の有無を判断し（Ｓ９２０）、あ
るときには上記Ｓ９０２〜Ｓ９０８と同一のＳ１００２
〜Ｓ１００８の処理をし、ないときにはＳ１１２０に移
る。When there is no M-th compound word in the compound word temporary storage unit 108, the compound words are ranked in order from the smallest number of inserted words stored in the compound word table, and the compound words are rearranged (S91).
8). It is determined whether or not there is a compound word having the same rank (S920), and if there is, S1002 which is the same as S902 to S908.
The processes of S1008 to S1008 are performed, and if not, the process proceeds to S1120.

【００３５】連語に含まれる単語の数を分母として、単
語一時記憶部１０６に記憶されている印を付加した単語
の数を分子として辞書一致率を計算する（Ｓ１０１
０）。計算した辞書一致率を連語一時記憶部１０８の連
語テーブル４１０に記憶させ（Ｓ１０１２）、上記Ｓ９
１４、Ｓ９１６と同一のＳ１０１４、Ｓ１０１６の処理
をする。連語一時記憶部１０８にＭ番目の連語がない
ときには、辞書一致率の大きな順に順位を付け、連語を
並べ替える（Ｓ１０１８）。同一順位の連語が有るか否
かを判断し（Ｓ１０２０）、有るときには上記Ｓ１００
２〜Ｓ１００８と同一のＳ１１０２〜Ｓ１１０８の処理
をし、ないときにはＳ１１２０に移る。The dictionary matching rate is calculated with the number of words included in the collocation as the denominator and the number of words with the mark stored in the word temporary storage unit 106 as the numerator (S101).
0). The calculated dictionary matching rate is stored in the collocation table 410 of the collocation temporary storage unit 108 (S1012), and the above S9 is performed.
14, the same processing of S1014 and S1016 as S916 is performed. When there is no Mth compound word in the compound word temporary storage unit 108, the compound words are ranked in descending order of the dictionary matching rate and the compound words are rearranged (S1018). It is determined whether or not there are collocations of the same rank (S1020).
The same processes of S1102 to S1108 as those of 2 to S1008 are performed, and if not, the process proceeds to S1120.

【００３６】単語一時記憶部１０６に記憶されている印
を付加された単語の数を単語一致数として数える（Ｓ１
１１０）。連語一時記憶部１０６の連語テーブルに単語
一致数を記憶させ（Ｓ１１１２）、上記Ｓ１０１４、Ｓ
１０１６と同一のＳ１１１４、Ｓ１１１６の処理をす
る。連語一時記憶部１０８にＭ番目の連語がないときに
は、単語一致数の大きな順に順位を付け、連語を並べ替
える（Ｓ１１１８）。The number of marked words stored in the word temporary storage unit 106 is counted as the number of word matches (S1).
110). The number of word matches is stored in the multiple word table of the multiple word temporary storage unit 106 (S1112).
The same processes of S1114 and S1116 as 1016 are performed. When there is no M-th compound word in the compound word temporary storage unit 108, the compound words are ranked in descending order of the number of word matches and the compound words are rearranged (S1118).

【００３７】Ｓ１１２０において、出力部１１０は、連
語テーブル６０１に記憶された順位の高い順に連語とそ
の連語訳とを出力する。次に、本実施例で用いられる単
語辞書１０３の作成装置について説明する。単語辞書作
成装置は、本実施例で説明した単語検索部１０５とほぼ
同様の構成である。In step S1120, the output unit 110 outputs the complex word and its complex word translation stored in the complex word table 601 in descending order of rank. Next, a device for creating the word dictionary 103 used in this embodiment will be described. The word dictionary creating device has substantially the same configuration as the word searching unit 105 described in the present embodiment.

【００３８】図１２は、連語辞書１０４の一例を示して
いる。連語辞書１０４は前述したように連語を識別する
連語番号と連語と連語訳とを予め対応付けて記憶してい
る。単語辞書作成装置は、連語番号「１１」で識別され
る連語を構成する先頭の単語「ｂｅｆｏｒｅ」から読み
出し、形態素解析をして、「ｂｅｆｏｒｅ」が基本形の
単語であると認識して、単語辞書１０３に見出し単語
「ｂｅｆｏｒｅ」と連語番号「１１」とを対応して記憶
させる。以下、単語「Ｉ」についても同様である。単語
「ｗａｓ」を読み出し、形態素解析の結果、単語「ｂ
ｅ」が基本形の単語であると認識して、単語辞書１０３
に見出し単語「ｂｅ」と連語番号「１１」とを対応して
記憶させる。同様の処理を繰り返し、連語番号１２の単
語「ｂｅ」を読み出し、形態素解析の結果、それが基本
形の単語であると認識して連語番号「１２」を記憶させ
るが、同一の単語「ｂｅ」が既に単語辞書１０３の見出
し単語として記憶されているので既に記憶されている連
語番号「１１」の後に連語番号「１２」を記憶させる。
このような処理の結果、単語辞書１０３は図１３に示す
ようになる。FIG. 12 shows an example of the collocation dictionary 104. As described above, the collocation dictionary 104 stores collocation numbers for identifying collocations, collocations and collocations in association with each other in advance. The word dictionary creating device reads from the first word "before" that forms the compound word identified by the compound number "11", performs morphological analysis, recognizes that "before" is a basic word, and recognizes the word dictionary. The heading word “before” and the collocation number “11” are stored in 103 in association with each other. Hereinafter, the same applies to the word "I". The word "was" is read out, and as a result of the morphological analysis, the word "b" is read.
Recognizing that “e” is a basic word, the word dictionary 103
The heading word “be” and the collocation number “11” are stored in correspondence with each other. The same process is repeated to read the word “be” having the collocation number 12, and as a result of the morphological analysis, recognize that it is the basic word and store the collocation number “12”, but the same word “be” Since the word is already stored as the index word in the word dictionary 103, the word number “12” is stored after the word number “11” already stored.
As a result of such processing, the word dictionary 103 becomes as shown in FIG.

【００３９】以上、本発明を実施例に基づいて説明した
けれども、本発明は上記実施例に限定されないのは勿論
である。即ち、上述の実施例では英日の連語翻訳を例に
挙げているけれども、連語辞書、単語辞書の記憶内容を
他の言語のものに変えることによって如何なる言語の組
み合わせにも対応することができる。Although the present invention has been described above based on the embodiments, it goes without saying that the present invention is not limited to the above embodiments. That is, in the above-mentioned embodiment, English-Japanese collocation translation is taken as an example, but any combination of languages can be dealt with by changing the memorized contents of the collocation dictionary or word dictionary to another language.

【００４０】[0040]

【発明の効果】以上説明したように請求項１の発明によ
れば、既存の辞書などの対訳例文を連語辞書に用いるこ
とができ、連語を構成する２語以上の単語が入力された
文中に含まれているときには、その連語と連語訳とを翻
訳に利用できる適切な順に出力するので、該文中の連語
を漏れなく、かつ迅速に利用することができる。As described above, according to the invention of claim 1, it is possible to use a parallel translation example sentence such as an existing dictionary in a collocation dictionary, and in a sentence in which two or more words forming the collocation are input. When included, the collocations and collocations are output in an appropriate order that can be used for translation, so that the collocations in the sentence can be used quickly and completely.

【００４１】請求項２の発明によれば、請求項１の発明
の効果に加えて、入力された文中の語順と連語辞書の連
語の語順との入れ替わりの少ない連語を優先的に出力す
るので、より適切な連語を効率的に利用することができ
る。請求項３の発明によれば、請求項１の発明の効果に
加えて更に、入力された文中の連語を構成する単語間に
挿入された連語を構成しない単語の少ない連語を優先的
に出力するので、より適切な連語を効率的に利用するこ
とができる。According to the second aspect of the invention, in addition to the effect of the first aspect of the invention, since the compound word in which the word order in the input sentence and the word order of the compound word in the compound word dictionary are less interchanged is preferentially output, More appropriate collocation can be used efficiently. According to the invention of claim 3, in addition to the effect of the invention of claim 1, further, the compound word inserted between the words composing the compound word in the input sentence and having less composing words is preferentially output. Therefore, more appropriate collocation can be efficiently used.

【００４２】請求項４の発明によれば、請求項１の発明
の効果に加えて更に、連語を構成する単語の一致率の高
い連語の順に優先的に出力するので、より適切な連語を
効率的に利用することができる。請求項５の発明によれ
ば、請求項１の発明の効果に加えて更に、連語を構成す
る単語数の多い連語を優先的に出力するので、より適切
な連語を効率的に利用することができる。According to the invention of claim 4, in addition to the effect of the invention of claim 1, further, since the collocations having a high matching rate of the words constituting the collocation are preferentially output, a more appropriate collocation is efficiently produced. Can be used for various purposes. According to the invention of claim 5, in addition to the effect of the invention of claim 1, since a compound word having a large number of words forming the compound word is preferentially output, a more appropriate compound word can be efficiently used. it can.

【００４３】請求項６の発明によれば、請求項１乃至請
求項５と同様の効果が得られる。According to the invention of claim 6, the same effects as those of claims 1 to 5 can be obtained.

[Brief description of drawings]

【図１】本発明に係る連語検索装置の一実施例の構成図
である。FIG. 1 is a configuration diagram of an embodiment of a compound word search device according to the present invention.

【図２】本実施例１における単語辞書の内容を示す図で
ある。FIG. 2 is a diagram showing the contents of a word dictionary in the first embodiment.

【図３】本実施例における連語辞書の内容を示す図であ
る。FIG. 3 is a diagram showing the contents of a collocation dictionary in this embodiment.

【図４】本実施例における連語一時記憶部の記憶内容の
一例を示す図である。FIG. 4 is a diagram showing an example of contents stored in a complex word temporary storage unit in the present embodiment.

【図５】本実施例における入力文を示す図である。FIG. 5 is a diagram showing an input sentence according to the present embodiment.

【図６】本実施例における連語一時記憶部の記憶内容の
一例を示す図である。FIG. 6 is a diagram showing an example of stored contents of a complex word temporary storage unit in the present embodiment.

【図７】本実施例の動作を説明するためのフローチャー
トである。FIG. 7 is a flowchart for explaining the operation of this embodiment.

【図８】本実施例の動作を説明するためのフローチャー
トである。FIG. 8 is a flowchart for explaining the operation of this embodiment.

【図９】本実施例の動作を説明するためのフローチャー
トである。FIG. 9 is a flow chart for explaining the operation of this embodiment.

【図１０】本実施例の動作を説明するためのフローチャ
ートである。FIG. 10 is a flowchart for explaining the operation of this embodiment.

【図１１】本実施例の動作を説明するためのフローチャ
ートである。FIG. 11 is a flowchart for explaining the operation of this embodiment.

【図１２】連語辞書の一例を示す図である。FIG. 12 is a diagram showing an example of a collocation dictionary.

【図１３】図１２に示した連語辞書から作成される単語
辞書を示す図である。13 is a diagram showing a word dictionary created from the collocation dictionary shown in FIG.

[Explanation of symbols]

１０１入力部１０２文一時記憶部１０３単語辞書１０４連語辞書１０５単語検索部１０６単語一時記憶部１０７連語検索部１０８連語一時記憶部１０９連語整列部１１０出力部 101 Input Unit 102 Sentence Temporary Storage Unit 103 Word Dictionary 104 Conjunction Dictionary 105 Word Search Unit 106 Word Temporary Storage Unit 107 Conjunction Search Unit 108 Conjunction Temporary Storage Unit 109 Conjunction Sorting Unit 110 Output Unit

Claims

[Claims]

1. A compound word search device for searching a compound word composed of a plurality of words in an input sentence, comprising: input means for inputting a sentence; and sentence temporary storage means for temporarily storing the sentence input by the input means. And a word storage unit that stores in advance a headword that is a basic form and a collocation identifier that includes the headword and its inflection word in association with each other, an identifier that identifies each collocation, and an identifier that is specified by the identifier. A complex word stored in advance in association with a complex word and its translation, and the sentence stored in the sentence temporary storage means is divided into words, and when the divided word is not in the basic form, it is converted into a basic form word. A word dividing unit for converting; a word detecting unit for detecting an identifier of a corresponding collocation word when the word obtained by the word dividing unit matches a heading word stored in the word storage unit; When the means detects a collocation identifier, the identifier and the word that remains divided and the basic form or the word that remains divided are detected.If the means does not detect the word that remains divided and the basic form or word division A word temporary storage means for storing the as-set words in a divided order as a set; and a compound word identified by the identifier when the identifiers stored in the word temporary storage means match in a plurality of sets. A collocation search unit that searches collocation translations from the collocation storage unit, a collocation temporary storage unit that temporarily stores all collocations and collocations searched by the collocation search unit for each collocation, and the collocation temporary storage unit. Order determination means for determining the optimal order of stored collocations by a predetermined procedure, and output means for outputting collocations and collocations in descending order of order determined by the order determination means. Phrase search apparatus according to claim that there were example.

2. The rank determining means counts the number of times the order of words stored in the word temporary storage means and the order of words stored in the word temporary storage means are interchanged, and the number of words having a small number of times is changed. The multi-word search device according to claim 1, further comprising: a word replacement number counting unit for increasing the rank of.

3. The rank determining means determines, between the first and last words forming the collocation stored in the collocation temporary storage means, what words are not forming the collocation stored in the word temporary storage means. 2. The multi-word search device according to claim 1, further comprising a word insertion number counting unit that counts whether or not words have been inserted and increases the rank of the multi-words having the smaller number.

4. The rank determining means calculates a matching rate between a word forming a compound word stored in the word temporary storage means and a word forming a compound word stored in the word temporary storage means, The multi-word search device according to claim 1, further comprising a word concordance rate calculation unit for increasing a rank of a multi-word having a high matching rate.

5. The rank determining means counts the number of words stored in the word temporary storage means that match a word forming one compound word stored in the multiple word temporary storage means, and the number is The multi-word search device according to claim 1, further comprising a word matching number counting unit for increasing the rank of a large multi-word.

6. The rank determining means counts the number of times of exchange of the order of words stored in the word temporary storage means and the order of words stored in the multiple word temporary storage means, and the multiple words with a small number of times. And a word replacement number counting unit for increasing the rank of the word replacement number counting unit, when there is a compound word of the same rank in the word replacement number counting unit, between the first and last words constituting the compound word stored in the compound word temporary storage means. In the word insertion number counting unit, which counts the number of words that do not form a collocation stored in the word temporary storage means and increases the rank of collocations with a small number, When there are collocations having the same rank, the matching rate between the words forming the collocation stored in the word temporary storage means and the words forming the collocation stored in the collocation temporary storage means is calculated, And a word matching rate calculating unit for increasing the rank of a collocation having a high matching rate, and when there is a collocation having the same rank in the word matching rate calculating unit, one collocation stored in the collocation temporary storage means is configured. 2. The multi-word search according to claim 1, further comprising a word match number counting section for counting the number of words stored in the word temporary storage means that match the corresponding word, and increasing the rank of the multi-word having a large number. apparatus.