JPH08137881A

JPH08137881A - Device and method for analyzing word relation

Info

Publication number: JPH08137881A
Application number: JP7038716A
Authority: JP
Inventors: Hiroaki Karasawa; 裕明唐沢; Shigeto Iwase; 成人岩瀬
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1994-09-12
Filing date: 1995-02-27
Publication date: 1996-05-31

Abstract

PURPOSE: To analyze the meaning of words in a set of candidates for meaning or their word through fast and simple logic by analyzing the means of a word and a word string having meaning which are limited to a significant number. CONSTITUTION: A relative information dictionary 100 contains information representing the meaning of a work as bit information by displaying meanings of a slave semantic category by the meanings of a basic semantic category. Then the relative information acquiring means 210 of an analyzing means 200 obtain the meaning of a selected word by referring to the relative information dictionary 100 and using relative information of bit constitution, and recursively repeating logical operation as many times as inputted words, and a relative information output means 220 performs logical operation between relative information on the final word and all words before the final word when the selected word is the final word of the inputted word string, and outputs relative information on the whole set of the inputted words.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、単語関係解析装置及び
単語関係解析方法に係り、入力された単語の組合せが表
す意味、及び、当該組合せの中の各単語の意味を解析す
る単語関係解析装置及び単語関係解析方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a word relation analyzing apparatus and a word relation analyzing method, and a word relation analysis for analyzing the meaning of a combination of input words and the meaning of each word in the combination. The present invention relates to a device and a word relation analysis method.

【０００２】詳しくは、一部または、全ての単語がどの
ような意味で使用しているのかを指定せずに、入力され
た複数の単語について、入力された単語の組合せに基づ
いて各単語、及び単語の集合全体が表す意味を解析する
単語関係解析装置及び単語関係解析方法に関する。[0002] More specifically, for a plurality of input words, each word is input based on a combination of input words without specifying what meanings of some or all of the words are used. Also, the present invention relates to a word relation analysis device and a word relation analysis method for analyzing the meaning represented by the entire word set.

【０００３】[0003]

【従来の技術】従来、自然言語処理における任意の単語
の集合が表す意味、あるいはその時の各単語の意味を得
るためには、辞書により各単語毎に付与された意味に基
づいて、テーブル化あるいは、ロジック化された自然語
の意味解析のルールに従って、解釈を行う。2. Description of the Related Art Conventionally, in order to obtain the meaning represented by a set of arbitrary words in natural language processing or the meaning of each word at that time, a table is created or a dictionary is created based on the meaning given to each word by a dictionary. , Interpret according to the rules of semantic analysis of natural language.

【０００４】図２２は、従来の単語解析システムの構成
を示す。同図に示すシステムは、単語入力部１０、形態
素解析部２０、解析部３０、結果出力部４０、辞書５
０、及び意味解析ルール６０より構成される。形態素解
析部２０は、単語入力部１０より入力された単語列を単
語毎に分割し、分割された単語について辞書５０を参照
して当該単語に意味を付与する。FIG. 22 shows the configuration of a conventional word analysis system. The system shown in the figure includes a word input unit 10, a morpheme analysis unit 20, an analysis unit 30, a result output unit 40, and a dictionary 5.
0 and a semantic analysis rule 60. The morphological analysis unit 20 divides the word string input by the word input unit 10 into words, and refers to the dictionary 50 for the divided words to add meaning to the words.

【０００５】解析部３０は、形態素解析部２０により意
味を付与されている単語について、意味解析ルール６０
に従って意味の解釈を行う。ここで、入力される各単語
が市町村名等の住所を指定している場合を例に述べる。The analysis unit 30 applies a semantic analysis rule 60 to the words given the meaning by the morphological analysis unit 20.
Interpret meaning according to. Here, an example will be described in which each input word specifies an address such as a city name.

【０００６】顧客データをコンピュータで管理・検索す
る上で、住所情報は、顧客を特定するための重要な入力
条件の１つである。住所解析とは、住所として入力され
た文字列を各文字列が県名、市名、町名等の何を意味し
ているかを解析し、それに該当する住所コードに変換す
る処理があり、入力の容易化、曖昧さへの対処、処理の
高速化が要求される。例えば、電話番号案内のように、
問い合わせ者と対話しながら、結果を出さなければなら
ない場合に、問い合わせ者から取得した情報に基づいて
住所解析を行うケース等がある。Address information is one of the important input conditions for identifying a customer in managing / retrieving customer data with a computer. Address analysis is a process that analyzes what a character string entered as an address is, such as a prefecture name, city name, town name, etc., and converts it to the corresponding address code. Ease of handling, dealing with ambiguity, and speeding up processing are required. For example, like directory information
There is a case where address analysis is performed based on the information acquired from the inquirer when the result must be produced while interacting with the inquirer.

【０００７】コード化対象の住所が、都道府県市区群町町大字字丁目の４階層に分かれている。従来の住所解析では、（１）住所階層毎に区切って入力する方法（２）各住所階層の先頭文字を入力してガイダンスによ
り確定する方法等が用いられている。The address to be coded is divided into four layers, which are prefectures, cities, groups, towns, towns, and large letters. In the conventional address analysis, there are used (1) a method of delimiting each address hierarchy and (2) a method of inputting the first character of each address hierarchy and confirming it by guidance.

【０００８】（１）の住所階層毎に区切って入力する例
として、 “姫路市△飾磨区△英賀春日町” のように、“市”、“区”毎に区切り記号等を用いて入
力する。但し、“△”は、区切り（スペース）を示す。As an example of inputting by dividing each address hierarchy in (1), input using a delimiter symbol for each "city" and "ward", such as "Himeji City △ Shikama Ward △ Ega Kasuga Town" To do. However, “Δ” indicates a division (space).

【０００９】（２）の各住所階層の先頭文字を入力する
例として、まず、“県”という文字を入力・表示する
と、例えば、ユーザは、『埼玉』の入力を行う。次に、
“市”というガイダンスを入力・表示すると、『春日
部』の入力を行う。このように、住所階層毎のガイダン
スに従って住所情報を入力する。As an example of inputting the first character of each address hierarchy in (2), first, when the character "ken" is input and displayed, for example, the user inputs "Saitama". next,
When you enter / display the guidance "City", enter "Kasukabe". In this way, the address information is input according to the guidance for each address hierarchy.

【００１０】このように、従来は、接尾詞（県、市）で
区切られる名詞の包合関係で意味を絞り込む、または、
接尾詞の内容によって名詞の意味の関係を把握するもの
である。As described above, conventionally, the meaning is narrowed down by the inclusion relation of nouns separated by suffixes (prefecture, city), or
The content of the suffix is used to understand the relationship between the meanings of the nouns.

【００１１】[0011]

【発明が解決しようとする課題】しかしながら、上記従
来の技術における住所階層毎に区切って入力する方法及
び各住所階層の先頭文字を入力してガイダンスにより確
定する方法等では、“町・大字”、“字”が複数の単語
から構成されることも多く、区切りを誤って入力する場
合が多い。住所の入力誤り例を表１に示す。However, in the method of inputting by dividing each address hierarchy and the method of inputting the first character of each address hierarchy and confirming with the guidance in the above-mentioned conventional technique, "town / large character", In many cases, a “letter” is composed of multiple words, and the delimiter is often entered incorrectly. Table 1 shows examples of input errors.

【００１２】入力順序に対する制約が有り、対話形式で
処理する場合に、町名、市名のように種々の順序で
入力された場合や、入力された単語が市名か町名か不明
確な場合に対する処理が困難である。There is a restriction on the input order, and in the case of interactive processing, when the names are input in various orders such as town names and city names, or when it is unclear whether the input words are city names or town names. It is difficult to process.

【００１３】[0013]

【表１】 [Table 1]

【００１４】また、ガイダンスに沿って入力する場合で
も、住所階層の入力を誤ると各住所階層の先頭文字を入
力しても目的とする住所は得られない等の問題点があ
る。さらに、従来は、住所解析に自然言語処理を適用す
る上で、形態素解析と意味解析の役割分担が問題とな
る。住所は、殆ど地名と接尾詞（県、市、区、町等）で
入力されるので、一般的な自然文の文法は適応できな
い。従って、住所の包合関係（Ａ町がＢ市に含まれると
き、包合関係があると呼ぶ）によって下位を絞り込むこ
とになる。“字”までの住所は全国で４６万件（別読み
を含む）あり、１文字・２文字等の文字が少ないカナ地
名は殆ど全ての文字の組合せが存在する。形態素解析で
この辞書を用いて、完全な包合チェックを行うことは、
解候補の件数が多くなり過ぎて実用的な実行速度が得ら
れないという問題がある。Further, even when inputting in accordance with the guidance, if the address hierarchy is erroneously input, the target address cannot be obtained even if the first character of each address hierarchy is input. Furthermore, conventionally, in applying natural language processing to address analysis, the division of roles between morphological analysis and semantic analysis has become a problem. Addresses are mostly input with place names and suffixes (prefecture, city, ward, town, etc.), so the general grammar of natural sentences cannot be applied. Therefore, the lower order is narrowed down by the inclusion relation of the address (when the town A is included in the city B, it is called the inclusion relation). There are 460,000 addresses (including separate reading) nationwide up to "characters", and almost all combinations of characters exist in kana place names with few characters such as one or two characters. Performing a complete inclusion check using this dictionary in morphological analysis is
There is a problem that the number of solution candidates becomes too large and a practical execution speed cannot be obtained.

【００１５】また、従来は、接尾詞を含めた単語の文字
列の一致を行うために、接尾詞が省略された場合には、
救済できなくなるという問題もある。さらには、従来の
単語関係解析では意味解析ルールとして、テーブル化或
いはロジック化された自然語の意味解析ルールを用いて
いるために、処理のロジックが複雑になり、多大な開発
コスト及び処理時間に伴う多大な作業コストが必要とな
る等の問題がある。Conventionally, in order to match the character strings of words including a suffix, when the suffix is omitted,
There is also the problem that it will not be possible to relieve. Further, in the conventional word relation analysis, since the semantic analysis rule of the natural language that is tabulated or logic is used as the semantic analysis rule, the processing logic becomes complicated, resulting in a great development cost and processing time. There is a problem that enormous work costs are required.

【００１６】さらに、解析を高速に行うためには、辞書
の全てを高速にアクセスする可能とする、例えば、主記
憶メモリに格納すると高速にアクセス可能となるが、コ
ストが高価となるという問題がある。また、辞書をディ
スク装置のような２次記憶装置に格納した場合には、主
記憶装置にディスク装置内の辞書情報を読み出して処理
することにより高速なアクセスは望めないという問題が
ある。Further, in order to perform the analysis at high speed, it is possible to access all of the dictionary at high speed. For example, if the dictionary is stored in the main memory, high speed access is possible, but the cost is high. is there. Further, when the dictionary is stored in a secondary storage device such as a disk device, there is a problem that high-speed access cannot be expected by reading and processing the dictionary information in the disk device into the main storage device.

【００１７】本発明は、上記の点に鑑みなされたもの
で、上記従来の問題点を解決し、入力された任意の単語
の集合が、どの意味（階層）で使用されているかを指定
しなくとも、意味の候補及びその単語の集合の各単語の
意味を高速に、単純なロジックで解析することが可能な
単語関係解析装置及び単語関係解析方法を提供すること
を目的とする。The present invention has been made in view of the above points, solves the above-mentioned conventional problems, and does not specify in what meaning (hierarchy) the input arbitrary word set is used. Also, it is an object of the present invention to provide a word relation analysis device and a word relation analysis method capable of analyzing a meaning candidate and the meaning of each word in a set of the words at high speed with a simple logic.

【００１８】また、本発明の更なる目的は、入力された
単語列の接尾詞の扱いを柔軟にし、省略された場合に
は、省略されたままの情報で意味を解釈し、接尾詞があ
れば当該接尾詞を反映した意味解釈を行うことが可能な
単語関係解析装置及び単語関係解析方法を提供すること
である。Further, a further object of the present invention is to make the handling of suffixes of an input word string flexible, and when omitted, interpret the meaning with the omitted information and add the suffix. For example, it is to provide a word relation analysis device and a word relation analysis method capable of performing a meaning interpretation that reflects the suffix.

【００１９】また、本発明の更なる目的は、辞書等のア
クセス速度を落とさずに、リソースを削減することが可
能な単語関係解析装置及び単語関係解析方法を提供する
ことである。A further object of the present invention is to provide a word relation analysis device and a word relation analysis method capable of reducing resources without reducing the access speed of a dictionary or the like.

【００２０】[0020]

【課題を解決するための手段】図１は、本発明の第１の
原理構成図である。本発明の単語関係解析装置は、入力
された単語の意味、単語の集合の意味を解析する単語関
係解析装置において、基本意味カテゴリによる意味毎に
従属意味カテゴリによる意味を表示することにより単語
の意味を表現した情報を格納する関連情報辞書１００
と、有意の数に限定される意味を有する単語及び単語か
らなる単語列の意味を、関連情報辞書１００を参照して
解析する解析手段２００とを有する。FIG. 1 is a block diagram of the first principle of the present invention. The word relation analysis device of the present invention is a word relation analysis device that analyzes the meaning of an input word, the meaning of a set of words, and the meaning of a word is displayed by displaying the meaning of each subordinate meaning category for each meaning of the basic meaning category. Related information dictionary 100 for storing information expressing
And an analysis means 200 for analyzing the meaning of a word having a meaning limited to a significant number and the meaning of a word string made up of words by referring to the related information dictionary 100.

【００２１】また、上記の関連情報辞書１００は、単語
毎の関連情報をビット情報として保持する。また、上記
の解析手段２００は、選択された単語の意味を関連情報
辞書１００を参照してビット化された関連情報を用い
て、入力された単語数分の論理演算を再帰的に繰り返し
て取得する関連情報取得手段２１０と、選択された単語
が入力された単語列の最終単語である場合には、最終単
語の関連情報と最終単語以前の全ての単語との論理演算
を行い、入力された単語の集合全体の関連情報を出力す
る関連情報出力手段２２０とを有する。The related information dictionary 100 holds the related information for each word as bit information. Further, the analysis means 200 recursively obtains the logical operation for the number of input words by using the related information bitized by referring to the related information dictionary 100 for the meaning of the selected word. When the selected word is the final word of the input word string, the related information acquisition unit 210 that performs the logical operation of the related information of the final word and all the words before the final word is input. The related information output means 220 which outputs the related information of the whole set of words.

【００２２】さらに、上記の関連情報取得手段２１０
は、入力された単語に、意味カテゴリの１つについて意
味カテゴリの意味・内容を表す接尾詞が含まれている場
合に、接尾詞の内容に基づいて他の意味カテゴリの関連
情報のマスク処理を行うマスク手段２１１を含む。Further, the above-mentioned related information acquisition means 210
When the input word includes a suffix indicating the meaning / content of the meaning category for one of the meaning categories, the masking of the related information of other meaning categories is performed based on the content of the suffix. The mask means 211 for performing is included.

【００２３】図２は、本発明の第２の原理構成図であ
る。本発明は、入力された単語の意味、該単語の集合の
意味を解析する単語関係解析装置において、従属意味カ
テゴリによる意味を表示することにより単語の意味を表
現した情報を、基本意味カテゴリ毎に格納される関連情
報辞書１００と、有意の数に限定される意味を有する単
語及び該単語からなる単語列の意味を、関連情報辞書１
００を参照して解析する第１の解析手段２００を有す
る。FIG. 2 is a block diagram of the second principle of the present invention. The present invention provides, in a word relation analysis device that analyzes the meaning of an input word, the meaning of a set of words, information representing the meaning of a word by displaying the meaning in a subordinate meaning category for each basic meaning category. The related information dictionary 100 to be stored, the meaning of a word having a meaning limited to a significant number, and the meaning of a word string composed of the words are stored in the related information dictionary 1
It has the 1st analysis means 200 which analyzes with reference to 00.

【００２４】また、上記の関連情報辞書１００は、単語
毎の関連情報をコード化された情報として保持する。ま
た、上記の第１の解析手段２００は、順序を有する基本
意味カテゴリの順に選択された単語の従属意味カテゴリ
を関連情報辞書を参照し、コード化された関連情報を用
いて、上位の基本意味カテゴリの範囲で従属意味カテゴ
リを解釈し、解釈済みの従属意味カテゴリの範囲で、次
の基本意味カテゴリを解釈することを再帰的に繰り返し
て取得する関連情報取得手段と、入力された単語の集合
全体の関連情報を出力する関連情報出力手段２２０とを
含む。The related information dictionary 100 holds the related information for each word as coded information. Further, the first analysis unit 200 refers to the related information dictionary for the dependent meaning categories of the words selected in order of the basic meaning categories having the order, and uses the coded related information to use the higher-order basic meanings. A related information acquisition unit that interprets the dependent meaning category within the range of the category, and recursively repeatedly acquires the interpretation of the next basic meaning category within the range of the interpreted dependent meaning category, and a set of input words The related information output means 220 which outputs the related information of the whole is included.

【００２５】また、上記の第１の解析手段２００は、入
力された単語に意味カテゴリの１つについて、該意味カ
テゴリの意味・内容を表す接尾詞が含まれている場合
に、該接尾詞の内容に基づいて基本意味カテゴリを確定
し、基本意味カテゴリの範囲で解析する第２の解析手段
２１２を含む。Further, the first analyzing means 200 described above, when the inputted word includes a suffix indicating the meaning / content of the meaning category for one of the meaning categories, the suffix of the meaning category It includes a second analysis unit 212 that determines the basic meaning category based on the content and analyzes the basic meaning category within the range.

【００２６】また、本発明は、関連情報辞書に単語及び
情報を格納する際に、単語の意味カテゴリのレベルに基
づいて最上位から下位のレベルに降順に格納する順位を
決定する格納順位決定手段と、格納順位決定手段により
決定された順位のうち、最上位から所定のレベルまでの
単語をアクセス速度の早い第１の格納手段内の第１の関
連情報辞書に格納する第１の格納手段と、所定のレベル
以下の単語を第１の格納手段よりアクセス速度が遅い第
２の格納手段内の第２の関連情報辞書に格納する第２の
格納手段を有する。Further, according to the present invention, when the words and information are stored in the related information dictionary, the storage order determining means for determining the order of storing in descending order from the highest level to the lower level based on the level of the meaning category of the word. And a first storage means for storing words from the highest rank to a predetermined level among the ranks determined by the storage rank determination means in a first related information dictionary in the first storage means having a high access speed. , Second storage means for storing words of a predetermined level or lower in a second related information dictionary in a second storage means having an access speed slower than that of the first storage means.

【００２７】また、上記の第１の格納手段を主記憶装置
とし、第２の格納手段をディスク装置とする。本発明の
単語関係解析方法は、入力された単語の意味、単語の集
合の意味を解析する単語関係解析方法において、複数の
意味カテゴリにより単語の意味を表示する関連情報を用
いて、有意の数に限定される意味を有する単語及び単語
からなる単語列の意味を解析する。The first storage means is a main storage device and the second storage means is a disk device. The word relation analysis method of the present invention is a word relation analysis method for analyzing the meaning of an input word or the meaning of a set of words. The meaning of a word having a meaning limited to and a word string composed of words is analyzed.

【００２８】また、本発明は、入力された単語の意味、
該単語の集合の意味を解析する単語関係解析方法におい
て、従属意味カテゴリによる意味を表示することにより
単語の意味を表現した情報を、基本意味カテゴリ毎に格
納する関連情報辞書を有し、有意の数に限定される意味
を有する単語及び該単語からなる単語列の意味を、関連
情報辞書を参照して解析する。The present invention also provides the meaning of the input word,
In a word relation analysis method for analyzing the meaning of a set of words, a related information dictionary that stores information expressing the meaning of words by displaying meanings in subordinate meaning categories for each basic meaning category, The meaning of a word having a meaning limited to a number and a word string composed of the word is analyzed with reference to a related information dictionary.

【００２９】また、上記において解析を行う際に、単語
毎の関連情報をコード化された情報である関連情報辞書
を参照して解析する。また、上記において解析を行う際
に、階層順序を有する基本意味カテゴリの順に選択され
た単語の従属意味カテゴリを関連情報辞書を参照し、コ
ード化された関連情報を用いて、上位の基本意味カテゴ
リの範囲で従属意味カテゴリを解釈し、解釈済みの従属
意味カテゴリの範囲で、次の基本意味カテゴリを解釈す
ることを再帰的に繰り返して取得し、入力された単語の
集合全体の関連情報を出力する。When performing the above-mentioned analysis, the related information for each word is analyzed by referring to the related information dictionary which is coded information. Also, when performing the above analysis, the subordinate semantic categories of the words selected in order of the basic semantic categories having a hierarchical order are referred to the related information dictionary, and the encoded related information is used to determine the higher-order basic semantic categories. Interprets the subordinate semantic categories within the range of, and then recursively acquires the interpretation of the next basic semantic category within the range of the interpreted subordinate semantic categories, and outputs the related information of the entire set of input words. To do.

【００３０】また、上記において基本意味カテゴリを解
釈する際に、入力された単語に意味カテゴリの１つにつ
いて、該意味カテゴリの意味・内容を表す接尾詞が含ま
れている場合に、該接尾詞の内容に基づいて基本意味カ
テゴリを確定し、基本意味カテゴリの範囲で解析する。Further, when interpreting the basic meaning category in the above, if the input word includes a suffix indicating the meaning and content of the meaning category for one of the meaning categories, the suffix is added. The basic meaning category is determined based on the contents of, and analysis is performed within the range of the basic meaning category.

【００３１】また、本発明は、単語の意味カテゴリのレ
ベルに基づいて最上位から下位のレベルに降順に格納す
る順位を決定し、決定された順位のうち、最上位から所
定のレベルまでの単語をアクセス速度の早い第１の記憶
手段に格納し、所定のレベル以下の単語を第１の記憶手
段よりアクセス速度が遅い第２の記憶手段内に格納す
る。Further, according to the present invention, the order of storing in descending order from the highest level to the lower level is determined based on the level of the meaning category of the word, and the words from the highest level to a predetermined level in the determined order. Is stored in the first storage means having a high access speed, and words having a predetermined level or less are stored in the second storage means having a slower access speed than the first storage means.

【００３２】[0032]

【作用】図３は、本発明の第１の原理を説明するための
図である。本発明は、ある意味カテゴリ（以下、基本カ
テゴリ）における各意味内容との関連性の有無によっ
て、他の意味カテゴリ（以下、従属カテゴリ）の意味内
容を表現し、各単語の意味を特定している関連情報辞書
を予め用意しておく。FIG. 3 is a diagram for explaining the first principle of the present invention. The present invention expresses the meaning content of another meaning category (hereinafter, subordinate category) depending on the presence or absence of the relationship with each meaning content in a certain meaning category (hereinafter, basic category), and specifies the meaning of each word. Prepare a related information dictionary in advance.

【００３３】ステップ１）関連情報辞書から入力された
単語の意味を辞書関連情報として読み出す。ステップ２）入力された第１の単語について全ての従属
カテゴリが表している意味内容を得て（即ち、各従属カ
テゴリの意味内容の論理和を求めて）、結果をＢ１とす
る。Step 1) The meaning of the word input from the related information dictionary is read as dictionary related information. Step 2) Obtain the semantic content represented by all the subordinate categories for the input first word (that is, obtain the logical sum of the semantic content of each subordinate category), and set the result as B1.

【００３４】ステップ３）入力単語数ｉ＝１とする。ステップ４）入力単語数ｉをインクリメントする。ステップ５）第２の単語の各従属カテゴリ毎に、その意
味内容に結果Ｂ１の内容を反映させ（当該単語の各従属
カテゴリの意味内容と結果Ｂ１との論理積をとって）、
その第１の単語の意味内容が反映された全ての従属カテ
ゴリが表している第２の単語の意味内容を得て、結果Ｂ
２とする。次の第３単語についても、同様に、結果Ｂ２
との論理演算により、第１、第２の単語の意味内容を反
映させて、結果Ｂ３を得る。以降、入力された各単語に
ついて再帰的に処理を行う。Step 3) The input word number i = 1. Step 4) The input word number i is incremented. Step 5) For each subordinate category of the second word, reflect the content of the result B1 in its semantic content (by taking the logical product of the semantic content of each subordinate category of the word and the result B1),
The semantic content of the second word represented by all the subordinate categories in which the semantic content of the first word is reflected is obtained, and the result B
Set to 2. Similarly, for the next third word, the result B2
The result B3 is obtained by reflecting the meaning contents of the first and second words by a logical operation with and. After that, the input words are recursively processed.

【００３５】ステップ６）入力単語が最終単語ｎの場合
には、入力された最後の単語ｎについて、結果Ｂ_nを得
る。なお、結果Ｂ_nは、入力された全単語の基本カテゴ
リに関する共通の意味内容を表している。（住所の例で
は、基本カテゴリを県名とすると、Ｂ_nは入力された単
語（市区町村名）の集合が存在する県名を表示している
ことになる。Step 6) When the input word is the last word n, the result B _n is obtained for the last input word n. The result B _n represents the common meaning content regarding the basic categories of all the input words. (In the example of the address, if the basic category is a prefecture name, B _n indicates the prefecture name in which the set of input words (city / town / village name) exists.

【００３６】ステップ７）各単語について、結果Ｂ_nを
満足する意味情報を求める（結果Ｂ _n）と各結果Ｂ
_i（ｉ＝１，２，…，ｎ−１）の論理積演算を行う）。
以上により、入力された単語の集合についての意味内容
及び、各単語の意味内容の候補を絞り込むことができ
る。Step 7) Result B for each word_nTo
Find satisfying semantic information (Result B _n) And each result B
_i(The logical product operation of i = 1, 2, ..., N-1) is performed).
As a result, the meaning of the set of input words
Also, you can narrow down the candidates for the meaning of each word.
It

【００３７】図４は、本発明の第２の原理を説明するた
めの図である。ステップ１１）入力文字列を階層レベルの順に上位か
ら下位の方向に入力する。ステップ１２）入力された文字列を関連辞書情報を参
照して単語分割して各レベル毎に分割する。FIG. 4 is a diagram for explaining the second principle of the present invention. Step 11) Input the input character string in the order of the hierarchy level from the higher order to the lower order. Step 12) The input character string is divided into words by referring to the related dictionary information and divided into each level.

【００３８】ステップ１３）入力された１つの文字列
の全てのレベルの分割が終了したらステップ１４に移行
し、まだ、分割していない単語列がある場合には、ステ
ップ１２に移行する。ステップ１４）分割された各レベル毎の単語の意味カ
テゴリを解析する。このとき、階層を有する基本意味カ
テゴリの階層順に選択された単語の意味カテゴリを関連
情報辞書を参照して、コード化された関連情報に基づい
て上位の基本意味カテゴリの範囲で従属意味カテゴリを
解釈して、意味範囲の方向関係を取得する。Step 13) When the division of all the levels of one input character string is completed, the process proceeds to step 14, and if there is a word string that has not been divided, the process proceeds to step 12. Step 14) Analyze the semantic categories of the divided words for each level. At this time, referring to the related information dictionary for the meaning categories of the words selected in the hierarchical order of the basic meaning categories having a hierarchy, the dependent meaning categories are interpreted within the range of the higher basic meaning categories based on the coded related information. Then, the direction relation of the meaning range is acquired.

【００３９】ステップ１５）意味カテゴリ毎にレベル
を分ける。例えば、入力文字列の情報が住所情報である
場合には、県市区のレベルと町字のレベルに分ける。ステップ１６）ステップ１５で分けられたレベルが所
定のレベル以上であれば、ステップ１７に移行し、所定
レベルより下位に位置する単語であれば、ステップ１８
に移行する。Step 15) Divide the level for each semantic category. For example, when the information of the input character string is address information, it is divided into a prefecture city / ward level and a town character level. Step 16) If the level divided in Step 15 is equal to or higher than the predetermined level, the process proceeds to Step 17, and if it is a word located below the predetermined level, Step 18
Move to

【００４０】ステップ１７）所定レベルより上位の単
語は、主記憶装置のメモリ等の高速アクセス用の記憶手
段に格納される。ステップ１８）所定レベルより下位の単語は、二次ア
クセス用の記憶手段、例えば、ディスク装置に格納され
る。Step 17) Words higher than a predetermined level are stored in a storage means for high speed access such as a memory of a main storage device. Step 18) Words lower than a predetermined level are stored in a storage device for secondary access, for example, a disk device.

【００４１】これにより、自然言語処理を用いた住所解
析方法等において、入力された文字列をレベル毎に分割
し、分割されたレベルが所定のレベル以上であれば、頻
繁にアクセスされる上位レベルの情報として、高速アク
セス可能な主記憶装置等の記憶手段に格納し、その他の
情報は、二次アクセス用のディスク装置等に格納する。
例えば、住所情報の場合に、県市区レベルまでは、必須
入力としている場合、それより下位のレベルの町字レベ
ルについては、既に確定している県市区レベルの単語
（コード）をキーとしてアクセスするため、そのアクセ
ス回数は、それほど多くない。つまり、住所単語辞書に
おいて県市区レベルの単語数はそれより下位のレベルの
単語数に比較して少ないため、入力文字数が増加しても
辞書ファイルアクセス回数は少ない。これにより、上位
のレベルの辞書のみを高速にアクセスできる媒体に格納
することで、アクセス速度を落とさずに、リソースを削
減することができる。As a result, in the address analysis method using natural language processing, etc., the input character string is divided into levels, and if the divided levels are equal to or higher than a predetermined level, the higher level that is frequently accessed is Is stored in a storage means such as a main storage device that can be accessed at high speed, and other information is stored in a disk device for secondary access.
For example, in the case of address information, if you are required to enter up to the prefectural / city / ward level, for the lower levels of the town letter level, use the already fixed prefecture / ward / ward level word (code) as a key. Since it is accessed, the number of times of access is not so large. That is, in the address word dictionary, the number of words at the prefecture / ward / ward level is smaller than the number of words at a lower level, so the number of access to the dictionary file is small even if the number of input characters increases. Thus, by storing only the higher-level dictionary in a medium that can be accessed at high speed, it is possible to reduce resources without reducing the access speed.

【００４２】[0042]

【実施例】以下、図面を用いて本発明の実施例を説明す
る。以下に示す実施例は、入力された単語の組が、どこ
の住所を表し、各単語が住所表示階層のどの階層として
使用されているかを解析する住所解析を例として説明す
る。Embodiments of the present invention will be described below with reference to the drawings. The embodiment described below will be described by taking an address analysis as an example, in which an address set is represented by a set of input words, and which hierarchy of each address display hierarchy is used for each word.

【００４３】図５は、本発明の第１の実施例の単語関係
解析装置のブロック図である。同図に示す単語関係解析
装置は、関連情報辞書１、外部の装置或いは、モジュー
ルとのデータの受渡しを行う入出力部２、単語関係解析
装置の全体を制御する制御部３、論理演算の途中結果を
一時的に格納する一時記憶部４、入出力部２から入力さ
れた各単語及び単語集合としての関連情報を蓄積する関
連情報記憶装置５、制御部３により制御され、関連情報
記憶装置５のデータ内容同士、あるいは、関連情報記憶
装置５のデータ内容との論理積、あるいは、論理和等の
論理演算を行い、演算結果を出力する論理演算部６より
構成される。FIG. 5 is a block diagram of the word relation analyzing apparatus of the first embodiment of the present invention. The word relation analysis device shown in the figure includes a related information dictionary 1, an input / output unit 2 for exchanging data with an external device or a module, a control unit 3 for controlling the whole of the word relation analysis device, and a middle of a logical operation. A temporary storage unit 4 for temporarily storing the result, a related information storage device 5 for storing each word input from the input / output unit 2 and related information as a word set, and a related information storage device 5 controlled by the control unit 3. The logical operation unit 6 performs a logical operation such as a logical product such as a logical product or a logical sum with the data content of each other or with the data content of the related information storage device 5 and outputs the operation result.

【００４４】図６は、本発明の第１の実施例のフィール
ド構成を有する関連情報辞書の内容を示す。同図に示す
例は、基本カテゴリとして都道府県名、従属カテゴリと
して、市区レベル大字・町レベル字・丁目レベルの場合を示している。基本カテゴリの都道府県名は、４
７ビットのビット位置に対応させ、該当ビットを“１”
にすることで単語が該当ビット位置の都道府県名の意味
を有していることを示している。FIG. 6 shows the contents of the related information dictionary having the field structure of the first embodiment of the present invention. The example shown in the figure shows the case where the prefecture is the basic category and the subordinate categories are the city / ward level large character / town level character / chome level. Prefectures in the basic category are 4
Corresponding to the bit position of 7 bits, the corresponding bit is "1"
It means that the word has the meaning of the prefecture name of the corresponding bit position.

【００４５】同図の例では、各単語は、“市区レベル”
関連情報、“大字・町レベル”関連情報、“字・丁目レ
ベル”関連情報の各関連情報が４７都道府県を示す４７
ビットの情報で表現される。例として、『トウキョウ』
は、“市区レベル”関連情報のビット情報として、“00
11……………01”を有し、“大字・町レベル”関連情報
のビット情報として“1100……………00”を有し、“字
・丁目レベル”関連情報のビット情報として“0000……
………01”を有する。In the example of the figure, each word is "city level".
The related information, “large-scale / town level” related information, and “character / chome level” related information indicate 47 prefectures 47
It is represented by bit information. As an example, "Tokyo"
Is "00" as bit information of "city level" related information.
11 …………… 01 ”and“ 1100 …………… 00 ”as the bit information of the“ large / town level ”related information, and“ 1100 ……… 00 ”as the bit information of the“ character / chome level ”related information. 0000 ……
……… has 01 ”.

【００４６】即ち、第３、第４、第４７ビットに対応す
る件には、「トウキョウ」という名称の「市区レベル」
の住所名が有り、第１、第２ビットに対応する県には、
「トウキョウ」という名称の「大字・町レベル」の住所
名があるということを示している（字・丁目レベルにつ
いても同様）。That is, the case corresponding to the 3rd, 4th and 47th bits is "city level" with the name "Tokyo".
There is an address name of, and the prefectures corresponding to the first and second bits are:
It indicates that there is an address name of "Oaza / Machi level" called "Tokyo" (the same applies to the character / chome level).

【００４７】このように、関連情報辞書は、各住所名を
表す単語が“市区レベル”、“町・大字レベル”、或い
は、“字、丁目レベル”でいずれかの県に存在するか、
しないかの情報を示し、“市区レベル”、“町・大字レ
ベル”、或いは、“字・丁目レベル”毎に、県数分のビ
ット情報（４７ビット）と対応している。As described above, in the related information dictionary, whether a word representing each address name exists in any prefecture in “city / ward level”, “town / large character level”, or “letter, chome level”,
It indicates whether or not to do so, and each "city / ward level", "town / large character level", or "character / chome level" corresponds to bit information (47 bits) for the number of prefectures.

【００４８】上記の単語関係解析装置の動作を図７を用
いて説明する。図７は、本発明の第１の実施例の単語関
係解析装置の動作を示すフローチャートである。以下の
処理の前提として、入出力部２から入力された単語列が
形態素解析機能により単語毎に分割されているものとす
る。The operation of the above-mentioned word relation analyzing apparatus will be described with reference to FIG. FIG. 7 is a flowchart showing the operation of the word relation analysis device of the first exemplary embodiment of the present invention. As a premise of the following processing, it is assumed that the word string input from the input / output unit 2 is divided into words by the morphological analysis function.

【００４９】ステップ１０１）単語関係解析装置に分割
された単語が入力されると、制御部３は、各単語毎に関
連情報辞書１を参照して、各単語の関連情報、即ち、各
単語の意味候補を関連情報記憶装置４に蓄積する。ステップ１０２）制御部３は、最初に入力された単語を
ポイントし、単語の関連情報を一時記憶部４に蓄積す
る。Step 101) When the divided words are input to the word relation analysis device, the control unit 3 refers to the related information dictionary 1 for each word and refers to the related information of each word, that is, each word. The meaning candidates are accumulated in the related information storage device 4. (Step 102) The control unit 3 points the first input word and stores the related information of the word in the temporary storage unit 4.

【００５０】ステップ１０３）制御部３は、一時記憶部
４の内容を、ポイントしている単語の意味候補内容に該
処理単語の意味候補の内容が反映された関連情報とし
て、関連情報記憶装置５に移送する。ステップ１０４）論理演算部６は、制御部３がポイント
している単語の関連情報に対し、制御部３に登録してあ
る各アプリケーション固有の論理演算処理（単語の順序
を表すビット情報とビット情報との論理積をとる等の処
理）を行い、演算結果を一時記憶部４に蓄積する。Step 103) The control unit 3 sets the contents of the temporary storage unit 4 as the related information in which the meaning candidate contents of the pointed word are reflected as the meaning candidate contents of the processing word. Transfer to. Step 104) The logical operation unit 6 performs logical operation processing (bit information and bit information indicating the order of words) specific to each application registered in the control unit 3 with respect to the related information of the word pointed to by the control unit 3. And the calculation result is accumulated in the temporary storage unit 4.

【００５１】ステップ１０５）制御部３がポイントして
いる単語の次の単語の現在の関連情報と一時記憶部４の
内容とを論理演算部６に移送し、論理積演算を行った結
果を再度一時記憶部４へ移送する。ステップ１０６）制御部３は、ポイントしている単語の
位置を１つ進める。Step 105) The current related information of the word next to the word pointed by the control unit 3 and the contents of the temporary storage unit 4 are transferred to the logical operation unit 6, and the result of the logical product operation is again obtained. Transfer to the temporary storage unit 4. (Step 106) The control unit 3 advances the position of the pointed word by one.

【００５２】ステップ１０７）制御部３が現在ポイント
している単語が単語集合内の最終単語であるかどうかを
調べ、最終単語である場合には、ステップ１０８に移行
し、最終単語でない場合には、ステップ１０３に移行す
る。ステップ１０８）論理演算部６は、一時記憶部４の内容
（最終単語の関連情報に相当）と、各単語の関連情報を
関連情報記憶装置５から順次呼出し、論理演算部６にお
いて、論理積を取った後、論理積結果を関連情報記憶装
置５内のそれぞれの単語の関連情報記憶領域に上書きす
る。Step 107) The control unit 3 checks whether or not the word currently pointed to is the last word in the word set. If it is the last word, the process proceeds to step 108, and if it is not the last word, , And shifts to step 103. Step 108) The logical operation unit 6 sequentially calls the contents of the temporary storage unit 4 (corresponding to the related information of the last word) and the related information of each word from the related information storage device 5, and the logical operation unit 6 calculates the logical product. After taking, the logical product result is overwritten in the related information storage area of each word in the related information storage device 5.

【００５３】ステップ１０９）制御部３は、関連情報記
憶装置５に蓄積されている全ての単語の関連情報を制御
部３で順次呼出し、一時記憶部４を途中の結果を保存す
るバッファとして、論理演算部６で論理積をとり、論理
和結果を関連情報記憶装置５内の単語の集合としての関
連情報蓄積領域に蓄積する。Step 109) The control unit 3 sequentially calls the related information of all the words stored in the related information storage device 5 by the control unit 3, and the temporary storage unit 4 is used as a buffer for storing the intermediate results. The arithmetic unit 6 calculates the logical product and stores the logical sum result in the related information storage area as a set of words in the related information storage device 5.

【００５４】これにより、入出力部２は、関連情報記憶
装置５から単語列が表す関連情報の候補及び、その時の
各単語の関連情報を出力する。図８は、本発明の第１の
実施例のアプリケーションプログラムの固有の処理を示
すフローチャートである。Thus, the input / output unit 2 outputs the related information candidates represented by the word string from the related information storage device 5 and the related information of each word at that time. FIG. 8 is a flowchart showing the processing unique to the application program according to the first embodiment of this invention.

【００５５】ステップ２０１）制御部３は、入力された
単語列の先頭から処理を行う単語を順番に選択する。ステップ２０２）論理演算部６は、現在の単語の辞書関
連情報と前単語の結果Ｂとの論理積を取る。この際、現
在の単語が先頭単語であった場合、前単語の結果の初期
値として、“ＡＬＬ１”であると考えるので、処理結
果としては、何も行わないのと同値となる。(Step 201) The control section 3 sequentially selects words to be processed from the beginning of the input word string. (Step 202) The logical operation unit 6 takes the logical product of the dictionary related information of the current word and the result B of the previous word. At this time, when the current word is the first word, it is considered that the initial value of the result of the previous word is "ALL 1", and therefore the processing result is the same value as when nothing is done.

【００５６】ステップ２０３）制御部３において、現在
ポインタが指し示している単語に接尾詞があるか否かの
判断を行い、処理の条件分岐を行う。接尾詞がある場合
には、ステップ２０４に移行し、単語に接尾詞が存在し
ない場合には、ステップ２０５に移行する。Step 203) The control section 3 judges whether or not the word currently pointed to by the pointer has a suffix, and branches the processing conditionally. If there is a suffix, the process proceeds to step 204, and if the word has no suffix, the process proceeds to step 205.

【００５７】ステップ２０４）単語に接尾詞がある場合
には、接尾詞の内容に基づいて、現在の単語に対応する
関連情報のないようにマスク処理を行う。例えば、
“シ”や“ク”の接尾詞の存在に基づいて“市区レベ
ル”以外のビット情報を“ＡＬＬ０”とすることにより
マスク処理する。以下、このマスク処理された結果を結
果Ａとして説明する。即ち、辞書関連情報が “１０１０，１１１１，１１１０” であり、当該単語が“市区レベル”の接尾詞を有してい
る場合、該単語が“大字・町レベル”、“字・町レベ
ル”、“字・丁目レベル”の意味を持つことはないの
で、マスク処理することにより、 “１０１０，００００，００００” とする。Step 204) If the word has a suffix, mask processing is performed based on the content of the suffix so that there is no relevant information corresponding to the current word. For example,
Masking is performed by setting bit information other than "city level" to "ALL0" based on the presence of suffixes "shi" and "ku". Hereinafter, the result of this masking will be described as the result A. That is, when the dictionary-related information is “1010,1111,1110” and the word has a suffix of “city level”, the word is “large letter / town level”, “letter / town level”. , "Character / Chome level" does not have any meaning, and is masked to be "1010,000,000".

【００５８】ステップ２０５）ポインタが指し示してい
る単語の関連情報の各ブロック内容の論理和をとる。即
ち、“市区レベル”の４ビットと“大字・町レベル”の
４ビットと“字・丁目レベル”の４ビットとの論理和を
とり、この結果を結果Ｂとして説明する。例えば、 “１０１０”、“１１１１”、“１１１０” の論理和をとると、結果Ｂ＝“１１１１”となる。ま
た、上記のようにマスク処理され、 “１０１０，００００，００００” の場合は、結果Ｂ＝“１０１０”となる。Step 205) The logical sum of the contents of each block of the related information of the word pointed by the pointer is calculated. That is, the logical sum of 4 bits of "city level", 4 bits of "large character / town level" and 4 bits of "character / chome level" is obtained, and the result is explained as a result B. For example, if the logical sum of “1010”, “1111”, and “1110” is taken, the result B = “1111”. When the mask processing is performed as described above and "10,100,000", the result B is "1010".

【００５９】ステップ２０６）制御部３は、現在の単語
が最終単語であるかどうかをポインタを参照して判断
し、最終単語である場合には、ステップ２０７に移行す
る。また、最終単語でない場合には、ポインタをインク
リメントし、ステップ２０１に移行する。Step 206) The control unit 3 judges whether the current word is the last word or not by referring to the pointer, and if it is the last word, the procedure goes to Step 207. If it is not the final word, the pointer is incremented and the process proceeds to step 201.

【００６０】ステップ２０７）論理演算部６は、最終単
語の結果Ｂと入力した全ての単語について結果Ａと各々
の論理積をとり、その結果を各単語の出力結果とする。ステップ２０８）論理演算部６は、入力した全ての単語
の各結果Ｂの論理積をとり、単語の組合せとしての出力
結果とする。(Step 207) The logical operation unit 6 obtains the logical product of the result B of the final word and the result A of all the input words, and sets the result as the output result of each word. (Step 208) The logical operation unit 6 takes the logical product of the respective results B of all the input words and sets the result as the output result as a combination of words.

【００６１】以下、図９から図１３の各データフローを
示し、上記の処理を具体的に説明する。図９〜図１３
は、本発明の単語関連解析のデータフローを示す。図９
から図１３においては、簡単化のために、“市区レベ
ル”、“大字・町レベル”、“字・丁目レベル”の全て
のレベルのビットを本来は４７県分で４７ビット存在す
るところを４ビット（４県分）で示した例である。Hereinafter, the above-mentioned processing will be specifically described by showing each data flow of FIGS. 9 to 13. 9 to 13
Shows a data flow of word association analysis of the present invention. Figure 9
13 to FIG. 13, for simplification, it is assumed that there are originally 47 bits for 47 prefectures for all levels of “city level”, “large character / town level”, and “character / chome level”. This is an example shown with 4 bits (for 4 prefectures).

【００６２】以下、入力された単語が、単語１『ミナト
ク』、単語２『ロッポンギ』、及び単語３『トウキョ
ウ』の場合を例に説明する。図９のデータフローにおい
て、例えば、これらの単語で図６の関連情報辞書１を参
照すると、『ミナト』という住所名単語は市区レベルと
しては、「北海道、岩手」に存在し、大字・町レベルで
は「北海道、青森、岩手、……、沖縄」に存在すること
を示すビット情報で記述された関連情報が得られる。制
御部３は、これらの関連情報を関連情報記憶装置５に蓄
積する（図７：ステップ１０２）。ここで、図９に示す
ように、単語ポインタを単語１（先頭単語）にセットし
ておき、本実施例におけるアプリケーション固有の処理
を行う。接尾詞の意味が付与されて入力された場合、ま
たは、制御部３で接尾詞と判断された場合には（図８：
ステップ２０３）、単語１「ミナトク」のように接尾詞
「ク」が存在することとなり、辞書関係情報の“市区レ
ベル”の情報のみを有効となるように、他のレベルに関
してはマスク処理を行い、“大字・町レベル”及び“字
・丁目レベル”の情報を落とす処理を行う（図８：ステ
ップ２０４）。Hereinafter, the case where the input words are the word 1 "minatoku", the word 2 "roppongi", and the word 3 "tokyo" will be described as an example. In the data flow of FIG. 9, for example, referring to the related information dictionary 1 of FIG. 6 with these words, the address name word “Minato” exists in “Hokkaido, Iwate” at the city / ward level, and is in the large or small town. At the level, the related information described by the bit information indicating that it exists in "Hokkaido, Aomori, Iwate, ..., Okinawa" is obtained. The control unit 3 stores the related information in the related information storage device 5 (FIG. 7: step 102). Here, as shown in FIG. 9, the word pointer is set to the word 1 (first word), and the processing unique to the application in this embodiment is performed. When the suffix meaning is added and input, or when the control unit 3 determines that the suffix is a suffix (FIG. 8:
In step 203), the suffix "ku" exists such as word 1 "minatoku", and mask processing is performed for other levels so that only the "city-level" information in the dictionary-related information is valid. Then, the processing of dropping the information of "large character / town level" and "character / chome level" is performed (FIG. 8: step 204).

【００６３】即ち、“大字・町レベル”及び“字・丁目
レベル”の部分が“００００”、“００００”となり、
関連情報記憶装置５には、１０１０，００００，００００が格納される。In other words, the "large character / town level" and the "character / chome level" portions are "0000" and "0000",
In the related information storage device 5, 1010,000,000 is stored.

【００６４】次に、図１０のデータフローにおいて、
“市区レベル”、“大字・町レベル”、“字・丁目レベ
ル”の各レベルの情報の論理和をとる。この例では、１０１０，１０１０，１０１０となり、その住所名単語が存在する可能性のある県にビ
ットがたった関連情報結果が一時的に一時記憶部４に格
納される（図７：ステップ１０４、図８：ステップ２０
５）。Next, in the data flow of FIG.
The logical sum of the information at each level of “city / ward level”, “large character / town level”, and “character / chome level” is calculated. In this example, it becomes 1010, 1010, 1010, and the related information result in which the bit is in the prefecture in which the address name word may exist is temporarily stored in the temporary storage unit 4 (FIG. 7: step 104, FIG. 8: Step 20
5).

【００６５】現在の一時記憶部４の内容は、図８のステ
ップ２０５の結果Ｂに相当し、図１１に示すように、一
時記憶部４の内容 “１０１０，１０１０，１０１０” と次の単語２『ロッポンギ』との論理積結果 “００００，００１０，００１０” を一時記憶部４を経由して関連情報記憶装置５に蓄積す
る（図７：ステップ１０５、図８：ステップ２０２）。The current contents of the temporary storage unit 4 correspond to the result B of step 205 in FIG. 8, and as shown in FIG. 11, the contents of the temporary storage unit 4 "1010, 1010, 1010" and the next word 2 The logical product result "0000,0010,0010" with "Roppongi" is stored in the related information storage device 5 via the temporary storage unit 4 (FIG. 7: step 105, FIG. 8: step 202).

【００６６】この蓄積された情報は、現在の単語ポイン
タ（２）より前の単語（１）の関係を見た上での、現在
の単語ポインタ（２）上の単語の意味となるべき候補を
示すビットがたった関連情報結果であり、図８のフロー
チャートのステップ２０５の結果Ｂに相当する。This accumulated information identifies candidates that should be the meanings of the word on the current word pointer (2) after checking the relationship of the word (1) before the current word pointer (2). The indicated bit is the result of only the related information, and corresponds to the result B of step 205 in the flowchart of FIG.

【００６７】次に、単語ポインタを単語２に進める（図
７：ステップ１０６）。単語２には、接尾詞がないの
で、マスク処理を行わずにこのまま処理を進める。後
は、この操作を再帰的に繰り返し、図１２に示すよう
に、単語ポインタが単語３（即ち、処理すべき最後の単
語）を指すようになったとき、再帰的な処理を終了し
（図７：ステップ１０７）、当該処理が終了した時点の
一時記憶部４の内容［単語３（最終単語）の関連情報に相当「００１０」］と、各処理を行った単語の現在の関連情報との論理積を
取ることにより、出力結果に示すような単語それぞれの
関連情報が得られる（図７：ステップ１０８、図８：ス
テップ２０７）。Next, the word pointer is advanced to word 2 (FIG. 7: step 106). Since word 2 has no suffix, the masking process is not performed and the process proceeds as it is. After that, this operation is repeated recursively, and when the word pointer points to the word 3 (that is, the last word to be processed) as shown in FIG. 12, the recursive processing is terminated (see FIG. 7: Step 107), the contents of the temporary storage unit 4 at the end of the process [corresponding to the related information of the word 3 (final word) “0010”] and the current related information of the word subjected to each process. By taking the logical product, the related information of each word as shown in the output result is obtained (FIG. 7: step 108, FIG. 8: step 207).

【００６８】以下に、最終単語の結果について“市区レ
ベル”、“町レベル”、“字レベル”も全て同一のビッ
ト列としたものと、処理の終了した先頭側のすべての関
連情報記憶装置１に格納されている情報との論理積を取
った結果を示す。単語３の関連情報 “００１０，００１０，００１０” と単語１の関連情報記憶装置に格納されている情報 “１０１０，００００，００００” との論理積を取ることにより、出力結果（単語１の関連
情報）は、 “００１０、００００、００００” となる。In the following, regarding the result of the final word, "city level", "town level", and "letter level" are all the same bit string, and all related information storage devices 1 on the head side where the processing is completed. The result of the logical product with the information stored in is shown. The output result (related information of word 1 is obtained by ANDing the related information “0010,0010,0010” of word 3 and the information “1010,000,0000” stored in the related information storage device of word 1. ) Becomes “0010, 0000, 0000”.

【００６９】単語３の関連情報“００１０，００１
０，００１０” と、単語２の関連情報記憶装置に格納されている情報 “００００，００１０，００１０” との論理積を取ることにより、出力結果（単語２の関連
情報）は、 “００００、００１０、００１０” となる。Related information of word 3 "0010,001"
The output result (related information of word 2) is “0000,0010” by logically ANDing the information “0000,0010,0010” stored in the related information storage device of word 2 with “0,0010”. , 0010 ”.

【００７０】単語３の関連情報“００１０，００１
０，００１０” と、単語３の関連情報記憶装置に格納されている情報 “００１０，００００，００００” との論理積を取ることにより、出力結果（単語３の関連
情報）は、 “００１０，００００，００００” となる。Related information of word 3 "0010,001"
The output result (related information of word 3) is obtained by logically ANDing 0,0010 ”and the information“ 010,000,000,000 ”stored in the related information storage device of word 3. , 0000 ".

【００７１】最後に、図１３に示すように、すべての単
語の関連情報の論理積をとり、単語集合としての関連情
報とし、出力できる（図７：ステップ１０９、図８：ス
テップ２０８）。この結果は、単語集合関連情報 “００１０” の３ビット目が示す県に、単語１『ミナトク』、単語２
『ロッポンギ』及び単語３『東京』のいずれも存在する
ことを示しており、各単語の関連情報は、例えば、単語
１『ミナトク』は“市区レベル”にあり、単語２『ロッ
ポンギ』は“大字・町レベル”あるいは“字・丁目レベ
ル”にあるなど、当該単語が“市区レベル”か“大字・
町レベル”か“字・丁目レベル”かいずれの住所階層に
存在するかを示している。Finally, as shown in FIG. 13, it is possible to take the logical product of the related information of all the words to obtain the related information as a word set and output it (FIG. 7: Step 109, FIG. 8: Step 208). This result shows that the word 1 “Minatoku” and the word 2 are displayed in the prefecture indicated by the 3rd bit of the word set related information “0010”
It indicates that both “Roppongi” and word 3 “Tokyo” exist, and related information of each word is, for example, word 1 “Minatok” is at “city level” and word 2 “Roppongi” is “ The word is "city / city level" or "city / city level" or "letter / chome level".
It indicates whether it exists in the address level of "town level" or "character / chome level".

【００７２】即ち、図１３の出力結果において、単語１
の『ミナトク』の出力結果は、 “００１０，００００，００００” であり、“市区レベル”にビットがたっており、他の
“大字・町レベル”あるいは“字・丁目レベル”には、
ビットがたっていないので、『ミナトク』の階層（意
味）は、“大字・町レベル”あるいは“字・丁目レベ
ル”ではないことを示している。That is, in the output result of FIG. 13, word 1
The output result of "Minatoku" is "010,000,000,000", and there is a bit in the "city level", and in other "large character / town level" or "character / chome level",
Since there are no bits, it means that the hierarchy (meaning) of "Minatoku" is not "large character / town level" or "character / chome level".

【００７３】また、図１３の出力結果において、単語２
の『ロッポンギ』の出力結果は、 “００００，００１０，００１０” であり、“大字・町レベル”と“字・丁目レベル”にビ
ットがたっているので、『ロッポンギ』の階層は、“大
字・町レベル”あるいは“字・丁目レベル”であること
を示している。In the output result of FIG. 13, word 2
The output result of "Roppongi" is "0000,0010,0010", and there are bits at "Large / town level" and "Character / Chome level", so the level of "Roppongi" is "Large / town". It indicates that it is “level” or “character / chome level”.

【００７４】さらに、図１３の出力結果において、単語
３の『東京』の出力結果は、 “００１０，００００，００００” であり、“市区レベル”にビットがたっており、他の
“大字・町レベル”あるいは“字・丁目レベル”には、
ビットがたっていないので、『トウキョウ』の階層（意
味）は、“大字・町レベル”あるいは“字・丁目レベ
ル”ではなく、“市区レベル”であることを示してい
る。Further, in the output result of FIG. 13, the output result of the word 3 "Tokyo" is "010,000,000000", and the "city level" has a bit. "Level" or "character / chome level",
Since there are no bits, it means that the hierarchy (meaning) of "Tokyo" is at "city level", not "large character / town level" or "character / chome level".

【００７５】上記のように、本実施例は、住所情報のよ
うに、予め定められている有意の意味数に限定して関連
情報辞書を生成しておき、入力された各単語が属する県
名の集合（４７個）と当該単語が属するレベルの数（市
区町村レベル）を組み合わせることにより、入力された
単語の意味を解析するものである。従って、『ミナト』
は、県名が「東京都」の場合には、区レベルとなり、県
名が「○○県」と「××県」の場合には、町レベルを意
味する。この場合には、ビット情報の複数箇所にビット
がたっている。単語が持つ意味は、県市区町村の階層、
どの県に属しているかという複数のカテゴリからなり、
入力された単語はその組合せにより意味が異なる。As described above, according to the present embodiment, like the address information, the related information dictionary is generated by limiting the number of significant meanings that are set in advance, and the prefecture name to which each input word belongs (47) and the number of levels to which the word belongs (city level) are combined to analyze the meaning of the input word. Therefore, "Minato"
Means the ward level when the prefecture name is “Tokyo”, and means the town level when the prefecture name is “XX prefecture” and “XX prefecture”. In this case, there are bits at multiple locations in the bit information. The meaning of a word is the hierarchy of prefectures, municipalities,
It consists of multiple categories that belong to which prefecture,
The input words have different meanings depending on the combination.

【００７６】例えば、“『ミナト』、『トウキョウ』、
『ロッポンギ』”の組で入力された場合と、“『ミナ
ト』、『アオモリ』、『ハチノヘ』”の組で入力された
場合とでは、『ミナト』の意味（階層）は異なる。これ
は、各単語のビット情報を参照して、その論理演算の結
果から判断する。For example, ““ Minato ”,“ Tokyo ”,
The meaning (hierarchy) of "Minato" differs depending on whether it is entered in the "Roppongi" group and when it is entered in the "Minato", "Aomori", "Hachinohe" group. The bit information of each word is referred to and the judgment is made from the result of the logical operation.

【００７７】次に、意味カテゴリ間に包合関係が存在す
る場合について説明する。単語分割で包合チェックを行
う場合、単語辞書に住所コードを持ち込む必要がある。
ところが、１つの単語には、数多くの住所コードが対応
する。例えば、「アイオイ」は全国の市と町だけでも７
０存在する。そこで、辞書が巨大になるのを防ぐため、
ある単語が属する住所を県のレベルのビット列で持つ構
成とした。例えば、「アイオイ」は県市区郡レベルでは
“兵庫県”、“徳島県”、町・大字レベルでは“北海
道”、“山形県”、“神奈川県”……に存在することを
ビット列で表した以下のようなフラグで持つ。Next, the case where there is an inclusion relation between semantic categories will be described. When performing inclusion check by word division, it is necessary to bring the address code into the word dictionary.
However, many address codes correspond to one word. For example, "Ioi" is 7 in cities and towns all over the country.
0 exists. So, to prevent the dictionary from becoming huge,
The address to which a certain word belongs is configured to have a bit string at the prefecture level. For example, the bit string indicates that "Ioi" exists in "Hyogo Prefecture", "Tokushima Prefecture" at the prefecture / city / district level, and "Hokkaido", "Yamagata Prefecture", "Kanagawa Prefecture" at the town / large character level. It has the following flags.

【００７８】[0078]

【表２】 [Table 2]

【００７９】住所コードをフラグで持つことにより辞書
量の増加は１単語当たり１８バイトの増加（４７ビット
＊３）で済む。また、この様な辞書構成にすることによ
り次のような利点が生まれる。包合チェックが単にビット演算で容易に実現でき
る。辞書で県までの情報を持つため、住所候補抽出で必
要な住所テーブルを県単位に分割することができる。従
って、サーチするデータ数が減少し、テーブルアクセス
が高速になる。By having the address code as a flag, the amount of dictionary can be increased by 18 bytes per word (47 bits * 3). In addition, such a dictionary structure has the following advantages. The inclusion check can be easily realized simply by bit operation. Since the dictionary has information up to prefectures, the address table required for address candidate extraction can be divided into prefecture units. Therefore, the number of data to be searched is reduced and the table access becomes faster.

【００８０】なお、上記の例では、意味カテゴリを県市
区町村の階層と、どの県に属するかとういう２つのカテ
ゴリで説明したが、この例に限定されることなく、ｎ種
のカテゴリとしてもよい。［第２の実施例］次に、本発明の第２の実施例として、
アクセスが高速なデバイスに辞書内容を効率的に格納す
るために、アクセス頻度の高い上位レベルの情報を主記
憶装置に、下位レベルの情報をディスク装置に格納する
ものである。In the above example, the semantic categories are described as two categories, ie, a hierarchy of prefectures, cities, towns and villages, and which prefecture they belong to, but the present invention is not limited to this example, and there are n types of categories. Good. [Second Embodiment] Next, as a second embodiment of the present invention,
In order to efficiently store the dictionary contents in a device that can be accessed at high speed, high-level information that is frequently accessed is stored in the main storage device, and low-level information is stored in the disk device.

【００８１】これは、住所単語辞書の階層順の入れ代わ
りや住所レベルの階層抜け等に対処するために住所単語
辞書を用いているが、住所単語辞書は、県、市区、大
字、町、さらに字、丁目に至る全ての単語が格納される
必要があるために非常に辞書が巨大化してしまう。これ
により、アクセスが高速なデバイスに辞書内容の全てを
格納するのは、コスト高となる。従って、本発明では、
アクセス頻度が高く、単語数が少ない上位レベルの単語
を主記憶装置に、その他の下位レベルの単語をディスク
装置に格納することにより対処するものである。This uses the address word dictionary in order to deal with the replacement of the hierarchical order of the address word dictionary, the lack of hierarchical levels at the address level, etc. The address word dictionary includes prefectures, city / ward, large letters, towns, and The dictionary becomes very large because all words up to the letters and the cue must be stored. This makes it costly to store all of the dictionary contents in a device that can be accessed at high speed. Therefore, in the present invention,
This is dealt with by storing high-level words with high access frequency and a small number of words in the main storage device and other low-level words in the disk device.

【００８２】図１４は、本発明の第２の実施例の県、
市、区レベルの関連情報辞書の一例を示す。同図が示し
ているのは、各県市区レベルの住所名単語に対して接続
接尾詞及び住所コードが記述されたテーブルである。本
実施例において、『トウキョウ』は、「都」の接尾詞が
接続し、住所コードは「１３００００」であることを示
し、『フチユウ』は、「市」の接尾詞が接続し、住所コ
ードは「１３０２０６」と「３４０３０２」の２か所に
存在することを示している。FIG. 14 shows the prefecture of the second embodiment of the present invention.
An example of a city / ward level related information dictionary is shown. The figure shows a table in which connection suffixes and address codes are described for address name words at each prefecture / city / ward level. In this example, "Tokyo" indicates that the suffix "tou" is connected and the address code is "130000", and "Fuchiyu" is connected to the suffix "city" and the address code is It is shown that there are two locations, "130206" and "340302".

【００８３】図１５は、本発明の第２の実施例の町字レ
ベルの関連情報辞書の一例を示す。同図は、各町字レベ
ルの住所名単語に対する住所コードが記述されたテーブ
ルである。本実施例において、『ミドリ』及び『ミドリ
チョウ』はどちらも住所コード「１３０２０３００１
２」に対応し、『ホンマチ』は住所コード「１３０２０
６００３０」と「３４０３０２００１８」の２箇所に存
在することを示している。FIG. 15 shows an example of a town character level related information dictionary according to the second embodiment of the present invention. This figure is a table in which address codes for address name words at each town character level are described. In the present embodiment, both "Midori" and "Midoricho" have the address code "130203001".
Corresponding to "2", "Honmachi" is the address code "13020"
It is shown that there are two locations of "63030" and "3403020018".

【００８４】図１６は、本発明の第２の実施例の単語関
係解析装置のブロック図である。同図において、図５と
同一部分には同一符号を付す。同図に示す単語解析装置
は、関連情報辞書１、外部装置または、モジュールとデ
ータの受渡しを行う入出力部２、単語関係解析方法を実
現する装置全体を制御する制御部３、論理演算の途中結
果を一時的に格納する一時記憶部４、入出力部２から入
力された各単語、及び単語集合としての関連情報を蓄積
する関連情報記憶装置５、制御部３により制御され、階
層コード化された関連情報記憶装置５のデータ内容同
士、あるいは、一部記憶部４のデータ内容と関連情報記
憶装置５のデータ内容との包合関係をチェックする包合
チェック部７より構成される。FIG. 16 is a block diagram of the word relation analyzing apparatus of the second embodiment of the present invention. In the figure, the same parts as those in FIG. 5 are designated by the same reference numerals. The word analysis device shown in the figure includes a related information dictionary 1, an external device or an input / output unit 2 for exchanging data with a module, a control unit 3 for controlling the entire device for realizing a word relation analysis method, and a logical operation process. It is controlled by the temporary storage unit 4 for temporarily storing the result, each word input from the input / output unit 2, and the related information storage device 5 for storing related information as a word set, and the control unit 3, and is hierarchically coded. Further, the inclusion check unit 7 checks the inclusion relationship between the data contents of the related information storage device 5 or between the data contents of the partial storage unit 4 and the data contents of the related information storage device 5.

【００８５】図１７は、本発明の第２の実施例の単語関
係解析方法を示すフローチャートである。ステップ３０１）まず、入出力部２から入力される単
語の集合を、各単語毎に関連情報辞書１の県市区レベル
単語辞書を参照することにより分割し、各単語の関連情
報をそれぞれ取得する。FIG. 17 is a flow chart showing the word relation analysis method of the second embodiment of the present invention. Step 301) First, a set of words input from the input / output unit 2 is divided for each word by referring to the prefectural city / ward level word dictionary of the related information dictionary 1, and the related information of each word is acquired. .

【００８６】ステップ３０２）各分割された単語（単
語分割解）毎の関連情報を一時記憶部４に格納する。ステップ３０３）包合チェック部７は、一時記憶部４
に格納されている内容に対して、単語分割解毎に相互の
包合チェックを行い、先頭から包合が成立した単語位置
までの単語列と当該単語列が示す住所コードを関連情報
記憶装置５に移送する。Step 302) The related information for each divided word (word division solution) is stored in the temporary storage unit 4. Step 303) The inclusion checking unit 7 causes the temporary storage unit 4
Mutual inclusion check is performed for each of the word division solutions with respect to the content stored in, and the related information storage device 5 obtains the word string from the beginning to the word position where the inclusion is established and the address code indicated by the word string. Transfer to.

【００８７】ステップ３０４）ステップ３０３で組み
合わされた包合が成立した単語列とその関連情報の対応
を１レコードとし、制御部３は、先頭レコードにレコー
ドポインタをセットする。ステップ３０５）制御部３がポイントしているレコー
ドの単語列以降の入力単語列を町字レベル単語辞書を参
照することにより、単語分割し、包合が成立している住
所コードのみを関連情報記憶装置５へ上書きして、包合
の成立しないレコードに関しては、関連情報記憶装置５
に格納された当該レコードを削除する。Step 304) The correspondence between the word string for which the inclusion combined in Step 303 is established and its related information is set as one record, and the control unit 3 sets a record pointer in the first record. (Step 305) The control unit 3 divides the input word string subsequent to the word string of the record pointed to by the town character level word dictionary, and stores only the associated address code in the related information. Regarding the record which is overwritten in the device 5 and the inclusion is not established, the related information storage device 5
Delete the relevant record stored in.

【００８８】ステップ３０６）制御部３がポイントし
ているレコードの位置を１つ進める。ステップ３０７）制御部３がポイントしているレコー
ドが関連情報記憶装置５内の最終レコードであるかどう
かを調べ、最終レコードである場合にはステップ３０８
に移行し、最終レコードでない場合には、ステップ３０
５に移行し、制御部３がポイントするレコードが最終レ
コードになるまで、繰り返す。Step 306) The position of the record pointed to by the control unit 3 is advanced by one. (Step 307) It is checked whether or not the record pointed to by the control unit 3 is the last record in the related information storage device 5, and if it is the last record, step 308.
If it is not the final record, go to step 30.
5, and the process is repeated until the record pointed to by the control unit 3 becomes the final record.

【００８９】このような方法により、関連情報記憶装置
５から、単語の集合が表す関連情報の候補及びその時の
各単語の関連情報を、入出力部２を介して出力すること
が可能である。次に、上記のように、解析された情報を
利用する例を示す。By such a method, it is possible to output the candidate of the related information represented by the set of words and the related information of each word at that time from the related information storage device 5 through the input / output unit 2. Next, an example of using the analyzed information as described above will be shown.

【００９０】図１８は、本発明の第２の実施例の利用例
を示す。最初に、入力条件として、入出力部２から住所
情報として、住所階層において所定のレベルまでの単語
は必ず入力される。例えば、「市区レベルの単語の入力
は必須とする」のように定義する。また、住所の階層レ
ベルの上位と解の逆転は内容に入力する。例えば、『府
中市東京都』のように県レベルと市レベル等に逆転があ
ってはならない。FIG. 18 shows an application example of the second embodiment of the present invention. First, as input conditions, words up to a predetermined level in the address hierarchy are always input as address information from the input / output unit 2. For example, it is defined as "Must input word at city / ward level". Also, the inversion of the solution at the top of the hierarchical level of the address is entered in the content. For example, there should be no reversal between prefecture level and city level, such as "Fuchu City Tokyo".

【００９１】また、辞書構成は、図１４に示すように、
住所テーブルに住所コードを合わせもった辞書を、住所
階層における所定のレベルまでの単語（県市区レベルま
での単語）とそれ以外の単語に分割し、図１４、図１５
に示すような県市区住所辞書及び町字住所辞書を作成し
ている。The dictionary structure is as shown in FIG.
The dictionary having the address code in the address table is divided into words up to a predetermined level in the address hierarchy (words up to the prefectural city level) and other words, and FIGS.
The prefecture city / ward address dictionary and the town character address dictionary are created as shown in.

【００９２】図１８において、上記の入力条件に従っ
て、入力された入力住所文字列の前方から県市区住所辞
書１１の単語と一致する文字列を探索して単語分割を行
う。その際に、入力住所の前方からある部分までは市区
レベルとして候補が列挙される。その後、町字住所辞書
により、それぞれの候補の中から残りの入力住所文字列
を単語分割し、その住所コードと候補での市区レベルの
住所コードとを比較し、包合関係が認められれば、最終
の候補として残す処理を行う。その後、実際のアプリケ
ーショでは、ユーザへの候補の見せ方を規定するために
絞り込み処理（意図理解）を行って出力を行う。In FIG. 18, a character string that matches a word in the prefecture / city / ward address dictionary 11 is searched from the front of the input input address character string according to the above input condition, and word division is performed. At that time, candidates are enumerated as city level from the front of the input address to a certain part. After that, the remaining input address character strings are divided into words from each of the candidates using the town address dictionary and the address code is compared with the city / ward level address code of the candidate. , To leave it as the final candidate. After that, in an actual application, narrowing down processing (intention understanding) is performed in order to specify how the candidate is shown to the user, and then output.

【００９３】例えば、『ツシマ』の入力において、県市
区住所辞書では、以下の候補が考えられる。１ツ／シマ → 津市＋シマ２ツ｜シ／マ → 津市＋マ３ツシマ → 津島市次に、町字住所辞書１２により、「シマ」を検索する
が、“津市”配下に「シマ」も「マ」もないので、この
場合は３「津島市」と確定する。For example, in the input of "Tsushima", the following candidates can be considered in the prefecture / city / ward address dictionary. 1 Tsu / Shima → Tsu City + Shima 2 Tsu | Shi / Ma → Tsu City + Ma 3 Tsushima → Tsushima City Next, search for “Shima” using the street address dictionary 12, but under “Tsu City” Since there is neither a "sima" nor a "ma", in this case, it is decided as 3 "Tsushima City".

【００９４】以上の辞書構成により、容量の比較的小さ
な住所階層（例えば、上位階層である「県市区レベル」
における所定のレベルまでの単語辞書（県市区住所辞
書）ものを高速なデバイスに格納することにより、効率
的な解析の高速化が可能となる。また、（１）の入力条
件は、実用上ほとんど制約のない自然な入力条件であ
る。With the above dictionary construction, an address hierarchy having a relatively small capacity (for example, "prefectural city / ward level" which is a higher hierarchy)
By storing the word dictionary (prefecture city / ward address dictionary) up to a predetermined level in (1) in a high-speed device, efficient analysis can be speeded up. The input condition (1) is a natural input condition with practically no restrictions.

【００９５】以下に具体例を説明する。図１９は、本発
明の第２の実施例の具体例を示す図である。同図では、
入出力部２から入力された住所情報を辞書生成部８にお
いて、コード化し、上位のレベルと下位のレベルに振り
分け、上位のレベルの住所コードを主記憶装置９の県市
区住所辞書に登録し、下位のレベルの住所をディスク装
置１１の町字住所辞書に登録するものとする。「東京都府中市本町１１」「東京都港区六本木１−１−１」「東京都武蔵野市緑町２０」「静岡県三島市本町４０」「大阪府大阪市中央区島之内２」が入力されたとする。ここで、関連情報辞書１を参照し
て以下のような住所コードを取得する。「１３００００／１３０２０６／１３０２０３００
１２」「１３００００／１３６０００／１３６０００１１
１１」「１３００００／１３１０００／１３０２０３００
１２」「１０００００／１１００９０／１１００９００８
００」「７０００００／７００１００／７００１１００１
００」上記の〜の住所コードを一時記憶部４に格納する。
これらの住所コードは、上位から下位の方向に入力され
ているので、最初の住所コードは、「１３００００／
１３０２０６」、は「１３００００／１３６００
０」、は「１３００００／１３１０００」、は「１
０００００／１１００９０」、は「７０００００／７
００１００」を主記憶装置内９のメモリに県市区住所辞
書として登録する。A specific example will be described below. FIG. 19 is a diagram showing a specific example of the second exemplary embodiment of the present invention. In the figure,
In the dictionary generation unit 8, the address information input from the input / output unit 2 is coded, sorted into a higher level and a lower level, and the higher-level address code is registered in the prefectural city / district address dictionary of the main storage device 9. , Lower-level addresses are registered in the street address dictionary of the disk device 11. "11, Honmachi, Fuchu-shi, Tokyo""1-1-1, Roppongi, Minato-ku, Tokyo""20 Midori-cho, Musashino-shi, Tokyo""40 Honcho, Mishima-shi, Shizuoka""2 Shimanouchi, Chuo-ku, Osaka-shi, Osaka" was entered To do. Here, the following address code is acquired with reference to the related information dictionary 1. "130000/130206/13020300
12 "" 130000/136000/13600011
11 "" 130000/131000/13020300
12 "" 100000/11090/111009008
00 ”“ 7000000/700100/70011001
00 ”The address codes 1 to 3 above are stored in the temporary storage unit 4.
Since these address codes are entered from the higher order to the lower order, the first address code is "130000 /
130206 ", is" 130000/13600
"0", "130000/131000", "1"
"00000/11090" is "700000/7"
"00100" is registered in the memory of the main storage device 9 as a prefecture / city / ward address dictionary.

【００９６】また、住所コードの「１３０２０３００
１２」、は「１３６０００１１１１」、は「１３０
２０３００１２」、は「１１００９００８００」、
は「７００１１００１００」をディスク装置１１内の町
字住所辞書として登録する。このように、入力された住
所データから住所辞書を生成することにより、例えば、
電話番号の問い合わせ時の住所入力を例にとると、実際
には、入力文字数は１０〜１６文字が最も多い。また、
３０文字〜４０文字と長い文字列になる場合もあり、上
記の例のように、コード化された住所データを例えば、
県市区及び町字に区切って主記憶装置９内の辞書及びデ
ィスク装置１１内の辞書に振り分けることにより、電話
番号の案内を行う場合に、県市区に関する部分について
は、アクセス速度の速い主記憶装置９にアクセスし、ま
た、町字に関する部分についてはディスク装置１１にア
クセスする。つまり、上位階層でサイズが比較的小さい
部分の単語辞書をアクセス時間の短いメモリで構成する
ことにより、コスト増が少ない、また、文字数が多くて
もディスク装置等のアクセス時間が比較的長い記憶装置
へのアクセス回数を少なくすることが可能であり、単語
解析時間を低減できる。The address code "13020300
12 ”,“ 13600100111 ”, and“ 130
2030012 ”, is“ 1100900800 ”,
Registers "7001100100" as a town character address dictionary in the disk device 11. In this way, by generating an address dictionary from the input address data, for example,
Taking the address input when inquiring about a telephone number as an example, the number of input characters is actually 10 to 16 in most cases. Also,
There may be a long character string of 30 to 40 characters, and as in the above example, the coded address data is, for example,
When the telephone number is guided by dividing the prefecture into the city and the town and dividing into the dictionary in the main storage device 9 and the dictionary in the disk device 11, the part related to the prefecture and city is the main access speed is high. The storage device 9 is accessed, and the disk device 11 is accessed for the part related to the town letter. In other words, by constructing the word dictionary of the relatively small size in the upper hierarchy with a memory having a short access time, the cost increase is small, and even if the number of characters is large, a storage device having a relatively long access time such as a disk device is used. It is possible to reduce the number of accesses to the word and reduce the word analysis time.

【００９７】なお、本発明は、上記の実施例に限定され
ることなく、特許請求の範囲内で種々変更・応用が可能
であり、住所情報の編集や地名や住所の問い合わせ等の
情報サービスに広範に適用可能である。The present invention is not limited to the above-described embodiments, and various modifications and applications are possible within the scope of the claims, and are applicable to information services such as editing of address information and inquiries of place names and addresses. Widely applicable.

【００９８】[0098]

【発明の効果】上述のように、本発明の単語関係解析装
置及び単語関係解析方法によれば、任意の単語列におい
て、各単語のビット化された関連情報に基づいて、単語
の組合せが表す意味の候補及び該単語の意味を、各単語
がどのような意味で使用しているかという情報を指定せ
ずに、単に論理演算を行うのみで実現できる。As described above, according to the word relation analyzing apparatus and the word relation analyzing method of the present invention, a combination of words is represented in an arbitrary word string based on bitwise related information of each word. The meaning candidate and the meaning of the word can be realized by simply performing a logical operation without designating information as to what meaning each word uses.

【００９９】また、住所テーブルは字まで含めて約４０
万件、単語辞書は１６万単語になる。問い合わせ例１０
０００件を解析したところ平均の単語分割の解件数は２
件であり、９９％は図２０に示すように、８件以内に収
まる。入力の平均文字数は８文字であったので、包合チ
ェックを行わなければ約２００件の解候補が出力される
ので、約１／１００の絞り込み効果が得られる。The address table is about 40 including characters.
The word dictionary has 160,000 words. Inquiry example 10
Analysis of 000 shows that the average number of word divisions is 2
As shown in FIG. 20, 99% falls within 8 cases. Since the average number of input characters is eight, about 200 solution candidates are output without inclusion check, so that a filtering effect of about 1/100 can be obtained.

【０１００】従って、本発明によれば簡単なロジックで
高速に単語関係の解析処理を行うことが可能である。ま
た、図２１は、本発明の第２の実施例の効果を説明する
ための図である。同図は、住所言語解析と高速住所解析
とが同じ小量のメモリ（高速媒体）を使用できるという
条件下において、同図において、縦軸は、辞書ファイル
アクセス回数を示し、横軸は、入力文字数を示す。点線
で示すａは、住所言語解析を行った場合の辞書ファイル
アクセス回数を示し、実線で示すｂは、高速住所解析を
行った場合のファイルアクセス回数を示す。ａ：住所言語解析…次のキーインデックスのみメモリに
登録：住所辞書はディスク装置のファイルに登録：ｂ：高速住所解析…県市区住所辞書のみメモリに登録
（上位基本カテゴリ分に相当）：町字住所辞書はディスク装置のファイルに登録とした場合のシミュレーション結果である。Therefore, according to the present invention, it is possible to perform the analysis processing of word relations at high speed with a simple logic. In addition, FIG. 21 is a diagram for explaining the effect of the second embodiment of the present invention. In the figure, under the condition that the same small amount of memory (high-speed medium) can be used for the address language analysis and the high-speed address analysis, the vertical axis indicates the number of dictionary file accesses and the horizontal axis indicates the input. Indicates the number of characters. The dotted line a shows the number of dictionary file accesses when the address language analysis is performed, and the solid line b shows the file access number when the high speed address analysis is performed. a: Address language analysis ... Register only the following key index in the memory: Register the address dictionary in the file of the disk device: b: High-speed address analysis ... Register only the prefecture / city / ward address dictionary in the memory (corresponding to the upper basic category): Town The character address dictionary is the simulation result when it is registered in the file of the disk device.

【０１０１】解析を行う単語列に相当する入力文字列が
長くなるほど、辞書ファイルアクセス回数は住所言語解
析においては、顕著に増加する。しかしながら、高速住
所解析においては、入力文字列が長くなっても辞書ファ
イルアクセス回数はそれほど増加せず、しかも、住所入
力における入力文字数の範囲における辞書ファイルアク
セス回数は無視できるため、実用的な実行速度で住所を
解析することが可能となる。The longer the input character string corresponding to the word string to be analyzed, the more the dictionary file access frequency increases in the address language analysis. However, in high-speed address analysis, the number of dictionary file accesses does not increase so much even if the input character string becomes long, and the number of dictionary file accesses within the range of the number of input characters for address input can be ignored. It is possible to analyze the address with.

【０１０２】上記のように、本発明によれば、任意の単
語列において、各単語の階層コード化された関連情報に
基づいて、単語の組合せが表す意味の候補及び、その時
の各単語の意味を、各単語がどのような意味で使用して
いるかを指定せずに入力しても、少ないリソースで、高
速に、しかも、単純な階層包合チェックで解析すること
が可能である。As described above, according to the present invention, in an arbitrary word string, based on the hierarchically encoded related information of each word, the candidate of the meaning represented by the combination of words and the meaning of each word at that time are represented. Even if is input without specifying what meaning each word is used for, it is possible to perform analysis with a small amount of resources, at high speed, and with a simple hierarchical inclusion check.

[Brief description of drawings]

【図１】本発明の第１の原理構成図である。FIG. 1 is a first principle configuration diagram of the present invention.

【図２】本発明の第２の原理構成図である。FIG. 2 is a second principle configuration diagram of the present invention.

【図３】本発明の第１の原理説明図である。FIG. 3 is a diagram illustrating the first principle of the present invention.

【図４】本発明の第２の原理説明図である。FIG. 4 is a diagram illustrating a second principle of the present invention.

【図５】本発明の第１の実施例の単語関係解析装置のブ
ロック図である。FIG. 5 is a block diagram of a word relation analysis device according to the first embodiment of this invention.

【図６】本発明の第１の実施例の関連情報辞書の例を示
す図である。FIG. 6 is a diagram showing an example of a related information dictionary according to the first embodiment of this invention.

【図７】本発明の第１の実施例の単語関係解析装置の動
作を示すフローチャートである。FIG. 7 is a flowchart showing an operation of the word relation analysis device of the first exemplary embodiment of the present invention.

【図８】本発明の第１の実施例のアプリケーションプロ
グラムの固有の処理を示すフローチャートである。FIG. 8 is a flowchart showing processing unique to an application program according to the first embodiment of this invention.

【図９】本発明の第１の実施例の単語関連解析のデータ
フロー（その１）である。FIG. 9 is a data flow (No. 1) of the word relation analysis according to the first embodiment of this invention.

【図１０】本発明の第１の実施例の単語関連解析のデー
タフロー（その２）である。FIG. 10 is a data flow (No. 2) of the word relation analysis according to the first embodiment of this invention.

【図１１】本発明の第１の実施例の単語関連解析のデー
タフロー（その３）である。FIG. 11 is a data flow (No. 3) of the word relation analysis according to the first embodiment of this invention.

【図１２】本発明の第１の実施例の単語関連解析のデー
タフロー（その４）である。FIG. 12 is a data flow (No. 4) of the word relation analysis according to the first embodiment of this invention.

【図１３】本発明の第１の実施例の単語関連解析のデー
タフロー（その５）である。FIG. 13 is a data flow (No. 5) of the word relation analysis according to the first embodiment of this invention.

【図１４】本発明の第２の実施例の県市区レベルの住所
辞書の一例を示す図である。FIG. 14 is a diagram showing an example of a prefecture / ward / ward-level address dictionary according to the second embodiment of the present invention.

【図１５】本発明の第２の実施例の町字レベルの住所辞
書の一例を示す図である。FIG. 15 is a diagram showing an example of a street character level address dictionary according to a second embodiment of the present invention.

【図１６】本発明の第２の実施例の単語関係解析装置の
ブロック図である。FIG. 16 is a block diagram of a word relation analysis device of a second exemplary embodiment of the present invention.

【図１７】本発明の第２の実施例の単語関係解析方法を
示すフローチャートである。FIG. 17 is a flowchart showing a word relation analysis method according to the second embodiment of the present invention.

【図１８】本発明の第２の実施例の利用例を示す図であ
る。FIG. 18 is a diagram showing a usage example of the second exemplary embodiment of the present invention.

【図１９】本発明の第２の実施例の具体例を示す図であ
る。FIG. 19 is a diagram showing a specific example of the second exemplary embodiment of the present invention.

【図２０】本発明の第１の効果を説明するための図であ
る。FIG. 20 is a diagram for explaining the first effect of the present invention.

【図２１】本発明の第２の効果を説明するための図であ
る。FIG. 21 is a diagram for explaining the second effect of the present invention.

【図２２】従来の単語解析システム構成図である。FIG. 22 is a block diagram of a conventional word analysis system.

[Explanation of symbols]

１関連情報辞書２入出力部３制御部４一時記憶部５関連情報記憶装置６論理演算部７包合チェック部８辞書生成部９主記憶装置１０ディスク装置１１県市区住所辞書１２町字住所辞書１００関連情報辞書２００解析手段２１０関連情報取得手段２１１マスク手段２１２第２の解析手段２２０関連情報出力手段 1 Related information dictionary 2 Input / output unit 3 Control unit 4 Temporary storage unit 5 Related information storage device 6 Logical operation unit 7 Encapsulation check unit 8 Dictionary generation unit 9 Main storage device 10 Disk device 11 Prefectural city / district address dictionary 12 Town address Dictionary 100 Related information dictionary 200 Analysis means 210 Related information acquisition means 211 Masking means 212 Second analysis means 220 Related information output means

───────────────────────────────────────────────────── フロントページの続き (51)Int.Cl.⁶ 識別記号庁内整理番号ＦＩ技術表示箇所 8420−5Ｌ 15/38 Ｍ ─────────────────────────────────────────────────── ─── Continuation of front page (51) Int.Cl. ⁶ Identification code Office reference number FI technical display location 8420-5L 15/38 M

Claims

[Claims]

1. A word relation analysis device that analyzes the meaning of an input word and the meaning of a set of said words, and expresses the meaning of said word by displaying the meaning of each dependent meaning category for each meaning of a basic meaning category. Associated information dictionary for storing the stored information, and analysis means for analyzing the meaning of a word having a meaning limited to a significant number and the meaning of a word string consisting of the word by referring to the related information dictionary. And a word relation analysis device.

2. The related information dictionary holds related information for each word as bit information.
The described word relation analysis device.

3. The analysis means recursively repeats a logical operation for the number of input words by using the related information in which the meaning of the selected word is bitized by referring to the related information dictionary. When the selected word is the final word of the input word string, the related information acquisition means to be acquired and the related information of the final word and all the words before the final word are logically operated and input. 2. The word relation analysis device according to claim 1, further comprising: related information output means for outputting related information of the entire set of the selected words.

4. The related information acquiring means, based on the content of the suffix, when the input word includes a suffix indicating the meaning / content of the meaning category for one of the meaning categories. 4. The word relation analysis apparatus according to claim 3, further comprising masking means for masking related information of another semantic category.

5. A word relation analysis device for analyzing the meaning of an input word and the meaning of a set of said words, and displaying the meaning of each word by displaying the meaning in a subordinate meaning category for each basic meaning category. And a first analysis means for analyzing the meaning of a word having a meaning limited to a significant number and the meaning of a word string consisting of the word, by referring to the related information dictionary. A word relation analysis device characterized by the above.

6. The word relation analysis apparatus according to claim 5, wherein the related information dictionary holds the related information for each word as coded information.

7. The first analyzing means refers to the related information dictionary for subordinate meaning categories of words selected in a hierarchical order of basic meaning categories having an order, and refers to the related information dictionary to obtain higher ranks by using the related information. The related information acquisition means that interprets the dependent meaning category within the range of the basic meaning category of, and repeatedly recursively acquires the interpretation of the next basic meaning category within the range of the interpreted dependent meaning category, 6. The word relation analysis device according to claim 5, further comprising: related information output means for outputting related information of the entire set of words.

8. The first analysis means, when the input word includes, for one of the meaning categories, a suffix indicating the meaning / content of the meaning category, the content of the suffix is added. 8. The word relation analysis apparatus according to claim 7, further comprising a second analysis means for determining a basic meaning category based on the basic meaning category and analyzing the basic meaning category within the range.

9. A storage order determining means for determining a storage order in descending order from the highest level to a lower level based on the level of the meaning category of the word, and the highest order among the orders determined by the storage order determining means. Firstly, words from the upper level to a predetermined level are stored in a first related information dictionary in a first storage means having a high access speed.
Storage means, and second storage means for storing words having a predetermined level or lower in a second related information dictionary in a second storage means having an access speed slower than that of the first storage means. Item 5
The described word relation analysis device.

10. The first storage means is a main storage device, and the second storage means is a disk device.
The described word relation analysis device.

11. A word analysis method for analyzing the meaning of an input word or the meaning of a set of said words, using relevant information for displaying the meaning of a word by a plurality of meaning categories, and limiting the number to a significant number. A word relation analysis method, characterized by analyzing the meaning of a word having a meaning and a word string composed of the word.

12. The related information of each word is held as bit information, the logical operation of the bit information between the words is performed, the operation result of the logical operation is held for each word, and the bit of all words is held. The word relation analysis method according to claim 11, wherein a logical operation is recursively performed on the information.

13. The word constituted by the bit information of the related information is given to each word of the word string, and the bit information of the first word is based on a certain semantic category and the logic of the bit information of another semantic category is given. Performs an arithmetic operation, sequentially with the result of the logical operation of the first word, and recursively repeats the logical operation of the bit information of the next word, and the logical operation of the logical operation result of the bit information of the last word and the bit information of all words. 12. The word relationship analysis method according to claim 11, wherein the meaning of each word in the word string is specified.

14. The meaning category of the input word is 1
12. The word relation analysis method according to claim 11, wherein when a suffix is included for one, mask processing of related information of another semantic category is performed based on the content of the suffix.

15. A word relation analysis method for analyzing the meaning of an input word or the meaning of a set of said words, wherein information representing the meaning of a word is displayed for each basic meaning category by displaying the meaning in a subordinate meaning category. And a related information dictionary stored in, and analyzing the meaning of a word having a meaning limited to a significant number and the meaning of a word string consisting of the word by referring to the related information dictionary. Method.

16. The word relation analysis method according to claim 15, wherein the related information for each word is analyzed with reference to the related information dictionary which is coded information.

17. When performing an analysis, the subordinate semantic categories of words selected in order of basic semantic categories having an order are referred to the related information dictionary, and higher-order basic meanings are obtained by using the encoded related information. Interpret the dependent semantic category in the range of the category, recursively iteratively obtain the interpretation of the next basic semantic category in the range of the interpreted dependent semantic category, and obtain the related information of the entire set of input words. The word relation analysis method according to claim 15, which outputs the word relation.

18. When interpreting the basic meaning category, if one of the meaning categories includes a suffix indicating the meaning / content of the meaning category in the input word, the suffix of the meaning category is input. 18. The word relation analysis method according to claim 17, wherein the basic meaning category is determined based on the content, and analysis is performed within the range of the basic meaning category.

19. The order of storing in descending order from the highest level to the lower level is determined based on the level of the meaning category of the word, and the access speed of words from the highest level to a predetermined level in the determined order. 12. The word relation analysis method according to claim 11, wherein the word relation analysis method stores the words in the first storage means having a high speed, and stores the words having a level equal to or lower than the predetermined level in the second storage means having an access speed slower than that of the first storage means.