JP2002092017A

JP2002092017A - Concept dictionary extending method and its device and recording medium with concept dictionary extending program recorded thereon

Info

Publication number: JP2002092017A
Application number: JP2000278108A
Authority: JP
Inventors: Toshiaki Makino; 俊朗牧野; Masayuki Sugizaki; 正之杉崎; Hiroto Inagaki; 博人稲垣
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2000-09-13
Filing date: 2000-09-13
Publication date: 2002-03-29
Anticipated expiration: 2020-09-13
Also published as: JP3614765B2

Abstract

PROBLEM TO BE SOLVED: To calculate the attribute vectors of specific nouns or new words which do not exist in a concept dictionary, and to add those attribute vectors to the concept dictionary. SOLUTION: A relevancy calculating part 4 acquires the information of a user ID (or a terminal ID), retrieval words, and a retrieval time from a retrieval log in a retrieval log data base 3, and calculates relevancy among the retrieval words, and preserves relevant word data in a relevant data base 5. A new word vector calculating part 6 acquires relevant word data related with words read from a new word list 1 from the relevant word data base 5, and calculates the attribute vectors of new words by using word attribute vector information in a concept dictionary 2 and a temporarily preserved vector data base 7 based on the relevant word data, and outputs the attribute vectors of the new words to the temporarily preserved vector data base 7 or a new word dictionary 8.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、コンピュータ上
で、自然言語文の意味処理を行うために用いる概念辞書
に新しい語を追加する概念辞書拡張装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a concept dictionary extending apparatus for adding a new word to a concept dictionary used for performing a semantic processing of a natural language sentence on a computer.

【０００２】[0002]

【従来の技術】近年のインターネットの発達などによ
り、電子化された文書が多数存在するようになり、それ
らを検索、分類したいという要望が高まっている。電子
化された文書を検索、分類する手法には、単語の出現頻
度に基づく文書ベクトルを利用する方法や、単語の意味
を属性ベクトルで表現した概念辞書を用いて、文書内に
出現する各単語の属性ベクトルの和により表現した文書
ベクトルを利用する方法などがある。2. Description of the Related Art With the recent development of the Internet and the like, a large number of digitized documents are present, and there is an increasing demand for searching and classifying them. The method of searching and classifying digitized documents includes a method using a document vector based on the frequency of occurrence of a word, and a method using a concept dictionary expressing the meaning of the word with an attribute vector, and each word appearing in the document. There is a method of using a document vector expressed by the sum of the attribute vectors.

【０００３】単語の出現頻度を利用するものは、特に辞
書を必要としないという利点はあるが、表記にのみ依存
するので、表記の違う単語同士は、全く別の語として取
り扱われてしまうため、単語間の意味の近さを表現でき
ないという欠点がある。これに対して、概念辞書を用い
る方法は、概念辞書中の属性ベクトルの近い語は類似の
語として判断することが可能なので、単語間の意味の類
似性を取り扱うことができる。[0003] Those utilizing the frequency of appearance of words have the advantage of not requiring a dictionary in particular, but rely only on the notation, and words with different notations are treated as completely different words. There is a drawback that closeness of meaning between words cannot be expressed. On the other hand, in the method using the concept dictionary, words having similar attribute vectors in the concept dictionary can be determined as similar words, and therefore, similarity of meaning between words can be handled.

【０００４】概念辞書を作成する手法としては、既存の
国語辞書などを利用し、見出し語の単語を、その後の語
義文中に出現する単語を属性とし、その出現回数をその
属性の値として、属性ベクトルを定義するという方法が
ある。[0004] As a method of creating a concept dictionary, an existing Japanese language dictionary or the like is used, a word of a headword is set as an attribute of a word appearing in a subsequent meaning sentence, and the number of appearances is set as a value of the attribute. There is a way to define a vector.

【０００５】[0005]

【発明が解決しようとする課題】しかしながら、辞書の
語義文を利用する方法では、辞書に掲載されている以外
の語の属性ベクトルを定義することはできない。このた
め、インターネット上のWWWページなどに出現する固有
名詞や新語を取り扱うことができないという問題があ
る。However, in the method using the meaning of the dictionary, it is not possible to define an attribute vector of a word other than a word included in the dictionary. For this reason, there is a problem that proper nouns and new words appearing on WWW pages on the Internet cannot be handled.

【０００６】本発明の目的は、概念辞書中に存在しない
固有名詞や新語の属性ベクトルを計算し、概念辞書に追
加し、概念辞書を拡張する概念辞書拡張方法、装置、お
よび概念辞書拡張プログラムを記録した記録媒体を提供
することにある。An object of the present invention is to provide a concept dictionary extension method, a concept dictionary extension program and a concept dictionary extension method for calculating an attribute vector of a proper noun or a new word which does not exist in the concept dictionary, adding the vector to the concept dictionary, and extending the concept dictionary. An object of the present invention is to provide a recorded recording medium.

【０００７】[0007]

【課題を解決するための手段】本発明の概念辞書拡張装
置は概念辞書と検索ログデータベースと関連語データベ
ースと新語リストと一時保存ベクトルデータベースと新
語辞書と関連度計算部と新語ベクトル計算部を有する。The concept dictionary expansion apparatus of the present invention has a concept dictionary, a search log database, a related word database, a new word list, a temporary storage vector database, a new word dictionary, a relevance calculator, and a new word vector calculator. .

【０００８】関連度計算部は、検索ログから得られる、
検索ユーザが使用した各２つの検索語の使用された時間
間隔の情報を用いて両検索語間の関連度を算出し、検索
語とその関連度を含む関連語データを作成する。[0008] The relevance calculation unit is obtained from the search log.
The degree of relevance between the two search terms is calculated using the information on the time interval of each of the two search terms used by the search user, and related word data including the search terms and their relevance is created.

【０００９】新語ベクトル計算部は、属性ベクトルを追
加する新語のリストである新語リストから単語を１つ読
み込み、関連語データベースから、その単語に関する関
連語を関連度ともに受け取る。The new word vector calculation unit reads one word from a new word list which is a list of new words to which an attribute vector is added, and receives a related word relating to the word together with the degree of relevance from a related word database.

【００１０】次に、関連語の中で概念辞書に既に存在す
るものについて、属性ベクトルを概念辞書から取得し、
それを関連度で重みづけした上で足し合わせて、新語の
属性ベクトルとする。これを新語リストの各単語につい
て行い、結果を一時保存ベクトルデータベースへ保存す
る。Next, for related words that already exist in the concept dictionary, attribute vectors are obtained from the concept dictionary,
These are weighted by the degree of relevance and then added to obtain a new word attribute vector. This is performed for each word in the new word list, and the result is stored in the temporary storage vector database.

【００１１】次に、新語ベクトル計算部は、再び新語リ
ストから単語を１つ読み込み、先程と同様に、関連語と
その関連度を関連語データベースから取得する。Next, the new word vector calculation unit reads one word from the new word list again, and obtains related words and their degrees of relevance from the related word database in the same manner as described above.

【００１２】関連語の中で概念辞書に存在するものは、
概念辞書から、一時保存ベクトルデータベースに存在す
るものに関しては、一時保存ベクトルデータベースから
属性ベクトルを取得し、関連度で重みづけした上で足し
合わせて、新語の新たな属性ベクトルとし、一時保存ベ
クトルデータベースに記録する。[0012] Among the related words, those existing in the concept dictionary are:
For those existing in the temporary storage vector database from the concept dictionary, the attribute vectors are obtained from the temporary storage vector database, weighted by the degree of relevance, and added together to form a new attribute vector for the new word. To record.

【００１３】関連語データを取得し、概念辞書と一時保
存ベクトルデータベース中の属性ベクトルデータを利用
して、新たな属性ベクトルを計算するという動作を、予
め定められた回数、あるいは前回の属性ベクトルと新た
な属性ベクトルの差分の総和が予め定められた閾値を下
回るまで繰り返し、最終的に得られた結果を新語辞書に
出力する。The operation of acquiring related word data and calculating a new attribute vector by using the attribute vector data in the concept dictionary and the temporary storage vector database is performed a predetermined number of times, It repeats until the sum of the differences of the new attribute vectors falls below a predetermined threshold, and outputs the finally obtained result to the new word dictionary.

【００１４】以上のように、検索ログと概念辞書より新
たな語の属性ベクトルを算出し、新語辞書を概念辞書に
加えることにより、概念辞書の拡張を行うことができ
る。As described above, the concept dictionary can be expanded by calculating the attribute vector of a new word from the search log and the concept dictionary and adding the new word dictionary to the concept dictionary.

【００１５】[0015]

【発明の実施の形態】次に、本発明の実施の形態につい
て図面を参照して説明する。Next, embodiments of the present invention will be described with reference to the drawings.

【００１６】図１を参照すると、本発明の一実施形態の
概念辞書拡張装置は新語リスト１と概念辞書２と検索ロ
グデータベース３と関連度計算部４と関連語データベー
ス５と新語ベクトル計算部６と一時保存ベクトルデータ
ベース７と新語辞書８で構成されている。Referring to FIG. 1, an apparatus for expanding a concept dictionary according to an embodiment of the present invention includes a new word list 1, a concept dictionary 2, a search log database 3, a relevance calculator 4, a related word database 5, and a new word vector calculator 6. And a temporary storage vector database 7 and a new word dictionary 8.

【００１７】新語リスト１は属性ベクトルを追加する単
語のリストを保存している。概念辞書２は語の意味を属
性ベクトルで表現した辞書である。検索ログデータベー
ス３はWWWの検索エンジンの検索ログまたはデータベー
スの検索ログを保存している。関連度計算部４は検索ロ
グデータベース３中の検索ログから、ユーザID（または
端末ID）、検索語、検索時刻の情報を取得し、検索語間
の関連度を計算し、関連語データベース５に検索語と関
連度を含む関連語データを出力する。関連語データベー
ス５は関連度計算部４が出力した関連語データを保存す
る。新語ベクトル計算部６は新語リスト１から読み込ん
だ単語に関する関連語データを関連語データベース５よ
り取得し、それに基づき概念辞書２および一時保存ベク
トルデータベース７内の語の属性ベクトル情報を利用し
て、新語の属性ベクトルを計算し、一時保存ベクトルデ
ータベース７や新語辞書８に出力する。一時保存ベクト
ルデータベース７は新語ベクトル計算部６が算出した、
新語の属性ベクトルの途中結果を一時的に保存する。新
語辞書８は新語ベクトル計算部６が算出した最終的な新
語の属性ベクトルを保存する。The new word list 1 stores a list of words to which an attribute vector is added. The concept dictionary 2 is a dictionary expressing the meaning of a word by an attribute vector. The search log database 3 stores a search log of a WWW search engine or a search log of a database. The relevance calculator 4 obtains information on the user ID (or terminal ID), the search term, and the search time from the search log in the search log database 3, calculates the relevance between the search terms, and stores it in the related term database 5. Outputs related word data including search terms and relevance. The related word database 5 stores the related word data output by the relevance calculator 4. The new word vector calculation unit 6 acquires related word data relating to the word read from the new word list 1 from the related word database 5 and uses the concept dictionary 2 and the attribute vector information of the words in the temporary storage vector database 7 based on the word data. Is calculated and output to the temporary storage vector database 7 and the new word dictionary 8. The temporary storage vector database 7 is calculated by the new word vector calculation unit 6,
Temporarily save the intermediate results of the new word attribute vector. The new word dictionary 8 stores the final new word attribute vector calculated by the new word vector calculation unit 6.

【００１８】図２は、新語リスト１中の単語リストの例
である。概念辞書２に新たに追加したい語を１行に１単
語記述したものである。FIG. 2 is an example of a word list in the new word list 1. A word to be newly added to the concept dictionary 2 is described one word per line.

【００１９】表１は、概念辞書２中の辞書データの例で
ある。図中の「電話」「レストラン」「グラフ」などが
単語であり、「A」「B」「C」・・・・「ZZZ」が属性名
である。各語について、各属性の値を定義してあり、こ
れにより単語が属性ベクトルとして表現されている。こ
れは、予め作成して与えておく。なお、一時保存ベクト
ルデータベース７、新語辞書８中のデータも同様の形式
である。Table 1 is an example of dictionary data in the concept dictionary 2. In the figure, "telephone", "restaurant", "graph" and the like are words, and "A", "B", "C",... "ZZZ" are attribute names. The value of each attribute is defined for each word, whereby the word is represented as an attribute vector. This is created and given in advance. The data in the temporary storage vector database 7 and the new word dictionary 8 have the same format.

【００２０】[0020]

【表１】 [Table 1]

【００２１】表２は、検索ログデータベース３中の検索
ログの例である。ユーザまたは端末を表すユーザIDとそ
のユーザが入力した単語とその単語が入力された時刻が
記述してある。Table 2 is an example of a search log in the search log database 3. A user ID representing a user or a terminal, a word input by the user, and a time at which the word was input are described.

【００２２】[0022]

【表２】 [Table 2]

【００２３】この例では、時刻はある時点を起点とし
て、そこからの秒数で表現してある。ログの表現形式は
一例であり、ユーザID、検索時刻、検索語の情報が含ま
れていれば、形式に制限はない。In this example, the time is represented by the number of seconds from a certain point as a starting point. The expression format of the log is an example, and there is no limitation on the format as long as the information includes the user ID, the search time, and the search word.

【００２４】図３は、関連語データベース５中の関連語
データの例である。２つの語と、その関連度が記述され
ている。この値が大きいほど、２つの語の関連度が高い
ことを示している。FIG. 3 is an example of related word data in the related word database 5. Two words and their relevance are described. The larger the value, the higher the degree of relevance between the two words.

【００２５】次に、本概念辞書拡張装置の動作につい
て、図４に示すフローチャートをもとに説明する。Next, the operation of the concept dictionary expanding apparatus will be described with reference to the flowchart shown in FIG.

【００２６】ステップ１０１に、関連度計算部４は検索
ログデータベース３中の検索ログを読み込み、関連度を
計算し、関連語と関連度を含む関連語データを作成し、
関連語データベース５に保存する。検索語w_jとw_kの関連
度V_jkは例えば以下の式で求める。In step 101, the relevance calculator 4 reads the search log in the search log database 3, calculates the relevance, and generates related word data including the related word and the relevance.
It is stored in the related word database 5. The degree of relevance V _jk between the search words w _j and w _k is determined by, for example, the following equation.

【００２７】[0027]

【数１】 (Equation 1)

【００２８】ここで、iは、検索語w_jとw_kの両方の語を
使用したユーザを表し、Here, i represents a user who has used both the search words w _j and w _k ,

【００２９】[0029]

【外１】 [Outside 1]

【００３０】は、以下で与えられるものとする。Is given below.

【００３１】[0031]

【数２】 (Equation 2)

【００３２】ただし、t_ijは、ユーザiが検索語w_jを使用
した時刻とする。Here, t _ij is the time when the user i uses the search word w _j .

【００３３】また、関数f(x)は、xの値が大きいほど、
小さい値を与える関数とする。Further, the larger the value of x, the more the function f (x)
A function that gives a small value.

【００３４】検索語w_jとw_kの関連度は、あるユーザｉが
w_jとw_kを使用した時間間隔が小さいほど大きくなり、ま
た、w_jとw_kの両方を使用したユーザの数が大きいほど大
きくなる。The relevance between the search terms w _j and w _k is determined by a certain user i
It becomes larger as the time interval using w _j and w _k is smaller, and becomes larger as the number of users using both w _j and w _k is larger.

【００３５】上記の方法で、全ての検索語の組み合わせ
について、関連度を計算し、図３に示すような形式の関
連語データを作成し、関連語データベース５に保存す
る。With the above-described method, the degree of relevance is calculated for all combinations of search words, related word data in a format as shown in FIG. 3 is created, and stored in the related word database 5.

【００３６】ステップ１０２に、新語ベクトル計算部６
は、新語リスト１中のリストLから単語tを取り出す。In step 102, the new word vector calculation unit 6
Extracts the word t from the list L in the new word list 1.

【００３７】ステップ１０３に、新語ベクトル計算部６
は関連語データベース５から単語tの関連語データRを取
得する。関連語データRは、単語tの関連語r_iと、単語t
の関連語r_iの関連度v_iの組（r_i,v_i)の集合である。ここ
で、ｉは関連語の番号である。In step 103, the new word vector calculation unit 6
Acquires the related word data R of the word t from the related word database 5. The related word data R includes a related word r _i of the word t and a word t
Set of relevance v _i of related words _{_{_{r i (r i, v i}}} ) is the set of. Here, i is the number of the related word.

【００３８】ステップ１０４に、新語ベクトル計算部６
は、関連語データR中の関連語riのうちで、概念辞書２
中に存在する語に関して、概念辞書２より各関連語r_iの
属性ベクトルIn step 104, the new word vector calculation unit 6
Is the concept dictionary 2 among the related words ri in the related word data R.
Attribute vector of each related word r _i from concept dictionary 2

【００３９】[0039]

【外２】 [Outside 2]

【００４０】を取得する。Is obtained.

【００４１】ステップ１０５に、新語ベクトル計算部６
は、関連語データR中の語rのうちで、一時保存ベクトル
データベース７中に存在する語に関して、一時保存デー
タベース７よりその属性ベクトルIn step 105, the new word vector calculation unit 6
Is the attribute vector of the word r in the related word data R, which is present in the temporary storage vector database 7, from the temporary storage database 7.

【００４２】[0042]

【外３】 [Outside 3]

【００４３】を取得する。なお、初期状態では、一時保
存ベクトルデータベース７中にはデータはない。Is obtained. In the initial state, there is no data in the temporary storage vector database 7.

【００４４】ステップ１０６に、新語ベクトル計算部６
は、関連度の高い語同士は意味的な関連も深いと仮定
し、ステップ１０４または１０５で取得した属性ベクト
ルデータIn step 106, the new word vector calculation unit 6
Assumes that words having a high degree of relevance are deeply semantically related, and the attribute vector data acquired in step 104 or 105

【００４５】[0045]

【外４】 [Outside 4]

【００４６】とステップ１０３で求めた関連度の値v_iを
用いて、単語tの属性ベクトル[0046] and by using the relevance of the value v _i obtained in step 103, attribute vector of the word t

【００４７】[0047]

【外５】 [Outside 5]

【００４８】を次式により計算Is calculated by the following equation.

【００４９】[0049]

【数３】 (Equation 3)

【００５０】する。Then,

【００５１】ここで、添字１は単語tの属性ベクトルの
１回目の計算結果であることを表す。一般に単語tの属
性ベクトルのn回目の計算結果をHere, the subscript 1 represents the result of the first calculation of the attribute vector of the word t. In general, the n-th calculation result of the attribute vector of word t is

【００５２】[0052]

【外６】 [Outside 6]

【００５３】で表す。Is represented by

【００５４】ステップ１０７に、新語リストL中に未処
理の単語が存在するかどうか判定する。存在する場合
は、ステップ１０２へ、全ての単語について処理を終え
た場合は、ステップ１０８へ進む。この時点で、新語リ
ストL中の各語についての属性ベクトルの計算が１回、
終了したことになる。In step 107, it is determined whether or not an unprocessed word exists in the new word list L. If there is, the process proceeds to step 102, and if the processing has been completed for all words, the process proceeds to step 108. At this point, the calculation of the attribute vector for each word in the new word list L is performed once,
It has ended.

【００５５】ステップ１０８に、新語ベクトル計算部６
は、終了条件を判定する。終了条件としては、予め設定
した計算回数に達したか否かや、各単語の属性ベクトル
の前回の計算結果との差分の総和Dが、予め設定した閾
値より小さいか否かなどが考えられる。Dは次式で定義
される。In step 108, the new word vector calculation unit 6
Determines the termination condition. Examples of the termination condition include whether or not a predetermined number of calculations has been reached, and whether or not the sum D of the difference between the attribute vector of each word and the previous calculation result is smaller than a predetermined threshold. D is defined by the following equation.

【００５６】[0056]

【数４】 (Equation 4)

【００５７】終了条件が満たされている場合は、ステッ
プ１１０へ、満たされていない場合は、ステップ１０９
へ進む。If the termination condition is satisfied, the process proceeds to step 110; otherwise, the process proceeds to step 109.
Proceed to.

【００５８】ステップ１０９に、新語ベクトル計算部６
は、今回計算した各語の属性ベクトルで、一時保存ベク
トルデータベース７を書き換え、ステップ１０２へ戻
る。In step 109, the new word vector calculation unit 6
Rewrites the temporary storage vector database 7 with the attribute vector of each word calculated this time, and returns to step 102.

【００５９】ステップ１１０に、新語ベクトル計算部６
は、今回計算した各語の属性ベクトルを新語辞書８へ書
き出す。In step 110, the new word vector calculation unit 6
Writes the attribute vector of each word calculated this time to the new word dictionary 8.

【００６０】本実施形態によれば、既存の概念辞書２と
検索ログを用意するだけで、自動的に新語の属性ベクト
ルを算出することが可能となる。According to the present embodiment, it is possible to automatically calculate the attribute vector of a new word only by preparing the existing concept dictionary 2 and the search log.

【００６１】なお、以上説明した図４の処理は概念辞書
拡張プログラムとして、フロッピィディスク、CD−RO
M、光磁気ディスクなどの記録媒体に記録しておき、パ
ソコンなどのコンピュータ上で実行することができる。The above-described processing of FIG. 4 is performed as a concept dictionary extension program by using a floppy disk, a CD-RO
M, recorded on a recording medium such as a magneto-optical disk, and can be executed on a computer such as a personal computer.

【００６２】[0062]

【発明の効果】以上説明したように、本発明は、インタ
ーネットの検索エンジンやデータベースの検索ログか
ら、検索語、検索語が使用された時刻、検索語の使用者
あるいは使用端末のID情報を獲得し、これらに基づき検
索語間の関連の程度を表す関連度を算出し、この関連度
と概念辞書に定義された単語の属性ベクトルを用い、新
語の属性ベクトルを自動的に算出することにより、新語
や固有名詞に対応した概念辞書を容易に構築できるとい
う効果がある。As described above, the present invention obtains a search term, a time at which the search term was used, ID information of a user of the search term or a terminal used from a search log of an Internet search engine or a database. Then, based on these, the degree of relevance indicating the degree of relevance between the search words is calculated, and by using the degree of relevance and the attribute vector of the word defined in the concept dictionary, the attribute vector of the new word is automatically calculated. There is an effect that a concept dictionary corresponding to a new word or a proper noun can be easily constructed.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の一実施形態の概念辞書拡張装置のブロ
ック図である。FIG. 1 is a block diagram of a concept dictionary extension device according to an embodiment of the present invention.

【図２】図１に示した新語リスト１中の単語リストの例
の一部である。FIG. 2 is a part of an example of a word list in the new word list 1 shown in FIG.

【図３】図１に示した関連度計算部４が生成し、関連語
データベース５に保存される関連語データの例の一部で
ある。FIG. 3 is a part of an example of related word data generated by a relevance calculator 4 shown in FIG. 1 and stored in a related word database 5;

【図４】図１の概念辞書拡張装置の動作を示すフローチ
ャートである。FIG. 4 is a flowchart showing an operation of the concept dictionary extension device of FIG. 1;

【符号の説明】１新語リスト２概念辞書３検索ログデータベース４関連度計算部５関連語データベース６新語ベクトル計算部７一時保存ベクトルデータベース８新語辞書１０１〜１１０ステップ[Description of Signs] 1 New word list 2 Concept dictionary 3 Search log database 4 Relevance calculator 5 Related word database 6 New word vector calculator 7 Temporary storage vector database 8 New word dictionary 101 to 110 steps

フロントページの続き (72)発明者稲垣博人東京都千代田区大手町二丁目３番１号日本電信電話株式会社内Ｆターム(参考） 5B075 ND03 NK46 NR12 PR03 PR10 QM07 UU01 Continuation of front page (72) Inventor Hiroto Inagaki 2-3-1 Otemachi, Chiyoda-ku, Tokyo F-term in Nippon Telegraph and Telephone Corporation (reference) 5B075 ND03 NK46 NR12 PR03 PR10 QM07 UU01

Claims

[Claims]

A relevance between two search terms is calculated using information on time intervals of two search terms used by a search user obtained from a search log, and the search terms and the relevance are calculated. A first step of creating related word data including the related word database and storing the related word data in a related word database; reading one word from the new word list; obtaining related words related to the word together with the degree of relevance from the related word database; For words that already exist in the concept dictionary, the attribute vector is obtained from the concept dictionary, and for the acquired related words that exist in the temporary storage vector database, the attribute vector is obtained from the temporary storage vector database. Calculating the attribute vector of the word by adding the acquired attribute vectors after weighting them with the degree of association, A second step for all the words in the list, a third step for determining whether a predetermined end condition is satisfied, and an attribute vector of each word calculated this time when the predetermined end condition is not satisfied. A fourth step of rewriting the temporary storage vector database and returning to the second step, and a fifth step of writing the attribute vector of each word calculated this time to the new word dictionary when a predetermined end condition is satisfied. Concept dictionary expansion method to have.

2. The method according to claim 1, wherein the termination condition is whether the number of calculations has reached a predetermined number of calculations.

3. The method according to claim 1, wherein the end condition is whether or not a sum of differences between the attribute vector of each word and a previous calculation result is smaller than a preset threshold.

4. A concept dictionary in which the meaning of a word is expressed in a vector, a search log database that holds a search log that records a user's search, and a related word database that temporarily stores related word data including a search word and its relevance. A new word list that is a list of new words to which a new attribute vector is to be added; a temporary storage vector database that temporarily stores an attribute vector; a new word dictionary that holds the final new word attribute vector calculation result; and the search log. The search log in the database is read, and the degree of relevance between the two search terms is calculated using information on the time interval of each of the two search terms used by the search user obtained from the search log. A relevance calculator that stores related word data including a relevance in the related word database; and, for each word in the new word list, the related word data. From the database, a related word related to the word is acquired together with the degree of relevance, and among the acquired related words that already exist in the concept dictionary, an attribute vector is acquired from the concept dictionary. For those that exist in the temporary storage vector database, obtain the attribute vector from the temporary storage vector database, calculate the attribute vector of the word by adding the acquired attribute vectors after weighting them with the degree of association, It is determined whether a predetermined termination condition is satisfied.If not, the temporary storage vector database is rewritten with the attribute vector of each word calculated this time, and the process returns to the calculation of the attribute vector of the word, and the condition is satisfied. In this case, there is a new word vector calculation unit for writing the attribute vector of each word calculated this time into the new word dictionary. Concept dictionary expansion device.

5. The degree of relevancy between two search terms is calculated using information on time intervals of two search terms used by a search user obtained from a search log, and the search term and its relevance are calculated. A first process of creating related word data including the related word database and storing the related word data in a related word database; reading one word from the new word list; acquiring a related word related to the word together with the degree of relevance from the related word database; For words that already exist in the concept dictionary, the attribute vector is obtained from the concept dictionary, and for the acquired related words that exist in the temporary storage vector database, the attribute vector is obtained from the temporary storage vector database. Calculating the attribute vector of the word by adding the acquired attribute vectors after weighting them with the degree of relevance, A second process to be performed on all the words in, a third process to determine whether a predetermined end condition has been satisfied, and, if the predetermined end condition has not been satisfied, an attribute vector of each word calculated this time is used. A fourth process of rewriting the temporary storage vector database and returning to the second process, and a fifth process of writing the attribute vector of each word calculated this time to a new word dictionary when a predetermined end condition is satisfied, Storage medium storing a concept dictionary extension program to be executed by a user.