JP3472032B2

JP3472032B2 - Information filter device and information filter method

Info

Publication number: JP3472032B2
Application number: JP10265596A
Authority: JP
Inventors: 信宏下郡
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1995-04-24
Filing date: 1996-04-24
Publication date: 2003-12-02
Anticipated expiration: 2016-04-24
Also published as: JPH0916627A

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、入手した文書に読
む価値があるか否かを利用者に代わって判定し、読む価
値があると判定された文書のみを利用者に提示する情報
フィルタ装置及び情報フィルタ方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention determines, on behalf of a user, whether an acquired document is worth reading, and presents to the user only a document that is judged to be worth reading. And an information filtering method.

【０００２】[0002]

【従来の技術】近年、情報機器や情報記憶媒体等の大容
量化・低価格化に伴い、膨大な量の電子化情報が種々の
媒体で流通されるようになった。このように情報量が増
大してくると、利用者が全ての情報に目を通すことは不
可能であり、読む価値のある情報を選択する必要がでて
くる。しかし、大量の情報のうちから、必要な情報だけ
を選択すること自体が一般ユーザの限られた能力や時間
では困難になっている。2. Description of the Related Art In recent years, with the increase in capacity and price of information devices and information storage media, an enormous amount of computerized information has been distributed on various media. As the amount of information increases in this way, it is impossible for the user to read all the information, and it becomes necessary to select information that is worth reading. However, it is difficult to select only necessary information from a large amount of information with general users' limited ability and time.

【０００３】そこで、利用者に代わって新着の文書の内
容を前もって評価し、読む価値があると思われるものだ
けを利用者に提示するような情報フィルタが種々提案さ
れている。このような従来の情報フィルタは、文書の内
容に関するキーワードを用いて文書の選択或いは評価を
行うものであって、（１）検索式のようなルールを予め
登録しておくもの、（２）ニューラルネットを用いるも
の、などが知られている。Therefore, various information filters have been proposed which evaluate the contents of a newly arrived document in advance on behalf of the user and present only to the user what is considered to be worth reading. Such a conventional information filter selects or evaluates a document by using a keyword related to the content of the document, and (1) a rule such as a search formula is registered in advance, (2) a neural network Those using the net are known.

【０００４】しかしながら、従来のルール記述型の情報
フィルタでは、利用者側においてキーワードを用いて検
索式のようなルールを明示的に記述したものを用意する
必要がある。ここで、利用者の興味等が変化した場合に
はその都度、利用者自身がルールを再定義しなければ対
応できないので、不便である。また、利用者の要求を十
分に満たすようにルールを記述するためには、ある程度
の経験や試行錯誤が必要とされ、簡易に適切なルールを
記述できるといったようなものではない。However, in the conventional rule description type information filter, it is necessary for the user side to prepare an explicit description of a rule such as a search formula using a keyword. Here, it is inconvenient if the user's interest or the like changes, unless the user redefines the rule each time. Further, in order to describe a rule so as to sufficiently satisfy the user's request, some experience and trial and error are required, and it is not a simple thing to describe an appropriate rule.

【０００５】また、従来のニューラルネットを用いた情
報フィルタでは、ニューラルネットが利用者の好みを学
習してそれに基づき、情報を選択するので利用者側にお
いてルール記述の手間や経験が不要というメリットがあ
るが、その反面、ニューラルネット側では利用者の好み
を学習するために膨大な計算量が必要となり、計算量を
抑えるようにすると、充分な学習効果が得られない。Further, in the conventional information filter using the neural network, the neural network learns the user's preference and selects the information based on the learned information, so that the user has no merit of having to write rules and experience. On the other hand, on the other hand, on the neural network side, a huge amount of calculation is required to learn the preference of the user, and if the amount of calculation is suppressed, a sufficient learning effect cannot be obtained.

【０００６】以上のように、大量の情報の中から、利用
者の必要とする情報を選択して、その利用者に提示する
ようにした従来の情報フィルタは、ルール記述型の情報
フィルタの場合には、利用者がルールを明示的に記述す
る必要があり、利用者の興味が変化した場合には、再び
利用者がルールを定義し直す必要があるので、使用に際
して不便であった。As described above, the conventional information filter that selects the information required by the user from a large amount of information and presents it to the user is a rule description type information filter. , It is necessary for the user to explicitly describe the rule, and when the interest of the user changes, the user needs to redefine the rule again, which is inconvenient in use.

【０００７】また、一方、ニューラルネットを用いた従
来の情報フィルタは、利用者の好みを学習するのに膨大
な計算量を必要とし、計算量を抑えようとすると充分な
学習効果が得られず、必要な情報を適確に選択できな
い。On the other hand, the conventional information filter using the neural network requires a huge amount of calculation to learn the preference of the user, and if the amount of calculation is suppressed, a sufficient learning effect cannot be obtained. , I can not select the necessary information properly.

【０００８】[0008]

【発明が解決しようとする課題】以上のように、大量の
情報の中から、利用者の必要とする情報を選択してその
利用者に提示するようにした従来の情報フィルタにおい
ては、ルール記述型の情報フィルタの場合、利用者がル
ールを明示的に記述せねばらず、利用者の興味が変化し
た場合には再び利用者がルールを定義し直さねばならず
不便であった。As described above, in the conventional information filter in which the information required by the user is selected from a large amount of information and presented to the user, the rule description is used. In the case of the type information filter, the user has to explicitly describe the rule, and when the user's interest changes, the user has to redefine the rule again, which is inconvenient.

【０００９】また、一方、ニューラルネットを用いた従
来の情報フィルタにおいては、利用者の好みを学習する
のに膨大な計算量を必要としており、計算量を抑えよう
とすると充分な学習効果が得られず、必要な情報を適確
に選択できないという問題があった。On the other hand, in the conventional information filter using the neural network, a huge amount of calculation is required to learn the preference of the user, and if the amount of calculation is suppressed, a sufficient learning effect can be obtained. However, there was a problem that necessary information could not be selected accurately.

【００１０】従って、いずれの方式の情報フィルタにお
いても使い勝手の良いものではなく、大幅な改善が必要
であり、実用性に乏しいものであった。そのため、もっ
と実用的な情報フィルタの開発が嘱望されている。Therefore, any type of information filter is not easy to use, requires a great improvement, and is not practical. Therefore, the development of a more practical information filter is desired.

【００１１】そこで、この発明の目的とするところは、
大量の情報の中から、ユーザの必要とする情報を選択す
るにあたり、無用に手間をかけることなく利用者の好み
の変化を反映できて、しかも、推定精度が良好であり、
かつ計算量が少く高速に推論／学習を行なうことができ
るようにした実用性の高い情報フィルタ装置および情報
フィルタ方法を提供することにある。Therefore, the object of the present invention is to
When selecting the information required by the user from a large amount of information, it is possible to reflect changes in the user's preference without unnecessary effort, and the estimation accuracy is good,
Another object of the present invention is to provide a highly practical information filter device and information filter method that can perform inference / learning with a small amount of calculation and at high speed.

【００１２】[0012]

【課題を解決するための手段】本発明は、大量の情報の
中から、ユーザの必要とする情報を選択するにあたり、
実用性の高い情報フィルタ装置を提供するものであっ
て、特に、次のような情報フィルタを提供することを目
的としている。According to the present invention, in selecting information required by a user from a large amount of information,
It is intended to provide a highly practical information filter device, and particularly to provide the following information filter.

【００１３】（１）無用に手間をかけることなく利用
者の好みの変化を反映できること。（２）推定精度が良好であること。（３）計算量が少なく高速に推論／学習を行うことが
できること。そして、上記目的を達成するため、本発明
はつぎのようにする。すなわち、複数のキーワードを抽
出するキーワード抽出手段と、前記複数のキーワードの
うちの１又は２以上のキーワードを要素とするキーワー
ドの組合せを生成し、学習対象とする文書中に、前記キ
ーワードの組合せに含まれる各キーワードが出現するか
否かを判定し、出現する場合には第１の識別子を、出現
しない場合には第２の識別子を前記キーワードの組合せ
に割り当て、前記第１の識別子を計数し、計数結果が偶
数の場合には第３の識別子を、奇数の場合には第４の識
別子を前記キーワードの組合せに割り当ててスペクトル
を生成し、前記スペクトルと、前記文書に対応づけられ
た評価値とに基づき、前記キーワードの組合せに対する
係数を算出し、前記係数を前記キーワードの組合せに対
応付けて記憶することにより学習する学習手段と、推論
対象とする文書のうち、前記キーワードの組合せがある
ものについての前記係数と前記スペクトルに基づき、前
記推論対象とする文書の重要度を推論する推論手段とを
具備したことを特徴とする。(1) It should be possible to reflect a change in user's preference without unnecessary trouble. (2) The estimation accuracy is good. (3) Inference / learning can be performed quickly with a small amount of calculation. Then, in order to achieve the above object, the present invention is as follows. That is, multiple keywords are extracted.
The keyword extracting means to be issued and the plurality of keywords
Keyword that has one or more of these keywords as elements
A combination of codes is generated and the key is added in the document to be learned.
Whether each keyword included in the word combination appears
If it appears, the first identifier is
If not, combine the second identifier with the keyword
, The first identifier is counted, and the counting result is an even number.
If it is a number, the third identifier is used. If it is an odd number, the fourth identifier is used.
Spectrum is assigned by assigning bespoke to the keyword combination
Generated and associated with the spectrum and the document
Based on the evaluation value
The coefficient is calculated and the coefficient is paired with the keyword combination.
Learning means for learning by responding and storing and reasoning
Among the target documents, there is a combination of the above keywords
Based on the coefficient and the spectrum for
Inference means for inferring the importance of a document to be inferred is provided.

【００１４】[0014]

【００１５】[0015]

【００１６】また、好ましくは、前記学習手段は、前記
キーワード抽出手段により抽出されたキーワード以外の
新規のキーワードが抽出された場合に、過去の前記学習
手段による学習結果に該新規キーワードを追加するキー
ワード追加手段をさらに具備したことを特徴とする。Further, preferably, the learning means, if a new keyword other than the keyword extracted by the keyword extracting means is extracted, the learning in the past is performed.
It is characterized by further comprising keyword adding means for adding the new keyword to the learning result by the means .

【００１７】また、好ましくは、前記学習手段は、前記
学習の際、前記文書から抽出されたキーワードが影響を
及ぼす範囲についてのみ学習するものであり、前記推論
手段は、推論対象とする文書から抽出されたキーワード
に前記抽出手段により抽出されたキーワードが存在しな
いとした場合の該文書の重要度を予め保持しておき、入
力された前記文書の重要度を推論する際、前記文書から
抽出されたキーワードに前記抽出手段により抽出された
キーワードが存在する場合に該キーワードが該保持して
おいた重要度を変動させる値を求め、この値に基いて該
保持しておいた重要度を修正することにより、入力され
た前記の重要度を求めるものであることを特徴とする。[0017] In a preferred embodiment, the learning means, the
At the time of learning, learning is performed only for the range in which the keyword extracted from the document influences, and the inference means includes the keyword extracted by the extraction means in the keyword extracted from the document to be inferred. not the advance holds the importance of the document in the case of, when inferring the importance of the inputted document, <br/> keywords extracted by the extraction unit to the keywords extracted from the document When the keyword is present, a value for varying the importance retained by the keyword is obtained, and the importance retained is corrected based on this value, whereby the inputted importance is calculated. The feature is that it is desired.

【００１８】[0018]

【００１９】また、好ましくは、前記文書から抽出され
た全キーワード中における前記抽出されたキーワード
の、前記抽出された全キーワード中に占める割合が、予
め定められた値に満たないときは、前記推論手段による
推論を行なわずに利用者に表示する構成としたことを特
徴とする。 [0019] Preferably, the extracted keywords in all keywords in extracted from the document, when the percentage of the total keywords in that the extracted is less than the predetermined value, the inference characterized by being configured to be displayed to the user without the inference by means.

【００２０】本発明では、入力された文書からキーワー
ドを抽出し、記憶手段に記憶している予め定められたキ
ーワードの組合せと文書の重要性に関する評価値の関係
の情報と前記入力文書から抽出されたキーワードのうち
の所定のキーワード（例えばテーブルに登録してあるキ
ーワード）に基いて、その入力された文書の重要度を推
論する。そして、推論手段により重要と判定された文書
を利用者に提示する。According to the present invention, a keyword is extracted from an input document, and information on the relationship between a predetermined combination of keywords stored in the storage means and an evaluation value relating to the importance of the document is extracted from the input document. Based on a predetermined keyword (for example, a keyword registered in the table) among the keywords, the degree of importance of the input document is inferred. Then, the document determined to be important by the inference means is presented to the user.

【００２１】この推論は、スペクトル理論（高速スペク
トル理論）に基づいて行なわれる。また、本発明では、
入力文書から抽出されたキーワードに基いて該文書の重
要度を推論し、該文書が重要と判定された場合に表示を
行なうような情報フィルタにおいて、記憶手段に保持さ
せる推論に用いるための予め定められたキーワードの組
合せと文書の重要性に関する評価値の関係の情報を学習
する際、学習対象となる文書から抽出されたキーワード
のうちの所定のキーワード（例えばテーブルに登録して
あるキーワード）と、得られた評価値を用いて、キーワ
ードの組合せと評価値の関係を学習する。この学習は、
スペクトル理論に基づいて行なわれる。[0021] The inference is performed based on the spectral theory (fast spectral theory). Further, in the present invention,
An information filter that infers the importance of the document based on a keyword extracted from the input document and displays the document when the document is determined to be important. When learning the information on the relationship between the combination of the obtained keywords and the evaluation value regarding the importance of the document, a predetermined keyword (for example, a keyword registered in the table) among the keywords extracted from the document to be learned, Using the obtained evaluation value, the relationship between the combination of keywords and the evaluation value is learned. This learning,
It is performed based on spectrum theory.

【００２２】本発明によれば、評価対象の文書から抽出
されるキーワードの組合せと、このキーワードの組み合
わせで決まる文書の重要度に関する評価値の関係だけに
基いて推論／学習を行なうので、良好な推定精度を維持
しつつ、しかも、計算量を少くし、高速に判定／学習を
行なうことが可能な情報フィルタ装置を得ることができ
るようになる。従って、本発明によれば、利用者は明ら
かに興味のない文書を読む必要がなくなる。According to the present invention, since inference / learning is performed only on the basis of the relationship between the combination of keywords extracted from the document to be evaluated and the evaluation value regarding the importance of the document determined by this combination of keywords, it is preferable. It is possible to obtain an information filter device capable of performing determination / learning at high speed while maintaining the estimation accuracy and reducing the calculation amount. Therefore, according to the present invention, the user does not need to read documents that are obviously not of interest.

【００２３】尚、本発明はコンピュータで読取り、実行
できるアプリケーションソフトウエアとして可搬可能な
記憶媒体に格納するなどして頒布可能であることから、
次のような形態も発明の範疇に含まれる。Since the present invention can be distributed by being stored in a portable storage medium as application software that can be read and executed by a computer,
The following forms are also included in the scope of the invention.

【００２４】［１］コンピュータなどのような情報表
示装置と共に用いられ、入力した文書から情報をフィル
タすることにより所定の文書を表示する処理プログラム
を媒体内に記憶したコンピュータ読み取り実行可能な記
憶媒体であって、入力された文書からキーワードを抽出
するプログラムコード手段と、予め定められたキーワー
ドの組合せと文書に与えられる評価値との関係を記憶す
るプログラムコード手段と、記憶された前記関係と前記
入力された文書から抽出された前記キーワードのうちの
所定のキーワードとに基づいて、前記入力された文書の
重要度を推論するプログラムコード手段と、からなる処
理プログラムを媒体内に記憶した読み取り可能な記憶媒
体。[1] A computer-readable storage medium which is used with an information display device such as a computer and stores a processing program for displaying a predetermined document by filtering information from an input document in the medium. A program code means for extracting a keyword from an input document, a program code means for storing a relationship between a predetermined combination of keywords and an evaluation value given to the document, the stored relationship and the input And a program code means for inferring the importance of the input document based on a predetermined keyword among the keywords extracted from the read document, and a readable storage storing a processing program in a medium. Medium.

【００２５】［２］前記推論により得られた重要度
が、所定の条件を満たしているか否かを判定するプログ
ラムコード手段と、前記判定により所定の条件を満たし
ていると判定された場合に、前記入力された文書に関す
る所定の情報を利用者に表示するプログラムコード手段
と、を更に具備してなる処理プログラムを媒体内に記憶
した前記［１］記載の読み取り可能な記憶媒体。[2] Program code means for judging whether or not the importance obtained by the inference satisfies a predetermined condition, and when it is judged by the above judgment that the predetermined condition is satisfied, The readable storage medium according to the above [1], wherein the processing program further comprises: a program code unit for displaying predetermined information regarding the input document to the user.

【００２６】［３］表示された前記所定の情報に基づ
いて、前記利用者に対する評価値を入力するプログラム
コード手段と、前記文書から抽出されたキーワードと、
前記入力された評価値とに基づいて、前記キーワードの
組合せと前記評価値との関係の情報を学習するプログラ
ムコード手段と、を更に具備してなる処理プログラムを
媒体内に記憶した前記［２］記載の読み取り可能な記憶
媒体。[3] Program code means for inputting an evaluation value for the user based on the displayed predetermined information, and a keyword extracted from the document,
[2] The processing program stored in a medium, further comprising a program code means for learning information on a relationship between the combination of the keywords and the evaluation value based on the input evaluation value. A readable storage medium as described.

【００２７】［４］前記学習させるプログラムコード
手段は、前記所定のキーワード以外の新規のキーワード
が抽出された場合に、前記新規のキーワードを追加する
プログラムコード手段を更に有してなる処理プログラム
を媒体内に記憶した前記［３］記載の読み取り可能な記
憶媒体。[4] The learning program code means is a medium for a processing program further including program code means for adding the new keyword when a new keyword other than the predetermined keyword is extracted. The readable storage medium according to [3], which is stored in the storage medium.

【００２８】［５］前記学習させるプログラムコード
手段は、前記キーワードの組合せと評価値の関係を学習
する際、前記文書から抽出されたキーワードが影響を及
ぼす範囲について学習するプログラムコード手段を含
み、前記推論するプログラムコード手段は、文書から抽
出されたキーワードに所定のキーワードが存在しないと
した場合の当該文書の重要度を予め保持しておき、入力
された前記文書の重要度を推論する際に、前記文書から
抽出されたキーワードに所定のキーワードが存在する場
合に当該キーワードが保持しておいた重要度を変動させ
る値を求め、この値に基づいて該保持しておいた重要度
を修正することにより、入力された前記の重要度を求め
るプログラムコード手段を含んでなる処理プログラムを
媒体内に記憶した前記［３］記載の読み取り可能な記憶
媒体。[5] The program code means for learning includes program code means for learning a range in which a keyword extracted from the document influences when learning the relationship between the combination of the keywords and the evaluation value, The program code means for inferring holds the importance of the document in advance when the keyword extracted from the document does not include a predetermined keyword, and when inferring the importance of the input document, When a keyword extracted from the document includes a predetermined keyword, a value for varying the importance held by the keyword is obtained, and the held importance is corrected based on this value. By storing the processing program including the program code means for determining the inputted importance level in the medium. A readable storage medium according to [3].

【００２９】[0029]

【発明の実施の形態】以下、図面を参照しながら本発明
の具体例を説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Specific examples of the present invention will be described below with reference to the drawings.

【００３０】（第１の具体例）図１は、本発明の第１の
具体例に係る情報フィルタ装置の構成を示す図である。
本発明の第１の具体例における情報フィルタ装置は、文
書入力部１、文書記憶部２、キーワード抽出部３、デー
タ記憶部４、推論部５、表示部６、評価データ入力部７
及び学習部８を備えている。(First Specific Example) FIG. 1 is a diagram showing a configuration of an information filter device according to a first specific example of the present invention.
The information filter device according to the first specific example of the present invention includes a document input unit 1, a document storage unit 2, a keyword extraction unit 3, a data storage unit 4, an inference unit 5, a display unit 6, and an evaluation data input unit 7.
And a learning unit 8.

【００３１】文書入力部１は、外部から電子化された文
書のデータ（以下、単に文書と呼ぶ）を入力するための
ものであり、文書の伝達形態に応じて、ネットワーク接
続装置、無線受信装置、磁気ディスク／テープ読取り装
置、ＣＤ−ＲＯＭ読取り装置等の所望の装置を用いるこ
とができる。The document input unit 1 is for inputting electronic data of a document (hereinafter, simply referred to as a document) from the outside. The document input unit 1 has a network connection device and a wireless reception device according to the transmission form of the document. , A magnetic disk / tape reader, a CD-ROM reader, etc. can be used.

【００３２】文書記憶部２は、外部から入力した文書を
一時的に蓄えるためのものであり、磁気ディスク装置、
磁気テープ装置、光ディスク装置、半導体メモリ等、所
望の装置を用いることができる。The document storage unit 2 is for temporarily storing a document input from the outside, and includes a magnetic disk device,
A desired device such as a magnetic tape device, an optical disk device, or a semiconductor memory can be used.

【００３３】キーワード抽出部３は、文書記憶部２に一
時的に蓄えられた新たな文書から、所定のキーワードを
抽出する。文書からのキーワード抽出にあたってどのよ
うなキーワードを抽出の対象とするかは、次のようにし
て決める。The keyword extracting unit 3 extracts a predetermined keyword from the new document temporarily stored in the document storage unit 2. When extracting keywords from a document, the keywords to be extracted are determined as follows.

【００３４】キーワードは予め文書に付加して送る形態
と、付加しない形態とがあるので、キーワードの抽出に
は、例えば次のような方法を適用すればよい。Since there are a form in which the keyword is added to the document in advance and a form in which it is not added, the following method may be applied to the extraction of the keyword.

【００３５】予め文書にキーワードが付加されていない
場合には、(1) 公知のキーワード抽出手段（例えば、石
川巌他：「文書解析処理に基づく主題索引作成支援シス
テム」、情報処理学会論文誌Ｖｏｌ．１３２，１９９１
にて開示されているキーワード抽出手段等）を用いて文
書中からキーワードを抽出する。When no keyword is added to the document in advance, (1) publicly known keyword extracting means (for example, Iwao Ishikawa et al .: "Subject index creation support system based on document analysis process", IPSJ Journal Vol. .132, 1991
The keyword is extracted from the document using the keyword extracting means disclosed in (1).

【００３６】また、英文の場合には、単語の語幹を抽出
する。In the case of an English sentence, the stem of the word is extracted.

【００３７】予め文書にキーワードが付加されている場
合には、次のような手段が用いられる。When a keyword is added to a document in advance, the following means are used.

【００３８】［１］上記(1) のキーワード抽出手段を
用いる、［２］文書に付加されているキーワードを取り出す。
そして、この取り出したものをキーワードとする。[1] Using the keyword extracting means of the above (1), [2] extracting the keyword added to the document.
Then, the extracted one is used as a keyword.

【００３９】［３］上記［１］、［２］の両者を併用
する。[3] Both of the above [1] and [2] are used together.

【００４０】このようにすることにより、所要とするキ
ーワードを定めてそのキーワードに該当するキーワード
を文書中から検索抽出する。By doing so, the required keyword is determined, and the keyword corresponding to the keyword is searched and extracted from the document.

【００４１】データ記憶部４は、データを記憶保持する
ものであって、詳細は後述するスペクトル理論に基づい
た推論に用いる係数（後述するαｓ）、入力ベクトル及
び事例ベクトルの作成に用いる図２に示すようなキーワ
ードテーブル、及び／又は、学習部８の学習結果などを
記憶する。The data storage unit 4 stores and holds data, and details are shown in FIG. 2 which is used to create a coefficient (αs described below) used for inference based on spectrum theory described later, an input vector and a case vector. The keyword table as shown and / or the learning result of the learning unit 8 and the like are stored.

【００４２】図２に示すようなキーワードテーブルは、
例えば、キーワードとして“ワープロ”、“辞書”、
“帰納”、“学習”、“情報”、“フィルタ”、“光
学”…といった言葉（キーワード）をキーワード番号と
共に登録する。具体的には、例えば、“ワープロ”とい
う言葉は“１”というキーワード番号と共に登録されて
おり、“辞書”という言葉は“２”というキーワード番
号と共に登録されており、“帰納”という言葉は“３”
というキーワード番号と共に登録されている。The keyword table as shown in FIG.
For example, keywords such as "word processor", "dictionary",
Words (keywords) such as "induction", "learning", "information", "filter", "optics" ... Are registered together with keyword numbers. Specifically, for example, the word "word processor" is registered with the keyword number "1", the word "dictionary" is registered with the keyword number "2", and the word "induction" is " 3 "
It is registered with the keyword number.

【００４３】推論部５は、入力文書中からキーワード抽
出部３で抽出されたキーワード群をキーワード抽出部３
から受取り、データ記憶部４に記憶されているキーワー
ドテーブルを用いて、キーワード抽出部３から受け取っ
たキーワードのキーワード番号を求め、入力ベクトルを
生成した後に、データ記憶部４に記憶されている現時点
での係数αｓを用いてスペクトル理論に基づく推論を行
う。この推論における出力は、文書を利用者に提示すべ
きであるか否かを示す情報である。例えば、文書を利用
者に提示すべきであると判定された場合は“１”を、そ
うでない場合は“−１”を出力する。The inference unit 5 extracts the keyword group extracted by the keyword extracting unit 3 from the input document into the keyword extracting unit 3
The keyword number of the keyword received from the keyword extraction unit 3 is obtained using the keyword table stored in the data storage unit 4 and the input vector is generated, and then the Inference based on spectrum theory is performed using the coefficient αs of. The output in this inference is information indicating whether the document should be presented to the user. For example, "1" is output when it is determined that the document should be presented to the user, and "-1" is output otherwise.

【００４４】表示部６は制御機能部６ａと出力部６ｂと
を有する。制御機能部６ａは推論部５による上記の推論
の結果、利用者に提示すべきであると判定された文書の
内容を利用者に提示するためのものであって、推論部５
から指示された文書を文書記憶部２から読み出して出力
制御する。出力部６ｂは、この制御機能部６ａによる出
力制御に基づき、その文書を表示もしくは印字出力する
ものであって、デイスプレイ装置やプリンタ装置、或い
は音声出力装置などがあげられる。The display section 6 has a control function section 6a and an output section 6b. The control function unit 6a is for presenting to the user the contents of the document determined to be presented to the user as a result of the above inference by the inference unit 5.
The document instructed from is read from the document storage unit 2 and output is controlled. The output unit 6b displays or prints out the document based on the output control by the control function unit 6a, and may be a display device, a printer device, a voice output device, or the like.

【００４５】評価データ入力部７は、提示された文書に
対する評価データを入力する。評価データとは、例え
ば、文書が読む価値のあるか否かを示す情報である。実
際に入力する評価データとしては、文書が読む価値のあ
るものであった場合は評価値“１”を、そうでなかった
場合は評価値“−１”を入力しても良いし、他の形態
（例えば“Ｏ（マル），×（バツ）”又は“ｔｒｕｅ，
ｆａｌｓｅ”等）の情報をキー入力或いは選択入力し、
評価データ入力部７内部で評価値“１”や“−１”に変
換するようにしても良い。The evaluation data input unit 7 inputs the evaluation data for the presented document. The evaluation data is, for example, information indicating whether the document is worth reading. As the evaluation data to be actually input, if the document is worth reading, the evaluation value “1” may be input, and if not, the evaluation value “−1” may be input. Form (for example, "O", "x" or "true,"
false ”etc.) key input or selection input,
The evaluation data input unit 7 may convert the evaluation values into “1” and “−1”.

【００４６】学習部８は、評価データ入力部７への入力
から得られた評価値（“＋１”又は“−１”）に基づい
て係数αｓの学習を行う。The learning unit 8 learns the coefficient αs based on the evaluation value (“+1” or “−1”) obtained from the input to the evaluation data input unit 7.

【００４７】本具体例の情報フィルタ装置の概略動作を
説明する。本具体例の情報フィルタ装置の動作は、大き
く分けると実際の情報フィルタリング処理とその結果の
選択提示動作を含めた推論のプロセスと、良好な推論結
果を得るための学習のプロセスの２つからなる。The general operation of the information filter device of this example will be described. The operation of the information filter device of this specific example is roughly divided into two processes: an inference process including an actual information filtering process and a selective presentation operation of the result, and a learning process for obtaining a good inference result. .

【００４８】推論のプロセスにおいては、文書入力部１
から新たに入力された文書は文書記憶部２に一時的に蓄
えられる。そして、この文書からキーワード抽出部３に
よりキーワードが抽出される。推論部５は、当該文書に
関して抽出されたキーワード群を受取り、図２に示した
ようなキーワードテーブルを用いて、出現したキーワー
ドのキーワード番号を求め、入力ベクトルを生成し、現
時点での係数αｓを用いてスペクトル理論による推論を
行う。In the process of inference, the document input unit 1
The document newly input from is temporarily stored in the document storage unit 2. Then, the keyword extraction unit 3 extracts keywords from this document. The inference unit 5 receives the keyword group extracted with respect to the document, obtains the keyword number of the appearing keyword by using the keyword table as shown in FIG. 2, generates the input vector, and calculates the current coefficient αs. We use it to make inferences based on spectrum theory.

【００４９】そして、推論部５はスペクトル理論による
推論の結果、利用者に提示すべきであると判定した場合
には、表示部６に文書を表示することを指示する。この
指示を推論部５から受けると、表示部６は文書記憶部２
に記憶されている当該提示すべきと判定した文書の内容
を読み出して出力部６ｂに出力し、利用者に提示する。
利用者は出力部６ｂから出力された文書を読むことにな
る。Then, the inference unit 5 instructs the display unit 6 to display the document when it is determined that the document should be presented to the user as a result of the inference based on the spectrum theory. When this instruction is received from the inference unit 5, the display unit 6 displays the document storage unit 2
The content of the document determined to be presented, which is stored in, is read out, output to the output unit 6b, and presented to the user.
The user will read the document output from the output unit 6b.

【００５０】学習プロセスは次のように行われる。The learning process proceeds as follows.

【００５１】利用者は、推論のプロセスによって表示部
６において提示された文書に対する評価を評価データ入
力部７より入力する。この入力された評価のデータは学
習部８に与えられ、学習部８はこの与えられた評価値を
もとにスペクトル理論に基づいて係数αｓの学習を行
う。The user inputs the evaluation for the document presented on the display unit 6 from the evaluation data input unit 7 by the inference process. The inputted evaluation data is given to the learning unit 8, and the learning unit 8 learns the coefficient αs based on the given evaluation value based on the spectrum theory.

【００５２】この学習プロセスは、予め与えられた教示
データに基づく初期学習および推論のプロセスに伴う学
習の際に行われる。すなわち、良好な予測結果を得るた
めに、教示データを与え、後述する係数αｓを計算す
る。また、推論のプロセスにおいて教示された文書に対
して、利用者が評価を与えることにより、さらに係数α
ｓが計算され、学習が進むこととなる。This learning process is performed at the time of learning associated with the process of initial learning and inference based on teaching data given in advance. That is, in order to obtain a good prediction result, teaching data is given and the coefficient αs described later is calculated. In addition, when the user gives an evaluation to the document taught in the inference process, the coefficient α
s is calculated, and learning will proceed.

【００５３】推論部５、学習部８にて行うスペクトル理
論に基づく推論・学習について説明する。ここで、スペ
クトル理論は、例えば、Nathan Linial 等による"Const
antDepth Circuits, Fourier Transform, and Learnabi
lty", Jouranl of the Association for Computing Mac
hinery, Vol. 40, No. 3, July 1993, pp. 607-620)等
に詳しく述べられている。Inference / learning based on the spectrum theory performed by the inference unit 5 and the learning unit 8 will be described. Here, the spectrum theory is described, for example, in "Const by Nathan Linial et al.
antDepth Circuits, Fourier Transform, and Learnabi
lty ", Jouranl of the Association for Computing Mac
hinery, Vol. 40, No. 3, July 1993, pp. 607-620).

【００５４】スペクトル理論では、入力（問題）と出力
（正解）の組みを与えて、評価関数のパラメータの学習
を行い、今までに入力されたことのない新たな問題が与
えられた場合に、その時点でのパラメータを用いて正解
を推論する。もちろん、入力されたことのある問題が与
えられた場合も正解を得ることができる。In the spectrum theory, a pair of an input (problem) and an output (correct answer) is given to learn the parameters of the evaluation function, and when a new problem that has not been input so far is given, The correct answer is inferred using the parameters at that time. Of course, the correct answer can be obtained even when a problem that has been input is given.

【００５５】このようなスペクトル理論を情報フィルタ
装置に適用する場合、キーワードの集合又はその要素の
組合わせを入力（問題）とし、これに対応する評価値を
出力（正解）とし、新着文書から抽出したキーワードの
組合わせを入力として与えて、これに対応する評価値を
推論する。When such a spectrum theory is applied to an information filter device, a set of keywords or a combination of its elements is input (problem), and an evaluation value corresponding to this is output (correct answer) and extracted from a new document. The combination of the specified keywords is given as an input, and the evaluation value corresponding to this is inferred.

【００５６】すなわち、本発明の推論及び学習は、予め
内容と評価のわかっている複数の事例を与えてどのよう
な入力の場合にどのような出力（評価値）となるかを予
め学習させ、学習終了後に、ある入力の答えが、どのよ
うな評価値をとるかを予測するものである。このような
推論及び学習に対し、本発明ではスペクトル理論を用い
ている。That is, in the inference and learning of the present invention, a plurality of cases whose contents and evaluations are known in advance are given, and what kind of input (output) (evaluation value) is output is learned in advance. This is to predict what kind of evaluation value an input answer will have after learning. The present invention uses spectrum theory for such inference and learning.

【００５７】もう少し具体的にスペクトル理論について
説明する。例えば、学習事例として以下のような入力ベ
クトル（入力）及び出力（評価値）からなる事例を与え
て学習を行う。The spectrum theory will be described more specifically. For example, learning is performed by giving a case including the following input vector (input) and output (evaluation value) as a learning example.

【００５８】入力（０，１，１，１，１）出力“＋１” 入力（１，０，１，１，１）出力“−１” 入力（１，１，０，１，１）出力“＋１” 入力（１，１，１，０，１）出力“−１” 入力（１，１，１，１，０）出力“＋１” すなわち、スペクトル理論の手法とは、所定の入力ベク
トルを入力した場合にその入力に対応して、どのような
出力（評価値）が得られるかを学習させ、その学習が終
了した後に、例えば、入力ベクトルが“入力（１，１，
１，１，１）”である場合の答えが、“＋１”である
か、或いは“−１”であるかを予測するような推論を行
う手法である。Input (0,1,1,1,1) Output “+1” Input (1,0,1,1,1) Output “−1” Input (1,1,0,1,1) Output “ +1 ”input (1,1,1,0,1) output“ −1 ”input (1,1,1,1,0) output“ +1 ”That is, the spectrum theory method inputs a predetermined input vector. In this case, what kind of output (evaluation value) is obtained according to the input is learned, and after the learning is completed, for example, the input vector becomes “input (1, 1,
It is a method of performing inference that predicts whether the answer in the case of "1, 1, 1)" is "+1" or "-1".

【００５９】このようなスペクトル理論の手法を情報フ
ィルタに応用するために、本発明システムにおける推論
部５では、推論を行おうとする対象である文書からキー
ワード抽出部３によって抽出されたキーワード（抽出キ
ーワード）について統一的な順番を付けて並べる。この
統一的な順番とは、例えば、キーワードテーブルのキー
ワードの並び順を意味するが、この並び順をユーザが適
宜編集して、変更できるようにしても良い。そして、抽
出したキーワードを順に、興味を引く対象として登録さ
れた語句（キーワード）に対応するキーワードであるか
図２に示すようなキーワードテーブルを参照して調べ、
登録されたものに該当していれば“１”に、該当してい
なければ“０”に置き換えることによって入力ベクトル
を作成する。具体的には、抽出キーワード群を順に１つ
づつ、登録キーワードと比較して登録キーワードに一致
するものがあれば“１”と置き、一致するものがなけれ
ば“０”と置くことで、入力ベクトルを生成する。In order to apply such a method of spectrum theory to an information filter, in the inference unit 5 in the system of the present invention, a keyword (extracted keyword) extracted by the keyword extracting unit 3 from the document to be inferred is extracted. ) Are arranged in a uniform order. The unified order means, for example, the order of arrangement of the keywords in the keyword table, but the order may be appropriately edited and changed by the user. Then, the extracted keywords are checked in order by referring to a keyword table as shown in FIG. 2 to see if the keyword corresponds to a word (keyword) registered as an object of interest.
An input vector is created by substituting "1" if it corresponds to the registered one and "0" if not. Specifically, the extracted keyword groups are input one by one, and if there is a match with the registered keyword by comparing with the registered keyword, “1” is set, and if there is no matching keyword, “0” is set. Generate a vector.

【００６０】また、学習にあたっては、本具体例におい
ては、利用者が興味を持つ内容の文書の場合は出力（評
価値）を“１”、興味のない内容の文書の場合は出力
（評価値）を“０”とし、これを用いて入力・出力の組
（入力ベクトルと評価値の組）による事例を構成し、幾
つかの事例を学習文書として与えることによって、推論
に用いるパラメータの学習を行う。In learning, in this specific example, the output (evaluation value) is "1" for a document of which the user is interested, and the output (evaluation value) is for a document of which the user is not interested. ) Is set to “0”, and by using this, a case by an input / output set (a set of an input vector and an evaluation value) is configured, and some cases are given as learning documents to learn the parameters used for inference. To do.

【００６１】例えば、利用者の興味のある分野のキーワ
ードとして、図２のようなキーワードが予めシステムに
登録されており、学習のための事例文書に「“情報”、
“フィルタ”、“学習”、“利用者”、“アルゴリズ
ム”」の５つのキーワードが、「“…”，“…”，
“…”，“情報”，“フィルタ”，“学習”，“…”，
“利用者”，“…”，“…”，“…”，“アルゴリズ
ム”，“…”，“…”，“…”」（但し、“…”はキー
ワードテーブルにキーワードとして登録されていない語
句であるが、キーワード抽出部３が抽出キーワードとし
て抽出したものを示す）のように他の語句に混って出現
していたとすると、この場合、この文書に対して推論部
５にて生成される入力ベクトルは、（０，０，０，１，１，１，０，１，０，０，０，１，
０，０，０）となる。For example, the keywords shown in FIG. 2 are registered in the system in advance as the keywords of the field in which the user is interested, and ““ information ”,
The five keywords "filter", "learning", "user", and "algorithm" are "" ... "," ... ",
"...", "information", "filter", "learning", "...",
"User", "...", "...", "...", "Algorithm", "...", "...", "...""(However," ... "is a term not registered as a keyword in the keyword table. However, in this case, it is generated by the inference unit 5 for this document in this case. The input vector is (0,0,0,1,1,1,0,1,0,0,0,1,
0,0,0).

【００６２】そして、学習を行う場合に、利用者が、こ
の提示された文書を読んでみて、読むに値するか否かを
判断して、利用者が評価データ入力部７よりその旨の評
価を入力する。評価データ入力部７から、その評価対応
に評価値“＋１”（読むに値する）、評価値“−１”
（読むに値しない）が、学習部８に与えられることによ
って、学習部８はこの与えられた評価値“＋１”、“−
１”に基づいて係数αｓを計算し、この係数αｓに基づ
いて学習が行われる。When learning, the user reads the presented document, judges whether or not it is worth reading, and the user makes an evaluation to that effect through the evaluation data input unit 7. input. From the evaluation data input unit 7, an evaluation value “+1” (worthy to read) and an evaluation value “−1” corresponding to the evaluation.
(Not worth reading) is given to the learning unit 8, so that the learning unit 8 receives the given evaluation values “+1” and “−”.
The coefficient αs is calculated based on 1 ″, and the learning is performed based on the coefficient αs.

【００６３】このようにして利用者が、表示された文書
を読んだ結果、読むに値するか否かの評価をデータ入力
部７で入力することで学習部８は当該データ入力部７か
ら評価対応に出力される評価値（“＋１”又は“−
１”）に基づいて係数αｓの学習を行う。In this way, the user inputs the evaluation as to whether or not it is worth reading as a result of reading the displayed document by the data input unit 7, so that the learning unit 8 responds to the evaluation from the data input unit 7. Evaluation value (“+1” or “−”
The coefficient αs is learned based on 1 ″).

【００６４】次に、スペクトル理論に基づく学習方法と
推論方法を具体的に示す。Next, the learning method and the inference method based on the spectrum theory will be concretely shown.

【００６５】まず、学習や推論で使用する要素について
定義とその説明を行う。ここで使用する要素には、Ｘ，
Ｘｉ，Ｓ，ｓ，χｓ（Ｘｉ），αｓ，ｆ（Χｉ）といっ
たものがある。First, the elements used in learning and inference are defined and explained. The elements used here include X,
There are Xi, S, s, χs (Xi), αs, f (αi).

【００６６】これらのうち、入力ベクトル全体集合
“Ｘ”は入力事例の全体を表す。例えば、以下のような
入力ベクトルがその並び順に入ってきたとする。Of these, the entire set of input vectors "X" represents the entire input case. For example, suppose the following input vectors come in the order of arrangement.

【００６７】（０，１，１，１，１）（１，０，１，１，１）（１，１，０，１，１）（１，１，１，０，１）（１，１，１，１，０）この例の場合、入力事例の全体を示す入力ベクトル全体
集合Ｘは、Ｘ＝（（０，１，１，１，１）、（１，０，
１，１，１）、（１，１，０，１，１）、（１，ｌ，
１，０，１）、（１，１，１，１，０））と書ける。(0,1,1,1,1) (1,0,1,1,1) (1,1,0,1,1) (1,1,1,0,1) (1, 1, 1, 1, 0) In this example, the input vector entire set X indicating the entire input case is X = ((0,1,1,1,1), (1,0,
(1,1,1), (1,1,0,1,1), (1, l,
It can be written as (1,0,1), (1,1,1,1,0)).

【００６８】入力ベクトル要素“Ｘｉ”はｉ番目の入力
事例の入力ベクトルを表す。例えば、上記の場合、１番
目の入力事例は、“（０，１，１，１，１）”であり、
２番目の入力事例は“（１，０，１，１，１）”であ
り、３番目の入力事例は“（１，１，０，１，１）”と
いった具合である。The input vector element "Xi" represents the input vector of the i-th input case. For example, in the above case, the first input case is “(0,1,1,1,1)”,
The second input case is "(1,0,1,1,1)", the third input case is "(1,1,0,1,1)", and so on.

【００６９】属性集合“Ｓ”は、属性ｓの組み合わせの
全体を表す。The attribute set "S" represents the entire combination of attributes s.

【００７０】キーワード属性集合ｓとは、本具体例にお
いては、キーワードの番号の組合せからなるものをい
う。例えば、属性ｓが１〜３まで存在する場合（キーワ
ードの番号が１〜３までの３種がある場合）、実質的に
同じ組み合わせとなるものを除くと、キーワード属性集
合ｓの組み合わせの全体Ｓは、Ｓ＝（（１），（２），
（３），（１，２），（１，３），（２，３），（１，
２，３））となる。ただし、キーワード属性集合ｓの組
合わせを所定の次数で打ち切る場合は、Ｓは当該所定の
次数内での属性の組み合わせの全体とする。なお、ここ
で言う次数は、Ｓの中の１つの値を表すキーワード属性
集合ｓに含まれる属性の数を指す。In this specific example, the keyword attribute set s is a combination of keyword numbers. For example, when there are 1 to 3 attributes s (when there are 3 types of keyword numbers 1 to 3), except for the combinations that are substantially the same, the entire combination S of the keyword attribute set s Is S = ((1), (2),
(3), (1,2), (1,3), (2,3), (1,
2, 3)). However, when the combination of the keyword attribute set s is cut off by a predetermined order, S is the entire combination of attributes within the predetermined order. The order referred to here indicates the number of attributes included in the keyword attribute set s representing one value in S.

【００７１】すなわち、ｓ＝（１，３）の場合、次数は
“２”であり、ｓ＝（２）の場合、次数は“１”であ
り、ｓ＝（１，２，３）の場合、次数は“３”である。
従って、例えば、次数“２”で打ち切る場合のＳは、上
述の例の場合、Ｓ＝（（１），（２），（３），（１，
２），（１，３），（２，３））となる。That is, if s = (1,3), the degree is “2”, if s = (2), the degree is “1”, and if s = (1,2,3). , The order is “3”.
Therefore, for example, S in the case of cutting off with the order “2” is S = ((1), (2), (3), (1,
2), (1, 3), (2, 3)).

【００７２】要素“ｓ”は属性の組み合わせの属性集合
Ｓの中の１つの値を表す。例えば、（１，２，３）や
（３）或いは（１，２）の如く、“（）”で括られた属
性を指している。The element "s" represents one value in the attribute set S of the attribute combination. For example, it indicates an attribute surrounded by "()" such as (1, 2, 3) or (3) or (1, 2).

【００７３】要素“χｓ（Ｘｉ）”はｉ番目の事例の入
力ベクトルＸｉにおいて、キーワード属性集合ｓに対応
する要素の値が“１”であるものの数が奇数個ならば
“−１”を、偶数個（偶数個には０個の場合も含む）な
らば“１”を係数として返すような関数を表す。The element “χs (Xi)” is “−1” if the number of the elements corresponding to the keyword attribute set s whose value is “1” is an odd number in the input vector Xi of the i-th case, If an even number (even number includes 0), it represents a function that returns "1" as a coefficient.

【００７４】例えば、ｉ番目の入力事例であるＸｉの入
力ベクトルがＸｉ＝（１，０，０）で、ｓ＝（１，３）
ならば、入力ベクトルＸｉ中の１番目の値は“１”、３
番目の値は“０”であるので、“１”であるものの数は
１個であってこれは奇数個であり、この場合、“−１”
を返すことになるから、関数χｓ（Ｘｉ）として表すと χｓ（Ｘｉ）＝χ1,3 （１，０，０）＝−１となる。また、Ｘｉ＝（１，０，０）、ｓ＝（２，３）
ならば、Ｘｉ中の２番目の値は“０”、３番目の値は
“０”であるので、“１”であるものの数は０個であ
り、これは偶数個であるから、この場合、“１”を返す
ことになるので、関数χｓ（Ｘｉ）は、 χｓ（Ｘｉ）＝χ2,3 （１，０，０）＝１となる。For example, the input vector of Xi which is the i-th input case is Xi = (1,0,0), and s = (1,3).
Then, the first value in the input vector Xi is "1", 3
Since the second value is “0”, the number of “1” s is one, which is an odd number, and in this case, “−1”.
Therefore, when expressed as a function χs (Xi), χs (Xi) = χ 1,3 (1,0,0) = − 1. Also, Xi = (1,0,0), s = (2,3)
Then, the second value in Xi is "0", and the third value is "0", so the number of "1" is 0, which is an even number. , 1 is returned, so that the function χs (Xi) is χs (Xi) = χ2,3 (1,0,0) = 1.

【００７５】この“−１”及び“１”が次に説明する係
数αとなる。These "-1" and "1" are the coefficient α described below.

【００７６】つまり、α1 の内容が“−１”であったと
すると、ｉ番目の入力事例であるＸｉ＝（１，０，
０）、ｓ＝（１）は、入力ベクトルＸｉにおけるキーワ
ード並び順での１番目のキーワードに、登録キーワード
が出現した数が奇数個あったことを示し、“１”であっ
たならばそれが０個であったことを示し、α2 の内容が
“−１”であったとすると、ｉ番目の入力事例であるＸ
ｉ＝（１，０，０）、ｓ＝（２）は、入力ベクトルＸｉ
におけるキーワード並び順での２番目のキーワードに、
登録キーワードが出現した数が奇数個あったことを示
し、“１”であったならばそれが０個であったことを示
し、α1,2 の内容が“−１”であったとすると、ｉ番目
の入力事例であるＸｉ＝（１，０，０）、ｓ＝（１，
２）は、入力ベクトルＸｉにおけるキーワード並び順で
の１番目と２番目のキーワードに、登録キーワードが出
現した数が奇数個あったことを示し、“１”であったな
らばそれが０個であった場合を含めて偶数個あったこと
を示し、α1,2,3 の内容が“−１”であったとすると、
ｉ番目の入力事例であるＸｉ＝（１，０，０）、ｓ＝
（１，２，３）は、入力ベクトルＸｉにおけるキーワー
ド並び順での１番目と２番目と３番目のキーワードに、
登録キーワードが出現した数が奇数個あったことを示
し、“１”であったならばそれが０個であった場合を含
めて偶数個あったことを示しているといった具合であ
る。In other words, if the content of α1 is "-1", the i-th input case Xi = (1,0,
0) and s = (1) indicate that the first keyword in the keyword arrangement order in the input vector Xi has an odd number of registered keywords, and if it is “1”, it is If it is shown that the number is 0 and the content of α2 is "-1", the i-th input case X
i = (1,0,0), s = (2) is the input vector Xi
To the second keyword in the keyword order in
It indicates that the number of occurrences of the registered keyword was an odd number, if it was "1", it was 0, and if the content of α1,2 was "-1", then i The second input case is Xi = (1,0,0), s = (1,
2) indicates that the first and second keywords in the keyword arrangement order in the input vector Xi had an odd number of registered keywords, and if it was “1”, it was 0. If there is an even number, including the case, and the content of α1,2,3 is “-1,”
The i-th input case Xi = (1,0,0), s =
(1, 2, 3) is the first, second, and third keywords in the keyword arrangement order in the input vector Xi,
It indicates that the number of appearances of the registered keyword is an odd number, and if it is “1”, it indicates that there is an even number, including the case where it is 0.

【００７７】これを学習対象の事例における入力ベクト
ル毎に、属性の組み合わせ別の登録キーワード出現数を
奇数、偶数の表示で並べたものがαｓである。Αs is obtained by arranging the number of appearances of registered keywords for each attribute combination for each input vector in the case of learning as an odd number display and an even number display.

【００７８】要素“ｆ（Χｉ）”は、入力事例Ｘｉに対
する評価を表す。The element "f (Ai)" represents the evaluation for the input case Xi.

【００７９】[0079]

【数１】 [Equation 1]

【００８０】ｓｉｇｎ（ｘ）：ｘ≧０ならばｓｉｇｎ
（ｘ）＝１、ｘ＜０ならばｓｉｇｎ（ｘ）＝−１となる
ような関数を表す。Sign (x): sign if x ≧ 0
If (x) = 1 and x <0, it represents a function such that sign (x) =-1.

【００８１】スペクトル理論に基づく学習は、次の数式
（１）によって行われる。Learning based on the spectrum theory is performed by the following mathematical expression (1).

【００８２】[0082]

【数２】 [Equation 2]

【００８３】全てのｓに関してαを求める。ｍは事例の
総数である。スペクトル理論に基づく推論は、次の数式
（２）によって行われる。Find α for all s. m is the total number of cases. Inference based on the spectrum theory is performed by the following mathematical expression (2).

【００８４】[0084]

【数３】 [Equation 3]

【００８５】与えられた問題事例ｘに対して式（２）を
用いると、ｆ（ｘ）の予測値が得られる。なお、推論
に、式（２）を使う限りは、右辺の値の正負が問題であ
るので、式（１）において分子をｍで割らなくても同じ
結果が得られることから、学習を次の式（３）で行うよ
うにしても構わない。本具体例では、式（３）を使用し
て説明している。By using the equation (2) for the given problem case x, the predicted value of f (x) is obtained. As long as the formula (2) is used for inference, since the positive and negative values of the right side are problems, the same result can be obtained without dividing the numerator by m in the formula (1). The equation (3) may be used. In this specific example, the explanation is given using the equation (3).

【００８６】[0086]

【数４】 [Equation 4]

【００８７】ところで、属性ｓの全ての組合わせを用い
て上記の推論・学習を行うようにする方が、より高い予
測精度が得られるようになるが、属性の数（キーワード
の数）の増加にともない、べき乗のオーダで計算量が増
えてしまう。前述の"NathanLinial" 等による文献によ
れば、一定の次数で学習を終了しても、おおまかな学習
は終了しており、予測精度にそれほどの差がないものと
考えられる。By the way, if the above inference / learning is performed by using all the combinations of the attributes s, higher prediction accuracy can be obtained, but the number of attributes (the number of keywords) increases. As a result, the calculation amount increases with the power order. According to the above-mentioned document by "Nathan Linial" and the like, rough learning is completed even if learning is completed at a constant degree, and it is considered that there is not much difference in prediction accuracy.

【００８８】そこで、上記の学習と推論（予測）で用い
るＳの次数を同じ値で制限することにより、全体の次数
を制限し、計算量を削減することができる。Therefore, by limiting the order of S used in the learning and inference (prediction) with the same value, it is possible to limit the overall order and reduce the amount of calculation.

【００８９】以上がスペクトル理論を用いた推論・学習
の説明である。The above is the explanation of the inference / learning using the spectrum theory.

【００９０】次に、図３を参照しながら推論部５の働き
を説明する。図３は、推論部５における処理の流れを示
すフローチャートである。Next, the function of the inference unit 5 will be described with reference to FIG. FIG. 3 is a flowchart showing the flow of processing in the inference unit 5.

【００９１】推論部５は起動されると、キーワード抽出
部３から文書に現れたキーワードの一覧を読み込む（ス
テップＳ１０１）。すなわち、キーワード抽出部３は文
書が入力されると当該文書に現れたキーワードを抽出し
て一覧を形成し、保持しているので推論部５はこれを読
み込む。When the inference unit 5 is activated, the keyword extraction unit 3 reads a list of keywords appearing in the document (step S101). That is, when the document is input, the keyword extracting unit 3 extracts the keywords appearing in the document, forms a list, and holds the list, so the inference unit 5 reads this.

【００９２】キーワードの一覧が読み込まれたならば、
次にこの読み込まれた各キーワードそれぞれについての
そのキーワード番号を、データ記憶部４に保存されてい
る図２の如きキーワードテーブルを参照しながら求め、
入力ベクトルを生成する（ステップＳ１０２）。When the keyword list is read,
Next, the keyword number of each of the read keywords is obtained by referring to the keyword table stored in the data storage unit 4 as shown in FIG.
An input vector is generated (step S102).

【００９３】その際、キーワードテーブルを参照しても
見付からないキーワード、つまり、キーワードテーブル
に登録されていないキーワードは、無視される。At this time, a keyword that cannot be found even by referring to the keyword table, that is, a keyword not registered in the keyword table is ignored.

【００９４】次に、キーワード同士の組み合わせ（前述
のＳ）の存在の有無を調べ（ステップＳ１０３）、その
結果、キーワード同士の組み合わせ（前述のＳ）が、ま
だ存在している場合には、次の組み合わせを生成し（ス
テップＳ１０４）、生成された組み合わせに関して予測
値の計算を行い（ステップＳ１０５）、ステップ１０３
に戻る。Next, it is checked whether or not there is a combination of the keywords (S described above) (step S103). As a result, if the combination of the keywords (S described above) still exists, then Is generated (step S104), a predicted value is calculated for the generated combination (step S105), and step 103 is performed.
Return to.

【００９５】ステップＳ１０３での判定の結果、キーワ
ード同士の組み合わせがこれ以上存在しない場合には、
予測値の計算は終了する。ここで、式（２）のｓｉｇｎ
関数に代入する値、すなわち次の式（４）のｈ（ｘ）が
得られる。If the result of determination in step S103 is that there are no more combinations of keywords,
The calculation of the predicted value ends. Here, the sign of Expression (2)
The value to be assigned to the function, that is, h (x) of the following expression (4) is obtained.

【００９６】[0096]

【数５】 [Equation 5]

【００９７】入力された文書に対する評価の推論出力
（予測値）は、ｓｉｇｎ（ｘ）に代入すると、ｈ（ｘ）
が“０”未満であった場合（ステップＳ１０７）には、
予測値は“−１”となり、この場合はシステムは利用者
に当該評価対象となった入力された文書の提示を、実施
しないで終了する。しかし、ｈ（ｘ）が“０”以上であ
った場合（ステップＳ１０６）には、予測値は“−１”
となり、このときは推論部５は表示部６に文書を提示す
ることを指示し（ステップＳ１０７）、処理を終了す
る。そして、この提示の指示を受けた表示部６は、当該
評価対象となった文書を文書記憶部２より読み出して表
示出力或いはプリント出力する。The inference output (predicted value) of the evaluation with respect to the input document is substituted into sign (x) to obtain h (x).
Is less than “0” (step S107),
The predicted value is "-1", and in this case, the system ends the presentation of the input document that is the evaluation target to the user without executing the evaluation. However, when h (x) is greater than or equal to "0" (step S106), the predicted value is "-1".
At this time, the inference unit 5 instructs the display unit 6 to present the document (step S107) and ends the process. Then, the display unit 6 that has received this presentation instruction reads out the document to be evaluated from the document storage unit 2 and outputs it for display or printing.

【００９８】図４を参照しながら学習部８の働きを説明
する。図４は、学習部８による処理の流れを示すフロー
チャートである。学習部８は、例えば、ユーザによる評
価データ入力部７から入力操作などによって起動され
る。The operation of the learning section 8 will be described with reference to FIG. FIG. 4 is a flowchart showing the flow of processing by the learning unit 8. The learning unit 8 is activated, for example, by an input operation from the evaluation data input unit 7 by the user.

【００９９】学習部８は起動されると、キーワード抽出
部３から文書に現れたキーワードの一覧を読み込む（ス
テップＳ２０１）。読み込まれたキーワードのキーワー
ド番号をデータ記憶部４に保存されている図２のような
キーワードテーブルを参照しながら求め、事例ベクトル
を生成する（ステップＳ２０２）。この時、キーワード
テーブルに登録されていないキーワードは無視する。When the learning unit 8 is activated, it reads the list of keywords appearing in the document from the keyword extracting unit 3 (step S201). The keyword number of the read keyword is obtained with reference to the keyword table stored in the data storage unit 4 as shown in FIG. 2 to generate a case vector (step S202). At this time, the keywords not registered in the keyword table are ignored.

【０１００】次に学習部８は、使用者が評価データ入力
部７の操作によって与えた評価を読み込む（ステップＳ
２０３）。この評価は、推論部５が表示指示した文書を
読んで使用者が自己にとって有用か、或いは興味がある
か否かの率直な判断評価である。Next, the learning unit 8 reads the evaluation given by the user by operating the evaluation data input unit 7 (step S
203). This evaluation is a frank judgment evaluation of whether or not the user reads the document instructed to display by the inference unit 5 and is useful or interesting to himself.

【０１０１】この評価が読み込まれると、次にキーワー
ド同士の組み合わせ（前述のＳ）の有無を調べる（ステ
ップＳ２０４）。その結果、キーワード同士の組み合わ
せ（前述のＳ）が存在している場合には、キーワード同
士の組み合わせを１つ生成し（ステップＳ２０５）、生
成された組み合わせに関して係数（前述のα）の計算を
行い（ステップＳ２０６）、ステップＳ２０４に戻る。
なお、ステップＳ２０６において、ｆ（Ｘ）はステップ
Ｓ２０３において読み込んだ、評価値のことである。When this evaluation is read, it is next checked whether or not there is a combination of keywords (S mentioned above) (step S204). As a result, when there is a combination of keywords (S described above), one combination of keywords is generated (step S205) and a coefficient (α described above) is calculated for the generated combination. (Step S206), the process returns to step S204.
Note that in step S206, f (X) is the evaluation value read in step S203.

【０１０２】次にステップＳ２０４において再びキーワ
ード同士の組み合わせ（前述のＳ）の有無を調べる。そ
の結果、キーワード同士の組み合わせ（前述のＳ）がま
だ存在している場合には、次の組み合わせを生成し（ス
テップＳ２０５）、生成された組み合わせに関して係数
（前述のα）の計算を行い（ステップＳ２０６）、ステ
ップＳ２０４に戻る。Next, in step S204, the presence or absence of a combination of keywords (S described above) is checked again. As a result, if the combination of keywords (S described above) still exists, the next combination is generated (step S205), and the coefficient (α described above) is calculated for the generated combination (step S205). S206) and returns to step S204.

【０１０３】このような処理をキーワード同士の組み合
わせが存在する限り繰り返すが、ステップＳ２０４での
判断の結果、キーワード同士の組み合わせがもう存在し
ない場合には、係数の計算は終了し、求めた係数αをデ
ータ記憶部４に保存し（ステップＳ２０７）、終了す
る。Such processing is repeated as long as there is a combination of keywords, but if the result of determination in step S204 is that there is no more combination of keywords, calculation of the coefficient ends and the calculated coefficient α is obtained. Is stored in the data storage unit 4 (step S207), and the process ends.

【０１０４】上記の動作を、キーワードテーブルに登録
するキーワード数を“６”、扱う係数αの次数を“２”
までとして、フィルタリングを行う例を示して、本具体
例をより具体的に説明する。In the above operation, the number of keywords registered in the keyword table is "6", and the degree of the coefficient α to be handled is "2".
Up to this point, this specific example will be described more specifically by showing an example of performing filtering.

【０１０５】まず、学習による係数αの初期設定につい
て説明する。登録されているキーワードは図５に示すよ
うなものであるとする。図５の例は、キーワード番号１
番として“keyword*1 ”が、キーワード番号２番として
“keyword*3 ”が、キーワード番号３番として“keywor
d*3 ”が、キーワード番号４番として“keyword*4 ”
が、そして、キーワード番号５番として“keyword*5 ”
が登録されていることを示している。First, the initial setting of the coefficient α by learning will be described. It is assumed that the registered keywords are as shown in FIG. In the example of FIG. 5, the keyword number 1
"Keyword * 1" as the number, "keyword * 3" as the keyword number 2 and "keywor" as the keyword number 3
d * 3 ”is“ keyword * 4 ”as keyword number 4
But, as keyword number 5, "keyword * 5"
Is registered.

【０１０６】ここで、（keyword*1 、keyword*3 ）が必
要なキーワードの組、（keyword*3、keyword*4 ）が不
要なキーワードの組であったとすると、次に、これらの
キーワードの組を用いて、仮想的に文書群を生成する。
仮想的な文書群の生成は、１文書あたり、上記６つのキ
ーワード中の３つのキーワードを含む構成として、それ
らの組み合わせ別のものをそれぞれ別の種類の文書とし
て考えた場合、例えば、各文書は（keyword*1 、keywor
d*3 ）又は（keyword*3 、keyword*4 ）のいずれかを主
体としてこれに更に別の１つのキーワードを加えた３つ
のキーワードを持つバリエーションとして生成する。Here, if it is assumed that (keyword * 1, keyword * 3) is a set of required keywords and (keyword * 3, keyword * 4) is a set of unnecessary keywords, then the set of these keywords is Is used to virtually generate a document group.
The generation of a virtual document group is configured such that one document includes three keywords out of the above six keywords, and if different combinations of these are considered as different types of documents, for example, each document is (Keyword * 1, keywor
It is generated as a variation having three keywords in which either d * 3) or (keyword * 3, keyword * 4) is the main subject, and another one keyword is added to this.

【０１０７】この結果、 (keyword*1、keyword*3 、keyword*3) (keyword*1、keyword*3 、keyword*4) (keyword*1、keyword*3 、keyword*5) (keyword*3、keyword*3 、keyword*4) (keyword*3、keyword*4 、keyword*5) (keyword*3、keyword*4 、keyword*6) の６種類の仮想文書が得られることになる。As a result, (keyword * 1, keyword * 3, keyword * 3) (keyword * 1, keyword * 3, keyword * 4) (keyword * 1, keyword * 3, keyword * 5) (keyword * 3, keyword * 3, keyword * 4) (keyword * 3, keyword * 4, keyword * 5) (keyword * 3, keyword * 4, keyword * 6) 6 types of virtual documents will be obtained.

【０１０８】これらの全ての仮想文書は上述のようなそ
れぞれ異なる組み合わせのキーワード３つを含む文書と
いうことになるが、これらのうち、必要なキーワードの
組が出現する仮想文書が必要な文書、不必要なキーワー
ドの組が出現する仮想文書が不必要な文書であるとし
て、それぞれに得点付けを行う。All of these virtual documents are documents that include three different combinations of keywords as described above. Among these, a document requiring a virtual document in which a required set of keywords appears Assuming that the virtual document in which the necessary set of keywords appears is an unnecessary document, each is scored.

【０１０９】更に、ノイズとして、“keyword*5 ”、
“keyword*6 ”を持つ仮想文書が存在する。ここでは、
以下の文書を学習させる。Furthermore, as noise, "keyword * 5",
There is a virtual document with "keyword * 6". here,
Train the following documents.

【０１１０】「必要な文書」 (keyword*1、keyword*3 、keyword*3) 評価値…＋１ (keyword*1、keyword*3 、keyword*4) 評価値…＋１ (keyword*1、keyword*3 、keyword*5) 評価値…＋１「不必要な文書」 (keyword*3、keyword*3 、keyword*4) 評価値…−１ (keyword*3、keyword*4 、keyword*5) 評価値…−１ (keyword*3、keyword*4 、keyword*6) 評価値…−１「ノイズ」 (keyword*1、keyword*5 、keyword*6) 評価値…＋１ (keyword*3、keyword*5 、keyword*6) 評価値…−１以上のような文書群を学習させた結果、各αの値は図６
（ａ）のようになったとする。ただし、図６（ａ）にお
いて、αｉはkeyword*i に関するαの値を、そして、α
ｉ，ｊはkeyword*i とkeyword*j の組に関するαの値を
意味する。"Required Document" (keyword * 1, keyword * 3, keyword * 3) Evaluation value ... + 1 (keyword * 1, keyword * 3, keyword * 4) Evaluation value ... + 1 (keyword * 1, keyword * 3) , Keyword * 5) Evaluation value… + 1 “Unnecessary documents” (keyword * 3, keyword * 3, keyword * 4) Evaluation value… −1 (keyword * 3, keyword * 4, keyword * 5) Evaluation value… − 1 (keyword * 3, keyword * 4, keyword * 6) Evaluation value -1 "Noise" (keyword * 1, keyword * 5, keyword * 6) Evaluation value +1 (keyword * 3, keyword * 5, keyword * 6) Evaluation value ...- 1 As a result of learning the above document group, the value of each α is shown in FIG.
Suppose it becomes like (a). However, in FIG. 6A, αi is the value of α related to keyword * i, and α
i and j mean the value of α related to the combination of keyword * i and keyword * j.

【０１１１】次に、推論するプロセスを説明する。Next, the inference process will be described.

【０１１２】（keyword*1 ，keyword*3 ，keyword*6 ）
をキーワードとして有し、“keyword*1 ”，“keyword*
3 ”，“…”，“…”，“…”，“keyword*6 ”なる配
列をとる文書（この文書はキーワード番号を取り出して
入力ベクトルにすると、（１、１、０、０、０、１）と
なる。）を次数２までに関し、処理したとする。する
と、図３におけるステップＳ１０３〜Ｓ１０５の処理ル
ープにおいて、各回周毎に順次以下の組合わせが生成さ
れる。つまり、１回目・・・keyword*1 のみ、２回目・・・keyword*3 のみ、３回目・・・keyword*3 のみ、４回目・・・keyword*4 のみ、５回目・・・keyword*5 のみ、６回目・・・keyword*6 のみ、７回目・・・keyword*1 とkeyword*3 、８回目・・・keyword*1 とkeyword*3 、９回目・・・keyword*1 とkeyword*4 、１０回目・・・keyword*1 とkeyword*5 、１１回目・・・keyword*1 とkeyword*6 、１２回目・・・keyword*3 とkeyword*3 、１３回目・・・keyword*3 とkeyword*4 、１４回目・・・keyword*3 とkeyword*5 、１５回目・・・keyword*3 とkeyword*6 、１６回目・・・keyword*3 とkeyword*4 、１７回目・・・keyword*3 とkeyword*5 、１８回目・・・keyword*3 とkeyword*6 、１９回目・・・keyword*4 とkeyword*5 、２０回目・・・keyword*4 とkeyword*6 、２１回目・・・keyword*5 とkeyword*6 の各組み合わせである。(Keyword * 1, keyword * 3, keyword * 6)
As a keyword, "keyword * 1", "keyword *
3), “...”, “...”, “...”, “keyword * 6” The document which takes the arrangement (When this document extracts the keyword number and makes it an input vector, (1, 1, 0, 0, 0, 1)) is processed up to the degree 2. Then, the following combinations are sequentially generated for each rotation in the processing loop of steps S103 to S105 in FIG.・・ Keyword * 1 only, 2nd time ・・・ keyword * 3 only, 3rd time ・・・ keyword * 3 only, 4th time ・・・ keyword * 4 only, 5th time ・・・ keyword * 5 only, 6th time ・・・ Keyword * 6 only, 7th time ・・・ keyword * 1 and keyword * 3, 8th time ・・・ keyword * 1 and keyword * 3, 9th time ・・・ keyword * 1 and keyword * 4, 10th time ・・・ Keyword * 1 and keyword * 5, 11th time ・・・ keyword * 1 and keyword * 6, 12th time ・・・ keyword * 3 and keyword * 3, 13th time ・・・ keyword * 3 and keyword * 4, 14th time ... keyword * 3 and keyword * 5, 15th time ... keyword * 3 and keyword * 6, 16th time ... keyword * 3 and keyword * 4, 17th time ... keyword * 3 and keyword * 5, 18th time ... keyword * 3 and keyword * 6, 19th time ... keyword * 4 and keyword * 5, 20th time ... keyword * 4 and keyword * 6, 21st time・・ Each combination of keyword * 5 and keyword * 6.

【０１１３】この組合わせを用いて、始めにステップＳ
１０５を通過するときは、ｓ＝１であるため、図６
（ａ）からα１＝−８であり、また、関数χｓ（Ｘｉ）
として表すと、 χ1 （１，１，０，０，０，１）＝−１であるため、予測値＝予測値（＝０）＋（−８×（−１））＝８となる。Using this combination, first step S
When passing through 105, since s = 1,
From (a), α1 = −8, and the function χs (Xi)
## EQU1 ## Since .chi.1 (1,1,0,0,0,1) =-1, the following is obtained.

【０１１４】また、７回目にステップＳ１０５を通過す
るときは、ｓ＝１，２であるため、図６（ａ）からα
１，２＝０であり、また、 χ1,2 （１、１、０、０，０、１）＝１であるため、予測値＝予測値（＝０）＋（０×１）＝０となる。Further, when passing through step S105 for the seventh time, s = 1, 2, and therefore α from FIG. 6 (a).
1,2 = 0, and χ 1,2 (1,1,0,0,0,1) = 1, so that predicted value = predicted value (= 0) + (0 × 1) = 0 Become.

【０１１５】最後にステップＳ１０５を通過するとき
は、ｓ＝５，６であるため、図６（ａ）からα５，６＝
２であり、また、 χ5,6 （１、１、０、０、０、１）＝−１であるため、予測値＝予測値（＝０）＋（２×（−１））＝−２となる。Finally, when passing through step S105, since s = 5, 6, it follows from FIG. 6 (a) that α5,6 =
2 and χ 5,6 (1,1,0,0,0,1) = − 1, so predicted value = predicted value (= 0) + (2 × (−1)) = − 2 Becomes

【０１１６】ここで、これら予測値を合計すると最終的
には、予測値は“１２”となる。この例では、組合わせ
が存在しないので、図３のステップＳ１０３からステッ
プＳ１０６に移り、ここで、最終的な上記予測値“１
２”をしきい値と比較してその大小に応じ、提示の判断
をする。しきい値は“０”とすると、上記予測値“１
２”はしきい値よりも大きいので、利用者に提示すると
判定する。そして、この判定に従い、出力部６ｂでは当
該評価対象となった入力文書を表示することになる。な
お、この場合、しきい値との差がどのくらいであったか
否かの情報を利用者に提示するようにしても良い。Here, when these predicted values are summed up, the predicted value finally becomes "12". In this example, since there is no combination, the process proceeds from step S103 of FIG. 3 to step S106, where the final predicted value “1” is obtained.
2 "is compared with a threshold value and the presentation is judged according to the magnitude. When the threshold value is" 0 ", the above-mentioned predicted value" 1 "
Since 2 ″ is larger than the threshold value, it is determined to present it to the user. Then, according to this determination, the output unit 6b displays the input document that is the subject of the evaluation. Information about how much the difference from the threshold value is may be presented to the user.

【０１１７】次に、学習するプロセスを説明する。図４
のステップＳ２０１において読み込まれたキーワードは
上記と同様に、（keyword*1 、keyword*3 、keyword*6
）である。上述の“keyword*1 ”，“keyword*3 ”，
“…”，“…”，“…”，“keyword*6 ”なる配列をと
る文書について、ステップＳ２０２においてキーワード
番号を取り出すと、（１、１、０、０、０、１）なる入
力ベクトルが得られる。Next, the learning process will be described. Figure 4
The keywords read in step S201 in step S201 are (keyword * 1, keyword * 3, keyword * 6
). The above "keyword * 1", "keyword * 3",
For a document having an array of "...", "...", "...", "keyword * 6", when the keyword number is extracted in step S202, an input vector of (1, 1, 0, 0, 0, 1) is obtained. can get.

【０１１８】この文書は利用者にとって必要な文書であ
ったとすると利用者が与える評価は“Ｏ（マル）”或い
は“ｇｏｏｄ”或いは“１”などであるから、ステップ
Ｓ２０３において読み込まれる評価値は“１”となる。
ステップＳ２０５において生成される組合わせは、予測
において生成されたものと同様である。最初にステップ
Ｓ２０６を通過するときの組合わせにおける属性ｓはｓ
＝１であり、 χ1 （１，１，０，０，０，１）＝−１であるため、予測値は α１＝−８＋（１×（−１））＝−９となる。また、７回目にステップ２０６を通過するとき
の組合わせにおけるｓはｓ＝１，２であり、 χ1,2 （１，１，０，０，０，１）＝１であるため、 α１，２＝０＋（１×１）＝１となる。If this document is a document necessary for the user, the evaluation given by the user is "O", "good", "1", or the like. Therefore, the evaluation value read in step S203 is " 1 ”.
The combinations generated in step S205 are similar to those generated in prediction. First, the attribute s in the combination when passing through step S206 is s
= 1 and χ 1 (1,1,0,0,0,1) = − 1, the predicted value is α1 = −8 + (1 × (−1)) = − 9. Further, s in the combination when passing through the step 206 for the seventh time is s = 1,2, and χ1,2 (1,1,0,0,0,1) = 1, so α1,2 = 0 + (1 × 1) = 1.

【０１１９】そして、最後にステップ２０６を通過する
ときの組合わせにおけるｓはｓ＝“５，６”であり、 χ5,6 （１，１，０，０，０，１）＝−１であるため、 α５，６＝２＋（１×（−１））＝１となる。Finally, s in the combination when passing through step 206 is s = “5,6”, and χ5,6 (1,1,0,0,0,1) = − 1. Therefore, α5,6 = 2 + (1 × (−1)) = 1.

【０１２０】ｓ＝“５，６”までの組合わせに対して処
理が終わると次のステップＳ２０４での組合わせ存在判
断においては、もう組合わせが存在しなくなる。そのた
めに、処理はステップＳ２０７に移り、各αを保存し、
終了する。[0120] s = "5,6" to the combination against When the process is finished of in the combination present decision at the next step S204, the other combination is no longer present. Therefore, the process proceeds to step S207, saves each α,
finish.

【０１２１】このような学習の結果、各αの値は図６
（ｂ）のようになる。As a result of such learning, the value of each α is shown in FIG.
It becomes like (b).

【０１２２】各要素の機能は以上の説明の通りである。
従って、本システムは、文書入力部１から、例えば、新
しい文書が入力されたとすると、キーワード抽出部３に
てこの文書からキーワードが抽出され、推論部５はこの
抽出されたキーワードをデータ記憶部４に記憶されてい
る利用者本人の興味ある分野のワード群であるキーワー
ドテーブルのキーワードと照らし合わせて、スペクトル
理論に基づき、読むに値する文書であるか否かを評価
し、読むに値すると評価した文書に対しては提示の指示
を表示部６に与えることにより、表示部６はその文書を
出力して利用者に提示するといった処理を行うことがで
きる。そのため、本システムにより、多数の文書から、
利用者の興味のそそる文書を自動的に選定することがで
きる。The function of each element is as described above.
Therefore, in the present system, if a new document is input from the document input unit 1, the keyword extraction unit 3 extracts the keyword from the document, and the inference unit 5 stores the extracted keyword in the data storage unit 4. Based on the spectrum theory, it was evaluated whether or not the document was worthy of reading by comparing it with the keywords of the keyword table, which is the group of words in the field of interest of the user himself, and was evaluated as worthy of reading. By giving a display instruction to the display unit 6 for a document, the display unit 6 can output the document and present it to the user. Therefore, with this system, from many documents,
It is possible to automatically select documents that are of interest to users.

【０１２３】以上のように、本具体例は、入力された文
書からキーワードを抽出し、この抽出キーワードを登録
キーワード（キーワードテーブルに登録してあるキーワ
ードで、利用者の興味のある分野のキーワード）と照合
して該当の有無を反映した入力ベクトルに変換し、この
入力ベクトルからスペクトル理論による推論を行い、読
むに値するか否かを判定し、読むに値すると判定した場
合にその文書を提示するようにし、また、提示された文
書を利用者が評価した結果を学習させて推論に反映させ
るようにしたので、推定精度が良好であり、また、推論
はスペクトル理論に基づき行うので、ニューラルネット
ワークを使用する場合に比べて計算量が少なく、高速に
判定／学習を行うことが可能となるなどの特徴を有する
情報フィルタを得ることができる。As described above, in this specific example, the keywords are extracted from the input document, and the extracted keywords are registered keywords (keywords registered in the keyword table, which are keywords of the user's field of interest). It is converted to an input vector that reflects the presence or absence of matching, and inference based on spectrum theory is performed from this input vector, it is determined whether it is worthy of reading, and if it is determined that it is worthy of reading, the document is presented. Moreover, since the result of the user's evaluation of the presented document is learned and reflected in the inference, the estimation accuracy is good, and the inference is based on the spectrum theory. Obtaining an information filter that has features such as a smaller amount of calculation compared to when it is used, and faster determination / learning. It is possible.

【０１２４】上述の具体例は、登録キーワードの有無を
中心として評価するものであったが、この場合、未登録
のキーワードは無視するようになっている。そこで、こ
れに対処する例を、第２の具体例として説明する。In the above-described specific example, the presence or absence of registered keywords is mainly evaluated, but in this case, unregistered keywords are ignored. Therefore, an example of dealing with this will be described as a second specific example.

【０１２５】（第２の具体例）本具体例は、判定対象と
なる文書において、第１の具体例のキーワードテーブル
に登録していない新しいキーワードが出現した場合にも
対応できるようにしている。(Second Specific Example) This specific example can deal with the case where a new keyword not registered in the keyword table of the first specific example appears in the document to be judged.

【０１２６】本具体例の情報フィルタは図７に示すよう
に構成されており、基本的には第１の具体例（図１）と
同様であり、また、推論部５の処理の流れも第１の具体
例（図３）と同様であるので、ここでの重複した説明は
省略し、第１の具体例と相違する点を主として説明す
る。The information filter of this specific example is constructed as shown in FIG. 7, and is basically the same as that of the first specific example (FIG. 1), and the processing flow of the inference unit 5 is also the first. Since this is the same as the first specific example (FIG. 3), duplicate description here will be omitted, and the points different from the first specific example will be mainly described.

【０１２７】この具体例では、第１の具体例での学習部
８の機能に加えて、更に新規のキーワードを学習できる
機能を付加した学習部８ａを先の学習部８の代わりに用
いるようにした点が異なる。In this specific example, in addition to the function of the learning unit 8 in the first specific example, a learning unit 8a having a function of learning a new keyword is used instead of the previous learning unit 8. The difference is that

【０１２８】以下、図７及び図８を参照しながら、新規
のキーワードを学習可能とした学習部８ａの働きを説明
する。Hereinafter, the function of the learning unit 8a that enables learning of a new keyword will be described with reference to FIGS. 7 and 8.

【０１２９】図８は、学習部８ａによる処理の流れを示
すフローチャートである。FIG. 8 is a flow chart showing the flow of processing by the learning section 8a.

【０１３０】学習部８ａが起動されると、キーワード抽
出部３から文書に現れたキーワードの一覧を読み込む
（ステップＳ３０１）。次に学習部８ａは、読み込まれ
たキーワード群中に、キーワードテーブル未登録の新規
キーワードが存在するか否かをチェックする（ステップ
Ｓ３０２）。すなわち、データ記憶部４において保存さ
れている図２のようなキーワードテーブルを参照し、当
該キーワードテーブルにない新規キーワードが存在する
か否かをチェックする。その結果、新規キーワードが存
在する場合には、前記キーワードテーブルの最後に当該
新規キーワードを追加登録する（ステップＳ３０３）。
そして、新しくキーワードが加わったことにより、属性
ｓの組み合わせの全体であるＳの要素が増加しているの
で、必要なαｓを追加する（ステップＳ３０４）。When the learning section 8a is activated, a list of keywords appearing in the document is read from the keyword extracting section 3 (step S301). Next, the learning unit 8a checks whether or not there is a new keyword not registered in the keyword table in the read keyword group (step S302). That is, the keyword table as shown in FIG. 2 stored in the data storage unit 4 is referred to and it is checked whether or not there is a new keyword that is not in the keyword table. As a result, if a new keyword exists, the new keyword is additionally registered at the end of the keyword table (step S303).
Then, the addition of the new keyword increases the number of elements of S, which is the entire combination of the attributes s, so that necessary αs is added (step S304).

【０１３１】以上の処理をキーワード抽出部３の抽出し
たキーワード群中の各新規のキーワードについて繰返し
行う。そして、これ以上新規のキーワードが存在しない
場合（ステップＳ３０２）には、読み込まれたキーワー
ドのキーワード番号をキーワードテーブルから求め、事
例ベクトルを生成する（ステップＳ３０５）。The above processing is repeated for each new keyword in the keyword group extracted by the keyword extracting unit 3. Then, when there are no more new keywords (step S302), the keyword number of the read keyword is obtained from the keyword table and a case vector is generated (step S305).

【０１３２】次に利用者の操作による評価データ入力部
７からの評価値を待ち、読み込む（ステップＳ３０
６）。提示した文書への評価値を読み込むと、次に、現
在までの全ての評価値の合計をα０とし、これをαの一
要素として保存する（ステップＳ３０７）。Next, the evaluation value from the evaluation data input section 7 operated by the user is waited for and read (step S30).
6). When the evaluation value for the presented document is read, next, the sum of all evaluation values up to the present is set to α0, and this is stored as one element of α (step S307).

【０１３３】次に、キーワード同士の組み合わせ（前述
のＳ）が存在しているか否かを調べ（ステップＳ３０
８）、その結果、まだ存在している場合には、次の組み
合わせを生成し（ステップＳ３０９）、生成された組み
合わせに関して係数（前述のα）の計算を行い（ステッ
プＳ３１０）、ステップＳ３０８に戻る。なお、ステッ
プＳ３１０においてｆ（ｘ）はステップＳ３０６におい
て読み込んだ、評価値のことである。Next, it is checked whether or not there is a combination of keywords (S described above) (step S30).
8) As a result, if it still exists, the next combination is generated (step S309), the coefficient (α described above) is calculated for the generated combination (step S310), and the process returns to step S308. . Note that f (x) in step S310 is the evaluation value read in step S306.

【０１３４】ステップＳ３０８での判定の結果、キーワ
ード同士の組み合わせがもう存在しない場合、係数αを
データ記憶部４に保存し（ステップＳ３１１）、処理を
終了する。If the result of determination in step S308 is that there are no more combinations of keywords, the coefficient α is saved in the data storage unit 4 (step S311), and the processing ends.

【０１３５】ステップＳ３０４においてαｓを追加する
方法を、具体的に説明する。例えば、図２のような１６
個のキーワードが既に登録されたキーワードテーブルが
あり、ここに新たに“データベース”というキーワード
が登録される場合を考える。この場合、キーワードテー
ブルには新しく“データベース”というキーワードが１
６番目のキーワードとして追加登録されることになる。The method of adding αs in step S304 will be specifically described. For example, as shown in FIG.
Consider a case where there is a keyword table in which individual keywords have already been registered, and a new keyword “database” is newly registered in this table. In this case, the keyword "database" is newly added to the keyword table.
It will be additionally registered as the sixth keyword.

【０１３６】この時点では、ｓ＝（１６）を要素として
持つαｓはデータとして存在しない。ところが、ここで
“データベース”なる語句は新規に登録されるキーワー
ドであるため、現在までにフィルタリングした文章の中
には存在していなかったことが分かる。At this point, αs having s = (16) as an element does not exist as data. However, since the word “database” is a newly registered keyword, it can be seen that it has not been present in the filtered text up to now.

【０１３７】すなわち、仮にキーワードテーブルに登録
されていたとしても、出現した回数は“０”である。従
って、学習式は α（１６）＝α（１６）＋ｆ（ｘ）χ(16)（ｘ）であるため、学習開始から現在までのχ(16)は常に
“１”（出現数偶数回；偶数には０も含まれる）を返し
ていた筈であり、α（１６）はΣｆ（ｘ）である。That is, the number of appearances is “0” even if it is registered in the keyword table. Therefore, since the learning formula is α (16) = α (16) + f (x) χ (16) (x), χ (16) from the start of learning to the present is always “1” (even number of appearances; It should have returned 0 even numbers), and α (16) is Σf (x).

【０１３８】従って、α（１６）＝α０により与えられ
る。つまり、現在までの全ての評価値の合計をα０と
し、これをαの一要素として保存しておくと共に、新規
登録の“keyword*16 ”については、α（１６）として
α０を用いれば良いことになる。Therefore, it is given by α (16) = α0. That is, the sum of all evaluation values up to now is set to α0, and this is saved as one element of α, and for newly registered “keyword * 16”, α0 should be used as α (16). become.

【０１３９】また、α（ａ，１６）は、キーワード番号
１６のキーワード（つまり、“データベース”という語
句）が出現していないので、α（ａ）と同じである。こ
こで、ａは１〜１５までのキーワード番号である。Further, α (a, 16) is the same as α (a) because the keyword of keyword number 16 (that is, the phrase “database”) does not appear. Here, a is a keyword number from 1 to 15.

【０１４０】同様に、次数Ｎのαは次数（Ｎ−１）のα
から求めることが可能である。Similarly, α of degree N is α of degree (N-1).
Can be obtained from

【０１４１】以上により、キーワードテーブルに登録し
ていない新しいキーワードが出現した場合にも、現在ま
での全ての評価値の合計をα０とし、これをαの一要素
として保存しておくと共に、新規登録のキーワードにつ
いては、そのキーワードのαとしてα０を用い、他の次
数のαとしては次数１での他のキーワードのものから流
用することで、新規登録に対応できるようになる情報フ
ィルタが得られる。As described above, even when a new keyword not registered in the keyword table appears, the sum of all evaluation values up to the present is set to α0, which is stored as one element of α and newly registered. For the keyword, the information filter that can cope with new registration can be obtained by using α0 as α of the keyword and diverting from other keywords of degree 1 as α of the other degree.

【０１４２】第２の具体例の変形例について説明する。A modification of the second specific example will be described.

【０１４３】第２の具体例においては、その推論部５部
分は、第１の具体例（図３）と同様の処理内容で実現で
きる。しかし、図３の処理の流れを若干修正して、図９
のようにしても良い。すなわち、文書から抽出されたキ
ーワードのうち、キーワードテーブルに登録されている
キーワード（登録キーワード）に対応するキーワードの
種類数をＲ、抽出された全キーワードの種類数をＡ、値
“０”〜“１”の間における所望の値に設定した閾値を
ＣＯとした場合、ステップＳ１０８で、Ｒ／Ａ＜ＣＯ
のときは、ステップＳ１０３〜ステップＳ１０５の推
論をせずに、文書を表示するように指示を出す。In the second specific example, the inference unit 5 part can be realized by the same processing contents as in the first specific example (FIG. 3). However, the processing flow of FIG.
You may do like this. That is, among the keywords extracted from the document, the number of types of keywords corresponding to the keyword (registered keyword) registered in the keyword table is R, the number of types of all the extracted keywords is A, and the values “0” to “”. When the threshold value set to a desired value between 1 ”is CO, in step S108, R / A <CO
In case of, the instruction is given to display the document without performing the inference in steps S103 to S105.

【０１４４】このようにすると、評価対象の文書から抽
出されたキーワードに含まれる登録キーワード対応のキ
ーワード種類数と文書から抽出されたキーワードの種類
数の比に応じて無条件に文書を提示するといった処理が
でき、抽出されたキーワードの種類数に占める登録キー
ワード数が設定した値に満たない時には、その文書を提
示することで、新分野の文書や、新技術の文書の見落と
しといった弊害発生の阻止を図ることが可能になる。In this way, the document is unconditionally presented according to the ratio of the number of types of keywords corresponding to the registered keywords included in the keywords extracted from the document to be evaluated and the number of types of keywords extracted from the document. When processing is possible and the number of registered keywords in the number of types of extracted keywords is less than the set value, by presenting that document, it prevents the occurrence of harmful effects such as oversight of documents in new fields and documents of new technology Can be achieved.

【０１４５】もちろん、抽出したキーワードのうちキー
ワードテーブルに存在しなかったものについて、第２の
具体例（図８）で説明したように、キーワードテーブル
に登録し、必要なαを追加し、αを学習して以後の推論
に反映させるようにすることもできる。Of course, of the extracted keywords that do not exist in the keyword table, as described in the second specific example (FIG. 8), they are registered in the keyword table, the necessary α is added, and α is set to It is also possible to learn and reflect it in subsequent inference.

【０１４６】（第３の具体例）第３の具体例について説
明する。本具体例は、前述したスペクトル理論に基づい
た推論と学習をそれぞれ高速化するよう工夫した。以下
に説明するスペクトル理論をこれ以降「高速スペクトル
理論」と称する。(Third Concrete Example) A third concrete example will be described. In this example, the inference and learning based on the above-mentioned spectrum theory are speeded up respectively. The spectrum theory described below is hereinafter referred to as "fast spectrum theory".

【０１４７】本具体例に係る情報フィルタの構成は図１
０に示すとおりであり、推論部５を推論部５Ａに学習部
８を学習部８Ａに置き換えてあるが、これらを含めその
他の要素の機能は基本的には第１の具体例（図１）と同
様であるので、同一部分についてのここでの説明は省略
し、第１の具体例と相違する点を主として説明する。The configuration of the information filter according to this example is shown in FIG.
As shown in FIG. 0, the inference unit 5 is replaced with the inference unit 5A and the learning unit 8 is replaced with the learning unit 8A, but the functions of other elements including these are basically the first specific example (FIG. 1). The description of the same parts is omitted here, and the points different from the first specific example will be mainly described.

【０１４８】さて、情報フィルタに入力される問題べク
トル（つまり、評価しようとする文書から生成した評価
対象の入力ベクトル）の特性として、ほとんどの属性値
が“０”であることが挙げられる。すなわち、１つの文
書の中に出現するキーワードは通常、２０〜３０個であ
り、文書を構成している語句の多くは登録キーワード以
外のことが実験等によりわかっている。As a characteristic of the problem vector (that is, the input vector to be evaluated generated from the document to be evaluated) input to the information filter, most of the attribute values are "0". That is, the number of keywords that appear in one document is usually 20 to 30, and it has been known by experiments that many of the words and phrases that make up the document are not registered keywords.

【０１４９】これに対して、全体の属性の数、すなわ
ち、情報フィルタに登録されているキーワードの数は数
千〜１万個にも及ぶ。ここに着目すると、予測値の計算
式である式（２）において、殆どのχｓ（ｘ）は“１”
を値とすることが分かる。On the other hand, the total number of attributes, that is, the number of keywords registered in the information filter reaches several thousand to ten thousand. Focusing on this, most of χs (x) is “1” in the formula (2) which is the calculation formula of the predicted value.
It can be seen that the value is.

【０１５０】そこで、予め全ての属性値が“０”であっ
た場合の予測値ｆ（０）を求めておき、ここから、
“０”でなかった属性値が影響を及ぼす部分だけを修正
する方が計算量が少なくて済む。Therefore, the predicted value f (0) when all the attribute values are "0" is obtained in advance, and from here,
The amount of calculation can be reduced by correcting only the part that is affected by the attribute value that is not "0".

【０１５１】例えば、１０００個のキーワードが登録さ
れているシステムで、１０個のキーワードを含む文書を
次数２までフィルタリングする場合、第１の具体例の方
式では、１０００＋１０００×１０００回もαｓχｓ
（ｘ）を求めなければならないところを、本具体例のよ
うにすると、１０＋１０×９９０回の計算で済むことに
なり、計算量が大幅に減ることが分かる。ゆえに、その
計算量が減った分、高速処理となる。For example, in a system in which 1000 keywords are registered, when a document including 10 keywords is filtered up to degree 2, in the method of the first specific example, 1000 + 1000 × 1000 times αsχs.
It can be seen that if (x) is required to be calculated in this specific example, 10 + 10 × 990 calculations will be required, and the amount of calculation will be significantly reduced. Therefore, the amount of calculation is reduced, resulting in high-speed processing.

【０１５２】これを実現するためには推論部５をこの具
体例では次のようにした推論部５Ａに置き換えてある。In order to realize this, the inference unit 5 is replaced with the inference unit 5A as follows in this specific example.

【０１５３】以下、図１１を参照しながら推論部５Ａの
働きを説明する。The function of the inference unit 5A will be described below with reference to FIG.

【０１５４】図１１は、推論部５Ａによる処理の流れを
示すフローチャートである。推論部５Ａは起動される
と、キーワード抽出部３から文書に現れたキーワードの
一覧を読み込む（ステップＳ４０１）。読み込まれたキ
ーワードのキーワード番号をデータ記憶部４に保存され
ている図２のようなキーワードテーブルを参照すること
により求める（ステップＳ４０２）。その際、キーワー
ド番号を求めようとしている対象のキーワードが、キー
ワードテーブルに登録されていない場合には無視する。FIG. 11 is a flow chart showing the flow of processing by the inference unit 5A. When the inference unit 5A is activated, the keyword extraction unit 3 reads a list of keywords appearing in the document (step S401). The keyword number of the read keyword is obtained by referring to the keyword table stored in the data storage unit 4 as shown in FIG. 2 (step S402). At that time, if the target keyword for which the keyword number is to be obtained is not registered in the keyword table, it is ignored.

【０１５５】次に、予測値に取り敢えずｆ（０）を代入
する（ステップＳ４０３）。そして、キーワード同士の
組み合わせが他に存在するか否かを調べ（ステップＳ４
０４）その結果、キーワード同士の組み合わせ（前述の
ｓ）がまだ存在している場合には、文書内に出現したキ
ーワードから奇数個、残りを出現しなかったキーワード
から選び、組合せを生成する（ステップＳ４０５）。こ
れにより生成された組み合わせは、ｆ（０）とｆ（ｘ）
が異なる部分であるため、次にこれを予測値＝予測値−２α０−２αｓなる演算を施すことにより、修正する（ステップＳ４０
６）。Next, f (0) is first substituted into the predicted value (step S403). Then, it is checked whether or not there is another combination of keywords (step S4).
04) As a result, if there is still a combination of the keywords (s described above), an odd number is selected from the keywords that have appeared in the document, and the remaining keywords are selected to generate a combination (step). S405). The resulting combination is f (0) and f (x)
Is a different part, so that this is corrected by performing an operation of predicted value = predicted value-2α0-2αs (step S40).
6).

【０１５６】なお、ここでのα０とは、現在までの評価
値の総和であり、αｓは第１の具体例と異なり、総和と
の差分を記憶していることになる。Here, α0 is the total sum of the evaluation values up to the present, and αs stores the difference from the total sum unlike the first specific example.

【０１５７】ステップＳ４０６におけるこの修正処理を
終えると、ステップＳ４０４に戻る。そして、ステップ
Ｓ４０４での判定の結果、キーワード同士の組み合わせ
がまだ存在すれば上述の処理を繰り返すが、もう存在し
ない場合には、予測値の計算は終了し、予測の結果によ
り文書を利用者に提示するか否かを判断する（ステップ
Ｓ４０７）。この判定は、予測の結果が“０未満”であ
るか否かにより決める。When the correction process in step S406 is completed, the process returns to step S404. Then, if the result of determination in step S404 is that a combination of keywords still exists, the above processing is repeated, but if it does not exist anymore, the calculation of the predicted value ends, and the document is presented to the user according to the result of the prediction. It is determined whether or not to present (step S407). This determination is determined by whether or not the prediction result is “less than 0”.

【０１５８】すなわち、予測の結果が“０未満”であっ
た場合、推論部５Ａは提示の指示を出さず、従って、シ
ステムは利用者に文書を提示しないで終了することにな
る。しかし、予測の結果が“０以上”であった場合には
（ステップＳ４０７）、推論部５Ａは提示の指示を表示
部６に出し（ステップＳ４０８）、処理を終了する。That is, when the result of the prediction is "less than 0", the inference unit 5A does not issue a presentation instruction, so the system ends without presenting the document to the user. However, when the result of the prediction is “0 or more” (step S407), the inference unit 5A issues a presentation instruction to the display unit 6 (step S408) and ends the process.

【０１５９】この指示を受けて表示部６は文書記憶部２
から当該評価の対象とした文書を読み出して出力部６ｂ
に出力し、当該文書を提示することになる。In response to this instruction, the display unit 6 displays the document storage unit 2
The document targeted for the evaluation is read from the output unit 6b
Will be output and the document will be presented.

【０１６０】図１２を参照しながら学習部８Ａの働きを
説明する。図１２は、学習部８Ａによる処理の流れを示
すフローチャートである。学習部８Ａが起動されると、
キーワード抽出部３から文書に現れたキーワードの一覧
を読み込む（ステップＳ５０１）。読み込まれたキーワ
ードのキーワード番号をデータ記憶部４に保存されてい
る図２のようなキーワードテーブルを参照しながら求め
る（ステップＳ５０２）。この時に、キーワードがキー
ワードテーブルに登録されていない場合には無視する。The operation of the learning section 8A will be described with reference to FIG. FIG. 12 is a flowchart showing the flow of processing by the learning unit 8A. When the learning unit 8A is activated,
A list of keywords appearing in the document is read from the keyword extracting unit 3 (step S501). The keyword number of the read keyword is obtained with reference to the keyword table stored in the data storage unit 4 as shown in FIG. 2 (step S502). At this time, if the keyword is not registered in the keyword table, it is ignored.

【０１６１】次に、α０に評価値を加え（ステップＳ５
０５）、そして、キーワード同士の組み合わせの有無を
調べる（ステップＳ５０６）。このステップＳ５０６で
調べた結果、キーワード同士の組み合わせ（前述のＳ）
がまだ存在している場合、文書内に出現したキーワード
から奇数個、そして、残りを文書内に出現しなかった登
録キーワードから選び、組合せを生成する（ステップＳ
５０７）。これにより、生成された組合わせは、χｓ
（ｘ）が“−１”である部分であるため、αｓを修正す
る（ステップＳ５０８）。Next, an evaluation value is added to α0 (step S5
05), and it is checked whether or not there is a combination of keywords (step S506). As a result of checking in this step S506, a combination of keywords (S described above)
If is still present, an odd number is selected from the keywords that have appeared in the document, and the rest are selected from the registered keywords that have not appeared in the document, and a combination is generated (step S
507). The resulting combination is then χs
Since (x) is a portion of "-1", αs is corrected (step S508).

【０１６２】更に、ｆ（０）もこれに合わせて修正する
（ｆ（０）＝ｆ（０）−２ｆ（Ｘ））（ステップＳ５０
９）。そして、ステップＳ５０６に戻ってキーワード同
士の組み合わせの有無を調べる。その結果、更に、組合
せが存在すればステップＳ５０７以降の処理を繰り返す
が、ステップＳ５０６での結果、存在しない場合には係
数の計算は終了し、αをデータ記憶部４に保存し（ステ
ップＳ５１０）、ｆ（０）もデータ記憶部４に保存し
（ステップＳ５１１）、終了する。Further, f (0) is also corrected accordingly (f (0) = f (0) -2f (X)) (step S50).
9). Then, the process returns to step S506 to check whether there is a combination of keywords. As a result, if there is a further combination, the processing from step S507 onward is repeated. If the result of step S506 is that there is no combination, calculation of the coefficient ends, and α is stored in the data storage unit 4 (step S510). , F (0) are also stored in the data storage unit 4 (step S511), and the process ends.

【０１６３】以上、本具体例は、文書から抽出したキー
ワードについて、登録キーワードと照合し、登録キーワ
ード該当のキーワードであれば、そのキーワード単体及
びキーワード同士の次数別組み合わせを調べて、それぞ
れの予測値を求め、予測値の値からその文書の提示、非
提示を決めるようにした。As described above, in the present specific example, the keywords extracted from the document are collated with the registered keywords, and if the keywords are the registered keywords, the single keyword of the keyword and the combination of the keywords according to the order are examined, and the predicted values of the keywords are calculated. Then, the presentation or non-presentation of the document is decided based on the predicted value.

【０１６４】これにより、処理内容が単純化されること
から、本具体例により、スペクトル理論に基づいた推論
と学習をそれぞれ高速化することが可能になる。As a result, since the processing contents are simplified, it is possible to speed up the inference and learning based on the spectrum theory in this specific example.

【０１６５】（第４の具体例）第４の具体例について説
明する。上述した第１〜第３の具体例は、いずれも１つ
の文書について、推論・表示を行うような逐次処理のシ
ステム構成例であった。第４の具体例では、複数の文書
について、一括して、推論・表示を行うようにした例を
説明する。(Fourth Concrete Example) A fourth concrete example will be described. The above-described first to third specific examples are all examples of a system configuration of sequential processing in which inference / display is performed for one document. In the fourth specific example, an example will be described in which a plurality of documents are collectively inferred and displayed.

【０１６６】本具体例に係る情報フィルタの構成は図１
３に示す如きであり、複数の文書について、一括して、
推論できるようにした推論部５Ｂを用いるようにしたも
のであって、この推論部５Ｂをはじめ、各機能要素は基
本的には前述の具体例（図１）と同様であるから、ここ
での説明は省略し、異なる部分について説明する。The configuration of the information filter according to this example is shown in FIG.
As shown in Fig. 3, for multiple documents,
The reasoning unit 5B that is capable of reasoning is used. The reasoning unit 5B and other functional elements are basically the same as those in the above-described specific example (FIG. 1). Description is omitted, and different parts will be described.

【０１６７】本具体例では、推論部５Ｂは推論処理に関
して、図３や図１１で説明したものに若干の修正を加え
た内容とした。ここでは、推論部５Ｂの機能として図３
で説明したものに、若干の修正を加えて実現するように
した例を図１４に示す。In this specific example, the inference unit 5B has a content in which the inference processing has been slightly modified from that described with reference to FIGS. Here, as the function of the inference unit 5B, FIG.
FIG. 14 shows an example that is realized by slightly modifying the above-described one.

【０１６８】この具体例では、複数の文書それぞれにつ
いて推論処理のみを先に実施し、その後に、評価の高い
文書から順に表示する。つまり、本具体例は複数の文書
をそれぞれ評価した後に、必要性の高いものを選んで提
示させるようにする例である。In this specific example, only the inference process is first executed for each of the plurality of documents, and then the documents with the highest evaluation are displayed in order. In other words, this specific example is an example in which a plurality of documents are evaluated and then the one with the highest need is selected and presented.

【０１６９】従って、評価対象となる文書は複数文書
分、入力されており、それぞれの文書単位でキーワード
抽出部３はそれぞれその文書に現れたキーワードを抽出
する。Therefore, the documents to be evaluated are input for a plurality of documents, and the keyword extracting unit 3 extracts the keywords appearing in each document for each document.

【０１７０】推論部Ｂ５が起動されると、キーワード抽
出部３から第１の文書に現れたキーワードの一覧を読み
込む（ステップＳ１０１）。キーワードの一覧が読み込
まれたならば、次にこの読み込まれた各キーワードそれ
ぞれについてのそのキーワード番号を、データ記憶部４
に保存されている図２のようなキーワードテーブルを参
照しながら求め、入力ベクトルを生成する（ステップＳ
１０２）。この時に、参照しても見付からないキーワー
ド、つまり、キーワードテーブルに登録されていないキ
ーワードであったならばそれは無視する。When the inference unit B5 is activated, a list of keywords appearing in the first document is read from the keyword extracting unit 3 (step S101). When the list of keywords is read, the keyword number of each of the read next keywords is stored in the data storage unit 4.
The input vector is generated by referring to the keyword table as shown in FIG.
102). At this time, if a keyword that cannot be found by reference, that is, a keyword that is not registered in the keyword table, is ignored.

【０１７１】次にキーワード同士の組み合わせ（前述の
Ｓ）の存在の有無を調べ（ステップＳ１０３）、その結
果、キーワード同士の組み合わせ（前述のＳ）が、まだ
存在している場合には、次の組み合わせを生成し（ステ
ップＳ１０４）、生成された組み合わせに関して予測値
の計算を行い（ステップＳ１０５）、ステップ１０３に
戻る。Next, it is checked whether or not there is a combination of keywords (S above) (step S103). As a result, if a combination of keywords (S above) still exists, the following A combination is generated (step S104), a predicted value is calculated for the generated combination (step S105), and the process returns to step 103.

【０１７２】ステップＳ１０３での判定の結果、キーワ
ード同士の組み合わせがもう存在しない場合には、予測
値の計算は終了する。ここで、式（２）のｓｉｇｎ関数
に代入する値、式（４）のｈ（ｘ）が得られる。If the result of determination in step S103 is that there are no more combinations of keywords, the calculation of the predicted value ends. Here, the value to be assigned to the sign function of Expression (2) and h (x) of Expression (4) are obtained.

【０１７３】そして、ステップＳ１１１移り、このステ
ップＳ１１１以下の処理ループにより、各文書につい
て、推論を一括して行う。ここでは、式（２）により表
示すべきと判断されたものについて、式（４）のｈ
（ｘ）を当該文書の必要性の度合いを示す指標として保
存しておく。Then, the process proceeds to step S111, and the inference is collectively performed for each document by the processing loop of step S111 and thereafter. Here, for the items determined to be displayed by the formula (2), h of the formula (4)
(X) is stored as an index indicating the degree of necessity of the document.

【０１７４】このような処理を、第２の文書、第３の文
書…それぞれに行い、式（２）により表示すべきと判断
されたものについては、式（４）のｈ（ｘ）を当該文書
の必要性の度合いを示す指標として保存しておく。Such processing is performed on each of the second document, the third document, etc., and for those judged to be displayed by the formula (2), h (x) of the formula (4) is It is saved as an index showing the degree of necessity of the document.

【０１７５】このような処理が終了後、ステップＳ１１
２に移り、表示すべきと判断された文書を、必要性の度
合いを示す指標ｈ（ｘ）の大きい順にソートする。そし
て、ステップＳ１１３に移り、ソートされた順に対象の
文書を表示するように、表示部６に指示し、当該ソート
順に文書を表示させる。After such processing is completed, step S11
Moving to 2, the documents determined to be displayed are sorted in descending order of the index h (x) indicating the degree of necessity. Then, the process proceeds to step S113, the display unit 6 is instructed to display the target documents in the sorted order, and the documents are displayed in the sorted order.

【０１７６】この結果、複数の文書をそれぞれについて
まず評価して、文書の必要性の度合いを示す指標で保存
し、複数の文書をそれぞれについての当該評価を全て終
えた後に指標の高いものを順に表示指示して表示させる
ことができる。As a result, each of a plurality of documents is first evaluated, and the plurality of documents are stored with an index indicating the degree of necessity of the document. It can be displayed by giving a display instruction.

【０１７７】従って、複数の文書が高頻度で入力される
ような場合に、事前に纏めて評価の後、必要性の高いも
のを選んで提示させることができるので、例えば、毎
日、要不要にかかわりなく、多数のメールが飛び込むネ
ットワークの各端末ユーザのように、取捨選択を必要と
する場合などに、緊急度の高いものや、重要度の高いも
の、或いは興味の高いと評価されるものを、指標の高い
ものから順に読むことができるようになり、便利とな
る。Therefore, when a plurality of documents are input with high frequency, it is possible to collect and evaluate the documents in advance, and then select and present the ones with high necessity. Regardless of each terminal user in the network where many mails jump in, regardless of whether selection is necessary, those with high urgency, high importance, or those that are evaluated as highly interesting , It becomes convenient to be able to read in order from the highest index.

【０１７８】（第１〜第４の具体例の変形例１）第１〜
第４の具体例では、推論結果を得るためのｓｉｇｎ関数
は、ｓｉｇｎ（ｘ）：ｘ≧０ならばｓｉｇｎ（ｘ）＝
１、ｘ＜０ならばｓｉｇｎ（ｘ）＝−１となるような関
数であった。ここで、次のような関数を考える。(Modification 1 of First to Fourth Concrete Examples) First to First
In the fourth specific example, the sign function for obtaining the inference result is sign (x): sign (x) = if x ≧ 0.
If 1 and x <0, the function was such that sign (x) =-1. Now consider the following function.

【０１７９】ｓｉｇｎ′（ｘ）：ｘ≧ｃならばｓｉｇｎ（ｘ）＝１、ｘ＜ｃならばｓｉｇｎ（ｘ）＝−１この関数において、推論のしきい値ｃ＝０とした場合
が、上記のｓｉｇｎ（ｘ）である。ここで、上記推論の
しきい値ｃは、任意に設定しても構わない。ｃの値を正
側に大きくする程、表示条件が厳しくなり、ｃの値を負
側に大きくする程、表示条件が緩くなる。Sign ′ (x): if x ≧ c, then sign (x) = 1; if x <c, then sign (x) = − 1 In this function, the reasoning threshold c = 0 is The above is sign (x). Here, the threshold value c of the inference may be set arbitrarily. The display condition becomes more severe as the value of c increases toward the positive side, and the display condition becomes looser as the value of c increases toward the negative side.

【０１８０】（第１〜第４の具体例の変形例２）重要と
判定された文書を表示する場合の他の例として、ここで
は推論部５が、文書を利用者に提示すべきであるとの判
断をしたときに、表示部６は文書記憶部２の文書のう
ち、まず、当該文書の目次や要約の部分を読み出してこ
れを出力部６ｂに表示するように制御し、更にこれを見
たユーザが本文を読みたいと判断してその指示を図示し
ない入力操作部から与えたときに、その表示要求に応じ
て、表示部６は文書記憶部２の文書から当該文書の本文
を読み出して出力部６ｂに表示するように制御する機能
を持たせるようにする。(Variation 2 of First to Fourth Concrete Examples) As another example of displaying a document determined to be important, the inference unit 5 should present the document to the user here. When it is determined that the contents are stored in the document storage unit 2, the display unit 6 first reads out the table of contents or the summary of the document and controls the display unit 6b to display the contents. When the user who sees it decides to read the text and gives the instruction from the input operation unit (not shown), the display unit 6 reads the text of the document from the document in the document storage unit 2 in response to the display request. The output unit 6b is provided with a function for controlling the display.

【０１８１】このような構成のシステムでは、推論部５
が文書を利用者に提示すべきであるとの判断をしたと
き、表示部６は文書記憶部２の文書のうち、まず、当該
文書の目次や要約の部分を読み出してこれを出力（表
示）するように制御する。In the system having such a configuration, the inference unit 5
When it is determined that the document should be presented to the user, the display unit 6 first reads out the table of contents or summary of the document in the document storage unit 2 and outputs (displays) the document. Control to do.

【０１８２】そして、この出力内容を見たユーザが本文
を読みたいと判断してその指示を、図示しない入力操作
部から与えたとすると、その表示要求に応じて、表示部
６は文書記憶部２の文書から当該文書の本文を読み出し
て出力（表示）する。If the user who sees the output contents determines that he / she wants to read the text and gives the instruction from the input operation unit (not shown), the display unit 6 displays the document storage unit 2 in response to the display request. The body of the document is read from the document and output (display).

【０１８３】このように重要と判定された文書を表示す
る場合、まず、当該文書の目次や要約を表示し、更にユ
ーザが本文を読みたいと判断したときに、ユーザからの
表示要求に応じて、文書の本文を表示するようにする
と、情報フィルタがユーザのために選択した文書の中か
ら、ユーザはより興味のある文書のみを選択して読むこ
とができるようになる。When displaying a document determined to be important in this way, first, a table of contents or a summary of the document is displayed, and when the user decides that he wants to read the text, he or she responds to the display request from the user. By displaying the text of the document, the user can select and read only the more interesting document from the documents selected by the information filter for the user.

【０１８４】また、推論にあたり、第１段階として、図
０１の文書入力部１から入力された文書を特定する情報
（文書名や文書コードなど）とキーワード群の組を利用
し、該キーワード群に対して推論を行い、第２段階とし
て、必要と判定されたものについてのみ、文書の本文を
入力してキーワードを抽出し、推論をするようにしても
良い。In the inference, the first step is to use a set of information (document name, document code, etc.) for specifying the document and the keyword group input from the document input unit 1 of FIG. Inference may be performed with respect to the second stage, and only in the second stage, the body of the document may be input to extract the keyword and the inference may be performed.

【０１８５】なお、以上の具体例において、キーワード
自体については言及しなかったが、通常の文書を良く表
す単語以外にも、その文書に予め分類がなされている場
合（例えば、特許関係書類におけるＩＰＣ分類（国際特
許分類）など）には、この分野や、著者名、著者所属な
どもキーワードの一部として利用することも可能であ
る。また、本発明は上述した各具体例に限定されるもの
ではなく、その要旨を逸脱しない範囲で、種々変形して
実施することができる。In the above specific examples, although the keyword itself is not mentioned, if the document is classified in advance in addition to the words that often represent a normal document (for example, IPC in patent documents). For classification (international patent classification, etc.), this field, author name, author affiliation, etc. can also be used as part of keywords. Further, the present invention is not limited to the above-mentioned specific examples, and various modifications can be carried out without departing from the scope of the invention.

【０１８６】以上により、複数の文書について、一括し
て、推論・表示を行うようにした情報フィルタ装置が得
られる。As described above, it is possible to obtain an information filter device that collectively infers and displays a plurality of documents.

【０１８７】（第５の具体例）上記の具体例ではキーワ
ードは増加して行く一方であったが、時間の経過ととも
に、利用者の興味が変化してゆくことも多く、その場
合、必要でないキーワードが発生する。そして、必要で
ないキーワードを残したままにしておくと、処理にその
分、無駄が生じるばかりでなく、使用者の必要とする文
書の評価に誤りが発生するようになってしまう。そこ
で、必要でないキーワードの除去が重要となるので、当
該必要でないキーワードの除去方法について第５の具体
例として説明する。(Fifth Specific Example) In the above specific example, the number of keywords has been increasing, but the interest of the user often changes with the passage of time, and in that case, it is not necessary. Keywords occur. Then, if the unnecessary keywords are left as they are, not only the processing will be wasted, but also an error will occur in the evaluation of the document required by the user. Therefore, since the removal of unnecessary keywords is important, a method of removing the unnecessary keywords will be described as a fifth specific example.

【０１８８】ここでは、不要キーワード検出を行うと共
に、不要キーワードが検出された場合に、過去の前記関
係の学習結果から該不要キーワードを削除するキーワー
ド削除機能を有する不要キーワード検出部９を設け、こ
の不要キーワード検出部９が不要キーワードを検出した
場合に、データ記憶部４における過去の前記関係の学習
結果から該データ記憶部４における該不要キーワードを
削除する構成とする。Here, unnecessary keyword detection is performed, and an unnecessary keyword detecting section 9 having a keyword deleting function for deleting the unnecessary keyword from the past learning result of the relation when the unnecessary keyword is detected is provided. When the unnecessary keyword detection unit 9 detects an unnecessary keyword, the unnecessary keyword in the data storage unit 4 is deleted from the past learning result of the relationship in the data storage unit 4.

【０１８９】すなわち、この第５の具体例では、図１５
に示すように、図０１、図０７、図１０、図１３いずれ
かの構成に、不要キーワード検出機能とキーワードテー
ブルの内容の更新処理機能を有する不要キーワード検出
処理部９を更に設け、不要キーワード検出してそれに基
づき、データ記憶部４におけるキーワードテーブルの内
容の更新を行うようにする。That is, in the fifth specific example, FIG.
As shown in FIG. 10, the unnecessary keyword detection processing unit 9 having the unnecessary keyword detection function and the function of updating the contents of the keyword table is further provided in any of the configurations of FIGS. 01, 07, 10 and 13 to detect the unnecessary keywords. Then, based on this, the contents of the keyword table in the data storage unit 4 are updated.

【０１９０】この更新を行えるようにするために、キー
ワードテーブルは図０２のキーワードテーブルに登録日
時と使用回数を記入する覧を追加した図１６に示すよう
なものとし、また、学習部８，８ａ，８Ａはキーワード
テーブルへの登録時に登録時刻を記入し、キーワードの
使用毎にその該当のキーワードの使用回数を累積した数
を更新記録してゆく機能を持たせた構成とした点が先の
各具体例と少し異なるが、その他の点は先の各具体例の
動作を踏襲するので、同一部分の動作説明は省略し、以
下の説明は異なる部分についてのみ行う。In order to enable this updating, the keyword table shall be as shown in FIG. 16 in which a list for entering the registration date and time and the number of times of use is added to the keyword table of FIG. , 8A has a function of entering the registration time at the time of registration in the keyword table and updating and recording the cumulative number of times of use of the corresponding keyword for each use of the keyword. Although it is a little different from the specific example, the other points follow the operations of the above specific examples, so the description of the operations of the same parts will be omitted and the following description will be made only for the different parts.

【０１９１】この具体例での特徴は、不要キーワード検
出処理部９の機能にあるが、この不要キーワード検出処
理部９の持つ機能のうち、不要キーワード検出機能は、
利用者からの指示で、或いは一定時間間隔で、又はフィ
ルタリング速度が一定以下に遅くなった時に起動され
る。The feature of this specific example resides in the function of the unnecessary keyword detection processing section 9. Among the functions of the unnecessary keyword detection processing section 9, the unnecessary keyword detection function is
It is activated by an instruction from the user, at a certain time interval, or when the filtering speed becomes slower than a certain value.

【０１９２】不要キーワード検出処理部９は起動される
と、図１６のキーワードテーブルを参照し、一定期間以
上登録されて、しかも、あまり使われていないキーワー
ドを探す。その探索のために、図１６は図０２のキーワ
ードテーブルに登録時刻と使用回数を記入する覧を追加
した構成としている。When the unnecessary keyword detection processing unit 9 is activated, it refers to the keyword table in FIG. 16 to search for keywords that have been registered for a certain period of time and are not used often. For the search, FIG. 16 has a configuration in which a list for entering the registration time and the number of times of use is added to the keyword table of FIG.

【０１９３】この例では、１番（キーワード番号が１）
の“ワープロ”という語については“１９９４年１１月
１０日”に登録され、使用回数は“５３”回であること
を示しており、２番（キーワード番号が２）の“辞書”
という語については“１９９４年１１月１１日”に登録
され、使用回数は“２１”回であることを示しており、
３番（キーワード番号が３）の“帰納”という語につい
ては“１９９４年１２月１０日”に登録され、使用回数
は“９”回であることを示しており、４番（キーワード
番号が４）の“学習”という語については“１９９４年
１２月１０日”に登録され、使用回数は“６”回である
ことを示しており、といった具合である。In this example, number 1 (keyword number is 1)
The word "word processor" is registered on "November 10, 1994", and it indicates that it has been used "53" times, and the second is "dictionary" (keyword number is 2).
The word is registered on "November 11, 1994" and indicates that it has been used "21" times.
The number 3 (keyword number 3) "induction" was registered on "December 10, 1994" and indicates that it was used "9" times, and the number 4 (keyword number 4 The term “learning” in) is registered on “December 10, 1994” and indicates that the number of times of use is “6”, and so on.

【０１９４】上記のように構成された本具体例装置の概
略動作を図１７を参照して説明する。図１７は、第４の
具体例の動作を示すフローチャートである。The general operation of the apparatus of the present specific example configured as described above will be described with reference to FIG. FIG. 17 is a flowchart showing the operation of the fourth specific example.

【０１９５】削除の対象となっているキーワードがまだ
存在するかを確認し、存在する場合には、次のキーワー
ドを得る（ステップ６０１）。得たキーワードが登録さ
れてから一定以上の日数が経過していない場合（ステッ
プ６０２）、ステップ６０１に戻り次のキーワードを探
す。登録されてから一定以上の日数が経過している場合
（ステップ６０２）、そのキーワードの使用回数が定数
Ｂ以上である場合には、削除せずにステップ６０１に戻
る（ステップ６０３）。そのキーワードの使用回数が定
数Ｂ以下である場合には（ステップ６０３）、更に、そ
のキーワードの使用回数が定数Ｂよりも小さな定数Ｃ以
下である場合には、（ステップ６０４）、キーワードを
削除する（ステップ６０６）。そのキーワードの使用回
数が定数Ｃ以上である場合には（ステップ６０４）、キ
ーワードを削除する（ステップ６０６）。そのキーワー
ドの使用回数が定数Ｃ以上である場合には（ステップ６
０４）、そのキーワードの学習係数の一次の値αの絶対
値が定数Ｄ以下の場合（ステップ６０５）、キーワード
を削除する（ステップ６０６）。そのキーワードの学習
係数の一次の値αの絶対値が定数Ｄ以下でない場合（ス
テップ６０５）、キーワードは削除せずに次のキーワー
ドを探す（ステップ６０１）。It is confirmed whether the keyword to be deleted still exists, and if it exists, the next keyword is obtained (step 601). If a certain number of days have not passed since the obtained keyword was registered (step 602), the process returns to step 601 to search for the next keyword. If a certain number of days have passed since the registration (step 602) and the number of times the keyword has been used is a constant B or more, the process returns to step 601 without deleting (step 603). If the number of times the keyword is used is less than the constant B (step 603), and if the number of times the keyword is used is less than the constant C that is smaller than the constant B (step 604), the keyword is deleted. (Step 606). If the number of times the keyword has been used is equal to or greater than the constant C (step 604), the keyword is deleted (step 606). If the number of times the keyword is used is the constant C or more (step 6)
04), if the absolute value of the primary value α of the learning coefficient of the keyword is equal to or less than the constant D (step 605), the keyword is deleted (step 606). When the absolute value of the primary value α of the learning coefficient of the keyword is not equal to or less than the constant D (step 605), the keyword is not deleted and the next keyword is searched (step 601).

【０１９６】以下、具体例を用いて、キーワードの削除
処理を説明する。The keyword deletion process will be described below using a specific example.

【０１９７】データ学習部８はキーワードを登録する時
には、その登録日時を記録し、また使用する度に、キー
ワードテーブルにあるキーワードの使用回数を増やす
（インクリメントする）。そして、不要キーワード検出
部９はキーワードテーブルの登録時刻と使用回数を参照
し、初期登録から一定期間経過して、しかも、使用頻度
が一定回数以下のものを削除対象とし、該当のものを削
除する。When registering a keyword, the data learning unit 8 records the registration date and time, and each time it is used, the number of times the keyword in the keyword table is used is increased (incremented). Then, the unnecessary keyword detecting unit 9 refers to the registration time and the number of times of use of the keyword table, and deletes the items that have been used for a certain period of time since the initial registration and whose use frequency is less than the certain number of times, and delete the corresponding items. .

【０１９８】例えば、３ケ月以上前に登録されていて、
今までの使用回数は１０回以下であったキーワードを削
除対象とするといった処理を行う。従って、今日の日付
が１９９５年６月１日であったとすると、この場合、図
１６のキーワードテーブルでは、番号１〜７迄のキーワ
ード（“ワープロ”から“光学”迄のキーワード）が期
間的に削除対象候補となる。For example, if it was registered three months or more ago,
A process of deleting a keyword that has been used 10 times or less up to now is performed. Therefore, if today's date is June 1, 1995, in this case, in the keyword table of FIG. 16, keywords 1 to 7 (keywords from "word processor" to "optical") are periodically changed. It becomes a candidate for deletion.

【０１９９】次に、使用された回数を見ると、キーワー
ド番号３のキーワード（“帰納”）、キーワード番号４
のキーワード（“学習”）、キーワード番号７のキーワ
ード（“光学”）がそれぞれ削除対象となっている。こ
こで、使用回数が５以下のキーワードは削除するとすれ
ば、図１６の例の場合には、キーワード番号７のキーワ
ード（“光学”）は削除する。なお、使用回数が６回以
上１０回以下のキーワードは係数αの一次の値を参考に
して、削除するか否かを決める。Next, looking at the number of times of use, the keyword of keyword number 3 (“induction”) and the keyword number 4
The keyword ("learning") and the keyword of keyword number 7 ("optical") are to be deleted. Here, if a keyword whose number of uses is 5 or less is deleted, in the case of the example of FIG. 16, the keyword of keyword number 7 (“optical”) is deleted. It should be noted that a keyword which is used more than 6 times and less than 10 times is determined with reference to the primary value of the coefficient α, whether or not to delete.

【０２００】係数αの例を示すと図１８のようになって
おり、この係数αの一次の値は、各キーワードが直接、
どの程度有用性に貢献しているかの指標である。従っ
て、αの一次の値が、“０”に近い場合には、あまり貢
献していないことを意味しており、削除して良いことに
なる。αの閾値を“１０”とした場合、αｓ≦１０を削
除対象とする。FIG. 18 shows an example of the coefficient α. The primary value of the coefficient α is directly expressed by each keyword.
It is an indicator of how much it contributes to usefulness. Therefore, when the primary value of α is close to “0”, it means that it does not contribute much and can be deleted. When the threshold value of α is “10”, αs ≦ 10 is a deletion target.

【０２０１】キーワード番号３のキーワードに対応する
α（３）と、キーワード番号４のキーワードに対応する
α（４）が、α（３）＝２、α（４）＝１００であった
場合には、キーワード番号３のキーワード（“帰納”）
は削除対象に決定するが、キーワード番号４のキーワー
ドは削除対象にしない。When α (3) corresponding to the keyword of keyword number 3 and α (4) corresponding to the keyword of keyword number 4 are α (3) = 2 and α (4) = 100, , Keyword number 3 (“induction”)
Is determined as the deletion target, but the keyword of keyword number 4 is not set as the deletion target.

【０２０２】削除するキーワードが決定したならば、α
の値から、削除するキーワードに関係する項目を取り除
く。即ち、キーワード番号３とキーワード番号７の各キ
ーワードを削除する場合には、“α３”、“α１，
３”、“α２，３”、“α３，４”、“α３，５”…
…“α７”、“α１，７”、“α２，７”、“α３，
７”、“α４，７”…を削除する。これらは、いずれも
キーワード番号３のキーワードかキーワード番号７のキ
ーワードのいずれかが関与している要素であるためであ
る。When the keyword to be deleted is determined, α
Remove the item related to the keyword to be deleted from the value of. That is, when deleting the keywords of keyword number 3 and keyword number 7, “α3”, “α1,”
3 ”,“ α2,3 ”,“ α3,4 ”,“ α3,5 ”...
... "α7", "α1,7", "α2,7", "α3"
7 ”,“ α4, 7 ”... Are deleted because both of them are elements in which either the keyword of keyword number 3 or the keyword of keyword number 7 is involved.

【０２０３】次に、キーワードテーブルからキーワード
番号３のキーワードと、キーワード番号７のキーワード
を削除する。次にキーワードテーブルからキーワード番
号３、とキーワード番号７を削除し、テーブルが虫食い
状態になっているので、整えて番号を１から順に付け直
す。Next, the keyword of keyword number 3 and the keyword of keyword number 7 are deleted from the keyword table. Next, the keyword number 3 and the keyword number 7 are deleted from the keyword table, and since the table is in the worm-eaten state, the numbers are re-arranged from 1 in order.

【０２０４】次に、キーワードテーブルに現在登録され
ている全てのキーワードの使用回数を一定の割合で減ら
す。例えば、ここでは、１／２にする。Then, the number of times of use of all the keywords currently registered in the keyword table is reduced at a constant rate. For example, here, it is halved.

【０２０５】以上のように、この具体例では、キーワー
ドテーブルにおける初期登録の時点と、使用頻度をキー
ワード毎に管理をするようにし、また、不要キーワード
検出処理部を設けて、初期登録から一定期間を経過し、
かつ、使用頻度の低いキーワードを対象に削除できるよ
うにしたので、不要なキーワードを学習結果を反映した
かたちで削除できるようになり、常に使用者の最新の興
味対象をターゲットにしたキーワード管理ができて、興
味のある文書を適確に選択できるようになる。なお、こ
こで用いた、閾値や係数などの具体的な数値は必要とさ
れるシステムの特性に応じて変更するものであり、数値
自体にはとくに意味はない。As described above, in this specific example, the time of initial registration in the keyword table and the frequency of use are managed for each keyword, and an unnecessary keyword detection processing unit is provided to set a fixed period from the initial registration. Has passed,
In addition, since we have made it possible to delete keywords that are rarely used, it is possible to delete unnecessary keywords in a way that reflects the learning result, and it is possible to always manage the keywords that target the user's latest interest. Then, it becomes possible to accurately select the document of interest. Note that the specific numerical values used here, such as threshold values and coefficients, are changed according to the required characteristics of the system, and the numerical values themselves have no meaning.

【０２０６】（第６の具体例）１つの情報フィルタ装置
で取り扱う利用者の興味の範囲が、広範であった場合、
必要なキーワード数が爆発的に増加する危険性がある。
そして、キーワード数が増加すると、その計算量は指数
的に増加してしまうために、キーワードを分割して、複
数の情報フィルタでテーマ別にフィルタリングを行った
方が有利である。このような場合に、キーワードを分割
して利用する方法について説明する。(Sixth Concrete Example) When the range of interest of the user handled by one information filter device is wide,
There is a risk that the number of required keywords will increase explosively.
Then, as the number of keywords increases, the amount of calculation increases exponentially, so it is advantageous to divide the keywords and perform filtering by theme with a plurality of information filters. In such a case, a method of dividing and using the keyword will be described.

【０２０７】この具体例では、第５の具体例の構成に加
え、更にキーワードを分割するキーワード分割処理部１
０を設けた図１９の如きの構成とした。キーワード分割
処理部１０は所定のキーワードの数が一定の大きさ以上
になった場合に、過去の前記関係の学習結果を分割する
機能を有する。In this specific example, in addition to the configuration of the fifth specific example, the keyword division processing unit 1 for further dividing keywords is used.
The configuration as shown in FIG. The keyword division processing unit 10 has a function of dividing a past learning result of the relationship when the number of predetermined keywords exceeds a certain size.

【０２０８】キーワード分割処理部１０は、利用者から
の指示により、又はフィルタリング速度が一定以下にな
った時に起動される。キーワード分割処理部１０は起動
されると、２次のαの値を参考にキーワードの分類を開
始する。The keyword division processing section 10 is activated by an instruction from the user or when the filtering speed falls below a certain level. When the keyword division processing unit 10 is activated, it starts keyword classification with reference to the secondary α value.

【０２０９】２次のαの値（例えば、α（１，２））は
その値の示す２つのキーワード（α（１，２）の場合、
キーワード番号１のキーワードとキーワード番号２のキ
ーワード）が文書全体の有効性にどれだけ寄与している
かを示しており、二つのキーワードが同一の文書内に出
現しない限り、その値は“０”である。The value of the secondary α (for example, α (1,2)) is two keywords (α (1,2) indicated by the value,
It shows how much each of the keywords of keyword number 1 and the keyword of keyword number 2 contributes to the effectiveness of the entire document. Unless two keywords appear in the same document, the value is "0". is there.

【０２１０】従って、α（１，２）の値が“０”に近け
れば、キーワード番号１のキーワードとキーワード番号
２のキーワードは同時に出現しなかったか、もしくは、
あまり文書の有効性の判定には寄与しないことを意味す
る。ある一定の閾値、例えば、５を閾値とした場合、α
（ｉ，ｊ）≧５ならば、二つのキーワードｉとｊ（キ
ーワード番号ｉとｊのキーワード）にはリンク（つなが
り）があると考える。Therefore, if the value of α (1,2) is close to “0”, the keyword of keyword number 1 and the keyword of keyword number 2 did not appear at the same time, or
It means that it does not contribute much to the judgment of the validity of the document. When a certain threshold value, for example, 5 is set as the threshold value, α
If (i, j) ≧ 5, it is considered that the two keywords i and j (keywords of the keyword numbers i and j) have a link (connection).

【０２１１】そこで、リンクのある二つのキーワード間
にラインを張ったかたちで図示すると、たとえば、図１
８のキーワードは図２０のように表現される。[0211] Therefore, when a line is drawn between two keywords having a link, for example, as shown in Fig. 1
The keyword of 8 is expressed as shown in FIG.

【０２１２】このような場合、キーワード番号１のキー
ワード、キーワード番号２のキーワード、キーワード番
号５のキーワードを１つのグループとし、キーワード番
号３のキーワード、キーワード番号４のキーワード、キ
ーワード番号６のキーワードを別のグループに分けるよ
うにするといった手法により、キーワードを分割する。In such a case, the keyword of keyword number 1, the keyword of keyword number 2 and the keyword of keyword number 5 are set as one group, and the keyword of keyword number 3, the keyword of keyword number 4 and the keyword of keyword number 6 are separated. The keywords are divided by a method of dividing them into groups.

【０２１３】上記の動作を図２１に示すフローチャート
を参照して具体的に説明する。The above operation will be specifically described with reference to the flowchart shown in FIG.

【０２１４】キーワード分割処理部１０が起動される
と、まだ分割が必要であるかを判定する（ステップ７０
１）。判定は分割されたキーワード群の大きさの比が一
定以下になっているかにより行う。起動されてすぐには
分割は行われていないので、大きさはＭi0であり、分割
がまだ必要である。キーワードの中からランダムに起点
を選ぶ（ステップ７０２）。選んだ起点からリンクをた
どって到達できるキーワードが存在するかを確認する。
２つのキーワード（ａ、ｂ）間のリンクとは、２次元の
α（ａｂ）の絶対値を用いる。絶対値が予め定められた
値よりも大きい場合にはリンクがあると判定する（ステ
ップ７０３）。まだたどれるキーワードがある場合に
は、このキーワードに印を付ける（ステップ７０４）。
もうたどれるリンクがなくなってしまったら（ステップ
７０３）、印のついたキーワードを古いキーワードリス
トから取り出す。ステップ７０１に戻り、更に分割の必
要があるかを調べる。When the keyword division processing unit 10 is activated, it is determined whether or not division is still required (step 70).
1). The determination is made based on whether the ratio of the sizes of the divided keyword groups is below a certain level. Since the division is not performed immediately after being activated, the size is Mi0 and the division is still necessary. A starting point is randomly selected from the keywords (step 702). Check if there is a keyword that can be reached by following the link from the selected starting point.
The link between two keywords (a, b) uses the absolute value of two-dimensional α (ab). If the absolute value is larger than a predetermined value, it is determined that there is a link (step 703). If there is still a keyword that can be traced, this keyword is marked (step 704).
When there are no more links to follow (step 703), the marked keywords are taken out from the old keyword list. Returning to step 701, it is checked whether further division is necessary.

【０２１５】更に具体的に、分割処理の手順を示す。ま
ず、キーワード番号１のキーワードから始め、リンクの
あるキーワードを順に探す。例えば、キーワード番号１
のキーワードとキーワード番号２のキーワードはリンク
があるので、キーワード番号２のキーワードは同じグル
ープに入る。キーワード番号３，４，５，６の各キーワ
ードはキーワード番号１のキーワードとは直接のリンク
が無い。More specifically, the procedure of division processing will be described. First, starting with the keyword with keyword number 1, search for linked keywords in order. For example, keyword number 1
Since the keyword No. and the keyword No. 2 have a link, the keyword No. 2 belongs to the same group. The keywords of keyword numbers 3, 4, 5, and 6 do not have a direct link with the keyword of keyword number 1.

【０２１６】次に、新たにキーワード番号１のキーワー
ドと同じグループに入ったキーワード番号２のキーワー
ドとリンクのあるキーワードを探すと、キーワード番号
５のキーワードが見つかる。そこで、キーワード番号５
のキーワードを同じグループに入れる。Next, when a keyword having a link with the keyword of the keyword number 2 newly entered in the same group as the keyword of the keyword number 1 is searched for, the keyword of the keyword number 5 is found. Therefore, keyword number 5
Put the keywords in the same group.

【０２１７】次にキーワード番号５のキーワードとリン
クのあるキーワードを探すが、新しく同じグループに加
えるキーワードは存在しないために、ここで一旦終了す
る。ここで、今回の例のように運良く、キーワードの数
が同程度の二つの集合が出来た場合は良いが、取り出さ
れたグループに含まれるキーワードの数が一定の割合以
下の場合には、このグループに含まれていないキーワー
ドをランダムに選び、そこを始点に新たなグループを探
し、既に取り出されているグループに加える。Next, a keyword having a link with the keyword of the keyword number 5 is searched for, but since there is no new keyword to be added to the same group, the processing ends here. Here, as in the case of this example, it is good if two sets with the same number of keywords are created, but if the number of keywords included in the extracted group is less than a certain ratio, Randomly select a keyword that is not included in this group, search for a new group starting from that keyword, and add it to the already extracted group.

【０２１８】このようにして、一定の割合に到達するま
で繰り返す。二つのグループにキーワードを分割したな
らば、データ記憶部６に含まれるキーワードテーブルと
αの値を複写し、第６の具体例で説明した削除手法によ
り、互いのグループに含まれないキーワードを削除す
る。このようにして学習結果を二つに分割することが可
能となる。そして、このキーワード分割により、キーワ
ード数の増加の抑制と、計算量の低減を図ることができ
るようになる。In this way, the process is repeated until a fixed rate is reached. If the keywords are divided into two groups, the keyword table and the value of α included in the data storage unit 6 are copied, and the keywords not included in each group are deleted by the deletion method described in the sixth specific example. To do. In this way, the learning result can be divided into two. By this keyword division, it is possible to suppress an increase in the number of keywords and reduce the calculation amount.

【０２１９】（第７の具体例）本具体例では、推論部５
とデータ学習部８を改良し、少ない記憶容量で、高次の
αを求めることができるようにした方法について説明す
る。(Seventh Concrete Example) In this concrete example, the inference unit 5 is used.
A method of improving the data learning unit 8 to obtain a high-order α with a small storage capacity will be described.

【０２２０】ｎ次のαの値はｎ個のキーワードが同時に
出現する場合にのみ有効である。ところが、情報フィル
タが扱う数千種のキーワードのうち、同時にｎ個のキー
ワードが出現する可能性は極めて低い。特にｎの値が高
くなればなるほど、その組合せは膨張するが、実際に有
効なαは少ない。The nth-order value of α is valid only when n keywords appear at the same time. However, of the thousands of keywords handled by the information filter, it is extremely unlikely that n keywords will appear at the same time. In particular, the higher the value of n, the more the combination expands, but the actual effective α is small.

【０２２１】そこで、有効なαのみを記憶しておき、そ
れ以外のものは、より低次のαから求めるようにする。Therefore, only valid α is stored, and the other ones are obtained from lower α.

【０２２２】例えば、α（ａ1 ，ａ2 ，ａ3 ，．．．，
ａn-1 ，ａn ）を求める場合、ここに出現するキーワー
ド“ａ1 ”、“ａ2 ”、“ａ3 ”，… ，“ａn-1 ”，
“ａn ”の全てが同時に出現したことが無かった場合に
は、α（ａ1 ，ａ2 ，ａ3 ，…，ａn-1 ，ａn ）は、こ
れよりも、ｎ−１次までのαにより表現されることが知
られている。For example, α (a1, a2, a3, ...,
an-1, an), the keywords "a1", "a2", "a3", ..., "An-1", appearing here are obtained.
If all of the "an" have never appeared at the same time, α (a1, a2, a3, ..., An-1, an) is represented by α up to the n-1th degree. It is known.

【０２２３】例えば、三次のαを求める場合、α（ａ1
，ａ2 ，ａ3 ）で、キーワード“ａ1 ”、“ａ2 ”、
“ａ3 ”が同時に出現したことが無かった場合を想定す
ると、α（ａ1 ，ａ2 ，ａ3 ）は、第１の具体例でのχ
の計算により、ａ1 ，ａ2 ，ａ3 のうち、奇数個のキー
ワードが同時に出現した回数を数えていることと同値で
ある。For example, when obtaining a cubic α, α (a1
, A2, a3), the keywords "a1", "a2",
Assuming that "a3" never appeared at the same time, α (a1, a2, a3) is χ in the first specific example.
It is the same value as counting the number of times that an odd number of keywords among a1, a2, and a3 appear at the same time.

【０２２４】従って、ａ1 ，ａ2 ，ａ3 のそれぞれの値
を１／０の値のべクトルで表せば（全てのキーワードが
出現した場合は（１，１，１）となる。）、“α（ａ1
，ａ2 ，ａ3 ）”は、“（１，０，０）”又は
“（０，１，０）”又は“（０，０，１）”又は
“（１，１，１）”が起こった回数を数えていることに
なる。Therefore, if each value of a1, a2, and a3 is represented by a vector having a value of 1/0 ((1,1,1) when all keywords appear), "α ( a1
, A2, a3) "is" (1,0,0) "or" (0,1,0) "or" (0,0,1) "or" (1,1,1) " You are counting the number of times.

【０２２５】一方、“α（ａ1 ，ａ2 ）”は、“（１，
０，０）”，“（０，１，０）”，“（１，０，
１）”，“（０，１，１）”が起こった回数を、“α
（ａ1 ，ａ3）”は、“（１，０，０）”，“（１，
１，０）”，“（０，０，１）”，“（０，１，１）”
が起こった回数を、“α（ａ2 ，ａ3 ）”は、“（１，
１，０）”，“（０，１，０）”，“（１，０，
１）”，“（０，０，１）”が起こった回数を、“α
（ａ1 ）”は、“（１，０，０）”，“（１，０，
１）”，“（１，１，０）”，“（１，１，１）”が起
こった回数を、α（ａ2 ）は、“（０，１，０）”，
“（０，１，１）”，“（１，１，０）”，“（１，
１，１）”が起こった回数を、そして、“α（ａ3 ）”
は、“（０，０，１）”，“（１，０，１）”，
“（１，１，０）”，“（１，１，１）”が起こった回
数をそれぞれ数えている。On the other hand, "α (a1, a2)" is "(1,
0,0) ”,“ (0,1,0) ”,“ (1,0,
1) ”and“ (0,1,1) ”are counted as“ α
(A1, a3) "is" (1,0,0) "," (1,
"1,0)", "(0,0,1)", "(0,1,1)"
The number of occurrences of "α (a2, a3)" is "(1,
1,0) ”,“ (0,1,0) ”,“ (1,0,
1) ”and“ (0,0,1) ”are counted as“ α
(A1) "is" (1,0,0) "," (1,0,
1) ”,“ (1,1,0) ”,“ (1,1,1) ”is the number of times α (a2) is“ (0,1,0) ”,
"(0,1,1)", "(1,1,0)", "(1,
1, 1) "and the number of occurrences of" α (a3) "
Is "(0,0,1)", "(1,0,1)",
The numbers of occurrences of "(1,1,0)" and "(1,1,1)" are counted.

【０２２６】ゆえに、 α（ａ1 ）＋α（ａ2 ）＋α（ａ3 ）−α（ａ1 ，ａ2
）−α（ａ2 ，ａ3 ）−α（ａ1 −ａ3 ）＝ｇ（０，
０，１）＋ｇ（０，１，０）＋ｇ（１，０，０）＋ｇ
（１，１，１）−４ｇ（１，１，１）＝α（ａ1 ，ａ2
，ａ3 ）−４ｇ（１，１，１）＝α（ａ1 ，ａ2 ，ａ3
）ただし、ｇ（ａ，ｂ，ｃ）（ａ，ｂ，ｃは１又は０）は
関数であり、ここでの関数ｇ（ａ，ｂ，ｃ）は、（ａ，
ｂ，ｃ）が起こった回数を表す。Therefore, α (a1) + α (a2) + α (a3) -α (a1, a2
) -Α (a2, a3) -α (a1-a3) = g (0,
0,1) + g (0,1,0) + g (1,0,0) + g
(1,1,1) -4g (1,1,1) = α (a1, a2
, A3) -4g (1,1,1) = α (a1, a2, a3
However, g (a, b, c) (a, b, c is 1 or 0) is a function, and the function g (a, b, c) here is (a, b
represents the number of times b, c) has occurred.

【０２２７】つまり、ｇ（１，１，１）＝０の場合に
は、α（ａ１，ａ２，ａ３）は１次と２次のαで表現出
来ることが分る。この仕組みを利用して、３つのキーワ
ードが同時に出現した場合のみ、キーワードの組合せと
αの値を保持し、それ以外の場合は、上記の方法により
αを求める。That is, when g (1,1,1) = 0, it can be seen that α (a1, a2, a3) can be expressed by primary and secondary α. Using this mechanism, the combination of keywords and the value of α are held only when three keywords appear at the same time, and in other cases, α is obtained by the above method.

【０２２８】このようにすると、膨大な数のαの値を記
憶する必要がなく、少ない記憶容量で精度の高い予測を
行うことが可能となる。In this way, it is not necessary to store an enormous number of values of α, and it is possible to perform highly accurate prediction with a small storage capacity.

【０２２９】なお、本発明は情報フィルタ処理を行なう
システムや情報フィルタ処理のための方法としての適用
ばかりでなく、プログラムパッケージ化し、パソコンや
ワークステーション等、コンピュータシステムに対して
アプリケーションパッケージとして提供して、情報フィ
ルタ処理を行なわせることもでき、コンピュータプログ
ラムパッケージとしての頒布の形態を採用した実施形態
も実現可能である。The present invention is not only applied as a system for performing information filter processing and a method for information filter processing, but also provided as a program package and provided as an application package to a computer system such as a personal computer or a workstation. Information filtering processing can be performed, and an embodiment in which a distribution form as a computer program package is adopted can be realized.

【０２３０】[0230]

【発明の効果】以上、詳述したように本発明によれば、
文書から抽出されるキーワードの組合せと文書の重要性
に関する評価値の関係だけに基いて推論／学習を行なう
ので、良好な推定精度を維持しつつ、計算量を少くし、
高速に判定／学習を行なうことが可能な情報フィルタを
得ることができる。As described above in detail, according to the present invention,
Since the inference / learning is performed only based on the relationship between the combination of keywords extracted from the document and the evaluation value regarding the importance of the document, the calculation amount is reduced while maintaining good estimation accuracy.
It is possible to obtain an information filter that can perform determination / learning at high speed.

【０２３１】従って、本発明によれば、利用者は明らか
に興味のない文書を読む必要がなくなる。Therefore, according to the present invention, the user does not need to read a document that is obviously not of interest.

[Brief description of drawings]

【図１】本発明を説明するための図であって、本発明の
第１〜第３の具体例に係る情報フィルタの構成を示すブ
ロック図。FIG. 1 is a diagram for explaining the present invention and is a block diagram showing a configuration of an information filter according to first to third specific examples of the present invention.

【図２】本発明を説明するための図であって、本発明に
用いるキーワードテーブルの一例を示す図。FIG. 2 is a diagram for explaining the present invention, showing an example of a keyword table used in the present invention.

【図３】本発明を説明するための図であって、本発明の
第１および第２の具体例における推論の流れを示すフロ
ーチャート。FIG. 3 is a diagram for explaining the present invention and is a flowchart showing a flow of inference in the first and second specific examples of the present invention.

【図４】本発明を説明するための図であって、本発明の
第１の具体例における学習の流れを示すフローチャー
ト。FIG. 4 is a diagram for explaining the present invention and is a flowchart showing the flow of learning in the first specific example of the present invention.

【図５】本発明を説明するための図であって、本発明に
用いるキーワードテーブルの他の例を示す図。FIG. 5 is a diagram for explaining the present invention, showing another example of the keyword table used in the present invention.

【図６】本発明を説明するための図であって、本発明に
用いる各係数αの学習前後の値を示す図。FIG. 6 is a diagram for explaining the present invention, showing the values before and after learning of each coefficient α used in the present invention.

【図７】本発明を説明するための図であって、本発明の
第２の具体例における情報フィルタの構成例を示すブロ
ック図。FIG. 7 is a diagram for explaining the present invention and is a block diagram showing a configuration example of an information filter in a second specific example of the present invention.

【図８】本発明を説明するための図であって、本発明の
第２の具体例における学習の流れを示すフローチャー
ト。FIG. 8 is a diagram for explaining the present invention and is a flowchart showing the flow of learning in the second example of the present invention.

【図９】本発明を説明するための図であって、本発明の
第２の具体例の変形例における推論の流れを示すフロー
チャート。FIG. 9 is a diagram for explaining the present invention and is a flowchart showing a flow of inference in a modified example of the second specific example of the present invention.

【図１０】本発明を説明するための図であって、本発明
の第３の具体例における情報フィルタの構成を示すブロ
ック図。FIG. 10 is a diagram for explaining the present invention and is a block diagram showing a configuration of an information filter in a third example of the present invention.

【図１１】本発明を説明するための図であって、本発明
の第３の具体例における推論の流れを示すフローチャー
ト。FIG. 11 is a diagram for explaining the present invention and is a flowchart showing a flow of inference in the third example of the present invention.

【図１２】本発明を説明するための図であって、本発明
の第３の具体例における学習の流れを示すフローチャー
ト。FIG. 12 is a diagram for explaining the present invention and is a flowchart showing the flow of learning in the third example of the present invention.

【図１３】本発明を説明するための図であって、本発明
の第４の実施形態における情報フィルタの構成を示すブ
ロック図。FIG. 13 is a diagram for explaining the present invention and is a block diagram showing a configuration of an information filter according to a fourth embodiment of the present invention.

【図１４】本発明を説明するための図であって、本発明
の第４の具体例における推論の流れを示すフローチャー
ト。FIG. 14 is a diagram for explaining the present invention and is a flowchart showing the flow of inference in the fourth example of the present invention.

【図１５】本発明を説明するための図であって、本発明
の第５の具体例に係る情報フィルタの構成を示すブロッ
ク図。FIG. 15 is a diagram for explaining the present invention, which is a block diagram showing a configuration of an information filter according to a fifth example of the present invention.

【図１６】本発明を説明するための図であって、本発明
の第５の具体例において用いるキーワードテーブルの一
例を示す図。FIG. 16 is a diagram for explaining the present invention, showing an example of a keyword table used in a fifth example of the present invention.

【図１７】本発明を説明するための図であって、本発明
の第５の具体例における削除処理の流れを示すフローチ
ャート。FIG. 17 is a diagram for explaining the present invention and is a flowchart showing the flow of a deletion process in the fifth example of the present invention.

【図１８】本発明を説明するための図であって、本発明
の第５の具体例において用いる各係数αの例を示す図。FIG. 18 is a diagram for explaining the present invention, showing an example of each coefficient α used in the fifth specific example of the present invention.

【図１９】本発明を説明するための図であって、本発明
の第６の具体例を説明するための図。FIG. 19 is a diagram for explaining the present invention and a diagram for explaining a sixth specific example of the present invention.

【図２０】本発明を説明するための図であって、本発明
の第６の具体例を説明するためのブロック図。FIG. 20 is a diagram for explaining the present invention and is a block diagram for explaining a sixth specific example of the present invention.

【図２１】本発明を説明するための図であって、本発明
の第６の具体例に係る分割処理の流れを示すフローチャ
ート。FIG. 21 is a diagram for explaining the present invention and is a flowchart showing the flow of a division process according to the sixth example of the present invention.

[Explanation of symbols]

１…文書入力部２…文書記憶部３…キーワード抽出部４…データ記憶部５，５Ａ，５Ｂ…推論部６…表示部６ａ…制御機能部６ｂ…出力部７…評価データ入力部８，８ａ，８Ａ…学習部９…不要キーワード検出処理部。 1 ... Document input section 2 ... Document storage 3 ... Keyword extractor 4 ... Data storage unit 5, 5A, 5B ... Inference section 6 ... Display 6a ... Control function unit 6b ... Output section 7 ... Evaluation data input section 8, 8a, 8A ... Learning section 9 ... Unnecessary keyword detection processing unit.

フロントページの続き (56)参考文献特開平５−204975（ＪＰ，Ａ) 下郡信宏外，利用者モデルの構想：情報フィルタによる利用者情報の収集, 情報処理学会第49回（平成６年後期）全国大会（５），1994年９月28日，ｐ. ５−103〜５−104 森田千絵外，推論ツールＫＩＮＯの開発，情報処理学会研究報告（95−ＡＩ −98），1995年１月18日，第95巻，第４号，第29〜38頁ＰｅｔｅｒＷ．Ｆｏｌｔｚ，ＳｕｓａｎＴ．Ｄｕｍａｉｓ，ＰｅｒｓｏｎａｌｉｚｅｄａｎＡｎａｌｙｓｉｓ，ＣｏｍｍｕｎｉｃａｔｉｏｎｏｆｔｈｅＡＣＭ，1992年12月，Ｖｏｌ．35，Ｎｏ．12，Ｐ．51−60 森田昌宏，情報フィルタリング技術の現状と展望，電子情報通信学会技術研究報告（ＡＩ93−24），1993年７月22 日，第93巻，第153号，第49〜56頁ＳａｎｊｉｖＫ．Ｂｈａｔｉａ，ＵｓｅｒＰｒｏｆｉｌｅｓｆｏｒＩｎｆｏｒｍａｔｉｏｎＲｅｔｒｉｅｖａｌ，ＬｅｃｔｕｒｅＮｏｔｅｓｉｎＡｒｔｉｆｉｃｉａｌＩｎｔｅｌｌｉｇｅｎｃｅ，米国，1991年10月ＰａｕｌＥ．Ｂａｃｌａｃｅ，ＰｅｒｓｏｎａｌＩｎｆｏｒｍａｔｉｏｎＩｎｔａｋｅＦｉｌｔｅｒｉｎｇ, ＢｅｌｌｃｏｒｅＩｎｆｏｒｍａｔｉｏｎＦｉｌｔｅｒｉｎｇＷｏｒｋｓｈｏｐ，1991年10月，ＵＲＬ，ｈｔｔｐ：／／ｗｗｗ．ｂａｃｌａｃｅ．ｎｅｔ／Ｒｅｓｏｕｒｃｅｓ／ｉｆｉｌｔｅｒｌ．ｈｔｍｌ (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06F 17/30 ＪＩＣＳＴファイル（ＪＯＩＳ) ＷＰＩ（ＤＩＡＬＯＧ)Continuation of the front page (56) References JP-A-5-204975 (JP, A) Nobuhiro Shimogori Outside, concept of user model: Collection of user information by information filter, 49th IPSJ (1994) (Latter term) National Convention (5), September 28, 1994, pp. 5-103 to 5-104 Chie Morita, Development of inference tool KINO, Research Report of Information Processing Society of Japan (95-AI-98), 1995 January 18, Vol. 95, No. 4, pages 29-38 Peter W. Foltz, Sus an T .; Dumais, Personalyzed and Analyzes, Communication of the ACM, December 1992, Vol. 35, No. 12, P.I. 51-60 Masahiro Morita, Current Status and Prospects of Information Filtering Technology, IEICE Technical Report (AI93-24), July 22, 1993, Vol. 93, No. 153, pp. 49-56 Sanjiv K. Bhatia, User Profiles for Information Retrieval, Right Notes in Artificial Intelligence, USA, October 1991 Paul E. et al. Blacace, Personal Information Intake Filtering, Bellcore Information on Filtering Works hop, October 1991, URL, http: // www. bacrace. net / Resources / ifilte rl. html (58) Fields investigated (Int.Cl. ⁷ , DB name) G06F 17/30 JISST file (JOIS) WPI (DIALOG)

Claims

(57) [Claims]

1. A keyword extraction for extracting a plurality of keywords
Output means and one or more keywords of the plurality of keywords
Generate a combination of keywords that have the words
In the document,
It is determined whether a keyword appears, and if it appears,
Is the first identifier, and if not, the second identifier
The first identifier assigned to the combination of the keywords
Is counted, and if the counting result is an even number, the third identifier is
If it is an odd number, the fourth identifier is combined with the keyword.
To generate a spectrum, and
The key based on the evaluation value associated with the document
Calculate a coefficient for a combination of words,
Learning by storing in association with keyword combinations
Of the learning means to learn and the document to be inferred, the combination of the keywords is
Based on the coefficient and the spectrum for a certain
Inference means that infers the importance of the document to be inferred.
Information filter instrumentation, characterized by comprising; and a step
Place

2. The learning means, when a new keyword other than the keyword extracted by the keyword extracting means is extracted, adds the new keyword to the past learning result by the learning means. The information filter device according to claim 1 , further comprising keyword adding means.

3. The learning means learns only a range in which a keyword extracted from the document influences at the time of the learning, and the inference means selects an inference target.
To the extraction means to extract keywords from the document to be
When the importance of the document is held in advance when the extracted keyword does not exist and the importance of the input document is inferred, the keyword extracted from the document is extracted by the extracting means. If there is a stored keyword, a value for varying the importance held by the keyword is obtained, and the stored importance is corrected based on this value, thereby The information filter device according to claim 1 , wherein the information filter device is for determining the degree of importance.

4. When the ratio of the extracted keywords in all the keywords extracted from the document to all the extracted keywords is less than a predetermined value, the inference is performed by the inference means. claim 1 also characterized by being configured to be displayed to the user without the
Is an information filter device according to any one of 2 .

5. The learning means has a function of detecting an unnecessary keyword in the learning result and deleting the unnecessary keyword, and when the unnecessary keyword is detected, a past keyword is detected.
Claims, characterized in that further comprising an unnecessary keyword detection means for deleting the unnecessary keywords from the learning result
Item 1. The information filter device according to item 1 .

When 6. The number of keywords extracted by the extraction unit is equal to or greater than a predetermined size, wherein, characterized in that it comprises a keyword dividing means for dividing the learning result of the past of the relationship further Item 1. The information filter device according to item 1 .

7. A plurality of keywords are extracted, and one or more keywords of the plurality of keywords are extracted.
Generate a combination of keywords that contains the words and include the keyword combination in the document to be learned.
It is determined whether each keyword
If the first identifier is not present, the second identifier is
An identifier is assigned to the combination of the keywords, the first identifier is counted, and when the counting result is an even number,
The third identifier, and if odd, the fourth identifier,
A spectrum is generated by assigning it to a combination of words, and the spectrum and the evaluation value associated with the document.
Calculate the coefficient for the combination of the keywords based on
Then, associate the above-mentioned coefficient with the combination of the above-mentioned keywords.
Of the documents that are learned by remembering and inferred, the combination of the keywords is
Based on the coefficient and the spectrum for a certain
To infer the importance of the document to be inferred.
Characterized information filtering method.

8. The learning is extracted by the extracting means.
And when a new keyword other than the keyword is extracted, information filtering method according to claim 7, further comprising the step of adding the new keywords.

Wherein said learning comprises learning about extracted keyword affects the range of the document, the
The inference is performed by preliminarily holding the importance of the document when the keyword extracted from the document to be inferred does not include the keyword extracted by the extracting unit ,
When inferring the importance of the input document, the keyword extracted from the document is extracted by the extracting unit.
When there is a stored keyword, a value that changes the degree of importance held by the keyword is obtained, and the stored degree of importance is corrected based on this value 8. The information filtering method according to claim 7 , further comprising a process of obtaining a degree.