JPH11296523A

JPH11296523A - Method and device for filtering information

Info

Publication number: JPH11296523A
Application number: JP10112757A
Authority: JP
Inventors: Takeshi Sugai; 猛菅井; Hiromi Haniyuda; 博美羽生田
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1998-04-08
Filing date: 1998-04-08
Publication date: 1999-10-29

Abstract

PROBLEM TO BE SOLVED: To enable a user to easily grasp the relation between base words of a profile and efficiently correct the profile. SOLUTION: When the user inputs the profile (step S2), a filtering part performs filtering according to the profile (step S6). An interface management part displays the filtering result and the user evaluates it (step S7). The filtering part changes the profile according to the evaluation result (step S8). A visualizing interface management part visually display base words of the filter changed by the filtering part in two dimensions so that as the relation between the base words is closer and closer, the distance is shorter and shorter (step S11).

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、情報資源に対して
フィルタリングを行う情報フィルタリング方法および装
置に関し、特に、プロファイルの基底語の表示に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an information filtering method and apparatus for filtering information resources, and more particularly, to displaying a base word of a profile.

【０００２】[0002]

【従来の技術】従来の情報フィルタリングの技術を示す
ものとして、以下の文献があった。［１］Nicholas J.Belkin,W.Bruce Croft,Information
Filtering and Information retrieval:Two Sides of t
he Same Coin?,Communication of the ACM,35(12),pp29
-38,19922. Description of the Related Art There are the following documents that show conventional information filtering techniques. [1] Nicholas J. Belkin, W. Bruce Croft, Information
Filtering and Information retrieval: Two Sides of t
he Same Coin ?, Communication of the ACM, 35 (12), pp29
-38,1992

【０００３】［２］Chris Beckley, Gerard Salton, Ja
mes Allan, The Effect of AddingRelevance Informati
on in a Relevance Feedback Environment,SIGIR'94 Pr
oceedings,1994,pp292-300[2] Chris Beckley, Gerard Salton, Ja
mes Allan, The Effect of AddingRelevance Informati
on in a Relevance Feedback Environment, SIGIR'94 Pr
oceedings, 1994, pp292-300

【０００４】［３］Scott Deerwester,Susan T.Dumais,
George W.Fornus,Thomas K.Landauer,Richard Harshma
n,Indexing by Latent Semantic Analysis,Journal of
the Society for Information Science,41-6,1990,pp39
1-407[3] Scott Deerwester, Susan T. Dumais,
George W. Fornus, Thomas K. Landauer, Richard Harshma
n, Indexing by Latent Semantic Analysis, Journal of
the Society for Information Science, 41-6,1990, pp39
1-407

【０００５】コンピュータネットワークの整備によっ
て、電子メールやネットニュース等、日々刻々と変わる
情報に対して、ユーザが欲しい情報だけを得るという情
報フィルタリングの技術の開発が行われている。情報フ
ィルタリングでは、ユーザの日々少しずつ変わる興味に
追従するために、関連フィードバックや学習アルゴリズ
ムによって、ユーザのプロファイルを変化させてフィル
タリングを行うシステムが一般的である。[0005] With the development of computer networks, information filtering technology has been developed that obtains only the information desired by the user for information that changes every day, such as e-mail and net news. In information filtering, a system is generally used in which filtering is performed by changing a user's profile using related feedback or a learning algorithm in order to follow a user's interest that changes little by day.

【０００６】ここで、プロファイルとは情報検索におけ
る検索文（あるいは、問い合わせ文）とほぼ同様の意味
であるが、ここでは、ある時間の長さに対して、ユーザ
の検索の興味を表現した検索文である。こうしたフィル
タリングは、テキストの内容と、ユーザが欲する情報を
表現したプロファイルとの類似度によって行われる。[0006] Here, the profile has almost the same meaning as a search sentence (or an inquiry sentence) in information search, but here, a search expressing a user's interest in search for a certain length of time. Statement. Such filtering is performed based on the similarity between the content of the text and a profile expressing information desired by the user.

【０００７】情報フィルタリングの手順は、一般に以下
のような方法で行われる。１．ユーザは、初期プロファイルを入力する。２．情報フィルタリング装置は、ユーザのプロファイル
と配送されてきたテキストを比べて、フィルタリングを
行い、フィルタリング結果をユーザに表示する。３．ユーザは、フィルタリング結果に対する評価を入力
する。４．上記３の情報から、情報フィルタリング装置は、プ
ロファイルの修正を行う。５．２〜４を繰り返す。[0007] The procedure of information filtering is generally performed by the following method. 1. The user enters an initial profile. 2. The information filtering device compares the user's profile with the delivered text, performs filtering, and displays the filtering result to the user. 3. The user inputs an evaluation for the filtering result. 4. Based on the information of the above 3, the information filtering device corrects the profile. Repeat steps 5.2 to 4.

【０００８】更に、ユーザが意図的にプロファイルを修
正したい時は、以下のような手順をとる。１．ユーザは、初期プロファイル、あるいは修正プロフ
ァイルを入力する。２．情報フィルタリング装置は、ユーザのプロファイル
と配送されてきたテキストを比べて、フィルタリングを
行い、フィルタリング結果をユーザに表示する。３．情報フィルタリング装置が、プロファイルをそのま
まユーザに提示する。４．ユーザがプロファイルの修正を行う。５．上記４の情報から、情報フィルタリング装置は、プ
ロファイルの修正を行う。６．１〜５を繰り返す。Further, when the user intentionally wants to modify the profile, the following procedure is taken. 1. The user enters an initial profile or a modified profile. 2. The information filtering device compares the user's profile with the delivered text, performs filtering, and displays the filtering result to the user. 3. The information filtering device presents the profile to the user as it is. 4. The user modifies the profile. 5. The information filtering device corrects the profile based on the information of the above item 4. 6. Repeat steps 1-5.

【０００９】[0009]

【発明が解決しようとする課題】しかしながら、情報フ
ィルタリングにおいて、ユーザがフィルタリング結果の
評価を行い、関連フィードバックによってプロファイル
を洗練する場合、ユーザがプロファイルを直接修正した
いことがある。例えば、フィルタリング結果が、ユーザ
の興味に合わない場合である。However, in the information filtering, when the user evaluates the filtering result and refines the profile by related feedback, the user may want to directly modify the profile. For example, a case where the filtering result does not match the user's interest.

【００１０】ところが、このような場合、システムがプ
ロファイルの基底語をそのままユーザに提示しても、ユ
ーザは、その基底語間の関連や、基底語とフィルタリン
グ結果（文書）との関連を知ることができない。このた
め、ユーザは、プロファイルを直接修正しようとして
も、効率的なプロファイルの修正が行えないという問題
があった。However, in such a case, even if the system presents the base word of the profile to the user as it is, the user must know the relation between the base word and the relation between the base word and the filtering result (document). Can not. For this reason, there is a problem that even if the user attempts to directly modify the profile, the profile cannot be modified efficiently.

【００１１】このような点から、プロファイルの基底語
間の関連をユーザが容易に把握することのでき、効率的
なプロファイルの修正を行うことのできる情報フィルタ
リング方法および装置の実現が望まれていた。[0011] In view of the above, it has been desired to realize an information filtering method and apparatus which enable a user to easily grasp the relationship between base words of a profile and to efficiently modify a profile. .

【００１２】[0012]

【課題を解決するための手段】本発明は、前述の課題を
解決するため次の構成を採用する。〈請求項１の構成〉先ず、ユーザが入力したプロファイ
ルの基底語に基づき、情報源に対してフィルタリングを
行って、そのフィルタリング結果を表示し、次に、フィ
ルタリング結果の表示に基づいてユーザが付けた評点に
より、プロファイルの基底語を変更して再度表示する場
合、フィルタリングを行う際のプロファイルの基底語の
関連が強い程、各基底語間の距離が近くなるよう各基底
語を視覚的に表示することを特徴とする情報フィルタリ
ング方法である。The present invention employs the following structure to solve the above-mentioned problems. <Structure of Claim 1> First, filtering is performed on an information source based on a base word of a profile input by a user, and the filtering result is displayed. Next, the user attaches the filtering result based on the display of the filtering result. When the base word of the profile is changed and displayed again according to the score, the base words are visually displayed such that the stronger the relation between the base words of the profile when performing filtering, the closer the distance between the base words becomes. The information filtering method is characterized in that

【００１３】〈請求項１の説明〉プロファイルとは、あ
る時間の長さに対して、ユーザの検索の興味を表現した
検索文である。また、基底語とは、例えば英語の場合で
は、“the”，“a ”，“to”といった基底語にしても
意味をなさない不要語を除外し、かつ、活用形の語幹を
基本形にした単語をいう。そして、フィルタリング結果
を表示するとは、ユーザのプロファイルに関係のある情
報を情報源から取り出して表示することである。<Explanation of Claim 1> A profile is a search sentence expressing a user's interest in search for a certain length of time. In the case of English, for example, in the case of English, unnecessary words such as “the”, “a”, and “to” that do not make sense even in base words are excluded, and the stem of the inflected form is used as the basic form. A word. Displaying a filtering result means extracting information related to a user's profile from an information source and displaying the information.

【００１４】請求項１の発明では、先ず、ユーザが入力
したプロファイルで情報源のフィルタリングを行って、
その結果を表示する。次に、ユーザがこの表示内容を見
て、結果に対する興味の有無等を入力する。これによ
り、ユーザが最初に入力したプロファイルとは異なった
プロファイルの基底語が抽出される。これらの基底語を
表示する場合、例えば二次元上に、フィルタリングにお
ける基底語の関連が強い程その距離が近くなるよう視覚
的に表示する。According to the first aspect of the present invention, first, information sources are filtered based on a profile input by a user.
Display the result. Next, the user looks at the displayed content and inputs whether or not the user is interested in the result. Thereby, a base word of a profile different from the profile first input by the user is extracted. When these base words are displayed, they are visually displayed, for example, in two dimensions so that the stronger the relation between the base words in the filtering is, the shorter the distance is.

【００１５】請求項１の発明がこのような作用を有する
ことにより、ユーザが不要な基底語を削除する場合も効
率よく行うことができ、従って、よりユーザの興味に近
い情報資源のみをフィルタリングすることができる効果
がある。[0015] With the above-described operation of the first aspect of the present invention, it is possible to efficiently remove unnecessary base words by a user, and therefore, only information resources closer to the user's interest are filtered. There is an effect that can be.

【００１６】〈請求項２の構成〉請求項１に記載の情報
フィルタリング方法において、各基底語を表示すると共
に、基底語間の関連が予め決められた値の範囲内である
基底語をクラスタリングし、各クラスタリング毎に、フ
ィルタリング結果におけるそのクラスタリングの重みの
度合いを示す重要度と、クラスタリングの代表の基底語
とを表示する情報フィルタリング方法である。<Structure of claim 2> In the information filtering method according to claim 1, each base word is displayed, and the base words whose association between the base words is within a predetermined value range are clustered. This is an information filtering method for displaying, for each clustering, the importance indicating the degree of the weight of the clustering in the filtering result and the representative base word of the clustering.

【００１７】〈請求項２の説明〉請求項２の発明は、請
求項１の発明において、似ている複数の基底語を一つの
グループとして表示し、かつ、そのグループの代表の基
底語と、そのグループの重要度を表示するようにしたも
のである。これにより、請求項１の発明の効果に加え
て、表示すべき基底語が多くなった場合でも、ユーザに
とって基底語間の関連が分かり易く、その結果、ユーザ
がプロファイルを変更する場合でも更に効率よく行うこ
とができる。<Explanation of claim 2> According to the invention of claim 2, in the invention of claim 1, a plurality of similar base words are displayed as one group, and a base word representative of the group is represented by: The importance of the group is displayed. Accordingly, in addition to the effect of the first aspect, even when the number of base words to be displayed increases, the relation between the base words is easy for the user to understand, and as a result, even when the user changes the profile, the efficiency is further improved. Can do well.

【００１８】〈請求項３の構成〉請求項１または２に記
載の情報フィルタリング方法において、プロファイルの
各基底語を表示すると共に、プロファイルによるフィル
タリング結果の情報を、各情報間の関連が強いほど距離
が近くなるように、かつ、各情報と各基底語間の関連が
強いほど距離が近くなるよう視覚的に表示する情報フィ
ルタリング方法である。<Structure of claim 3> In the information filtering method according to claim 1 or 2, each base word of the profile is displayed, and the information of the result of filtering by the profile is distanced as the relation between the information is stronger. This is an information filtering method that visually displays such that is closer and the stronger the association between each information and each base word, the shorter the distance.

【００１９】〈請求項３の説明〉請求項３の発明は、請
求項１または２の発明において、更に、フィルタリング
結果を示す情報も視覚的に表示するようにしたものであ
る。ここで、フィルタリング結果を示す情報とは、例え
ばプロファイルに関係する文書である。そして、このよ
うな文書を基底語と共に表示する。これにより、ユーザ
が評価した文書等の情報と基底語間の関係を、ユーザが
容易に把握することができるため、プロファイルの変更
を行う場合でも効率よく行うことができる。<Explanation of Claim 3> The invention of claim 3 is the invention of claim 1 or 2, wherein information indicating a filtering result is also visually displayed. Here, the information indicating the filtering result is, for example, a document related to the profile. Then, such a document is displayed together with the base words. This allows the user to easily grasp the relationship between the information such as the document and the like and the base word evaluated by the user, so that the profile can be efficiently changed even when the profile is changed.

【００２０】〈請求項４の構成〉ユーザが入力したプロ
ファイルの基底語に基づき、情報源に対するフィルタリ
ングを行い、この結果を出力すると共に、ユーザより評
価情報が入力された場合は、この評価情報に基づき、プ
ロファイルを変更するフィルタリング部と、ユーザから
の指示を受け取ると共に、フィルタリング部からの結果
をユーザに対して行うインタフェース管理部と、フィル
タリング部がプロファイルを変更した場合、フィルタリ
ングを行う場合のプロファイルの基底語間の関連が強い
ほど各基底語間の距離が近くなるよう各基底語を視覚的
に表示する視覚化インタフェース管理部とを備えたこと
を特徴とする情報フィルタリング装置である。<Structure of Claim 4> Based on the base word of the profile input by the user, filtering is performed on the information source, the result is output, and when evaluation information is input by the user, the evaluation information is output. A filtering unit that changes a profile, an interface management unit that receives an instruction from a user and receives a result from the filtering unit for the user, and a profile that performs filtering when the filtering unit changes the profile. An information filtering device, comprising: a visualization interface management unit that visually displays each base word such that the distance between the base words is closer as the relation between the base words is stronger.

【００２１】〈請求項４の説明〉フィルタリング部は、
ユーザが入力したプロファイルの基底語に基づき、情報
源に対するフィルタリングを行い、この結果をインタフ
ェース管理部に出力する。インタフェース管理部は、こ
の結果をユーザに対して表示し、ユーザから結果に対す
る評価情報が入力された場合は、この評価情報をフィル
タリング部に渡す。これにより、フィルタリング部は、
プロファイルを変更する。また、視覚化インタフェース
管理部は、フィルタリング部がプロファイルを変更する
と、フィルタリングを行う場合のプロファイルの基底語
間の関連が強いほど各基底語間の距離が近くなるよう各
基底語を視覚的に表示する。<Explanation of Claim 4> The filtering unit comprises:
Based on the base word of the profile input by the user, filtering is performed on the information source, and the result is output to the interface management unit. The interface management unit displays the result to the user, and when the user inputs evaluation information for the result, passes the evaluation information to the filtering unit. Thereby, the filtering unit:
Change your profile. In addition, when the filtering unit changes the profile, the visualization interface management unit visually displays each base word such that the stronger the association between the base words of the profile in performing the filtering, the closer the distance between the base words becomes. I do.

【００２２】請求項４の発明が、このような作用を有す
ることにより、ユーザが不要な基底語を削除する場合も
効率よく行うことができ、従って、よりユーザの興味に
近い情報資源のみをフィルタリングすることができる効
果がある。According to the fourth aspect of the present invention having such an effect, it is possible to efficiently remove unnecessary base words by the user. Therefore, only information resources that are more interesting to the user are filtered. There is an effect that can be.

【００２３】〈請求項５の構成〉請求項４に記載の情報
フィルタリング装置において、基底語間の関連が予め決
められた値の範囲内である基底語をクラスタリングし、
各クラスタリング毎に、フィルタリング結果におけるそ
のクラスタリングの重みの度合いを示す重要度と、クラ
スタリングの代表の基底語とを出力するクラスタリング
部と、クラスタリング部から出力されたクラスタリング
の重要度と代表の基底語とを視覚的に表示する視覚化イ
ンタフェース管理部とを備えたことを特徴とする情報フ
ィルタリング装置である。<Structure of claim 5> In the information filtering apparatus according to claim 4, the base words whose association between the base words is within a predetermined value range are clustered,
For each clustering, an importance indicating the degree of the weight of the clustering in the filtering result, a clustering unit that outputs a representative base word of clustering, a clustering importance output from the clustering unit and a representative base word are output. And a visualization interface management unit for visually displaying the information.

【００２４】〈請求項５の説明〉請求項５の発明は、請
求項４の発明において、クラスタリング部を設け、視覚
化インタフェース管理部は、クラスタリング部の表示デ
ータを視覚的に表示するようにしたものである。これに
より、これにより、請求項４の発明の効果に加えて、表
示すべき基底語が多くなった場合でも、ユーザにとって
基底語間の関連が分かり易く、その結果、ユーザがプロ
ファイルを変更する場合でも更に効率よく行うことがで
きる。<Explanation of Claim 5> In the invention of claim 5, in the invention of claim 4, a clustering unit is provided, and the visualization interface management unit visually displays display data of the clustering unit. Things. Thereby, in addition to the effect of the invention of claim 4, even when the number of base words to be displayed increases, the relation between the base words is easy for the user to understand, and as a result, the user changes the profile. However, it can be performed more efficiently.

【００２５】〈請求項６の構成〉請求項４または５に記
載の情報フィルタリング装置において、フィルタリング
部のプロファイルによるフィルタリング結果の情報を、
各情報間の関連が強いほど距離が近くなるように、か
つ、各情報と各基底語間の関連が強いほど距離が近くな
るよう視覚的な位置関係のデータを出力する文書視覚化
情報生成部と、文書視覚化情報生成部の出力結果を視覚
的に表示する視覚化インタフェース管理部とを備えたこ
とを特徴とする情報フィルタリング装置である。<Structure of Claim 6> In the information filtering device according to claim 4 or 5, the information of the filtering result by the profile of the filtering unit is
A document visualization information generation unit that outputs data of a visual positional relationship such that the stronger the relationship between the pieces of information is, the closer the distance is, and the closer the relationship between each piece of information and each base word is, the closer the distance is. And a visualization interface management unit for visually displaying an output result of the document visualization information generation unit.

【００２６】〈請求項６の説明〉請求項６の発明は、請
求項４または５の発明において、文書視覚化情報生成部
を設け、視覚化インタフェース管理部は、文書視覚化情
報生成部の表示データを視覚的に表示するようにしたも
のである。これにより、ユーザが評価した文書等の情報
と基底語間の関係を、ユーザが容易に把握することがで
きるため、プロファイルの変更を行う場合でも効率よく
行うことができる。<Explanation of Claim 6> According to the invention of claim 6, in the invention of claim 4 or 5, a document visualization information generation section is provided, and the visualization interface management section is configured to display the document visualization information generation section. The data is displayed visually. This allows the user to easily grasp the relationship between the information such as the document and the like and the base word evaluated by the user, so that the profile can be efficiently changed even when the profile is changed.

【００２７】[0027]

【発明の実施の形態】以下、本発明の実施の形態を図面
を用いて詳細に説明する。《具体例１》〈構成〉図１は本発明の具体例１による情報フィルタリ
ング方法を示すフローチャートであるが、この説明に先
立ち、情報フィルタリング方法を実施するための情報フ
ィルタリング装置の説明を行う。Embodiments of the present invention will be described below in detail with reference to the drawings. Embodiment 1 <Configuration> FIG. 1 is a flowchart showing an information filtering method according to Embodiment 1 of the present invention. Prior to the description, an information filtering apparatus for implementing the information filtering method will be described.

【００２８】図２は本発明の情報フィルタリング装置の
具体例１を示す構成図である。図の装置は、フィルタリ
ング部１、プロファイル保存部２、情報提供部３、イン
タフェース管理部４、視覚化インタフェース管理部５か
らなる。また、これらの構成は、ＣＰＵ（中央演算処理
装置）、主記憶装置、外部記憶装置、入出力装置からな
る一般のコンピュータシステム上に実現されている。FIG. 2 is a block diagram showing a first embodiment of the information filtering apparatus of the present invention. The illustrated device includes a filtering unit 1, a profile storage unit 2, an information providing unit 3, an interface management unit 4, and a visualization interface management unit 5. These configurations are implemented on a general computer system including a CPU (central processing unit), a main storage device, an external storage device, and an input / output device.

【００２９】フィルタリング部１は、ユーザ２０が入力
したプロファイル（やプロファイル保存部２に保存され
たプロファイル）の基底語に基づき、情報提供部３を介
して提供される情報源１０に対するフィルタリングを行
い、この結果をインタフェース管理部４に対して出力す
ると共に、インタフェース管理部４よりユーザ２０から
の評価情報が入力された場合は、この評価情報に基づ
き、プロファイルを変更してプロファイル保存部２に保
存する機能部である。The filtering unit 1 performs filtering on the information source 10 provided via the information providing unit 3 based on the base words of the profile input by the user 20 (or the profile stored in the profile storage unit 2). The result is output to the interface management unit 4, and when evaluation information from the user 20 is input from the interface management unit 4, the profile is changed based on the evaluation information and stored in the profile storage unit 2. It is a functional unit.

【００３０】ここで、本具体例で用いる情報フィルタリ
ングの手法として、ベクトル空間モデル、関連フィード
バック（Relevance Feedback）について説明する。Here, a vector space model and related feedback (Relevance Feedback) will be described as information filtering techniques used in this example.

【００３１】ベクトル空間モデルはテキストの中の単語
をベクトルとしてとらえる検索手法である（例えば、こ
れについては文献［3］に示す）。テキストを単語に分
割して、その単語に重要度を割り当て、特徴ベクトルと
する。ここで、一般に、分散ネットワーク上では、情報
資源とは、画像データ、動画像データ、圧縮ファイルな
どを含むが、本具体例では、情報資源をテキストに限定
する。また、質問文には、自然言語文を用い、テキスト
（あるいは文書）の単語と同様、特徴ベクトルに変換す
る。検索結果は、テキストと質問文の類似度をランキン
グしたものである。ここで、以後、単語を基底語と呼
ぶ。尚、基底語とは、発明者が先に出願した特願平９−
１５７９０９号明細書に記載しているように、ベクトル
空間モデルにおけるベクトルの軸を構成するものであ
る。また、基底語の重要度とは、フィルタリングした結
果である文書等に対するその基底語の重みを表す値であ
る。The vector space model is a retrieval method that takes a word in a text as a vector (for example, this is described in reference [3]). The text is divided into words, and the words are assigned importance levels, which are used as feature vectors. Here, in general, on a distributed network, information resources include image data, moving image data, compressed files, and the like, but in this specific example, information resources are limited to text. Also, a natural language sentence is used for the question sentence, and the question sentence is converted into a feature vector, similarly to a text (or document) word. The search result ranks the similarity between the text and the question sentence. Here, the word is hereinafter referred to as a base word. In addition, the base word is Japanese Patent Application No. Hei 9
As described in the specification of Japanese Patent No. 157909, it constitutes a vector axis in a vector space model. The importance of the base word is a value representing the weight of the base word with respect to a document or the like as a result of the filtering.

【００３２】上記の基底語とは、例えば英語の情報資源
で説明すると次のような語である。即ち、不要語の削除
を行った後、ステミングを行う。ステミングとは、英語
の活用形の語幹を基本形にする操作であり、ステミング
を行った結果の語がその語の基底語である。例えば、te
is→tie，hopping→hopのようにステミングを行った語
が基底語である。また、基底語にしても意味をなさない
語を不要語といい、例えば、“the”，“a ”，“to”
などである。The above-mentioned base words are, for example, the following words when described in English information resources. That is, after the unnecessary words are deleted, stemming is performed. Stemming is an operation of making the stem of the inflected form of English into a basic form, and the word resulting from the stemming is the base word of the word. For example, te
Words that have stemmed, such as is → tie, hopping → hop, are base words. Words that do not make sense even in base words are called unnecessary words, for example, “the”, “a”, “to”
And so on.

【００３３】あるテキストのベクトルを次式で表現す
る。図３は、ベクトル空間モデルの演算式の説明図であ
る。図３中、式（１）はあるテキストのベクトルＤＷを
示す式である。式（１）において、ｄｗ₁，ｄｗ₂，…，
ｄｗ_tは、そのテキストの基底語の重要度である。A vector of a text is represented by the following equation. FIG. 3 is an explanatory diagram of the arithmetic expression of the vector space model. In FIG. 3, Expression (1) is an expression indicating a vector DW of a certain text. In equation (1), dw ₁ , dw ₂ ,.
dw _t is the base language of the importance of the text.

【００３４】同様に、質問文のベクトルＱを式（２）で
表現する。ここで、ｑ₁，ｑ₂，…，ｑ_tは、質問文の基
底語の重要度である。Similarly, the vector Q of the question sentence is expressed by equation (2). Here, q ₁ , q ₂ ,..., Q _t are the degrees of importance of the base words of the question sentence.

【００３５】また、基底語の重要度（文書Ｄ_iの語Ｔ_kの
ベクトルの重み）は、式（３）で与えられる。ここで、
各記号の意味は以下の通りである。・Ｗ_ik：文書Ｄ_iの基底語Ｔ_kのベクトルの重み・ｔｆ_ik：文書Ｄ_iの基底語Ｔ_kの出現数・Ｎ：収集された文書の総数・ｎ_k：収集された文書の中で、基底語Ｔ_kが含まれてい
る文書の数Further, the base word importance (weight vector word T _k of the document D _i) is given by equation (3). here,
The meaning of each symbol is as follows. · W _ik: document D _i base language T _k of a vector of weighting · tf _ik of: document D _i base language T _k of the number of occurrences · N of: The total number · n _k of the collected documents: in the collection documents And the number of documents containing the base word T _k

【００３６】また、類似度Ｓｉｍは、式（４）に示す式
となる。ここで、式（５）中のθのようなスレッシュホ
ールドが存在する。検索結果は、式（３）を満たしたも
のの中で類似度が大きい順序に表示される。The similarity Sim is given by the following equation (4). Here, there is a threshold such as θ in equation (5). The search results are displayed in order of the degree of similarity among those satisfying Expression (3).

【００３７】検索された文書をユーザが評価して、ユー
ザが関連あるといった文書のベクトルを質問文にフィー
ドバックをかけて、検索式を洗練する方法を関連フィー
ドバックという（この関連フィードバックについては、
例えば文献[2]に示されている）。A method in which a user evaluates a retrieved document and feeds a vector of the document that the user is relevant to the question sentence and refines the retrieval formula is referred to as a related feedback.
For example, it is shown in reference [2]).

【００３８】関連フィードバックは様々な方法が提案さ
れているが、一般に、図中の式（６）が用いられる。こ
こで、ｒｅｌ＿ｄｏｃｓとは、検索された文書の中で、
ユーザが興味のある文書の特徴ベクトルである。また、
ｎｏｎｒｅｌ＿ｄｏｃｓとは、検索された文書の中で、
ユーザが興味がない文書の特徴ベクトルである。一般に
は、α、β、γに、値８、１６、４をそれぞれ用いる。Although various methods have been proposed for related feedback, generally, equation (6) in the figure is used. Here, rel_docs is defined as:
This is a feature vector of a document of interest to the user. Also,
nonrel_docs is the retrieved document,
This is a feature vector of a document that the user is not interested in. Generally, values 8, 16, and 4 are used for α, β, and γ, respectively.

【００３９】図２に戻って、プロファイル保存部２は、
フィルタリング部１がフィルタリングを行う際に用いる
プロファイルを保存する機能部である。このプロファイ
ル保存部２は、磁気ディスク装置等の記憶装置内に設け
られている。Returning to FIG. 2, the profile storage 2
This is a functional unit that stores a profile used when the filtering unit 1 performs filtering. The profile storage unit 2 is provided in a storage device such as a magnetic disk device.

【００４０】情報提供部３は、情報源１０中の情報資源
の識別子（Identifier(id)）をフィルタリング部１に送
る機能を有している。ここで、情報資源とは、情報源１
０に含まれる情報の一単位を示し、識別子は、各情報資
源を識別するための情報である。また、本具体例では、
この識別子を文書識別子と呼ぶ。The information providing unit 3 has a function of sending an identifier (Identifier (id)) of an information resource in the information source 10 to the filtering unit 1. Here, the information resource is the information source 1
0 indicates one unit of information included in the identifier, and the identifier is information for identifying each information resource. Also, in this specific example,
This identifier is called a document identifier.

【００４１】インタフェース管理部４は、ユーザ２０の
入力を受け付け、そのユーザ２０が入力した情報をフィ
ルタリング部１に送ると共に、フィルタリング部１のフ
ィルタリング結果をユーザ２０に表示する機能を有して
いる。The interface management unit 4 has a function of accepting an input from the user 20, transmitting information input by the user 20 to the filtering unit 1, and displaying a filtering result of the filtering unit 1 to the user 20.

【００４２】視覚化インタフェース管理部５は、フィル
タリング部１がプロファイルを変更し、これをプロファ
イル保存部２に保存した場合、この保存されたプロファ
イルに基づき、フィルタリングを行う場合のプロファイ
ルの基底語間の関連が強いほど、各基底語間の距離が近
くなるよう各基底語を視覚的に表示する機能を有してい
る。また、視覚化インタフェース管理部５は、視覚的に
表示した後、ユーザ２０からの基底語の変更指示を受け
取り、その変更したプロファイルをプロファイル保存部
２に送る機能を有している。When the filtering unit 1 changes the profile and saves the profile in the profile storage unit 2, the visualization interface management unit 5 determines between the base words of the profile when performing filtering based on the stored profile. It has a function of visually displaying each base word so that the stronger the association, the closer the distance between each base word is. Further, the visualization interface management unit 5 has a function of receiving a base word change instruction from the user 20 after visually displaying the same, and transmitting the changed profile to the profile storage unit 2.

【００４３】即ち、視覚化インタフェース管理部５は、
ユーザのプロファイルを二次元上に視覚的に表示する場
合、プロファイルの基底語の表示にＬＳＩ（Latent Sem
antic Indexing）を用いて、意味的に類似している基底
語を二次元上の近いところにまとめて表示する。ここで
は、視覚化の一つの方法としてＬＳＩを用いており、こ
のＬＳＩについて次に説明する。That is, the visualization interface management unit 5
When a user's profile is visually displayed in two dimensions, an LSI (Latent Sem) is used to display a base word of the profile.
Using antic indexing), semantically similar base words are displayed collectively at a close place in two dimensions. Here, an LSI is used as one method of visualization, and this LSI will be described below.

【００４４】ＬＳＩは、特異値分析法（singular value
decomposition）を行い、特徴ベクトルの直交化を行っ
ている（これについては、例えば従来文献[3]に示
す）。LSI is a singular value analysis method.
decomposition) to perform orthogonalization of the feature vector (this is described in, for example, conventional literature [3]).

【００４５】図４は、特異値分析の演算説明図である。
即ち、ｍ×ｎの行列Ａ（ｍ≦ｎ，rank（Ａ）＝ｒ）の特
異値分析を式（７）のように定義する。図中、rank
（Ａ）は行列の階数を示す。ここで、１≦ｉ≦ｒにおい
て、Ｕ^TＵ＝Ｖ^tＶ＝Ｉ_n，Σ＝diag（σ₁，…，σ_n），
σ＞０であり、ｊ≧ｒ＋１において、σ_j＝０である。FIG. 4 is a diagram for explaining the operation of the singular value analysis.
That is, the singular value analysis of an m × n matrix A (m ≦ n, rank (A) = r) is defined as in equation (7). In the figure, rank
(A) shows the rank of the matrix. Here, in ^{1 ≦ i ≦ r, U T} U = V t V = I n, Σ = diag (σ 1, ..., σ n),
σ> 0, and σ _j = 0 when j ≧ r + 1.

【００４６】ここで、Ｕは、左特異値ベクトル行列と呼
ばれ、列直交行列である。Ｖは、右特異値ベクトル行列
と呼ばれ、列直交行列である。また、計算量を少なくす
るために、行列の次元数を下げた図中の式（８）を用い
ることが多い。例えば、行列のをｍからｋに削減する。
図中、上側に~が付与されたＡが、行列の階数を下げた
行列Ａである。Here, U is called a left singular value vector matrix and is a column orthogonal matrix. V is called a right singular value vector matrix and is a column orthogonal matrix. Also, in order to reduce the amount of calculation, Expression (8) in the figure in which the number of dimensions of the matrix is reduced is often used. For example, the matrix is reduced from m to k.
In the figure, A to which ~ is given on the upper side is a matrix A in which the rank of the matrix is lowered.

【００４７】また、検索文もｋ次元の空間として、テキ
ストと比較される。検索文は図中の式（９）のように定
義できる。ここで、上側に~が付与されたｑが、検索文
を示している。尚、ここで、検索を行うには、検索文の
特徴ベクトルと文書の特徴ベクトルとの類似度によって
求められる。本発明では、式（８）までしか使用しない
が、ＬＳＩの説明のため、式（９）まで示した。The search sentence is also compared with text as a k-dimensional space. The search sentence can be defined as in equation (9) in the figure. Here, q to which ~ is added on the upper side indicates a search sentence. Note that the search is performed based on the similarity between the feature vector of the search sentence and the feature vector of the document. In the present invention, only expression (8) is used, but expression (9) is shown for the purpose of describing the LSI.

【００４８】図２に戻って、情報源１０は、上述したよ
うに複数の情報資源からなり、文書やイメージのデータ
を含んでいる。そして、フィルタリング装置は、コンピ
ュータネットワークを通じて情報源１０に接続されてい
る。Returning to FIG. 2, the information source 10 includes a plurality of information resources as described above, and includes data of documents and images. The filtering device is connected to the information source 10 through a computer network.

【００４９】また、上記のフィルタリング部１、情報提
供部３、インタフェース管理部４、視覚化インタフェー
ス管理部５は、それぞれの機能をコンピュータ上で実現
するためのプログラムと、これらのプログラムを実行す
るＣＰＵやメモリ等により実現されている。The filtering unit 1, the information providing unit 3, the interface management unit 4, and the visualization interface management unit 5 are programs for realizing the respective functions on a computer, and a CPU for executing these programs. And a memory or the like.

【００５０】〈動作〉具体例１では、情報フィルタリン
グ装置が、文書を配送する情報源１０に接続されている
とする。本具体例では、ユーザ２０は、コンピュータ関
係の本の中で、「マルチメディア、イントラネット、イ
ンターネット」に関係する雑誌の記事をフィルタリング
したいとする。ユーザ２０がこの情報フィルタリング装
置にプロファイルを登録しておけば、登録後、情報フィ
ルタリング装置は、「マルチメディア、イントラネッ
ト、インターネット」に関係する記事が情報源１０とし
て入ってくる毎に、その雑誌の記事をユーザ２０に配送
する。<Operation> In the first embodiment, it is assumed that the information filtering apparatus is connected to the information source 10 for delivering a document. In this specific example, it is assumed that the user 20 wants to filter out articles of magazines related to “multimedia, intranet, and the Internet” in computer-related books. If the user 20 registers a profile in the information filtering device, after the registration, the information filtering device sets the profile of the magazine every time an article related to “multimedia, intranet, Internet” comes in as the information source 10. The article is delivered to the user 20.

【００５１】ここで、ユーザ２０は、フィルタリングさ
れたデータが、ユーザ２０の興味と少し違った記事が示
されていると考え、ユーザ２０は自分でプロファイルを
変更したいと仮定する。この場合の動作を以下図１に沿
って説明する。Here, it is assumed that the user 20 considers that the filtered data indicates an article slightly different from the interest of the user 20, and the user 20 wants to change the profile by himself. The operation in this case will be described below with reference to FIG.

【００５２】［ステップＳ１］ユーザ２０の起動コマン
ドにより、情報フィルタリング装置が起動する。[Step S1] The information filtering device is activated by a user 20 activation command.

【００５３】［ステップＳ２］ここで、情報フィルタリ
ング装置は、ユーザ２０のデータの入力待ちとなる。ユ
ーザ２０はプロファイルとして、「マルチメディア、イ
ントラネット、インターネット」という語を入力する。[Step S2] Here, the information filtering apparatus waits for the user 20 to input data. The user 20 enters the word "multimedia, intranet, internet" as a profile.

【００５４】［ステップＳ３］インタフェース管理部４
は、ステップＳ２で入力されたユーザ２０のデータをフ
ィルタリング部１に送る。[Step S3] Interface management unit 4
Sends the data of the user 20 input in step S2 to the filtering unit 1.

【００５５】［ステップＳ４］フィルタリング部１は、
ユーザ２０のプロファイルを基底語に分解し、重要度を
計算し、プロファイル保存部２に格納する。[Step S4] The filtering unit 1
The profile of the user 20 is decomposed into base words, the importance is calculated, and stored in the profile storage unit 2.

【００５６】［ステップＳ５］情報提供部３は、ステッ
プＳ４でフィルタリング部１がプロファイル保存部２に
格納したプロファイルに基づき、このプロファイルに適
した文書を情報源１０から収集し、この収集した文書を
フィルタリング部１に送る。[Step S5] Based on the profile stored in the profile storage unit 2 by the filtering unit 1 in step S4, the information providing unit 3 collects documents suitable for the profile from the information source 10, and extracts the collected documents. Send to filtering section 1.

【００５７】［ステップＳ６］電子図書館に新しい本や
雑誌が入ってくる度に、情報提供部３が収集した文書に
対して、フィルタリング部１がフィルタリングを行う。
そして、このフィルタリング結果をインタフェース管理
部４に送り、インタフェース管理部４がフィルタリング
結果をユーザ２０に表示する。[Step S6] Each time a new book or magazine enters the digital library, the filtering unit 1 filters the documents collected by the information providing unit 3.
Then, the filtering result is sent to the interface management unit 4, and the interface management unit 4 displays the filtering result to the user 20.

【００５８】［ステップＳ７］ユーザ２０がフィルタリ
ング結果を評価する。例えば、評価方法として、その文
書に興味があるかないかを入力する。[Step S7] The user 20 evaluates the filtering result. For example, as an evaluation method, whether or not the user is interested in the document is input.

【００５９】［ステップＳ８］フィルタリング部１は、
上述した関連フィードバックの式（６）を用いて、ユー
ザ２０のプロファイルを変更し、プロファイル保存部２
に格納する。[Step S8] The filtering unit 1
Using the above-described related feedback equation (6), the profile of the user 20 is changed, and the profile storage unit 2 is changed.
To be stored.

【００６０】図５は、この場合のプロファイルの説明図
である。図示のように、プロファイルは基底語と重要度
との対になっている。FIG. 5 is an explanatory diagram of the profile in this case. As shown, the profile is a pair of the base word and the importance.

【００６１】［ステップＳ９］ユーザ２０がプロファイ
ルを変更したいかどうかを入力する。[Step S9] Whether the user 20 wants to change the profile is input.

【００６２】［ステップＳ１０］ユーザ２０が、フィル
タリングを終了したいかを入力する。[Step S10] The user 20 inputs whether he wants to end the filtering.

【００６３】［ステップＳ１１］上記ステップＳ９でユ
ーザ２０がプロファイルを変更したいと思った場合、視
覚化インタフェース管理部５は、プロファイル保存部２
にあるプロファイルを視覚的にユーザ２０に表示する。
ここでは、二次元上に基底語を表示する。即ち、プロフ
ァイル内の基底語の表示には、特徴ベクトルの次元を２
とし、式（８）のＵ₁を用いて表示する。[Step S11] If the user 20 wants to change the profile in step S9, the visualization interface management section 5 sets the profile storage section 2
Is visually displayed to the user 20.
Here, the base words are displayed in two dimensions. That is, to display the base words in the profile, the dimension of the feature vector is set to 2
And is displayed using U ₁ in equation (8).

【００６４】図６は、視覚化インタフェース管理部５に
よる表示結果の説明図である。図示のように、プロファ
イル内の各基底語が二次元上に表示されている。尚、図
中の横軸は一次元ベクトルの値、縦軸は二次元ベクトル
の値を示している。FIG. 6 is an explanatory diagram of a display result by the visualization interface management unit 5. As shown, each base word in the profile is displayed two-dimensionally. The horizontal axis in the figure indicates the value of a one-dimensional vector, and the vertical axis indicates the value of a two-dimensional vector.

【００６５】［ステップＳ１２］ここで、ユーザ２０
は、図６のように二次元上に示された基底語について、
自分の興味と違っている基底語を消去する。ここでは、
ユーザ２０は、「音楽」、「サーバ」という基底語を削
除したと仮定する。これは、視覚化インタフェース管理
部５を通じて行われる。[Step S12] Here, the user 20
Is for a base word shown on two dimensions as shown in FIG.
Eliminate base words that are different from your interests. here,
It is assumed that the user 20 has deleted the base words “music” and “server”. This is performed through the visualization interface management unit 5.

【００６６】図７は、削除した後のプロファイルの説明
図である。図示のように、ユーザ２０が削除指示した
「音楽」、「サーバ」が基底語から削除されている。FIG. 7 is an explanatory diagram of the profile after deletion. As shown in the figure, “music” and “server” that the user 20 instructed to delete are deleted from the base words.

【００６７】［ステップＳ１３］視覚化インタフェース
管理部５は、ステップＳ１２によって変更されたプロフ
ァイルをプロファイル保存部２に送り、プロファイル保
存部２はユーザ２０の新しいプロファイルに書き換え
る。[Step S13] The visualization interface management section 5 sends the profile changed in step S12 to the profile storage section 2, and the profile storage section 2 rewrites the profile with a new profile of the user 20.

【００６８】［ステップＳ１４］フィルタリング部１
は、新たに入ってくる情報資源に対して、プロファイル
保存部２に保存されている新しいプロファイルを用いて
フィルタリングを行い、その結果をインタフェース管理
部４に送る。インタフェース管理部４がフィルタリング
結果をユーザ２０に表示する。[Step S14] Filtering unit 1
Performs filtering on the newly entered information resource using the new profile stored in the profile storage unit 2 and sends the result to the interface management unit 4. The interface management unit 4 displays the filtering result to the user 20.

【００６９】［ステップＳ１５］ステップＳ１０でユー
ザ２０がフィルタリングを終了したいならば、フィルタ
リング装置は、フィルタリングを終了する。[Step S15] If the user 20 wants to end the filtering in step S10, the filtering device ends the filtering.

【００７０】〈効果〉以上のように、具体例１によれ
ば、ユーザがプロファイルを変更する際に、プロファイ
ルの中の基底語を、二次元上で視覚的に表示するように
したので、ユーザが不要な基底語を効率よく削除するこ
とができ、従って、よりユーザの興味に近い情報資源の
みをフィルタリングすることができる効果がある。<Effects> As described above, according to the first embodiment, when the user changes the profile, the base words in the profile are visually displayed in two dimensions. Can efficiently delete unnecessary base words, so that there is an effect that only information resources closer to the user's interest can be filtered.

【００７１】《具体例２》具体例２は、視覚化する際
に、プロファイルの基底語が多くあり、二次元上に表示
しきれない場合に、近傍の基底語をクラスタリングし
て、そのクラスタリングの代表値を決めて表示するよう
にしたものである。<< Specific Example 2 >> In the specific example 2, when there are many base words in the profile when visualization is performed and cannot be displayed in two dimensions, clustering of the nearby base words is performed and the clustering of the cluster is performed. A representative value is determined and displayed.

【００７２】〈構成〉図８は、具体例２の構成図であ
る。図の装置は、フィルタリング部１、プロファイル保
存部２、情報提供部３、インタフェース管理部４、視覚
化インタフェース管理部５、クラスタリング部６からな
る。ここで、フィルタリング部１〜インタフェース管理
部４は、具体例１と同様の構成であるため、ここでの説
明は省略する。<Structure> FIG. 8 is a diagram showing the structure of the second embodiment. The illustrated device includes a filtering unit 1, a profile storage unit 2, an information providing unit 3, an interface management unit 4, a visualization interface management unit 5, and a clustering unit 6. Here, the filtering unit 1 to the interface management unit 4 have the same configuration as that of the first embodiment, and a description thereof will be omitted.

【００７３】クラスタリング部６は、視覚的に表示され
る基底語を二次元上のある範囲にまとめて表示し、か
つ、そのとき、クラスタリングの代表語を表示すると共
に、クラスタリングの重要度も表示するための表示デー
タを出力する機能を有している。The clustering unit 6 collectively displays the base words visually displayed in a certain two-dimensional range, and at this time, displays the representative words of the clustering and also displays the importance of the clustering. For outputting display data for the display.

【００７４】また、視覚化インタフェース管理部５は、
クラスタリング部６から出力された表示データを視覚的
に表示し、かつ、この表示によりユーザ２０から入力さ
れた基底語の変更指示に基づき、プロファイル保存部２
のプロファイルを変更する機能を有している。Further, the visualization interface management unit 5
The display data output from the clustering unit 6 is visually displayed, and the profile storage unit 2 is displayed based on the base word change instruction input from the user 20 by this display.
Has the function of changing the profile of

【００７５】〈動作〉具体例２においても、情報フィル
タリング装置が、文書を配送する情報源１０に接続され
ているとする。本具体例では、ユーザ２０は、コンピュ
ータ関係の本の中で、「マルチメディア、イントラネッ
ト、インターネット」に関係する雑誌の記事をフィルタ
リングしたいとする。ユーザ２０がこの情報フィルタリ
ング装置にプロファイルを登録しておけば、登録後、情
報フィルタリング装置は、「マルチメディア、イントラ
ネット、インターネット」に関係する記事が情報源１０
として入ってくる毎に、その雑誌の記事をユーザ２０に
配送する。<Operation> Also in the specific example 2, it is assumed that the information filtering device is connected to the information source 10 for delivering a document. In this specific example, it is assumed that the user 20 wants to filter out articles of magazines related to “multimedia, intranet, and the Internet” in computer-related books. If the user 20 has registered a profile in this information filtering device, after the registration, the information filtering device reads articles relating to “multimedia, intranet, and the Internet” from the information source 10.
Every time it comes in, the article of the magazine is delivered to the user 20.

【００７６】ここで、ユーザ２０は、フィルタリングさ
れたデータが、ユーザ２０の興味と少し違った記事が示
されていると考え、ユーザ２０は自分でプロファイルを
変更したいと仮定する。この場合の動作を図１のステッ
プＳ９において、ユーザがプロファイルを変更したい状
態であるとしてこれ以降の処理を説明する。Here, it is assumed that the user 20 considers that the filtered data indicates an article slightly different from the interest of the user 20, and the user 20 wants to change the profile by himself. The operation in this case will be described assuming that the user wants to change the profile in step S9 of FIG.

【００７７】図９は、具体例２の動作を示すフローチャ
ートである。［ステップＳ２１］ユーザ２０がプロファ
イルを変更したいならば、視覚化インタフェース管理部
５は、プロファイル保存部２にあるプロファイルをクラ
スタリングし、視覚的にユーザ２０に表示する。FIG. 9 is a flowchart showing the operation of the second embodiment. [Step S21] If the user 20 wants to change the profile, the visualization interface management unit 5 clusters the profiles in the profile storage unit 2 and visually displays them to the user 20.

【００７８】図１０は、視覚化して表示した場合の説明
図である。ここでは、二次元上に基底語を表示して、ク
ラスタリング毎に重要度を表示する。ここで、プロファ
イル内の基底語の表示には、特徴ベクトルの次元を２と
して、式（７）のＵを用いて表示する。「音楽」、「サ
ーバ」、「インターネット」、「イントラネット」、
「マルチ」などは、それぞれのクラスタリングの代表値
を示す。そして、それぞれのクラスタリングの重要度も
その隣に表示する。FIG. 10 is an explanatory diagram when visualized and displayed. Here, the base words are displayed two-dimensionally, and the importance is displayed for each clustering. Here, the base words in the profile are displayed by using U in equation (7), with the dimension of the feature vector being 2. "Music", "server", "internet", "intranet",
“Multi” or the like indicates a representative value of each clustering. Then, the importance of each clustering is also displayed next to it.

【００７９】ここで、クラスタリングの代表の重要度
は、その基底語の重要度の和を表示する。例えば、代表
値がイントラネットの重要度は、（イントラネット、0.
543）、（搭載、0.037）、（構築、0.033）、（ソフト
ウェア、0.029）の重要度の和である0.642が表示され
る。Here, the representative importance of clustering indicates the sum of the importance of the base words. For example, if the representative value is intranet importance (intranet, 0.
543), (mounted, 0.037), (constructed, 0.033), and (software, 0.029), the sum of the importance of 0.642 is displayed.

【００８０】［ステップＳ２２］ユーザ２０は、図１０
にて示す二次元上に表示された基底語について、自分の
興味と違っている基底語を消去する。ここでは、ユーザ
２０は、「音楽」、「サーバ」という基底語を削除した
と仮定する。これは、視覚化インタフェース管理部５を
通じて行われる。[Step S22] The user 20 enters the state shown in FIG.
With respect to the base words displayed on the two-dimensional plane indicated by, the base words that are different from the user's interest are deleted. Here, it is assumed that the user 20 has deleted the base words “music” and “server”. This is performed through the visualization interface management unit 5.

【００８１】［ステップＳ２３］視覚化インタフェース
管理部５は、上記ステップＳ２によって変更されたプロ
ファイルをプロファイル保存部２に送る。これによりプ
ロファイル保存部２は、ユーザ２０の新しいプロファイ
ルを書き換える。[Step S23] The visualization interface management section 5 sends the profile changed in step S2 to the profile storage section 2. Thereby, the profile storage unit 2 rewrites the new profile of the user 20.

【００８２】［ステップＳ２４］フィルタリング部１
は、新たに入ってくる情報資源に対して、新しいプロフ
ァイルを用いてフィルタリングを行い、その結果をイン
タフェース管理部４に送る。インタフェース管理部４
は、フィルタリング結果をユーザ２０に表示する。[Step S24] Filtering unit 1
Performs filtering on a newly entered information resource using a new profile, and sends the result to the interface management unit 4. Interface management unit 4
Displays the filtering result to the user 20.

【００８３】〈効果〉以上のように、具体例２によれ
ば、ユーザがプロファイルを変更する際に、プロファイ
ルの中の基底語を、二次元上に視覚的に表示し、かつ、
似ている基底語をクラスタリングして表示するようにし
たので、具体例１の効果に加えて、次のような効果があ
る。即ち、関連フィードバックにより、プロファイルの
基底語が多くなった場合、これを単に二次元上に表示し
ても基底語の数が多くなり、基底語間の関係が分かりづ
らいという問題を一掃することができ、ユーザにとって
より見やすい表示となり、ユーザがプロファイルを変更
する場合、効率よく行えるという効果がある。<Effects> As described above, according to the specific example 2, when the user changes the profile, the base words in the profile are visually displayed two-dimensionally, and
Since similar base words are clustered and displayed, the following effects are obtained in addition to the effects of the first embodiment. That is, if the number of base words in the profile increases due to the related feedback, the number of base words increases even if the profile is simply displayed in two dimensions, and the problem that the relationship between the base words is difficult to understand can be eliminated. This makes the display easier to see for the user, and has an effect that when the user changes the profile, the display can be performed efficiently.

【００８４】《具体例３》具体例３は、視覚化する際
に、プロファイルの基底語と、フィルタリングされた文
書とを共に表示するようにしたものである。<< Specific Example 3 >> In the specific example 3, at the time of visualization, both the base word of the profile and the filtered document are displayed.

【００８５】〈構成〉図１１は、具体例３の構成図であ
る。図の装置は、フィルタリング部１、プロファイル保
存部２、情報提供部３、インタフェース管理部４、視覚
化インタフェース管理部５、文書視覚化情報生成部７か
らなる。ここで、フィルタリング部１〜インタフェース
管理部４は、具体例１、２と同様の構成であるため、こ
こでの説明は省略する。<Structure> FIG. 11 is a view showing the structure of the third embodiment. The illustrated device includes a filtering unit 1, a profile storage unit 2, an information providing unit 3, an interface management unit 4, a visualization interface management unit 5, and a document visualization information generation unit 7. Here, since the filtering unit 1 to the interface management unit 4 have the same configuration as those of the first and second specific examples, description thereof will be omitted.

【００８６】文書視覚化情報生成部７は、フィルタリン
グされた文書を二次元上に視覚化するための情報を、視
覚化インタフェース管理部５に渡す機能を有している。
即ち、文書視覚化情報生成部７は、フィルタリング部１
のプロファイルによるフィルタリング結果の情報を、各
情報間の関連が強いほど距離が近くなるように、かつ、
各情報と各基底語間の関連が強いほど距離が近くなるよ
う視覚的な位置関係のデータを視覚化インタフェース管
理部５に対して出力する機能を有している。The document visualization information generation unit 7 has a function of passing information for visualizing a filtered document in two dimensions to the visualization interface management unit 5.
That is, the document visualization information generation unit 7 includes the filtering unit 1
The information of the filtering result by the profile of is such that the stronger the association between the information, the shorter the distance, and
It has a function of outputting data of a visual positional relationship to the visualization interface management unit 5 so that the stronger the association between each information and each base word, the closer the distance.

【００８７】また、視覚化インタフェース管理部５は、
文書視覚化情報生成部７から出力された表示データを視
覚的に表示し、かつ、この表示によりユーザ２０から入
力された基底語の変更指示に基づき、プロファイル保存
部２のプロファイルを変更する機能を有している。Further, the visualization interface management unit 5
A function of visually displaying the display data output from the document visualization information generation unit 7 and changing the profile of the profile storage unit 2 based on the base word change instruction input from the user 20 by this display. Have.

【００８８】〈動作〉具体例３においても、情報フィル
タリング装置が、文書を配送する情報源１０に接続され
ているとする。本具体例では、ユーザ２０は、コンピュ
ータ関係の本の中で、「マルチメディア、イントラネッ
ト、インターネット」に関係する雑誌の記事をフィルタ
リングしたいとする。ユーザ２０がこの情報フィルタリ
ング装置にプロファイルを登録しておけば、登録後、情
報フィルタリング装置は、「マルチメディア、イントラ
ネット、インターネット」に関係する記事が情報源１０
として入ってくる毎に、その雑誌の記事をユーザ２０に
配送する。<Operation> In the third embodiment as well, it is assumed that the information filtering device is connected to the information source 10 for delivering a document. In this specific example, it is assumed that the user 20 wants to filter out articles of magazines related to “multimedia, intranet, and the Internet” in computer-related books. If the user 20 has registered a profile in this information filtering device, after the registration, the information filtering device reads articles relating to “multimedia, intranet, and the Internet” from the information source 10.
Every time it comes in, the article of the magazine is delivered to the user 20.

【００８９】ここで、ユーザ２０は、フィルタリングさ
れたデータが、ユーザ２０の興味と少し違った記事が示
されていると考え、ユーザ２０は自分でプロファイルを
変更したいと仮定する。この場合の動作を図１のステッ
プＳ９において、ユーザがプロファイルを変更したい状
態であるとしてこれ以降の処理を説明する。Here, it is assumed that the user 20 considers that the filtered data indicates an article slightly different from the interest of the user 20 and that the user 20 wants to change the profile by himself. The operation in this case will be described assuming that the user wants to change the profile in step S9 of FIG.

【００９０】図１２は、具体例３の動作を示すフローチ
ャートである。［ステップＳ３１］ユーザ２０がプロファイルを変更し
たいならば、視覚化インタフェース管理部５は、プロフ
ァイル保存部２にあるプロファイルと、文書視覚化情報
生成部７が生成した文書を示す情報を視覚化してユーザ
２０に表示する。FIG. 12 is a flowchart showing the operation of the third embodiment. [Step S31] If the user 20 wants to change the profile, the visualization interface management unit 5 visualizes the profile stored in the profile storage unit 2 and the information indicating the document generated by the document visualization information generation unit 7 so that the user 20 20 is displayed.

【００９１】図１３は、視覚化して表示した場合の説明
図である。ここでは、プロファイル内の基底語の表示に
は、特徴ベクトルの次元を２として、式（７）のＵを用
いて表示する。ここで、フィルタリングされた文書の表
示には、特徴ベクトルの次元を２として、式（７）のＶ
^Tを用いて表示する。FIG. 13 is an explanatory diagram when visualized and displayed. Here, the base word in the profile is displayed using U in equation (7), with the dimension of the feature vector being 2. Here, when the filtered document is displayed, the dimension of the feature vector is set to 2, and V
Display using ^T.

【００９２】図１３中の、Ｄ１、Ｄ２、Ｄ２、Ｄ４、Ｄ
３９は、フィルタリングされた文書の二次元上の位置を
表す。図中、基底語や他の文書との距離が近い程、その
基底語あるいは文書がよく似ているといえる。ここで、
（○）は、具体例１における図１のステップＳ７でユー
ザ２０が興味があると入力した文書であり、（×）は、
ステップＳ７でユーザ２０が興味がないと入力した文書
である。即ち、図中のＤ１、Ｄ２、Ｄ４、Ｄ３９は、ユ
ーザ２０が興味があると入力した文書であり、Ｄ３はユ
ーザ２０が興味がないと入力した文書である。In FIG. 13, D1, D2, D2, D4, D
Reference numeral 39 denotes a two-dimensional position of the filtered document. In the figure, it can be said that the closer the distance to the base word or another document is, the more similar the base word or document is. here,
(○) is a document that the user 20 has input in step S7 of FIG. 1 in the specific example 1, and (x) is
This is the document to which the user 20 has input in step S7 that he is not interested. That is, D1, D2, D4, and D39 in the figure are documents that the user 20 has input as being interested, and D3 is a document that the user 20 has input as not being interested.

【００９３】［ステップＳ３２］ユーザ２０は、図１３
にて示す二次元上に表示された基底語について、自分の
興味と違っている基底語を消去する。ここでは、ユーザ
２０が興味がないと入力した文書Ｄ３と「音楽」という
基底語が図１３中で近い距離にあるので、ユーザ２０
は、「音楽」という基底語を削除したと仮定する。これ
は、視覚化インタフェース管理部５を通じて行われる。[Step S 32] The user 20
With respect to the base words displayed on the two-dimensional plane indicated by, the base words that are different from the user's interest are deleted. Here, since the document D3 input that the user 20 is not interested is close to the base word “music” in FIG.
Assume that the base word "music" has been deleted. This is performed through the visualization interface management unit 5.

【００９４】［ステップＳ３３］視覚化インタフェース
管理部５は、上記ステップＳ２によって変更されたプロ
ファイルをプロファイル保存部２に送る。これによりプ
ロファイル保存部２は、ユーザ２０の新しいプロファイ
ルを書き換える。[Step S33] The visualization interface management section 5 sends the profile changed in step S2 to the profile storage section 2. Thereby, the profile storage unit 2 rewrites the new profile of the user 20.

【００９５】［ステップＳ３４］フィルタリング部１
は、新たに入ってくる情報資源に対して、プロファイル
保存部２に保存されている新しいプロファイルを用いて
フィルタリングを行い、その結果をインタフェース管理
部４に送る。インタフェース管理部４は、フィルタリン
グ結果をユーザ２０に表示する。[Step S34] Filtering unit 1
Performs filtering on the newly entered information resource using the new profile stored in the profile storage unit 2 and sends the result to the interface management unit 4. The interface management unit 4 displays the filtering result to the user 20.

【００９６】〈効果〉以上のように、具体例３によれ
ば、ユーザがプロファイルを変更する際に、プロファイ
ルの中の基底語を、二次元上に視覚的に表示し、かつ、
フィルタリングされた文書も二次元上に表示するように
したので、具体例１の効果に加えて、次のような効果が
ある。即ち、ユーザが評価した文書と基底語との関係
を、ユーザが容易に把握することができるため、ユーザ
がプロファイルを変更する場合、効率よく行えるという
効果がある。<Effects> As described above, according to the third embodiment, when the user changes the profile, the base words in the profile are visually displayed two-dimensionally, and
Since the filtered document is displayed two-dimensionally, the following effects are obtained in addition to the effects of the first embodiment. That is, since the user can easily grasp the relationship between the document evaluated by the user and the base word, the user can efficiently change the profile when the user changes the profile.

【００９７】《利用形態》本発明は、上記具体例に限定
されるものではなく、以下の変形例についても適用が可
能である。<< Usage Mode >> The present invention is not limited to the above specific examples, but can be applied to the following modified examples.

【００９８】・情報提供部３の代わりに、通常の情報検
索システムに接続しても適用可能である。[0098] Instead of the information providing unit 3, the present invention can be applied to a connection to a normal information retrieval system.

【００９９】・情報提供部３の代わりに、複数の情報源
（例えば、新聞社、出版社、通信社など）からテキスト
やイメージなどの記事の配送を受けるシステムに接続し
ても適用可能である。Instead of the information providing unit 3, the present invention can be applied to a case where the system is connected to a system for receiving articles such as texts and images from a plurality of information sources (for example, newspaper companies, publishers, and news agencies). .

【０１００】・各具体例では、情報フィルタリングの手
法にベクトル空間モデルを用いたが、情報検索で用いら
れている確率モデルを用いてもよい。In each of the specific examples, the vector space model is used as the information filtering technique. However, a probability model used in information retrieval may be used.

【０１０１】・各具体例では、ユーザのプロファイルを
修正する手法として、関連フィードバックを用いたが、
機械学習の手法を用いてもよい。In each of the specific examples, the related feedback is used as a method for correcting the user profile.
A machine learning method may be used.

【０１０２】・各具体例では、情報源１０が一つであっ
たが、複数の情報源であっても適用可能である。In each specific example, the information source 10 is one, but the present invention can be applied to a plurality of information sources.

【０１０３】・具体例２では、クラスタリングの重要度
を計算するのに、そのクラスタリングの中に含まれる基
底語の重要度の和を用いたが、各基底語の重要度の平
均、あるいは、各クラスタリングの中で基底語の重要度
の一番高い値を表示してもよい。In the second embodiment, the sum of the importance of the base words included in the clustering is used to calculate the importance of the clustering. The highest value of the importance of the base word in the clustering may be displayed.

[Brief description of the drawings]

【図１】本発明の具体例１における情報フィルタリング
方法を示すフローチャートである。FIG. 1 is a flowchart illustrating an information filtering method according to a first embodiment of the present invention.

【図２】本発明の具体例１における情報フィルタリング
装置の構成図である。FIG. 2 is a configuration diagram of an information filtering device according to a first embodiment of the present invention.

【図３】ベクトル空間モデルの演算式の説明図である。FIG. 3 is an explanatory diagram of an operation expression of a vector space model.

【図４】特異値分析の演算説明図である。FIG. 4 is an explanatory diagram of calculation of singular value analysis.

【図５】本発明の具体例１の情報フィルタリング装置で
表示されたプロファイルの説明図である。FIG. 5 is an explanatory diagram of a profile displayed by the information filtering device according to the first embodiment of the present invention.

【図６】本発明の具体例１の情報フィルタリング装置で
の表示結果を示す説明図である。FIG. 6 is an explanatory diagram showing a display result in the information filtering device according to the first embodiment of the present invention.

【図７】本発明の具体例１の情報フィルタリング装置に
おける基底語を削除した後のプロファイルの説明図であ
る。FIG. 7 is an explanatory diagram of a profile after deleting base words in the information filtering device according to the first embodiment of the present invention.

【図８】本発明の具体例２の情報フィルタリング装置の
構成図である。FIG. 8 is a configuration diagram of an information filtering device according to a second embodiment of the present invention.

【図９】本発明の具体例２における情報フィルタリング
装置の動作を示すフローチャートである。FIG. 9 is a flowchart illustrating an operation of the information filtering device according to the second embodiment of the present invention.

【図１０】本発明の具体例２における情報フィルタリン
グ装置での表示結果を示す説明図である。FIG. 10 is an explanatory diagram showing a display result in the information filtering device according to the second embodiment of the present invention.

【図１１】本発明の具体例３の情報フィルタリング装置
の説明図である。FIG. 11 is an explanatory diagram of an information filtering device according to a third embodiment of the present invention.

【図１２】本発明の具体例３の情報フィルタリング装置
の動作を示すフローチャートである。FIG. 12 is a flowchart illustrating an operation of the information filtering device according to the third embodiment of the present invention.

【図１３】本発明の具体例３における情報フィルタリン
グ装置での表示結果を示す説明図である。FIG. 13 is an explanatory diagram showing a display result in the information filtering device according to the third embodiment of the present invention.

[Explanation of symbols]

１フィルタリング部２プロファイル保存部４インタフェース管理部５視覚化インタフェース管理部６クラスタリング部７文書視覚化情報生成部 DESCRIPTION OF SYMBOLS 1 Filtering part 2 Profile storage part 4 Interface management part 5 Visualization interface management part 6 Clustering part 7 Document visualization information generation part

Claims

[Claims]

1. First, filtering is performed on an information source based on a base word of a profile input by a user, and a result of the filtering is displayed. Next, the user attaches a filter based on the display of the filtering result. When the base words of the profile are changed and displayed again according to the score, the respective base words are visually displayed such that the stronger the relation between the base words of the profile at the time of filtering, the closer the distance between the base words becomes. An information filtering method, comprising:

2. The information filtering method according to claim 1, wherein each of the base words is displayed, and the base words whose association between the base words is within a predetermined value range are clustered. And an information filtering method for displaying importance indicating the degree of weight of the clustering in the filtering result and a base word representative of the clustering.

3. The information filtering method according to claim 1, wherein each base word of the profile is displayed, and information on a result of filtering by the profile is set such that the closer the information is, the shorter the distance is. And an information filtering method for visually displaying such that the stronger the association between the respective information and the respective base words, the shorter the distance.

4. Filtering of an information source is performed based on a base word of a profile input by a user, a result is output, and when evaluation information is input by the user, a profile is generated based on the evaluation information. A filtering unit to be changed; an interface management unit that receives an instruction from a user and performs a result from the filtering unit to the user; and a case where the filtering unit changes a profile.
Information filtering comprising: a visualization interface management unit for visually displaying each base word such that the stronger the association between the base words of the profile when filtering is performed, the closer the distance between the base words becomes. apparatus.

5. The information filtering apparatus according to claim 4, wherein base words whose association between the base words is within a predetermined value range are clustered, and for each clustering, the weight of the clustering in the filtering result is obtained. A clustering unit that outputs a degree of importance indicating the degree of the clustering and a base word of the representative of the clustering, and a visualization interface management that visually displays the importance of the clustering and the base word of the representative outputted from the clustering unit And an information filtering device.

6. The information filtering apparatus according to claim 4, wherein the information of the filtering result by the profile of the filtering unit is set such that the stronger the relation between the pieces of information is, the shorter the distance is, and the information is filtered. A document visualization information generation unit that outputs data of a visual positional relationship such that the stronger the association between the base words is, the closer the distance is; a visual display that visually displays an output result of the document visualization information generation unit An information filtering device, comprising: an interface management unit.