JPH10326281A

JPH10326281A - Method and device for information retrieval and management

Info

Publication number: JPH10326281A
Application number: JP9133850A
Authority: JP
Inventors: Yoshitaka Kuwata; 喜隆桑田
Original assignee: N T T DATA KK; NTT Data Corp
Current assignee: N T T DATA KK; NTT Data Group Corp
Priority date: 1997-05-23
Filing date: 1997-05-23
Publication date: 1998-12-08

Abstract

PROBLEM TO BE SOLVED: To improve retrieval efficiency and to enable high quality retrieval by retrieving a similar category from categories that a directory service provides when an example is given. SOLUTION: After collecting information, an information storage part 13 stores it in an index information data base 13a. An information retrieval part 15, once given an index example, performs a retrieving process and presents a provider 17 for similar directory service and a category. Consequently, when an index example is obtained actually from a user, the information retrieval part 15 calculates similarity between the index example as an example and category information stored in the index information data base 13a and displays the provider 17 for similar directory service and the category. Therefore, new information can be found according to the category to improve the efficiency, and collected information is already categorized, so the information can easily be rearranged.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、インターネット
やデータベースシステムにおけるディレクトリ・サービ
スに適した情報検索・管理方法およびその情報検索・管
理装置に関するものである。[0001] 1. Field of the Invention [0002] The present invention relates to an information search / management method and an information search / management apparatus suitable for a directory service in the Internet or a database system.

【０００２】[0002]

【従来の技術】従来の検索サービス（リトリーバル・サ
ービス）で用いられる情報検索装置は、図１３に示すよ
うに、情報検索装置１０１の情報収集部１１１で収集さ
れた対象となる情報を情報蓄積部１１３を介して内容情
報データベース１１３ａに記録・蓄積すると共に、索引
付与部１１７で自然言語解析技術を用いて索引を付与し
索引情報データベース１１７ａに記録・蓄積するように
していた。2. Description of the Related Art As shown in FIG. 13, an information retrieval apparatus used in a conventional retrieval service (retrieval service) stores target information collected by an information collection unit 111 of an information retrieval apparatus 101 in an information storage unit. In addition to recording and accumulating the information in the content information database 113a via the index 113, the index assigning unit 117 assigns an index using a natural language analysis technique, and records and accumulates the index in the index information database 117a.

【０００３】そして情報検索時には、情報検索部１１５
において、索引情報データベース１１７ａに記録・蓄積
される索引情報からキーワードや複数のキーワードの組
み合わせを用いて検索し、内容情報データベース１１３
ａに記録・蓄積される情報から対応する情報を発見する
ようにしていた。このような検索サービスでは、字面で
のパターンマッチングを行うため、次のような問題点が
有る。 (1) 意味が同じ場合であっても異なる表現がなされてい
る場合には、検索の対象とならない。 (2) 全く関連しない情報であっても字面が同じである
と、検索の対象になってしまう。At the time of information retrieval, an information retrieval unit 115
In the content information database 113, a search is performed from the index information recorded and stored in the index information database 117a using a keyword or a combination of a plurality of keywords.
The corresponding information is found from the information recorded and stored in a. In such a search service, since pattern matching is performed on the character surface, there are the following problems. (1) Even if the meaning is the same, if different expressions are made, they are not searched. (2) Even if the information is completely unrelated, if the character face is the same, it becomes a search target.

【０００４】一方で、有志のボランティアや特定企業、
団体等によって、特定分野の情報が整理分類され、索引
サービス（ディレクトリ・サービス）として提供される
ようになってきている。これらのサービスでは、人手に
よって分類整理作業が行なわれているため、非常にきめ
細かく質の高いサービス内容となっている場合が多い。[0004] On the other hand, volunteers and specific companies,
Information in a specific field is organized and classified by an organization or the like, and is provided as an index service (directory service). In these services, since classification and sorting work is performed manually, the service contents are often very detailed and high quality.

【０００５】しかし、これとても次のような問題点があ
る。 (1) 人手による作業であるため、対象となる分野が限ら
れてしまい、そのためユーザにとって有用なディレクト
リ・サービスを提供している団体を発見することが難し
い。However, this has the following problems. (1) Since it is a manual operation, the target field is limited, and it is difficult to find an organization providing a directory service useful for the user.

【０００６】(2) ディレクトリ・サービス毎に整理の観
点が異なっているため、ディレクトリ・サービスのカテ
ゴリの違いが分かり難く、異なるディレクトリ・サービ
スを十分に利用するためには習熟を要した。[0006] (2) Since the point of view of organization differs for each directory service, it is difficult to understand the difference in the category of the directory service, and it is necessary to master the use of the different directory services sufficiently.

【０００７】[0007]

【発明が解決しようとする課題】上述してきたように、
従来は興味をもつディレクトリ・サービスを発見するた
めには、試行錯誤が必要とされた。また、従来型の検索
サービスにおいて関連するキーワードを指定すること
で、関連するディレクトリ・サービスを発見することは
可能ではあるものの、上述したような検索サービスの欠
点があるために出力される情報の品質は低く、かつ希望
する情報を発見するためには手間と時間がかかる場合が
多い。As described above, as described above,
In the past, trial and error was required to find a directory service of interest. In addition, although it is possible to find a related directory service by specifying a related keyword in a conventional search service, the quality of information output due to the drawbacks of the search service as described above. Is low, and it often takes time and effort to find desired information.

【０００８】本発明は、上記課題に鑑みてなされたもの
で、近年爆発的に普及しつつあるインターネットや社内
ネットワークであるイントラネット上で効率良く情報を
検索し、検索した情報を管理することを目的とする。特
に、本発明では従来の検索サービスに加えて索引サービ
スを探しだす方法を提供することで、検索効率を上げ、
品質の高い検索を可能とする情報検索・管理方法および
その情報検索・管理装置を提供することを目的とする。The present invention has been made in view of the above problems, and has as its object to efficiently search for information on the Internet or an intranet, which is an in-house network, which has exploded in recent years, and to manage the searched information. And In particular, the present invention increases the search efficiency by providing a method of searching for an index service in addition to the conventional search service.
It is an object of the present invention to provide an information search / management method and an information search / management device that enable high-quality search.

【０００９】[0009]

【課題を解決するための手段】前述した目的を達成する
ために、本発明のうちで請求項１記載の発明は、所定の
ディレクトリ・サービスのプロバイダとカテゴリを提示
する情報検索・管理方法であって、例題を与えることで
ディレクトリ・サービスの提供するカテゴリから類似す
るカテゴリを検索することを要旨とする。In order to achieve the above-mentioned object, an invention according to claim 1 of the present invention is an information search / management method for presenting a predetermined directory service provider and a category. The gist of the present invention is to search for similar categories from the categories provided by the directory service by giving examples.

【００１０】これにより請求項１記載の本発明では、従
来の検索サービスに加え、索引サービス検索を行うこと
で、より効率の良い情報の検索が行える。Thus, according to the present invention, by performing index service search in addition to the conventional search service, more efficient information search can be performed.

【００１１】また、請求項２記載の発明は、複数のプロ
バイダからそれぞれディレクトリ・サービスが提供され
るときに最適なディレクトリ・サービスのプロバイダと
カテゴリを提示する情報検索・管理方法であって、予め
前記複数のプロバイダからそれぞれカテゴリ情報を収集
しかつ記録蓄積するステップと、例題が与えられたとき
に当該例題と前記蓄積されるカテゴリ情報からその類似
度を演算し、この類似度から最適なディレクトリ・サー
ビスのプロバイダとカテゴリを提示するステップとを有
することを要旨とする。The invention according to claim 2 is an information search / management method for presenting an optimal directory service provider and category when a plurality of directory services are provided, respectively, Collecting, recording and storing category information from a plurality of providers, and, when an example is given, calculating the similarity from the example and the stored category information, and selecting an optimal directory service from the similarity. And a step of presenting the category and the category.

【００１２】また、請求項３記載の発明は、複数のプロ
バイダからそれぞれディレクトリ・サービスが提供され
るときに所定のディレクトリ・サービスのプロバイダと
カテゴリを提示する情報検索・管理装置であって、前記
複数のプロバイダからカテゴリ情報を収集する情報収集
手段と、この情報収集手段で収集されたカテゴリ情報を
記録蓄積する情報蓄積手段と、例題が与えられたときに
当該例題と前記情報蓄積手段に蓄積されるカテゴリ情報
からその類似度を演算し、この類似度から所定のディレ
クトリ・サービスのプロバイダとカテゴリを提示する情
報検索手段とを有することを要旨とする。The invention according to claim 3 is an information search / management apparatus for presenting a predetermined directory service provider and a category when a directory service is provided from a plurality of providers, respectively. Information collecting means for collecting category information from the provider, information storing means for recording and storing category information collected by the information collecting means, and when an example is given, the example is stored in the information storing means. The gist is to calculate the similarity from the category information, and to have a predetermined directory service provider and an information search means for presenting the category from the similarity.

【００１３】これにより、自分の収集整理した情報に近
いディレクトリ・サービスを探しだすことができ、類似
の情報を高い品質で探しだすことが可能となる。As a result, it is possible to search for a directory service that is close to the information collected and arranged by the user, and to search for similar information with high quality.

【００１４】望ましくは請求項３記載の情報蓄積手段
は、カテゴリ記録テーブルおよび情報エントリ記録テー
ブルとを備えると良い。[0014] Preferably, the information storage means according to claim 3 is provided with a category record table and an information entry record table.

【００１５】望ましくは請求項３記載の類似度は、包含
率と含有率とで表される指標であると良い。すなわち、
与えられた索引例とカテゴリの類似度（包含率、含有
率）を示すことで、カテゴリの違いを明らかにできる。Preferably, the degree of similarity according to the third aspect is an index represented by the inclusion rate and the content rate. That is,
By showing the similarity (inclusion rate, content rate) between a given index example and a category, the difference between categories can be clarified.

【００１６】[0016]

【発明の実施の形態】以下、図面を用いて本発明の実施
の形態について説明する。図１は本発明に係る情報検索
・管理方法を適用した情報検索装置１の一実施の形態の
構成を示すブロック図である。図１に示すように、情報
検索装置１は情報収集部１１と情報蓄積部１３と索引情
報データベース１３ａと情報検索部１５により構成され
る。Embodiments of the present invention will be described below with reference to the drawings. FIG. 1 is a block diagram showing a configuration of an embodiment of an information search device 1 to which an information search / management method according to the present invention is applied. As shown in FIG. 1, the information search device 1 includes an information collection unit 11, an information storage unit 13, an index information database 13a, and an information search unit 15.

【００１７】また、ディレクトリ・サービスを提供する
プロバイダ１７は、情報検索装置１に概念を表すカテゴ
リと実際の情報の所在を表す情報エントリを提供してい
る。あるカテゴリはその下位の概念を含むカテゴリおよ
び情報エントリを包含することができる。したがって、
ディレクトリ・サービスは全体として概念の階層を示す
木（ツリー）構造をなす。ディレクトリ・サービスを使
うユーザは自分の希望するカテゴリの階層を辿ること
で、自分の希望する情報の所在を発見することができ
る。The directory service provider 17 provides the information retrieval device 1 with a category representing a concept and an information entry representing the location of actual information. Certain categories may include categories and information entries that include subordinate concepts. Therefore,
The directory service as a whole has a tree structure indicating a hierarchy of concepts. A user using the directory service can find the location of the desired information by tracing the hierarchy of the desired category.

【００１８】情報収集部１１ではディレクトリ・サービ
スＡ，Ｂ，〜，Ｎをそれぞれ提供する複数のプロバイダ
からカテゴリ情報を収集する。収集にあたっては、それ
ぞれのディレクトリ・サービスの最上位の概念（ルート
カテゴリ）から順番に下位の概念を辿ることで、ディレ
クトリ・サービスの提供するカテゴリおよび情報エント
リの情報およびそれらの階層関係を効率良く取得するこ
とができる。The information collecting unit 11 collects category information from a plurality of providers that provide directory services A, B,. Upon collection, the information on the categories and information entries provided by the directory service and their hierarchical relations are efficiently obtained by sequentially tracing the lower concepts from the highest concept (root category) of each directory service. can do.

【００１９】情報蓄積部１３ではこれらの収集した情報
を、後述する収集処理を施した後に、索引情報データベ
ース１３ａのカテゴリ記録テーブル（図５参照）および
情報エントリ記録テーブル（図６参照）に格納する。な
お、以上の情報収集および蓄積操作は、情報検索の要求
が発生する前に、予め実行しておく。The information storage unit 13 stores the collected information in a category record table (see FIG. 5) and an information entry record table (see FIG. 6) of the index information database 13a after performing a collecting process described later. . Note that the above information collection and accumulation operation is executed in advance before an information search request is issued.

【００２０】情報検索部１５では、索引例が与えられる
と後述する検索処理を行い類似したディレクトリ・サー
ビスのプロバイダとカテゴリを提示する。これにより、
実際にユーザから索引例が示されると情報検索部１５で
は例題としての索引例と情報蓄積部１３の索引情報デー
タベース１３ａに蓄積されたカテゴリ情報からその類似
度（包含率、含有率）を計算し、類似したディレクトリ
・サービスのプロバイダとカテゴリを提示する。この情
報を元に、ユーザは実際にディレクトリ・サービスをチ
ェックして行くことが可能となる。When an index example is given, the information search unit 15 performs a search process described later and presents similar directory service providers and categories. This allows
When an index example is actually shown by the user, the information search unit 15 calculates the similarity (inclusion rate, content rate) from the index example as an example and the category information stored in the index information database 13a of the information storage unit 13. , Offer similar directory service providers and categories. Based on this information, the user can actually check the directory service.

【００２１】次に、本実施形態の情報検索・管理方法に
おける例題とカテゴリとの類似度の計算について説明す
る。まず、ユーザから提示された例題とディレクトリ・
サービスのプロバイダから収集したカテゴリ情報との類
似度の計算には包含率と含有率とで表される指標を用い
る。例題として示されたカテゴリ中に含まれる情報エン
トリ集合をＫ、収集したあるカテゴリに含まれる情報エ
ントリの集合をＮとしたとき、包含率と含有率を次のよ
うに定義する。包含率：Ｋの何％がＮに含まれるか。含有率：Ｋとの共通部分はＮの何％になるのか。Next, calculation of the similarity between an example and a category in the information search / management method of the present embodiment will be described. First, the examples and directories presented by the user
An index represented by the inclusion rate and the content rate is used for calculating the similarity with the category information collected from the service provider. Assuming that a set of information entries included in a category shown as an example is K and a set of information entries included in a certain collected category is N, the inclusion rate and the content rate are defined as follows. Inclusion: What percentage of K is included in N? Content: What percentage of N is common with K?

【００２２】図２は包含率および含有率とカテゴリ間、
すなわち例題集合（Ｋ）、対象集合（Ｎ）との関係を示
す図である。この図２からも明らかなように、例題と全
く同じカテゴリが発見される場合が最も理想的である
（ケースＣ２）。このケースＣ２の場合には、当然、包
含率も含有率も１００％となる。しかしながら、実際に
は例題にない情報エントリを知ることが目的であること
から、包含率１００％で含有率が１００％より小さなも
の（ケースＣ１）や包含率は１００％に至らないなが
ら、含有率が１００％より小さなもの（ケースＣ４）で
あっても良く、また望ましい。FIG. 2 shows the inclusion rate and content rate between categories,
That is, it is a diagram showing the relationship between the example set (K) and the target set (N). As is clear from FIG. 2, it is most ideal that the same category as the example is found (case C2). In the case C2, both the inclusion rate and the content rate are naturally 100%. However, since the purpose is to actually know the information entry which is not in the example, the content rate is 100% and the content rate is smaller than 100% (case C1). May be smaller than 100% (case C4) and is also desirable.

【００２３】なお、含有率が小さ過ぎる場合には、非常
に大きなカテゴリである場合が多いため、包含率が同じ
なら含有率はなるべく大きなものの方が望ましい。When the content is too small, the category is often very large. Therefore, if the content is the same, the content is preferably as large as possible.

【００２４】さらに、本定義はカテゴリがサブカテゴリ
を持ち、入れ子になっている場合にも、サブカテゴリを
無視して、情報エントリの数だけで計算を行っている。
上記指標の計算以外にカテゴリに含まれるサブカテゴリ
の数を計算に含める方法も考えられる。この場合、同じ
情報エントリを持つカテゴリで、サブカテゴリを持つも
のと、持たないものは上記指標は別の数値になる。Further, in the present definition, even when a category has subcategories and is nested, the calculation is performed only by the number of information entries, ignoring the subcategories.
In addition to the above calculation of the index, a method of including the number of subcategories included in the category in the calculation is also conceivable. In this case, for the categories having the same information entry, those having a subcategory and those having no subcategory have different numerical values for the index.

【００２５】また、上記指標以外にも、指標を定義して
利用することは可能である。例えば、プロバイダの提供
する情報の何パーセントを占めるか、対象カテゴリの深
さ等の情報を基に、指標を作成し、検索に利用すること
が可能である。In addition to the above-mentioned indices, it is possible to define and use indices. For example, it is possible to create an index based on information such as the percentage of the information provided by the provider and the depth of the target category, and use it for a search.

【００２６】次に、図３および図４を参照して情報収集
アルゴリズムについて説明する。図３は情報収集アルゴ
リズム（収集処理）を示し、図４は情報収集アルゴリズ
ム（カテゴリ収集処理）を示す。Next, an information collection algorithm will be described with reference to FIGS. FIG. 3 shows an information collection algorithm (collection processing), and FIG. 4 shows an information collection algorithm (category collection processing).

【００２７】図３を参照するに、情報収集に際しては、
まずステップＳ１１でカテゴリ記録テーブル（図５参
照）および情報エントリ記録テーブル（図６参照）の初
期化を行う。次に、対象となるプロバイダについてのプ
ロバイダリストの作成を各プロバイダの各カテゴリにつ
いてカテゴリの収集（ステップＳ１７）と共に、全ての
プロバイダに対して行う（ステップＳ１３，Ｓ１５，Ｓ
１７）。Referring to FIG. 3, when collecting information,
First, in step S11, the category record table (see FIG. 5) and the information entry record table (see FIG. 6) are initialized. Next, the creation of a provider list for the target provider is performed for all providers together with the collection of categories for each category of each provider (step S17) (steps S13, S15, and S).
17).

【００２８】次に、図４を参照してカテゴリの収集処理
について説明する。まず、ステップＳ２１において、与
えられたカテゴリをカテゴリ記録テーブルに記録し、さ
らにこの与えられたカテゴリが情報エントリか否かを確
認する。ここで当該カテゴリが情報エントリではないこ
とが確認されたときには、ステップＳ２５に進み当該与
えられたカテゴリのサブカテゴリ全てにカテゴリ収集処
理を行う。また当該カテゴリが情報エントリであること
が確認されたときには、ステップＳ２７に進み当該与え
られた情報を情報エントリ記録テーブルに記録する。こ
れにより、予め与えられたプロバイダリストを基に、カ
テゴリを再帰的に辿り情報エントリが格納されている部
分までを取得し、プロバイダ名、上位、下位との関係と
共に記録することができる。Next, a category collection process will be described with reference to FIG. First, in step S21, the given category is recorded in the category record table, and it is further confirmed whether or not the given category is an information entry. Here, when it is confirmed that the category is not an information entry, the process proceeds to step S25, and a category collection process is performed on all subcategories of the given category. When it is confirmed that the category is an information entry, the process proceeds to step S27, and the given information is recorded in the information entry record table. Thereby, based on the provider list given in advance, the category can be recursively traced up to the part where the information entry is stored, and can be recorded together with the provider name and the relation with the upper and lower ranks.

【００２９】次に、図５および図６を参照して収集した
情報を格納するための情報格納テーブルの構成を説明す
る。図５はカテゴリ記録テーブルの例を示し、図６は情
報エントリ記録テーブルの例を示す。カテゴリ記録テー
ブルはカテゴリ情報の格納のために用意されるものであ
り、情報エントリ記録テーブルは入力された情報エント
リと同一のエントリを素早く探し出すために用意される
ものであり、このとき情報エントリはハッシングして格
納される。Next, the configuration of an information storage table for storing collected information will be described with reference to FIGS. FIG. 5 shows an example of the category record table, and FIG. 6 shows an example of the information entry record table. The category record table is prepared for storing category information, and the information entry record table is prepared for quickly searching for the same entry as the input information entry. At this time, the information entry is hashed. Stored.

【００３０】図５のカテゴリ記録テーブルの各エントリ
は、そのエントリの通番を示す「ナンバフィールド」、
カテゴリの種類を示す「種別フィールド」、カテゴリま
たはプロバイダまたは情報エントリの名前を示す「カテ
ゴリ／プロバイダ／情報エントリ名フィールド」、その
カテゴリの上位カテゴリを示す「上位カテゴリフィール
ド」、下位カテゴリを示す「下位カテゴリフィールド」
からなる。図５の第１エントリはプロバイダ（ＤＳ１）
の情報を示す。種別はプロバイダであり、名前ＤＳ１、
最上位カテゴリなので、上位カテゴリは存在せず、下位
のカテゴリとして第２および第４エントリを持つ事を示
している。第２エントリは第１エントリに示したプロバ
イダのサブカテゴリ（ＤＳ１−Ｃ１）であり、上位カテ
ゴリは第１エントリであり、更に下位のカテゴリとし
て、第３、４、５エントリの情報を持つ事を示してい
る。以下同様の情報が格納されている様子を示してい
る。図６は情報エントリ記録テーブルの例を示してい
る。テーブルは情報名とカテゴリ記録テーブルのエント
リの番号をハッシュ表として実現している。情報名をキ
ーとしたハッシュ表であるため、このテーブルを使うと
情報名からカテゴリ記録テーブルのエントリ番号を検索
できる。実際の情報名からカテゴリ記録テーブルの検索
において、情報名からハッシュ関数を使い情報エントリ
記録テーブルのエントリ番号を計算し、そのエントリか
ら更にカテゴリ記録テーブルの格納場所を知る事ができ
る。Each entry of the category record table of FIG. 5 has a “number field” indicating the serial number of the entry,
"Type field" indicating the category type, "category / provider / information entry name field" indicating the name of the category or provider or information entry, "upper category field" indicating the upper category of the category, "lower level" indicating the lower category Category Field "
Consists of The first entry in FIG. 5 is the provider (DS1)
Shows the information. Type is provider, name DS1,
Since this is the highest category, there is no higher category, indicating that the category has second and fourth entries as lower categories. The second entry is a sub-category (DS1-C1) of the provider shown in the first entry, the upper category is the first entry, and indicates that the lower category has information of the third, fourth, and fifth entries. ing. Hereinafter, a state in which similar information is stored is shown. FIG. 6 shows an example of the information entry record table. The table implements an information name and a category record table entry number as a hash table. Since the hash table uses the information name as a key, the entry number of the category record table can be searched from the information name by using this table. In the search of the category record table from the actual information name, the entry number of the information entry record table is calculated using the hash function from the information name, and the storage location of the category record table can be further known from the entry.

【００３１】次に、図７および図８を参照して情報検索
アルゴリズムについて説明する。図７は類似カテゴリ検
索アルゴリズムを示し、図８は最適類似カテゴリ検索ア
ルゴリズムを示す。Next, an information retrieval algorithm will be described with reference to FIGS. FIG. 7 shows a similar category search algorithm, and FIG. 8 shows an optimal similar category search algorithm.

【００３２】図７を参照するに、類似カテゴリ検索処理
においては、ステップＳ３１で与えられた情報例を情報
エントリ記録テーブルでチェックする。次に、ステップ
Ｓ３３で現在の情報エントリの次の既にチェックされた
情報エントリを取り出し、ステップＳ３５で情報エント
リが終了したときには当該検索処理を終了し、それ以外
のときにはステップＳ３７に進み、次の上位カテゴリを
取り出す。ステップＳ３７で次の上位カテゴリを取り出
す際に、該当する上位カテゴリが無く、あるいは既に計
算済みの場合にはステップＳ３３に戻り、それ以外の場
合にはステップＳ４１に進み、次上位カテゴリに対して
最適類似カテゴリ検索処理を行った後にステップＳ３３
に戻る。Referring to FIG. 7, in the similar category search process, the information example given in step S31 is checked in the information entry record table. Next, in step S33, the already-checked information entry next to the current information entry is taken out. When the information entry is completed in step S35, the search processing is terminated. Otherwise, the process proceeds to step S37, and the next higher order is performed. Retrieve a category. When the next upper category is extracted in step S37, if there is no corresponding upper category, or if the calculation has already been performed, the process returns to step S33; otherwise, the process proceeds to step S41, and the optimum After performing similar category search processing, step S33
Return to

【００３３】図８を参照するに、最適類似カテゴリ検索
処理においては、まずステップＳ５１で、与えられたカ
テゴリの包含率、含有率を計算し、テーブルに記録す
る。ステップＳ５３では、カテゴリとルートカテゴリが
等しい場合には当該検索処理を終了し、それ以外の場合
にはステップＳ５５に進み、包含率が１００％かどうか
をチェックする。１００％の場合には当該検索処理を終
了し、それ以外の場合にはステップＳ５７に進み、上位
カテゴリの包含率と与えられたカテゴリの包含率との大
小関係をチェックする。上位カテゴリの包含率が与えら
れたカテゴリの包含率より大である場合には当該検索処
理を終了し、それ以外の場合にはステップＳ５９に進
み、上位カテゴリの最適類似カテゴリ検索を行う。Referring to FIG. 8, in the optimal similar category search process, first, in step S51, the inclusion rate and content rate of a given category are calculated and recorded in a table. In step S53, if the category is equal to the root category, the search process is terminated. Otherwise, the process proceeds to step S55 to check whether the inclusion rate is 100%. If it is 100%, the search process is terminated. Otherwise, the process proceeds to step S57, where the magnitude relation between the coverage of the upper category and the coverage of the given category is checked. If the inclusion rate of the upper category is larger than the inclusion rate of the given category, the search processing is terminated. Otherwise, the process proceeds to step S59, where an optimal similar category search of the upper category is performed.

【００３４】上述したように、本実施の形態では情報エ
ントリテーブルから入力された例題情報エントリを基に
対応する情報エントリの含まれるカテゴリをカテゴリ記
録テーブルから見つけ、さらにカテゴリ記録テーブルを
上位方向に辿りながら、包含率と含有率とで表される類
似指標の計算を行うようにしている。As described above, in the present embodiment, the category including the corresponding information entry is found from the category record table based on the example information entry input from the information entry table, and the category record table is traced upward. Meanwhile, the similarity index represented by the inclusion rate and the content rate is calculated.

【００３５】ここでは架空のディレクトリ・サービスか
らカテゴリを検索する場合を示したが、実際にはインタ
ーネットでは、情報エントリはＵＲＬ(Universal Rseso
urceLocator) であり、ディレクトリ・サービスのプロ
バイダは個人のホームページで公開されているホットリ
ストや趣味の情報のリンク情報などである。Here, a case has been described in which a category is searched from a fictional directory service. However, on the Internet, the information entry is actually a URL (Universal Rseso).
urceLocator), and the directory service provider is a hot list published on a personal homepage or link information of hobby information.

【００３６】図９はサービスプロバイダ（ＤＳ１，ＤＳ
２，ＤＳ３）からカテゴリ情報を収集した結果を格納し
てある様子を示している。この中から、例題として（Ｃ
１，Ｃ２，Ｃ４）を含むカテゴリを検索する。FIG. 9 shows a service provider (DS1, DS
2, DS3) shows the result of storing the result of collecting category information. From among these, (C
(1, C2, C4).

【００３７】まず、情報エントリ記憶テーブル上で例題
を含むエントリをチェックする。図１０では、該当する
エントリに網線を付している。次に、図１１に示すよう
に、チェックされたエントリを含むサブサブカテゴリに
注目し、類似指数を計算する。さらに、包含率が増加し
なくなる、またはルートカテゴリに至るまでサブサブカ
テゴリ、サブカテゴリ、カテゴリの順に類似指標の計算
を上位方向に辿りながら繰り返す。First, an entry including an example is checked on the information entry storage table. In FIG. 10, the corresponding entry is shaded. Next, as shown in FIG. 11, attention is paid to the sub-subcategory including the checked entry, and the similarity index is calculated. Further, the similarity index calculation is repeated in the order of the sub-subcategory, the sub-category, and the category until the inclusion rate does not increase or reaches the root category.

【００３８】以上の計算の終了の後に、ユーザに最も包
含率、含有率の高いカテゴリの一覧をそれらの指標と共
に示す（図１２）。また、ユーザの要求に応じて、さら
に上位、下位の概念を示すことも可能である。ユーザの
好みに応じて含有率を優先したり包含率を優先したり、
それらを重み付けて優先してユーザに提示することも可
能である。After the above calculation is completed, a list of categories having the highest inclusion rate and high content rate is shown to the user together with their indexes (FIG. 12). It is also possible to show higher and lower concepts according to the user's request. Prioritize content or inclusion based on user preference,
It is also possible to prioritize and present them to the user by weighting them.

【００３９】尚、上記の実施形態ではインターネットや
データベースシステムにおけるディレクトリ・サービス
に適用した場合を例にとって説明したが、本発明はこれ
に限定されること無く、例えば、その他類似するカテゴ
リの検索を行うアプリケーション、具体的にはイントラ
ネット、情報管理、索引サービス、データベース管理、
情報収集、サーチエンジン等の適宜の分野に適用するこ
とができる。In the above embodiment, the case where the present invention is applied to a directory service in the Internet or a database system has been described as an example. However, the present invention is not limited to this. For example, another similar category is searched. Applications, specifically intranets, information management, indexing services, database management,
The present invention can be applied to appropriate fields such as information collection and search engines.

【００４０】[0040]

【発明の効果】以上説明したように、本発明は、カテゴ
リに分類されている情報において例題を与えて類似のカ
テゴリを効率良く検索することができることから、例え
ばインターネット等においては、自分の収集したＵＲＬ
情報を基に、類似のカテゴリで整理してあるディレクト
リ・サービスとそのカテゴリを効率良く発見することが
できる。その結果として、新しい情報の発見がカテゴリ
に基づいて行え、効率が良くなり、また集められた情報
はすでにカテゴリ付けされているため、情報の整理が容
易である等の効果を奏する。As described above, according to the present invention, it is possible to efficiently search for similar categories by giving examples to information classified into categories. URL
Based on the information, it is possible to efficiently find directory services and their categories arranged in similar categories. As a result, new information can be found based on the category, and the efficiency is improved. Further, since the collected information is already categorized, the information can be easily organized.

[Brief description of the drawings]

【図１】本発明に係る情報検索・管理方法を適用した情
報検索・管理装置の一実施形態の概略の構成を示すブロ
ック図である。FIG. 1 is a block diagram showing a schematic configuration of an embodiment of an information search / management apparatus to which an information search / management method according to the present invention is applied.

【図２】包含率および含有率とカテゴリ間の関係を示す
図である。FIG. 2 is a diagram showing a relationship between an inclusion rate and a content rate and categories.

【図３】情報収集アルゴリズム（収集処理）を示すフロ
ーチャートである。FIG. 3 is a flowchart illustrating an information collection algorithm (collection processing).

【図４】情報収集アルゴリズム（カテゴリ収集処理）を
示すフローチャートである。FIG. 4 is a flowchart illustrating an information collection algorithm (category collection processing).

【図５】カテゴリ記録テーブルの構成を示す図である。FIG. 5 is a diagram showing a configuration of a category recording table.

【図６】情報エントリ記録テーブルの構成を示す図であ
る。FIG. 6 is a diagram showing a configuration of an information entry record table.

【図７】情報検索アルゴリズム（類似カテゴリ検索処
理）を示すフローチャートである。FIG. 7 is a flowchart illustrating an information search algorithm (similar category search processing).

【図８】情報検索アルゴリズム（最適類似カテゴリ検索
処理）を示すフローチャートである。FIG. 8 is a flowchart illustrating an information search algorithm (optimal similar category search processing).

【図９】収集したカテゴリ情報の格納状態を示す図であ
る。FIG. 9 is a diagram illustrating a storage state of collected category information.

【図１０】格納されたカテゴリ情報からのＣ１，Ｃ２，
Ｃ４を含むカテゴリの検索を説明するための図である。FIG. 10 shows C1, C2, and C3 from stored category information.
It is a figure for explaining search of the category containing C4.

【図１１】類似指数の計算過程を示す図である。FIG. 11 is a diagram illustrating a process of calculating a similarity index.

【図１２】カテゴリの検索結果を示す図である。FIG. 12 is a diagram showing search results of categories.

【図１３】従来の情報検索装置の構成を示すブロック図
である。FIG. 13 is a block diagram showing a configuration of a conventional information search device.

[Explanation of symbols]

１情報検索装置１１情報収集部１３情報蓄積部１３ａ索引情報１５情報検索部 DESCRIPTION OF SYMBOLS 1 Information search apparatus 11 Information collection part 13 Information storage part 13a Index information 15 Information search part

Claims

[Claims]

1. An information retrieval / management method for presenting a predetermined directory service provider and a category, wherein a similar category is retrieved from a category provided by the directory service by giving an example. Information retrieval and management method.

2. An information search / management method for presenting an optimal directory service provider and a category when a directory service is provided from each of a plurality of providers. Collecting and recording and storing, and when an example is given, the similarity is calculated from the example and the stored category information, and an optimal directory / directory is calculated from the similarity.
2. The method according to claim 1, further comprising the step of presenting a service provider and a category.

3. An information search / management apparatus for presenting a predetermined directory service provider and a category when a directory service is provided from each of a plurality of providers, wherein the category information is collected from the plurality of providers. Information collecting means, information storing means for recording and storing the category information collected by the information collecting means, and when an example is given, the similarity is calculated from the example and the category information stored in the information storing means. An information search / management apparatus comprising an information search means for presenting a predetermined directory service provider and a category based on the similarity.

4. The information search / management apparatus according to claim 3, wherein said information storage means includes a category record table and an information entry record table.

5. The information search / management apparatus according to claim 3, wherein the similarity is an index represented by an inclusion rate and a content rate.