JP2006164086A

JP2006164086A - Online knowledge search support system and online knowledge search support method

Info

Publication number: JP2006164086A
Application number: JP2004357483A
Authority: JP
Inventors: Hisanobu Matsuoka; 寿延松岡; Tsuneko Kura; 恒子倉; Noriyasu Arakawa; 則泰荒川; Yasuhisa Kato; 泰久加藤
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2004-12-10
Filing date: 2004-12-10
Publication date: 2006-06-22

Abstract

<P>PROBLEM TO BE SOLVED: To support an end user in extracting personal connections placing a high value and certain reliance on each other about target knowledge according to category information classifying entries and acquaintance information both in a Weblog server 1, and acquiring reliable knowledge from the content of Weblogs belonging to the connections. <P>SOLUTION: A knowledge resource collection apparatus 2 collects a Weblog content from the Weblog server 1, and extracts a set of Weblogs in acquaintanceship and having common category information as personal connections familiar with each category. A knowledge resource search support server 3 can search the Weblog content belonging to connections familiar with search target categories for the Weblog content belonging to a search target category including synonyms, and visualize and display search results on an end user terminal 4 by classification based on categorical connections, Weblogs, a time series, mutual comment relations, mutual trackback relations or the like. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、知人情報とカテゴリー情報に基づいたオンライン知識検索技術に関し、詳しくはネットワーク上で特定のカテゴリーに関する知識を蓄積しているウェブログコミュニティの抽出によりオンライン知識検索を支援する技術に関する。 The present invention relates to an online knowledge search technology based on acquaintance information and category information, and more particularly to a technology that supports online knowledge search by extracting a weblog community that stores knowledge about a specific category on a network.

従来のオンライン知識検索技術においては、キーワードとページランクに基づく一般的な検索結果から更に求める知識を絞り込む方法として、
（１）自然言語処理による話題抽出に基づく検索結果の絞り込み（例えば、特許文献１参照）。 In the conventional online knowledge search technology, as a method of further narrowing down the knowledge to be searched from general search results based on keywords and page rank,
(1) Narrowing search results based on topic extraction by natural language processing (see, for example, Patent Document 1).

（２）協調フィルタリングによる検索結果の絞り込み（例えば、特許文献２参照）。
といった方法がある。 (2) Narrowing search results by collaborative filtering (see, for example, Patent Document 2).
There is a method.

上記の（１）による絞り込み方法においては、一般に形態素解析、構文解析等を含む自然言語処理により、求める知識に言及しているコンテンツを抽出する。 In the narrowing-down method according to the above (1), contents referring to the knowledge to be obtained are generally extracted by natural language processing including morphological analysis and syntax analysis.

また、（２）による絞り込み方法においては、有限数の協調フィルタリング参加ユーザーの集合において、ユーザープロファイルの一部として登録された嗜好情報やブックマーク等の解析に基づく嗜好情報抽出、及び各参加ユーザーの検索履歴に基づき、検索者と似た嗜好を持つ有限数の参加ユーザーの検索履歴を利用して検索結果を絞り込む。 Further, in the narrowing-down method according to (2), preference information extraction based on analysis of preference information, bookmarks, etc. registered as part of the user profile in a limited number of collaborative filtering participant users, and search for each participant user Based on the history, the search results are narrowed down using the search history of a finite number of participating users who have similar preferences to the searcher.

また、通例ウェブログサーバは、ウェブログ作者が日々登録するエントリー、および該エントリーに対してウェブログ読者が付加するコメント、及び異なるウェブログにおいて該エントリーに関連するエントリーが登録されたことを示すトラックバックより構成されるウェブログコンテンツを蓄積するとともに、エントリーを話題毎に分類したカテゴリー情報、及び該ウェブログ作者のインターネット上での知人に関するウェブログＵＲＬ等のメタデータをＦｏａｆ（例えば、非特許文献１参照）等の形式で記述した知人情報を有する。本発明ではこれらを利用したオンライン知識検索支援をするものである。
特開２００１−３２５２７２特開２０００−３３１０２０ “ＦＯＡＦ Vocabular Specification" , Dan Brickley & Libby Miller,2004/09/02更新，［2004/10/19検索］，インターネットＵＲＬ＜http://xmlns.com/foaf/0.1/＞ Also, weblog servers typically track entries that weblog authors register daily, comments that weblog readers add to the entries, and trackbacks indicating that entries related to the entries in different weblogs are registered. In addition to accumulating web log contents configured, the category information in which entries are classified for each topic and metadata such as web log URLs related to acquaintances on the Internet of the web log author are stored in Foaf (for example, Non-Patent Document 1). Acquaintance information described in a format such as In the present invention, online knowledge search support using these is provided.
JP 2001-325272 A JP 2000-331020 A “FOAF Vocabular Specification”, Dan Brickley & Libby Miller, 2004/09/02 update, 2004/10/19 search, Internet URL <http://xmlns.com/foaf/0.1/>

前記従来技術においては、（１）の場合、求める知識に関する記述を含むコンテンツは抽出可能であるが、そのコンテンツにおける該知識に関する記述の信頼性は保証できない、という問題がある。 In the prior art, in the case of (1), there is a problem that the content including the description relating to the desired knowledge can be extracted but the reliability of the description relating to the knowledge in the content cannot be guaranteed.

また、（２）の場合、限られた数のエンドユーザーによる協調フィルタリングでは絞り込み精度をあげにくく、協調フィルタリングに参加するエンドユーザー数を増やすことも容易ではない、という問題がある。 Further, in the case of (2), there is a problem that it is difficult to improve the narrowing accuracy by collaborative filtering by a limited number of end users, and it is not easy to increase the number of end users participating in collaborative filtering.

本発明は、上記に問題点に鑑みてなされたもので、その目的とするところは、オンライン知識検索技術において、通例ウェブログがメタデータとして有する、エントリーを分類したカテゴリー情報、及びＦｏａｆ等の形式で記述されている知人情報を用いて、目的とする知識に関して相互に評価し合い一定の信頼をおく人脈を抽出し、この人脈に属するウェブログのコンテンツから信頼性の高い知識を取得できるよう、エンドユーザーを支援することを可能とするオンライン知識検索支援装置及び方法を提供することにある。 The present invention has been made in view of the above problems, and the purpose of the present invention is to categorize category information classified as entries, which is commonly used as metadata in a web log in online knowledge search technology, and a format such as Foaf. Using the acquaintance information described in the above, we will mutually evaluate the target knowledge and extract a human network with a certain level of trust, so that we can acquire highly reliable knowledge from the content of the weblog belonging to this human network. An object of the present invention is to provide an online knowledge search support apparatus and method that can support an end user.

上記目的を達成するため、請求項１等に記載の本発明は、以下の装置および方法を特徴とする。 In order to achieve the above object, the present invention described in claim 1 and the like is characterized by the following apparatus and method.

（１）知識リソース収集装置と知識検索支援サーバとを備えてオンライン知識検索を支援する装置であって、
前記知識リソース収集装置は、
一般的なウェブログコンテンツに含まれるエントリー、コメント、トラックバック及びエントリーの内容を分類したカテゴリー情報、及びウェブログ作者の知人のウェブログＵＲＬ等を含む知人情報をインターネット上で収集するクローリング手段と、
前記クローリング手段で収集したウェブログコンテンツに含まれる知人情報から共通のカテゴリーを持つ知人関係を抽出することにより該カテゴリーに詳しい知人ネットワークをカテゴリー毎の人脈情報として抽出する人脈情報抽出手段と、
前記人脈情報に属するウェブログコンテンツをカテゴリー毎に検索可能な形でインデクシングして知識リソースとして蓄積する知識リソース抽出手段とを備え、
前記知識検索支援サーバは、一般的なエンドユーザー端末のウェブブラウザ等のブラウジングソフトから検索対象カテゴリーを指定され、該検索対象カテゴリーの同意語も含めたカテゴリーについて前記知識リソースを検索し、検索結果のウェブログコンテンツに含まれるウェブログ、ウェブログ作者、時系列及びカテゴリー情報にて分類されたエントリー、コメント、トラックバック及び各カテゴリー毎の人脈情報を、構造可視化を行った上でエンドユーザー端末上に表示可能とする知識リソース検索手段を備えたことを特徴とするオンライン知識検索支援装置。 (1) An apparatus that supports an online knowledge search including a knowledge resource collection device and a knowledge search support server,
The knowledge resource collection device includes:
Crawling means for collecting acquaintance information on the Internet, including entries, comments, trackbacks and category information included in general weblog contents, and weblog information of weblog author acquaintances;
Human network information extracting means for extracting an acquaintance network having a common category from acquaintance information included in the weblog content collected by the crawling means to extract an acquaintance network that is familiar with the category as personal network information for each category;
A knowledge resource extracting means for indexing the web log content belonging to the personal network information in a searchable form for each category and storing it as a knowledge resource;
The knowledge search support server is designated a search target category from browsing software such as a web browser of a general end user terminal, searches the knowledge resource for a category including a synonym of the search target category, Weblog entries included in weblog content, weblog authors, time series and category information, comments, trackbacks, and network information for each category are displayed on the end user terminal after visualizing the structure. An online knowledge search support apparatus characterized by comprising knowledge resource search means for enabling.

（２）知識リソース収集装置と知識検索支援サーバとを備えてオンライン知識検索を支援する方法であって、
前記知識リソース収集装置は、
一般的なウェブログコンテンツに含まれるエントリー、コメント、トラックバック及びエントリーの内容を分類したカテゴリー情報、及びウェブログ作者の知人のウェブログＵＲＬ等を含む知人情報をインターネット上で収集するクローリング過程と、
前記クローリング過程で収集したウェブログコンテンツに含まれる知人情報から共通のカテゴリーを持つ知人関係を抽出することにより該カテゴリーに詳しい知人ネットワークをカテゴリー毎の人脈情報として抽出する人脈情報抽出過程と、
前記人脈情報に属するウェブログコンテンツをカテゴリー毎に検索可能な形でインデクシングして知識リソースとして蓄積する知識リソース抽出過程とを有し、
前記知識リソース検索支援サーバは、一般的なエンドユーザー端末のウェブブラウザ等のブラウジングソフトから検索対象カテゴリーを指定され、該検索対象カテゴリーの同意語も含めたカテゴリーについて前記知識リソースを検索し、検索結果のウェブログコンテンツに含まれるウェブログ、ウェブログ作者、時系列及びカテゴリー情報にて分類されたエントリー、コメント、トラックバック及び各カテゴリー毎の人脈情報を、構造可視化を行った上でエンドユーザー端末上に表示可能とする知識リソース検索過程を有することを特徴とするオンライン知識検索支援方法。 (2) A method for supporting online knowledge search comprising a knowledge resource collection device and a knowledge search support server,
The knowledge resource collection device includes:
A crawling process for collecting acquaintance information on the Internet, including entries, comments, trackbacks and category information included in general weblog contents, and web log URLs of acquaintances of weblog authors;
A network information extraction process for extracting a network of acquaintances familiar with the category as network information for each category by extracting an acquaintance relationship having a common category from acquaintance information included in the weblog content collected in the crawling process,
A knowledge resource extraction process of indexing the web log content belonging to the network information in a searchable form for each category and accumulating as a knowledge resource,
The knowledge resource search support server is designated a search target category from browsing software such as a web browser of a general end user terminal, searches the knowledge resource for a category including a synonym of the search target category, and a search result Visualization of the structure of the weblogs, weblog authors, entries categorized by time series and category information, comments, trackbacks, and network information for each category on the end user terminal An online knowledge search support method comprising a knowledge resource search process that enables display.

以上のように、本発明にあっては、知識リソース収集装置はウェブログコンテンツを収集し、知人関係にあると共に共通のカテゴリー情報を持つウェブログの集合を各カテゴリーに詳しい人脈として抽出し、知識リソース検索支援サーバは同意語を含む検索対象カテゴリーに属するウェブログコンテンツを、検索対象カテゴリーに詳しい人脈に属するウェブログコンテンツから検索し、カテゴリー毎の人脈やウェブログ毎、あるいは時系列や相互のコメント関係、相互のトラックバック関係等の分類により検索結果を可視化して表示することを可能とするため、通例ウェブログに一般的なメタデータとして付随するカテゴリー情報及び知人情報のみを用いて、目的とする知識に精通し相互に知人関係を維持するに足る信頼をおいている人脈を抽出し、この人脈に属するウェブログのコンテンツは該知識に関して信頼性が高いものとみなして、該ウェブログコンテンツから求める知識を容易に取得できるようエンドユーザーを支援することを可能とする。 As described above, in the present invention, the knowledge resource collection device collects web log contents, extracts a set of weblogs having acquaintance relationships and common category information as detailed personal connections to each category, and knowledge The resource search support server searches the web log contents belonging to the search target category including synonyms from the web log contents belonging to the personal network familiar to the search target category, and the human network for each category, each web log, or time series and mutual comments. In order to make it possible to visualize and display search results by classification of relationships, mutual trackback relationships, etc., we usually use only category information and acquaintance information attached as general metadata to weblogs. A human network that is knowledgeable and trusts enough to maintain mutual acquaintance Out, the content of the weblog belonging to the network of contacts is regarded as reliable with respect to the knowledge makes it possible to support the end-user to the knowledge obtained from the weblog content can be obtained easily.

また、現在普及しつつあるウェブログの一般的なメタデータをそのまま利用可能であるため、既存ウェブログコンテンツをそのまま知識リソースとして利用可能となる。 Further, since the general metadata of the weblog that is currently popularized can be used as it is, the existing weblog content can be used as it is as a knowledge resource.

これにより、求める知識に関する記述を含むコンテンツは抽出可能であるが、そのコンテンツにおける該知識に関する記述の信頼性は保証できない、という従来の問題を解決できる。 As a result, it is possible to solve the conventional problem that the content including the description related to the desired knowledge can be extracted, but the reliability of the description regarding the knowledge in the content cannot be guaranteed.

また、既存ウェブログコンテンツをそのまま知識リソースとして利用可能であるため、限られた数のエンドユーザーによる協調フィルタリングでは絞り込み精度をあげにくく、協調フィルタリングに参加するエンドユーザー数を増やすことも容易ではない、という問題を解決できる。 In addition, since existing weblog contents can be used as knowledge resources as they are, it is difficult to increase the accuracy of narrowing down by collaborative filtering by a limited number of end users, and it is not easy to increase the number of end users participating in collaborative filtering. Can solve the problem.

以上説明したように、本発明によれば、知識リソース収集装置はウェブログコンテンツを収集し、知人関係にあると共に共通のカテゴリー情報を持つウェブログの集合を各カテゴリーに詳しい人脈として抽出し、知識リソース検索支援サーバは同意語を含む検索対象カテゴリーに属するウェブログコンテンツを、検索対象カテゴリーに詳しい人脈に属するウェブログコンテンツから検索し、カテゴリー毎の人脈や各ウェブログ毎、あるいは時系列等の分類により検索結果を可視化してエンドユーザー端末で表示可能とするため、通例ウェブログに一般的なメタデータとして付随するカテゴリー情報及び知人情報のみを用いて、目的とする知識に精通し相互に知人関係を維持するに足る信頼をおいている人脈を抽出し、この人脈に属するウェブログのコンテンツは該知識に関して信頼性が高いものとみなして、該ウェブログコンテンツから求める知識を容易に取得できるようエンドユーザーを支援することを可能とする。 As described above, according to the present invention, the knowledge resource collection device collects web log contents, extracts a set of weblogs having acquaintance relationships and common category information as detailed personal connections to each category, and knowledge The resource search support server searches web log contents belonging to the search target category including synonyms from the web log contents belonging to the personal network familiar with the search target category, and classifies the human network by category, each web log, or time series. In order to visualize search results and display them on end-user terminals, we usually use only category information and acquaintance information attached as general metadata to weblogs. Extract the human network that has enough trust to maintain the web, and the web that belongs to this human network Content grayed is deemed reliable regarding the knowledge makes it possible to support the end-user to knowledge can easily acquire a determined from the weblog content.

以下、図面を用いて本発明の実施の形態を説明する。図１は、請求項１等に記載の本発明の一実施形態に係るオンライン知識検索支援装置及び方法の構成及び処理の流れを示す図である。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a diagram showing the configuration and processing flow of an online knowledge search support apparatus and method according to an embodiment of the present invention as set forth in claim 1 and the like.

同図に示すオンライン知識検索支援装置及び方法は、ウェブログサーバ１を情報源として、知識リソース収集装置２、知識リソース検索支援サーバ３から構成されており、各装置間はネットワークによって接続され、更にインターネットを介して一般的なウェブログサーバ１及び一般的なエンドユーザー端末４と接続されている。 The online knowledge search support apparatus and method shown in FIG. 1 includes a knowledge resource collection device 2 and a knowledge resource search support server 3 using a web log server 1 as an information source, and the devices are connected by a network. A general weblog server 1 and a general end user terminal 4 are connected via the Internet.

知識リソース収集装置２は、一般的なウェブログサーバ１に登録された知人情報Ｆｏａｆ、カテゴリー情報ＣＡＴ、エントリーＥＮＴ、コメントＣＭＴ及びトラックバックＴＲＢＫを含むウェブログコンテンツをキャッシュ情報として収集するクローリング手段と、前記知人情報Ｆｏａｆ及びカテゴリー情報ＣＡＴに基づき各カテゴリー毎の人脈情報ＲＳＳを抽出する人脈情報抽出手段と、カテゴリー毎の人脈情報ＲＳＳに属する各ウェブログ作者のウェブログに含まれるエントリーＥＮＴ、コメントＣＭＴ及びトラックバックＴＲＢＫを前記ウェブログコンテンツから抽出し、該エントリーＥＮＴ、コメントＣＭＴ及び該トラックバックＴＲＢＫをカテゴリー毎の人脈情報と関連づけて蓄積したものを知識リソースとして生成する知識リソース抽出手段とを有する。 The knowledge resource collection device 2 includes crawling means for collecting, as cache information, web log contents including acquaintance information Foaf, category information CAT, entry ENT, comment CMT, and trackback TRBK registered in a general web log server 1; Human network information extracting means for extracting personal network information RSS for each category based on acquaintance information Foaf and category information CAT, entry ENT, comment CMT and track back included in the web log of each web log author belonging to personal network information RSS for each category A knowledge resource that generates TRBK as a knowledge resource by extracting TRBK from the weblog content and storing the entry ENT, the comment CMT, and the trackback TRBK in association with the personal network information for each category. And a scan extraction means.

図２は一般的なウェブログコンテンツの構成例を示す図である。一般に、エントリーは時系列及びカテゴリー情報にて分類されているテキスト情報及び画像ＵＲＬ等であり、いわゆる日記の書き込み内容に相当する。コメントは通例、エントリーに対してウェブログ読者が書き込むテキスト情報であり、画像ＵＲＬ等を含むこともある。トラックバックはトラックバック先エントリーに関連するトラックバック元エントリーであり、トラックバック先ウェブログが受信したＴｒａｃｋｂａｃｋｐｉｎｇに基づき参照され、トラックバック先エントリーとは異なるウェブログに存在することが多い。 FIG. 2 is a diagram showing a configuration example of general weblog content. In general, entries are text information and image URLs classified by time series and category information, and correspond to what is written in a diary. A comment is usually text information written by a weblog reader for an entry, and may include an image URL or the like. The trackback is a trackback source entry related to the trackback destination entry, is referred to based on the trackbacking received by the trackback destination web log, and often exists in a web log different from the track back destination entry.

また、一般にカテゴリー情報は、エントリーが属するカテゴリーあるいはジャンルあるいは話題を示す分類情報であり、ウェブログ作者あるいはウェブログサーバ管理者が設定することが多く、また一般的なウェブログサーバでは階層構造を持たないことが多い。 In general, category information is classification information indicating the category, genre, or topic to which the entry belongs, and is often set by the weblog author or weblog server administrator. In addition, general weblog servers have a hierarchical structure. Often not.

知人情報は、通例Ｆｏａｆ形式でウェブログ毎に設定されており、該ウェブログ作者の名前やニックネーム、メールアドレスあるいはメールアドレスハッシュ値、ウェブログＵＲＬや該Ｆｏａｆ自体のＵＲＬ等を所定のＸＭＬにて＜Ｐｅｒｓｏｎ＞タグ配下に記述するとともに、該ウェブログ作者の知人のＦｏａｆへの参照情報を＜Ｋｎｏｗｓ＞タグ配下に記述し、更に必要に応じて例えばＤｕｂｌｉｎｃｏｒｅ（例えば、後述の非特許文献２参照）等一般的なメタデータ記述を含む。 The acquaintance information is usually set for each web log in the Foaf format, and the web log author's name, nickname, mail address or mail address hash value, web log URL, URL of the Foaf itself, etc. are stored in a predetermined XML. In addition to the description under the <Person> tag, reference information to the Foaf of the acquaintance of the weblog author is described under the <Knows> tag, and further, for example, Dubincore (for example, see Non-Patent Document 2 described later). General metadata description is included.

ここで、カテゴリー毎の人脈情報ＲＳＳと呼ぶものは、各ウェブログのＦｏａｆにおいて＜Ｋｎｏｗｓ＞タグで記述される知人のウェブログへの参照を順次たどり、該ウェブログ群の中で共通のカテゴリー情報を持つウェブログ集合を抽出した結果である。カテゴリー毎の人脈情報の実装の単純な一例としては、例えばＲＳＳ１．０（例えば、後述の非特許文献３参照）の形式を用い、＜Ｃｈａｎｎｅｌ＞タグの＜ｄｅｓｃｒｉｐｔｉｏｎ＞タグに共通のカテゴリー情報を記述し、該カテゴリー情報を共通に持ち＜Ｋｎｏｗｓ＞タグで参照関係にある各Ｆｏａｆへの参照ＵＲＬを＜ｉｔｅｍ＞タグ配下に記述する。図３にカテゴリー毎の人脈情報の一例を示す。 Here, what is called personal network information RSS for each category sequentially refers to the acquaintance's weblog described in the <Knows> tag in the Foaf of each weblog, and common category information in the weblog group. This is a result of extracting a set of weblogs having. As a simple example of the implementation of network information for each category, for example, the format of RSS 1.0 (for example, see Non-Patent Document 3 described later) is used, and common category information is described in the <description> tag of the <Channel> tag. Then, the reference URL to each Foaf having the category information in common and having a reference relationship with the <Knows> tag is described under the <item> tag. FIG. 3 shows an example of network information for each category.

図１に戻って、知識リソースと呼ぶものは、例えばウェブログコンテンツのキャッシュを格納したデータベースであり、ウェブログコンテンツを構成するエントリーＥＮＴ、コメントＣＭＴ及びトラックバックＴＲＢＫ間の相互関係、及び各エントリーＥＮＴが属するカテゴリー情報ＣＡＴ、各エントリーが属するウェブログ及びウェブログ作者及びカテゴリー毎の人脈情報ＲＳＳ等を含み、これらをキーとして検索が可能である。 Returning to FIG. 1, what is called a knowledge resource is a database that stores, for example, a cache of web log content. The entry ENT, the comment CMT, and the trackback TRBK constituting the web log content, and each entry ENT The category information CAT to which the entry belongs, the web log to which each entry belongs, the author of the web log, the personal network information RSS for each category, and the like, can be searched using these as keys.

知識リソース検索支援サーバ３は、エンドユーザー端末４から要求される検索対象カテゴリーを同意語辞書に基づき同意語の解決を行った上で、該カテゴリーに対応付けられてインデクシングされているエントリーＥＮＴ、コメントＣＭＴ及びトラックバックＴＲＢＫの集合を前記知識リソース抽出手段にて抽出した知識リソースから検索し、人脈情報ＲＳＳと関連づけて分類されている該検索結果をエンドユーザー端末４にて表示可能な形式、例えばＷＷＷブラウザで表示可能なｈｔｍｌ形式で返却する知識リソース検索手段を有する。なお、検索対象カテゴリーは、例えば複数のキーワードの組み合わせである。 The knowledge resource search support server 3 resolves the synonym based on the synonym dictionary for the search target category requested from the end user terminal 4, and then indexes the entry ENT and the comment associated with the category. A format in which a set of CMTs and trackbacks TRBK is searched from the knowledge resources extracted by the knowledge resource extraction means, and the search results classified in association with the personal network information RSS can be displayed on the end user terminal 4, for example, a WWW browser Knowledge resource search means for returning in html format that can be displayed in The search target category is, for example, a combination of a plurality of keywords.

次に、以上のように構成される本実施形態のオンライン知識検索支援装置及び方法の作用について、図１内に示すステップ番号Ｓ１〜Ｓ４を参照して説明する。 Next, the operation of the online knowledge search support apparatus and method of this embodiment configured as described above will be described with reference to step numbers S1 to S4 shown in FIG.

まず、知識リソース収集装置２は、インターネットに接続され公開されている一般的なウェブログサーバ１から、Ｆｏａｆ形式の知人情報Ｆｏａｆ、カテゴリー情報ＣＡＴ、エントリーＥＮＴ、コメントＣＭＴ及びトラックバックＴＲＢＫを含むウェブログコンテンツを収集するとともに、各Ｆｏａｆの＜Ｋｎｏｗｓ＞タグにて記述された各ウェブログ作者の知人が作成したウェブログについても同様にウェブログコンテンツをキャッシュ２Ａとして収集する（ステップＳ１）。 First, the knowledge resource collection device 2 starts from a general web log server 1 which is connected to the Internet and is open to public, including web acquaintance information Foaf, category information CAT, entry ENT, comment CMT, and trackback TRBK. The web log contents are similarly collected as the cache 2A for the web logs created by the acquaintances of the web log authors described by the <Knows> tag of each Foaf (step S1).

次に、知識リソース収集装置２は、前記キャッシュ２Ａにおいて、Ｆｏａｆの＜Ｋｎｏｗｓ＞タグにより参照関係にある複数のウェブログの中で同一カテゴリー情報を含むウェブログの集合に対して、該カテゴリーの人脈情報ＲＳＳ２Ｂとして抽出する（ステップＳ２）と共に、前記キャッシュ２Ａに含まれる該カテゴリーの人脈情報ＲＳＳに属するウェブログコンテンツを前記カテゴリー毎の人脈情報ＲＳＳと対応づけて蓄積し、カテゴリー毎の知識リソース２Ｃとする（ステップＳ３）。 Next, in the cache 2A, the knowledge resource collection device 2 applies a human network of the category to a set of web logs including the same category information among a plurality of web logs having a reference relationship by the <Knows> tag of Foaf. The information RSS2B is extracted (step S2), and the weblog content belonging to the category of human network information RSS included in the cache 2A is stored in association with the human network information RSS of each category. (Step S3).

次に、エンドユーザー端末４から検索対象カテゴリーとして複数のキーワードを受け取った知識リソース検索支援サーバ３は、各キーワードについて同意語辞書に基づき同意語の解決を行った上で、該キーワードに対応づけられたウェブログコンテンツを前記知識リソース抽出手段にて抽出した知識リソース２Ｃから検索し、検索結果のウェブログコンテンツに含まれるウェブログ、ウェブログ作者、時系列及びカテゴリー情報にて分類されたエントリーＥＮＴ、コメントＣＭＴ、トラックバック、及び前記ＲＳＳ形式で記述された各カテゴリー毎の人脈情報ＲＳＳをツリー構造あるいはグラフ構造としてｈｔｍｌ形式で可視化し、エンドユーザー端末４に返信する（ステップＳ４）。 Next, the knowledge resource search support server 3 that has received a plurality of keywords as the search target category from the end user terminal 4 resolves the synonym for each keyword based on the synonym dictionary and then associates the keyword with the keyword. Search the knowledge resource 2C extracted by the knowledge resource extraction means, and the entry ENT classified by the weblog, weblog author, time series and category information included in the weblog content of the search result, The comment CMT, the trackback, and the personal network information RSS for each category described in the RSS format are visualized in the html format as a tree structure or a graph structure, and are returned to the end user terminal 4 (step S4).

［非特許文献２］
“Dublin Core Metadata Element Set,Version 1.1:Reference Description" , 2003/06/02更新，［2004/10/28検索］，インターネットＵＲＬ＜http://dublincore.org/documents/dces/＞
［非特許文献３］
“RDF Site Summary(RSS)1.0" ,Gabe Beged‐Dov & Dan Brickley et.al., 2001/05/30更新，［2004/10/29検索］，インターネットＵＲＬ＜http://web.resource.org/rss/1.0/spec＞ [Non-Patent Document 2]
"Dublin Core Metadata Element Set, Version 1.1: Reference Description", updated 2003/06/02, [Search 10/28/2004], Internet URL <http://dublincore.org/documents/dces/>
[Non-Patent Document 3]
"RDF Site Summary (RSS) 1.0", Gabe Beged-Dov & Dan Brickley et.al., 2001/05/30 update, [Search 10/29/2004], Internet URL <http://web.resource.org /rss/1.0/spec>

本発明の一実施形態に係るオンライン知識検索支援装置及び方法の構成及び処理の流れを示す図。The figure which shows the structure of the online knowledge search assistance apparatus and method which concern on one Embodiment of this invention, and the flow of a process. 一般的なウェブログコンテンツの構成例を示す図。The figure which shows the structural example of a general web log content. 本発明の一実施形態における、カテゴリー毎の人脈情報の例を示す図。The figure which shows the example of the personal network information for every category in one Embodiment of this invention.

Explanation of symbols

１ウェブログサーバ
２知識リソース収集装置
３知識リソース検索支援サーバ
４エンドユーザー端末
1 Weblog Server 2 Knowledge Resource Collection Device 3 Knowledge Resource Search Support Server 4 End User Terminal

Claims

An apparatus for supporting online knowledge search comprising a knowledge resource collection device and a knowledge search support server,
The knowledge resource collection device includes:
Crawling means for collecting acquaintance information on the Internet, including entries, comments, trackbacks and category information included in general weblog contents, and weblog information of weblog author acquaintances;
Human network information extracting means for extracting an acquaintance network having a common category from acquaintance information included in the weblog content collected by the crawling means to extract an acquaintance network that is familiar with the category as personal network information for each category;
A knowledge resource extracting means for indexing the web log content belonging to the personal network information in a searchable form for each category and storing it as a knowledge resource;
The knowledge search support server is designated a search target category from browsing software such as a web browser of a general end user terminal, searches the knowledge resource for a category including a synonym of the search target category, Weblog entries included in weblog content, weblog authors, time series and category information, comments, trackbacks, and network information for each category are displayed on the end user terminal after visualizing the structure. An online knowledge search support device characterized by comprising knowledge resource search means for enabling.

A method for supporting online knowledge search comprising a knowledge resource collection device and a knowledge search support server,
The knowledge resource collection device includes:
A crawling process for collecting acquaintance information on the Internet, including entries, comments, trackbacks and category information included in general weblog contents, and web log URLs of acquaintances of weblog authors;
A network information extraction process for extracting a network of acquaintances familiar with the category as network information for each category by extracting an acquaintance relationship having a common category from acquaintance information included in the weblog content collected in the crawling process,
A knowledge resource extraction process of indexing the web log content belonging to the network information in a searchable form for each category and accumulating as a knowledge resource,
The knowledge resource search support server is designated a search target category from browsing software such as a web browser of a general end user terminal, searches the knowledge resource for a category including a synonym of the search target category, and a search result Visualization of the structure of the weblogs, weblog authors, entries categorized by time series and category information, comments, trackbacks, and network information for each category on the end user terminal An online knowledge search support method comprising a knowledge resource search process that enables display.